Jump to

What is Presto?

Presto is an open source distributed SQL query engine created by Facebook developers to run interactive analytics against large volumes of data. With Presto, organizations can simply use their existing SQL skills to query big data without having to learn new complex languages.

Learn how Presto on Dataproc can accelerate data analysis.

Ready to get started? New customers get $300 in free credits to spend on Google Cloud.

Presto defined

Open source Presto (SQL query engine) uses industry-standard SQL query language to provide a fast, easy way to process and perform ad hoc analysis of big data from multiple sources, across on-premises systems and the cloud.

Presto architecture is very similar to classic online analytical processing (OLAP) systems using distributed computing, where one master node coordinates multiple worker nodes.

Solve your business challenges with Google Cloud

New customers get $300 in free credits to spend on Google Cloud.
Get started
Talk to a Google Cloud sales specialist to discuss your unique challenge in more detail.
Contact us

What is Presto used for?

With Presto, organizations can seamlessly run federated queries across large-scale data repositories like BigQuery, Hadoop Distributed File System, Cloud Storage, Cloud SQL for MySQL, Apache Cassandra, or Apache Kafka. Here are some specific use cases.

Data warehousing

With Presto, you can run your database engine query, like traditional OLAP workloads, using an open, distributed SQL query engine.

Ad hoc business intelligence

For fast data exploration and simple reporting, create a small Presto query engine to run queries to multiple data sources, then power down.

Lightweight data prep

Quickly join and aggregate data to prepare your dataset and derived variables for ad hoc queries.

The Presto optional component for Dataproc brings the full suite of support from Google Cloud, including fast cluster start-up times and integration testing with the rest of Dataproc.

BigQuery and Cloud Storage can be part of a Google Cloud solution using Dataproc and the Presto query engine component for data analysis.