Introduction¶
Presto Workshop - Building an Open Data Lakehouse with Presto¶
Welcome to our workshop! In this workshop, you’ll learn the basics of Presto, the open-source SQL query engine, and it's support for Iceberg. You’ll get Presto running locally on your machine and connect to an S3-based data source and a REST server, which enables our Iceberg support. This is a beginner-level workshop for software developers and engineers who are new to Presto and Iceberg. At the end of the workshop, you will understand how to integrate Presto with Iceberg and MinIO and how to understand the Iceberg table format.
The goals of this workshop are to show you:
- What is Apache Iceberg and how to use it
- How to connect Presto to MinIO s3 storage and an Iceberg-compatible REST server using Docker
- How to take advantage of Iceberg using Presto and why you would want to
About this workshop¶
The introductory page of the workshop is broken down into the following sections:
Agenda¶
Introduction | Introduction to the technologies used |
Prerequisite | Prerequisites for the workshop |
Lab 1: Set up an Open Lakehouse | Set up a Presto cluster with REST catalog and data source connection |
Lab 2: Set up the Data Source | Set up a storage bucket in MinIO |
Lab 3: Exploring Iceberg Tables | Explore how to create Iceberg tables and use Iceberg features |
Compatibility¶
This workshop has been tested on the following platforms:
- Linux: Ubuntu 22.04
- MacOS: M1 Mac
Technology Used¶
- Docker: A container engine to run several applications in self-contained containers.
- Presto: A fast and Reliable SQL Engine for Data Analytics and the Open Lakehouse
- Apache Iceberg: A high-performance format for huge analytic tables
- MinIO: A high-performance, S3 compatible object store