Data Visualization¶
You've learned how to set up a Presto cluster, connect to multiple data sources, and run federate queries. The main reasons that you can easily do these are:
- Presto provides many built-in data source connectors
 - One single language: SQL
 - Optimize performance for large-scale distributed workload
 
Also, because of these features, it becomes very easy to integrate a data visualization tool with Presto. In this section, you will learn how to set up Apache Zeppelin to connect to the Presto cluster and run data analytics and visualization.
This section is comprised of the following steps:
1. Set up Apache Zeppelin¶
Use the Offical Docker Image¶
You can use the following command to check the container logs:
You can access http://localhost:8443 on a browser to access the dashboard, like this:
Note
If you run the lab on a remote server, replace the localhost with the server's IP address.
For example http://192.168.0.1:8443

Add Presto JDBC Driver¶
Thanks to the Presto JDBC Driver, you can easily integrate Apache Zeppelin with Presto.
- 
Click on the
anonymousin the upper-right corner and selectinterpreteron the pop-up menu:
 - 
Create a new interpreter by clicking the
Createbutton under theanonymous:
 - 
Use presto for the
Interpreter Name, meaning you need to use %presto as a directive on the first line of a paragraph in a Zeppelin notebook. Then select jdbc as theInterpreter group:
 - 
Have the following settings in the
Propertiessection:default.url: jdbc:presto://coordinator:8080/default.user: zeppelindefault.driver: com.facebook.presto.jdbc.PrestoDriver

 - 
Scroll down to the
Dependenciessection and use the maven URI to point to the Presto JDBC driver - com.facebook.presto:presto-jdbc:0.290. Click theSavebutton to save the settings.
 
You have created an interpreter to talk to the Presto cluster.
Note
The interpreter connects the Presto without a specific catalog and schema. When using the
%presto interpreter, you have to specify the catalog and schema in your SQL or run
use <catalog>.<schema>; first.
2. Data Visualization¶
Apache Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more. Here, we just show you how easily to leverage Presto to run data analytics and visualization.
- 
Create a Zeppelin note by clicking the
Notebookon the top nav menu and selectingCreate newnote`:
 - 
Name it as presto and select presto as the
Default Interpreter. Click theCreatebutton.
 - 
Copy and paste the following SQL to the first paragraph of the note:
use tpch.sf1; SELECT n.name, sum(l.extendedprice * (1 - l.discount)) AS revenue FROM "customer" AS c, "orders" AS o, "lineitem" AS l, "supplier" AS s, "nation" AS n, "region" AS r WHERE c.custkey = o.custkey AND l.orderkey = o.orderkey AND l.suppkey = s.suppkey AND c.nationkey = s.nationkey AND s.nationkey = n.nationkey AND n.regionkey = r.regionkey AND r.name = 'ASIA' AND o.orderdate >= DATE '1994-01-01' AND o.orderdate < DATE '1994-01-01' + INTERVAL '1' YEAR GROUP BY n.name ORDER BY revenue DESC;Then click the
trianglerun button on the top menu bar or the upper-right corner of the first paragraph.
 - 
After the query finishes, the results will show up below the SQL.

 - 
You can select different built-in charts to visualize the results.
