presto - slac conferences, workshops and … · 3 what is presto? •open source distributed sql...

Post on 13-Oct-2018

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1 © 2015 Teradata

Presto XLDB Lightning Talk – 5/24/2016

Matthew.Fuller@Teradata.com

2

3

What is Presto?

• Open source distributed SQL query engine

• Designed and written from the ground up for interactive analytical queries

• Scales to the sizes needed by organizations like Facebook

• Query data where it lives

• Hadoop Distribution agnostic

• Extensible

4

• Horizontal scale out

• Query execution is pipelined throughout memory

• Vectorized columnar processing

• Optimized data source readers (e.g. ORC)

• Presto is written in highly tuned Java

– Efficient in-memory data structures

– Very careful coding of inner loops

– Bytecode generation

Presto = Performance

5

Presto Architecture

Data stream API

Worker

Data stream API

Worker

Coordinator

Metadata

API

Parser/

analyzer Planner Scheduler

Worker

Client

Data location

API

Pluggable

6

Presto Extensibility – Connectors

Parser/

analyzer Planner

Worker

Data location API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Metadata API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Data stream API

Hiv

e

Ca

ssa

nd

ra

Ka

fka

MySQ

L

Scheduler

Coordinator

7

Presto Connectors

Teradata QueryGrid™

Targets

Entry Points

TERADATA DATABASE

ASTER ANALYTICS

PRESTO HADOOP

HIVE / HDFS

HADOOP

OTHER DATABASE

S

NOSQL DATABASE

S

TERADATA DATABASE

ASTER ANALYTICS

PRESTO HADOOP

Non-Relational DBs Multi-Genre Advanced Analytics™

Integrated Data Warehouses

3rd Party Relational DBs

Multiple Hadoop SQL Query Engines and Distributions

APACHE KAFKA

APACHE CASSANDRA

MYSQL POSTGRES PRESTO API AMAZON S3 AMAZON S3

8

Presto-to-Teradata & Teradata-to-Presto

9

• 100% open source contributions to Presto to increase adoption in the enterprise

• A multi-year roadmap commitment to enhancements of the open source code

• The first ever commercial support offering for Presto

• Providing ODBC / JDBC drivers to the community

• Driving Business Intelligence tool integrations

• Query Grid connectivity

Teradata & Presto

10

• Github, Presto users group, IRC, Twitter, Facebook

• https://prestodb.io/community.html

• https://github.com/prestodb

• https://github.com/Teradata

Contributing!

11

• Let’s make a connector for scientific data! e.g. https://root.cern.ch/doc/master/classTFile.html

Presto to Query Scientific Data?

12 12

top related