calcite meetup-2016-04-20

Post on 15-Apr-2017

481 Views

Category:

Software

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Introduction toApache CalciteJosh ElserMTS2016-04-20

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

About me

Apache Calcite is a project at the Apache Software Foundation.This name is a trademark of the Foundation.

Apache Calcite Committer and PMC

(Slowly) Re-learning SQL

Distributed systems nerd

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Users

Apache Kylin

Apache SamzaQuark

SQL-Gremlin/Apache TinkerPop

See the respective project pages at the ASF

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Brief History

Originally known as “Optiq” (https://github.com/julianhyde/optiq): 2012-05-07 Entered Apache Software Fundation’s Incubator: 2014-05-25 Renamed to Apache Calcite (incubating): 2014-09-30 Graduates to top-level project (TLP): 2015-10-21 2 major releases since graduation: 2016-03-XX Currently comprised of 16 committers and 14 PMC members

“The foundation for your next high-performance database.”

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

SQL Parser

SELECT d.name, COUNT(*) as cFROM Emps as e JOIN Depts as d ON e.deptno = d.deptnoWHERE e.age < 30GROUP BY d.deptnoHAVING COUNT(*) > 5ORDER BY c DESC

Scan

Join

Filter

Aggregate

Filter

Project

Sort

Scanhttps://calcite.apache.org/docs/reference.html

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Cost-Based Optimizer

Extensible

Java API– Parser output– Inline Java code

AKA Relational AlgebraRelBuilder builder = RelBuilder.create(config);RelNode node = builder .scan("EMP") .project(builder.field(“DEPTNO”), builder.field(“ENAME”)) .build();

SELECT ename, deptno FROM emp;

LogicalProject(DEPTNO, ENAME) LogicalTableScan(EMP)

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Cost-Based Optimizer

SELECT d.name, COUNT(*) as cFROM depts AS dJOIN emp AS e on d.deptno = e.deptnoGROUP BY d.name;

Scan Emp[deptno]

Join

Aggregate

Scan Depts[deptno,

name]

Join

Aggregate

Project[name, c]

Scan Emp[*] Scan Depts[*]

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

Pluggable Data Sources

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Pluggable Data Sources

User-implemented– Yes, you.

Custom optimizations– Predicate pushdown– Projections

Sources of Sources– Federation

Everything but the data

Join

Aggregate

Project

Scan Emp[*] Scan Depts[*]

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

AgendaSQL Parser

Cost-Based Optimizer

Pluggable Data Sources

Avatica

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Avatica

Calcite sub-project Wire protocol

– Protocol Buffers– JSON

Metrics Authentication Clients

– JDBC client– Python and Go (in-progress)

JDBC over HTTP – SQL for Everyone

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You!Email: elserj@apache.orgTwitter: @josh_elserMailing lists: dev@calcite.apache.orgProject info: https://calcite.apache.org/

top related