impala: a modern, open-source sql engine for hadoop
TRANSCRIPT
![Page 1: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/1.jpg)
![Page 2: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/2.jpg)
![Page 3: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/3.jpg)
![Page 4: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/4.jpg)
![Page 5: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/5.jpg)
![Page 6: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/6.jpg)
![Page 7: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/7.jpg)
![Page 8: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/8.jpg)
![Page 9: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/9.jpg)
![Page 10: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/10.jpg)
![Page 11: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/11.jpg)
![Page 13: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/13.jpg)
![Page 14: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/14.jpg)
![Page 15: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/15.jpg)
![Page 16: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/16.jpg)
![Page 17: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/17.jpg)
…
![Page 18: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/18.jpg)
![Page 19: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/19.jpg)
![Page 20: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/20.jpg)
![Page 21: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/21.jpg)
![Page 22: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/22.jpg)
![Page 23: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/23.jpg)
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
SQL App
ODBCHDFS NN
Statestore&
Catalog
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
SQL request
HiveMetastore
![Page 24: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/24.jpg)
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
SQL App
ODBC
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
HDFS NNStatestore
&Catalog
Planner turns request into collections of plan fragmentsCoordinator initiates execution on remotes nodes
HiveMetastore
![Page 25: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/25.jpg)
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
SQL App
ODBCHive
Metastore HDFS NNStatestore
&Catalog
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
Query Planner
Query Coordinator
Query Executor
HDFS DN HBase
query results
Intermediate results are streamed between nodes
Operation permitted, query results are streamed back to client
![Page 26: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/26.jpg)
![Page 27: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/27.jpg)
![Page 28: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/28.jpg)
![Page 29: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/29.jpg)
void MaterializeTuple(char* tuple) {for (int i = 0; i < num_slots_; ++i) {
char* slot = tuple + offsets_[i];switch (types_[i]) {
case BOOLEAN:*slot = ParseBoolean();break;
case INT:*slot = ParseInt();
case FLOAT: …case STRING: …// etc.
}}
}
void MaterializeTuple(char* tuple) {// i = 0*(tuple + 0) = ParseInt();// i = 1*(tuple + 4) = ParseBoolean();// i = 2*(tuple + 5) = ParseInt();
}
Hot code path, called per row
![Page 30: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/30.jpg)
![Page 31: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/31.jpg)
![Page 32: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/32.jpg)
![Page 33: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/33.jpg)
QueryFragment
QueryFragment
QueryFragment
IO Manager
Disk Disk Disk
Impala Daemon
Disk Disk
Thread0
Thread1
Thread2
Thread3
Thread4
![Page 35: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/35.jpg)
container format for all popular serialization formats: Avro, Thrift, Protocol Buffers
![Page 36: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/36.jpg)
From Twitter’s “Dremel Made Simple” blog
The most efficient IO, is one that never happens at all
![Page 37: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/37.jpg)
![Page 38: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/38.jpg)
![Page 39: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/39.jpg)
![Page 40: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/40.jpg)
![Page 41: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/41.jpg)
![Page 42: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/42.jpg)
![Page 43: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/43.jpg)
![Page 44: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/44.jpg)
![Page 45: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/45.jpg)
![Page 46: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/46.jpg)
![Page 47: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/47.jpg)
![Page 48: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/48.jpg)
OVER PARTITION, RANK, LEAD, LAG, NTILE, ..
•VARCHAR, CHAR
![Page 49: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/49.jpg)
ROLLUP, CUBE, GROUPING SETSET MINUS INTERSECT
![Page 50: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/50.jpg)
![Page 51: Impala: A Modern, Open-Source SQL Engine for Hadoop](https://reader035.vdocuments.net/reader035/viewer/2022062514/55a409191a28abf45e8b469f/html5/thumbnails/51.jpg)
SELECT question FROM audience WHERE has_question = true;