the evolution of apache kylin by luke han
TRANSCRIPT
![Page 2: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/2.jpg)
Aboutme…
§Luke Han|韩卿§ Co-creator&VPofApacheKylin
§ ASFMember
§ Co-founder&CEOatKyligenceInc
§ Twitter:@lukehq
![Page 3: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/3.jpg)
ApacheKylin
![Page 4: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/4.jpg)
Why
Happiness
Latency10s
![Page 5: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/5.jpg)
Whatwehavetried?
Kylin
![Page 6: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/6.jpg)
AboutApache Kylin
http://kylin.apache.org
Extreme OLAP Engine for Big Data
Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop, supporting extremely large datasets and sub-second level response time.
kylin /ˈkiːˈlɪn/麒麟--n.(inChineseart)amythicalanimalofcompositeform
![Page 7: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/7.jpg)
AboutApache Kylin
OLAP/数据集市
• BornforBigDataAnlytics
• Sub-secondsLatency
• ANSISQL
• SeamlessIntegration
withBITools
• Plug-ableArchitecture
![Page 8: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/8.jpg)
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>2. (9/15, milk, Urbana, *) - <time, item, location>3. (*, milk, Urbana, *) - <item, location>4. (*, milk, Chicago, *) - <item, location>5. (*, milk, *, *) - <item>
• Cuboid = one combination of dimensions• Cube = all combination of dimensions
(all cuboids)
OLAPCube
Cube- BalanceBetweenSpaceandTime
![Page 9: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/9.jpg)
Architecture
MapReduce/Spark
Kylin
BITools,WebApp…
ANSISQL
![Page 10: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/10.jpg)
ApacheKylin Journey
GoLiveateBay&OpenSourceonGithub
ApacheIncubator
FirstApacheReleasev0.71
InfoWorld:BossieAwardBestOpenSourceBigDataTool
ApacheReleasev1.0
ApacheTopLevelProject
Sept2013 Oct2014 June2015 Nov2015
Nov2014 Sept2015
§ Kyligence founded
Mar2016
Projectkickoff
![Page 11: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/11.jpg)
Apache KylinGlobalAdoptions
![Page 12: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/12.jpg)
UseCase:JD.com
![Page 13: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/13.jpg)
UseCase:Baidu Map
![Page 14: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/14.jpg)
UseCase:NetEase
![Page 15: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/15.jpg)
PerformanceandThroughput
ByNetEase:http://www.bitstech.net/2016/01/04/kylin-olap/
![Page 16: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/16.jpg)
TheEvolution
![Page 17: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/17.jpg)
ApacheKylin NewFeatures
§ Plugin-ablearchitecture§NewMRCubeEnginewithfastcubing(1.5xfaster)§NewHBaseStoragewithparallelscan(2xfaster)§Nearreal-timeanalysis§Userdefinedaggregations§ Excel/PowerBI/Zeppelinintegration
![Page 18: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/18.jpg)
TheFreedom,Extensibility,Flexibility
§ Freedom
§ Zoobreak,notboundtoHadoopanymore
§ Freetogotoabetterengineorstorage
§ Extensibility
§ Acceptanyinput,e.g.Kafka
§ Embracenext-gendistributedplatform,e.g.Spark
§ Flexibility
§ Choosedifferentenginefordifferentdataset
![Page 19: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/19.jpg)
Newgenerationdesign
CubeBuilder(MapReduce…)
SQL
LowLatency-SecondsRouting
3rdPartyApp(WebApp,Mobile…)
Metadata
SQL-BasedTool(BITools:Tableau…)
QueryEngine
HadoopHive
RESTAPI JDBC/ODBC
Ø OnlineAnalysisDataFlowØ OfflineDataFlow
Ø Clients/Users interactive withKylinviaSQL
Ø OLAPCubeistransparent tousers
StarSchemaData KeyValueData
DataCubeOLAPCubes(HBase)
SQL
RESTServerDa
taSource
Abstraction Engine
Abstraction
Storage
Abstraction
![Page 20: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/20.jpg)
MREngineIN OUT
HiveSource
HBaseStorage
CubeMetadata
SourceFactory StorageFactoryEngineFactory
Plug-ablearchitecture
![Page 21: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/21.jpg)
Plug-ablearchitecture
MREngine
HiveAdapter HBase Adapter
loaddata savecubeHiveSource
HBaseStorage
adapttoIN adapttoOUT
![Page 22: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/22.jpg)
ParallelScan
§ Slowqueriesare5-10xfaster.
§ NewHbase storageenablespartitiononcuboidsthatarebigenough.
§ Overallquerytimeis2x faster thanbefore,sumresultsfrom10,000+queries.
Query
CuboidA
CuboidB
Query
A1 B1
A2 B2
A3 C
CuboidC
Server1
Server2
Server3
Server1
Server2
Server3
![Page 23: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/23.jpg)
NearRealtime IncrementalBuild
n Minutesmicrocubesn Kafkasourcen In-memcubingn Automerge
![Page 24: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/24.jpg)
UserDefinedAggregationTypes
§HyperLogLog CountDistinct§ TopN§ BitMap PreciseCountDistinct
§ fromSun,Yerui (meituan.com)
§ RawRecords§ fromWang,Xiaoyu (jd.com)
![Page 25: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/25.jpg)
Support more BI &VisualizationTools
§ SupportsTableau9.1§ SupportsMSExcel§ SupportsMSPowerBI§ SupportsZeppelin
![Page 26: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/26.jpg)
Roadmap
![Page 27: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/27.jpg)
ApacheKylinRoadmap
![Page 28: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/28.jpg)
2016Focus…
§ Streaming and Real Time§ Performance,performanceandperformance§ SupportmoreBI&visualizationtools§ SQL &OLAP Functions.
![Page 29: The Evolution of Apache Kylin by Luke Han](https://reader034.vdocuments.net/reader034/viewer/2022042619/587332701a28ab596c8b6e2b/html5/thumbnails/29.jpg)
Q&A
§More…§Website:http://kylin.apache.org§Twitter:@ApacheKylin
§ContactMe:§ [email protected]§@lukehq