podling hivemall in the apache incubator
TRANSCRIPT
![Page 1: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/1.jpg)
Podling HivemallintheApacheIncubator
ResearchEngineerMakotoYUI@myui
12016/11/08ApacheHadoopMeetupatCWT2016
![Page 2: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/2.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 2
HivemallenteredApacheIncubatoronSept13,2016🎉
hivemall.incubator.apache.org
@ApacheHivemall
![Page 3: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/3.jpg)
•MakotoYui<TreasureData>• TakeshiYamamuro <NTT>Ø HivemallonApacheSpark• DanielDai<Hortonworks>Ø HivemallonApachePigØ ApachePigPMCmember• TsuyoshiOzawa<NTT>ØApacheHadoopPMCmember• KaiSasaki<TreasureData>
3
Initialcommitters
2016/11/08ApacheHadoopMeetupatCWT2016
![Page 4: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/4.jpg)
Champion
NominatedMentors
4
Projectmentors
• ReynoldXin<Databricks,ASFmember>ApacheSparkPMCmember• MarkusWeimer<Microsoft,ASFmember>ApacheREEFPMCmember• Xiangrui Meng <Databricks,ASFmember>ApacheSparkPMCmember
• RomanShaposhnik <Pivotal,ASFmember>ApacheBigtop/IncubatorPMCmember
2016/11/08ApacheHadoopMeetupatCWT2016
![Page 5: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/5.jpg)
WhatisApacheHivemall
ScalablemachinelearninglibrarybuiltasacollectionofHiveUDFs
52016/11/08ApacheHadoopMeetupatCWT2016
Multi/Crossplatform Versatile Scalable Ease-of-use
![Page 6: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/6.jpg)
Hivemalliseasyandscalable…
ClassificationwithMahout
CREATETABLElr_model ASSELECTfeature,-- reducersperformmodelaveraginginparallelavg(weight)asweightFROM(SELECTlogress(features,label,..)as(feature,weight)FROMtrain)t-- map-onlytaskGROUPBYfeature;-- shuffledtoreducers
MLmadeeasyforSQLdevelopers
Borntobeparallelandscalable
ThisSQLqueryautomaticallyrunsinparallelonHadoopcluster
62016/11/08ApacheHadoopMeetupatCWT2016
Ease-of-use
Scalable
![Page 7: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/7.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 7
Hivemallisamulti/cross-platformMLlibrary
HiveQL SparkSQL/Dataframe API PigLatin
HivemallisMulti/Crossplatform..
Multi/Crossplatform
predictionmodelsbuiltbyHivecanbeusedfromSpark,andconversely,predictionmodelsbuildbySparkcanbeusedfromHive
![Page 8: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/8.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 8
HivemallonApacheHive
![Page 9: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/9.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 9
HivemallonApacheSparkDataframe
![Page 10: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/10.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 10
HivemallonSparkSQL
![Page 11: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/11.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 11
HivemallonApachePig
![Page 12: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/12.jpg)
2016/11/08ApacheHadoopMeetupatCWT2016 12
Versatile
HivemallisaVersatilelibrary..
ü HivemallisnotonlyforMachineLearning
ü Hivemallprovidesbunchofgenericutilityfunctions(e.g.,top-k,NLP)
EachorganizationhasownsetsofUDFsfordatapreprocessing!
Don’tRepeatYourself!Don’tRepeatYourself!
![Page 13: Podling Hivemall in the Apache Incubator](https://reader036.vdocuments.net/reader036/viewer/2022062823/586fde8c1a28ab18428b6c15/html5/thumbnails/13.jpg)
ConclusionandTakeaway
Hivemallisamachinelearninglibrarythatis…
2016/11/08ApacheHadoopMeetupatCWT2016 13
WewelcomeyourcontributionstoApacheHivemallJ
Multi/Crossplatform Versatile Scalable Ease-of-use
hivemall.incubator.apache.org