cs 4604: introducon to database management...
TRANSCRIPT
CS4604:Introduc0ontoDatabaseManagementSystems
B.AdityaPrakashLecture#12:NoSQLandMapReduce
NOSQL(someslidesfromXiaoYu)
Prakash2016 VTCS4604 2
WhyNoSQL?
Prakash2016 VTCS4604 3
RDBMS§ Thepredominantchoiceinstoringdata
– Notsotruefordataminerssincewemuchintxtfiles.
§ Firstformulatedin1969byCodd– WeareusingRDBMSeverywhere
Prakash2016 VTCS4604 4
Slidefromneotechnology,“ANoSQLOverviewandtheBenefitsofGraphDatabases"
Prakash2016 VTCS4604 5
WhenRDBMSmetWeb2.0
SlidefromLorenzoAlberton,"NoSQLDatabases:Why,whatandwhen"Prakash2016 VTCS4604 6
Whattodoifdataisreallylarge?
§ Peta-bytes(exabytes,ze_abytes…..)
§ Googleprocessed24PBofdataperday(2009)
§ FBadds0.5PBperday
Prakash2016 VTCS4604 7
Prakash2016 VTCS4604 8
BIGdata
What’sWrongwithRela0onalDB?
§ Nothingiswrong.Youjustneedtousetherighttool.
§ Reladonalishardtoscale.– Easytoscalereads– Hardtoscalewrites
Prakash2016 VTCS4604 9
What’sNoSQL?
§ Themisleadingterm“NoSQL”isshortfor“NotOnlySQL”.
§ non-reladonal,schema-free,non-(quite)-ACID– MoreonACIDtransacdonslaterinclass
§ horizontallyscalable,distributed,easyreplicadonsupport
§ simpleAPI
Prakash2016 VTCS4604 10
Four(emerging)NoSQLCategories
§ Key-value(K-V)stores– BasedonDistributedHashTables/Amazon’sDynamopaper*
– Datamodel:(global)collecdonofK-Vpairs– Example:Voldemort
§ ColumnFamilies– BigTableclones**– Datamodel:bigtable,columnfamilies– Example:HBase,Cassandra,Hypertable
*GDeCandiaetal,Dynamo:Amazon'sHighlyAvailableKey-valueStore,SOSP07**FChangetal,Bigtable:ADistributedStorageSystemforStructuredData,OSDI06
Prakash2016 VTCS4604 11
Four(emerging)NoSQLCategories
§ Documentdatabases– InspiredbyLotusNotes– Datamodel:collecdonsofK-VCollecdons– Example:CouchDB,MongoDB
§ Graphdatabases– InspiredbyEuler&graphtheory– Datamodel:nodes,reladons,K-Vonboth– Example:AllegroGraph,VertexDB,Neo4j
Prakash2016 VTCS4604 12
FocusofDifferentDataModels
Slidefromneotechnology,“ANoSQLOverviewandtheBenefitsofGraphDatabases"
Prakash2016 VTCS4604 13
C-A-P“theorem"
Consistency
Availability
ParddonTolerance
RDBMS
NoSQL(most)
Prakash2016 VTCS4604 14
WhentouseNoSQL?§ Bigness§ Massivewriteperformance
– Twi_ergenerates7TB/perday(2010)§ Fastkey-valueaccess§ Flexibleschemaordatatypes§ Schemamigradon§ Writeavailability
– Writesneedtosucceednoma_erwhat(CAP,parddoning)§ Easiermaintainability,administradonandoperadons§ Nosinglepointoffailure§ Generallyavailableparallelcompudng§ Programmereaseofuse§ Usetherightdatamodelfortherightproblem§ Avoidhiqngthewall§ Distributedsystemssupport§ TunableCAPtradeoffs fromh_p://highscalability.com/
Prakash2016 VTCS4604 15
Key-ValueStoresid hair_color age height
1923 Red 18 6’0”
3371 Blue 34 NA
… … … …
Tableinreladonaldb Store/DomaininKey-Valuedb
Finduserswhoseageisabove18?Findalla_ributesofuser1923?FinduserswhosehaircolorisRedandageis19?(Joinoperadon)Calculateaverageageofallgradstudents?
Prakash2016 VTCS4604 16
VoldemortinLinkedIn
SidAnand,LinkedInDataInfrastructure(QConLondon2012)
Prakash2016 VTCS4604 17
VoldemortvsMySQL
SidAnand,LinkedInDataInfrastructure(QConLondon2012)
Prakash2016 VTCS4604 18
ColumnFamilies–BigTablelike
FChang,etal,Bigtable:ADistributedStorageSystemforStructuredData,osdi06 Prakash2016 VTCS4604 19
BigTableDataModel
The row name is a reversed URL. The contents column family contains the pagecontents, and the anchor column family contains the text of any anchors thatreferencethepage.
Prakash2016 VTCS4604 20
BigTablePerformance
Prakash2016 VTCS4604 21
DocumentDatabase-mongoDB
Tableinreladonaldb
Documentsinacollecdon
Inidalrelease2009
Opensource,documentdbJson-likedocumentwithdynamicschema
Prakash2016 VTCS4604 22
mongoDBProductDeployment
Andmuchmore…Prakash2016 VTCS4604 23
GraphDatabase
DataModelAbstracdon:• Nodes• Reladons• Properdes
Prakash2016 VTCS4604 24
Neo4j-BuildaGraph
Slidefromneotechnology,“ANoSQLOverviewandtheBenefitsofGraphDatabases"
Prakash2016 VTCS4604 25
ADebatablePerformanceEvalua0on
Prakash2016 VTCS4604 26
Conclusion
§ Usetherightdatamodelfortherightproblem
Prakash2016 VTCS4604 27
THEHADOOPECOSYSTEM
Prakash2016 VTCS4604 28
VTCS4604 29Prakash2016
SinglevsCluster
§ 4TBHDDsarecomingout§ Cluster?
– Howmanymachines?– Handlemachineanddrivefailure– Needredundancy,backup..
Prakash2016 VTCS4604 30
How to analyze such large datasets?
First thing, how to store them?
Single machine? 4TB drive is out
Cluster of machines?
• How many machines?• Need to worry about
machine and drive failure. Really?
• Need data backup, redundancy, recovery, etc.
5
3% of 100,000 hard drives fail within first 3 months
Failure Trends in a Large Disk Drive Populationhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/disk_failures.pdf
3%of100KHDDsfailin<=3months
h_p://stadc.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/disk_failures.pdf
Hadoop
§ Opensourcesovware– Reliable,scalable,distributedcompudng
§ Canhandlethousandsofmachines§ Wri_eninJAVA§ Asimpleprogrammingmodel§ HDFS(HadoopDistributedFileSystem)
– Faulttolerant(canrecoverfromfailures)
Prakash2016 VTCS4604 31
Open-source software for reliable, scalable, distributed computing
Written in Java
Scale to thousands of machines
• Linear scalability: if you have 2 machines, your job runs twice as fast
Uses simple programming model (MapReduce)
Fault tolerant (HDFS)
• Can recover from machine/disk failure (no need to restart computation)
7http://hadoop.apache.org
IdeaandSolu0on§ Issue:Copyingdataoveranetworktakes0me§ Idea:
– Bringcomputadonclosetothedata– Storefilesmuldpledmesforreliability
§ Map-reduceaddressestheseproblems– Google’scomputadonal/datamanipuladonmodel– Elegantwaytoworkwithbigdata– StorageInfrastructure–Filesystem
• Google:GFS.Hadoop:HDFS– Programmingmodel
• Map-ReduceVTCS4604 32Prakash2016
Map-Reduce[DeanandGhemawat2004]
§ Abstracdonforsimplecompudng– Hidesdetailsofparallelizadon,fault-tolerance,data-balancing
– MUSTRead!h_p://stadc.googleusercontent.com/media/research.google.com/en/us/archive/mapreduce-osdi04.pdf
Prakash2016 VTCS4604 33
HadoopVSNoSQL
§ Hadoop:compudngframework– Supportsdata-intensiveapplicadons– IncludesMapReduce,HDFSetc.(wewillstudyMRmainlynext)
§ NoSQL:NotonlySQLdatabases– CanbebuiltONhadoop.E.g.HBase.
Prakash2016 VTCS4604 34
StorageInfrastructure
§ Problem:– Ifnodesfail,howtostoredatapersistently?
§ Answer:– DistributedFileSystem:
• Providesglobalfilenamespace• GoogleGFS;HadoopHDFS;
§ Typicalusagepagern– Hugefiles(100sofGBtoTB)– Dataisrarelyupdatedinplace– Readsandappendsarecommon
VTCS4604 35Prakash2016
DistributedFileSystem§ Chunkservers
– Fileissplitintocondguouschunks– Typicallyeachchunkis16-64MB– Eachchunkreplicated(usually2xor3x)– Trytokeepreplicasindifferentracks
§ Masternode– a.k.a.NameNodeinHadoop’sHDFS– Storesmetadataaboutwherefilesarestored– Mightbereplicated
§ Clientlibraryforfileaccess– Talkstomastertofindchunkservers– Connectsdirectlytochunkserverstoaccessdata
VTCS4604 36Prakash2016
ProgrammingModel:MapReduce
Warm-uptask:§ Wehaveahugetextdocument
§ Countthenumberofdmeseachdisdnctwordappearsinthefile
§ Sampleapplica0on:– AnalyzewebserverlogstofindpopularURLs
VTCS4604 37Prakash2016
Task:WordCount
Case1:– Filetoolargeformemory,butall<word,count>pairsfitinmemory
Case2:§ Countoccurrencesofwords:
– words(doc.txt) | sort | uniq -c • wherewordstakesafileandoutputsthewordsinit,oneperaline
§ Case2capturestheessenceofMapReduce– Greatthingisthatitisnaturallyparallelizable
VTCS4604 38Prakash2016
MapReduce:Overview
§ Sequendallyreadalotofdata§ Map:
– Extractsomethingyoucareabout
§ Groupbykey:SortandShuffle§ Reduce:
– Aggregate,summarize,filterortransform
§ Writetheresult
Outlinestaysthesame,MapandReducechangetofittheproblem
VTCS4604 39Prakash2016
MapReduce:TheMapStep
vk
k v
k v
mapvk
vk
…
k vmap
Input key-value pairs
Intermediate key-value pairs
…
k v
VTCS4604 40Prakash2016
MapReduce:TheReduceStep
k v
…
k v
k v
k v
Intermediate key-value pairs
Groupbykey
reduce
reduce
k v
k v
k v
…
k v
…
k v
k v v
v v
Key-value groups Output key-value pairs
VTCS4604 41Prakash2016
MoreSpecifically§ Input:asetofkey-valuepairs§ Programmerspecifiestwomethods:
– Map(k, v) → <k’, v’>* • Takesakey-valuepairandoutputsasetofkey-valuepairs
– E.g.,keyisthefilename,valueisasinglelineinthefile
• ThereisoneMapcallforevery(k,v)pair
– Reduce(k’, <v’>*) → <k’, v’’>* • Allvaluesv’withsamekeyk’arereducedtogetherandprocessedinv’order
• ThereisoneReducefuncdoncallperuniquekeyk’
VTCS4604 42Prakash2016
MapReduce:WordCoun0ng
The crew of the space shuttle Endeavor recently re turned to Ear th as ambassadors, harbingers of a new era o f space exploration. Scientists at NASA are saying that the recent assembly of the Dextre bot is the first step in a long-term space-based man/mache partnership. '"The work we're doing now -- the robotics we're doing -- is what we're going to need ……………………..
Big document
(The,1)(crew,1)(of,1)(the,1)(space,1)(shu_le,1)
(Endeavor,1)(recently,1)
….
(crew,1)(crew,1)(space,1)(the,1)(the,1)(the,1)
(shu_le,1)(recently,1)
…
(crew,2)(space,1)(the,3)
(shu_le,1)(recently,1)
…
MAP:Readinputandproducesasetofkey-valuepairs
Groupbykey:Collectallpairswithsamekey
Reduce:Collectallvaluesbelongingtothekeyandoutput
(key, value)
Provided by the programmer
Provided by the programmer
(key, value) (key, value)
Sequ
enda
llyre
adth
edata
Onlysequ
enda
lreads
VTCS4604 43Prakash2016
WordCountUsingMapReduce
map(key, value): // key: document name; value: text of the document for each word w in value:
emit(w, 1)
reduce(key, values): // key: a word; value: an iterator over counts result = 0 for each count v in values: result += v emit(key, result)
VTCS4604 44Prakash2016
Map-Reduce(MR)asSQL
§ selectcount(*)fromDOCUMENTgroupbyword
Prakash2016 VTCS4604 45
Mapper
Reducer
Map-Reduce:Environment
Map-Reduceenvironmenttakescareof:§ Parddoningtheinputdata§ Schedulingtheprogram’sexecudonacrossasetofmachines
§ Performingthegroupbykeystep§ Handlingmachinefailures§ Managingrequiredinter-machinecommunicadon
VTCS4604 46Prakash2016
Map-Reduce:Adiagram
VTCS4604 47
Bigdocument
MAP:Readinputandproducesasetofkey-valuepairs
Groupbykey:Collectallpairswith
samekey(Hashmerge,Shuffle,
Sort,Par00on)
Reduce:Collectallvalues
belongingtothekeyandoutput
Prakash2016
Map-Reduce:InParallel
VTCS4604 48AllphasesaredistributedwithmanytasksdoingtheworkPrakash2016
Map-Reduce§ Programmerspecifies:
– MapandReduceandinputfiles§ Workflow:
– Readinputsasasetofkey-value-pairs– Maptransformsinputkv-pairsintoa
newsetofk'v'-pairs– Sorts&Shufflesthek'v'-pairstooutput
nodes– Allk’v’-pairswithagivenk’aresentto
thesamereduce– Reduceprocessesallk'v'-pairsgrouped
bykeyintonewk''v''-pairs– Writetheresuldngpairstofiles
§ Allphasesaredistributedwithmanytasksdoingthework
Input0
Map0
Input1
Map1
Input2
Map2
Reduce0 Reduce1
Out0 Out1
Shuffle
49VTCS4604Prakash2016
DataFlow
§ Inputandfinaloutputarestoredonadistributedfilesystem(FS):– Schedulertriestoschedulemaptasks“close”tophysicalstoragelocadonofinputdata
§ IntermediateresultsarestoredonlocalFSofMapandReduceworkers
§ OutputisoneninputtoanotherMapReducetask
VTCS4604 50Prakash2016
Coordina0on:Master
§ Masternodetakescareofcoordina0on:– Taskstatus:(idle,in-progress,completed)– Idletasksgetscheduledasworkersbecomeavailable– Whenamaptaskcompletes,itsendsthemasterthelocadonandsizesofitsRintermediatefiles,oneforeachreducer
– Masterpushesthisinfotoreducers
§ Masterpingsworkersperiodicallytodetectfailures
VTCS4604 51Prakash2016
DealingwithFailures
§ Mapworkerfailure– Maptaskscompletedorin-progressatworkerareresettoidle
– Reduceworkersarenodfiedwhentaskisrescheduledonanotherworker
§ Reduceworkerfailure– Onlyin-progresstasksareresettoidle– Reducetaskisrestarted
§ Masterfailure– MapReducetaskisabortedandclientisnodfied
VTCS4604 52Prakash2016
PROBLEMSSUITEDFORMAP-REDUCE
Prakash2016 VTCS4604 53
Example:Hostsize
§ Supposewehavealargewebcorpus§ Lookatthemetadatafile
– Linesoftheform:(URL,size,date,…)§ Foreachhost,findthetotalnumberofbytes
– Thatis,thesumofthepagesizesforallURLsfromthatpardcularhost
§ Otherexamples:– Linkanalysisandgraphprocessing– MachineLearningalgorithms
VTCS4604 54Prakash2016
Example:LanguageModel
§ Sta0s0calmachinetransla0on:– Needtocountnumberofdmesevery5-wordsequenceoccursinalargecorpusofdocuments
§ VeryeasywithMapReduce:– Map:
• Extract(5-wordsequence,count)fromdocument
– Reduce:• Combinethecounts
VTCS4604 55Prakash2016
DegreeofgraphExample
§ FinddegreeofeverynodeinagraphExample:Inafriendshipgraph,whatisthenumberoffriendsofeveryperson:Node6=1Node2=3Node4=3Node1=2Node3=2Node5=3
Prakash2016 VTCS4604 56
Degreeofeachnodeinagraph
§ Supposeyouhavetheedgelist === ==atable!
Schema? Edges(from,to)
Prakash2016 VTCS4604 57
6 4 4 6 4 3 3 4 4 5 5 4 ...
Degreeofeachnodeinagraph
§ Supposeyouhavetheedgelist === ==atable!
Schema? Edges(from,to)
SQLfordegreelist?
Prakash2016 VTCS4604 58
SELECTfrom,count(*)FROMEdgesGROUPBYfrom
6 4 4 6 4 3 3 4 4 5 5 4 ...
Degreeofeachnodeinagraph
§ SoinSQL:§ MapReduce?Mapper:emit(from,1)
Reducer:emit(from,count())
Prakash2016 VTCS4604 59
SELECTfrom,count(*)FROMEdgesGROUPBYfrom
Remember
6 4 4 6 4 3 3 4 4 5 5 4 ...
I.E.essen0allyequivalenttothe‘word-count’exampleJ
InHW5
§ Youwillhavetofindthedegreedistribu9onofanetwork.
Prakash2016 VTCS4604 60
Conclusions
§ Hadoopisadistributeddata-intensivecompudngframework
§ MapReduce– Simpleprogrammingparadigm– Surprisinglypowerful(maynotbesuitableforalltasksthough)
§ HadoophasspecializedFileSystem,Master-SlaveArchitecturetoscale-up
Prakash2016 VTCS4604 61
NoSQLandHadoop
§ Hotareawithseveralnewproblems– Goodforacademicresearch– Goodforindustry
=FunANDProfitJ
Prakash2016 VTCS4604 62
POINTERSANDFURTHERREADING
Prakash2016 VTCS4604 63
Implementa0ons
§ Google– NotavailableoutsideGoogle
§ Hadoop– Anopen-sourceimplementadoninJava– UsesHDFSforstablestorage– Download:http://lucene.apache.org/hadoop/
§ AsterData– Cluster-opdmizedSQLDatabasethatalsoimplementsMapReduce
VTCS4604 64Prakash2016
CloudCompu0ng
§ Abilitytorentcompudngbythehour– Addidonalservicese.g.,persistentstorage
§ Amazon’s“ElasdcComputeCloud”(EC2)
§ AsterDataandHadoopcanbothberunonEC2
VTCS4604 65Prakash2016
Reading
§ JeffreyDeanandSanjayGhemawat:MapReduce:SimplifiedDataProcessingonLargeClusters– h_p://labs.google.com/papers/mapreduce.html
§ SanjayGhemawat,HowardGobioff,andShun-TakLeung:TheGoogleFileSystem– h_p://labs.google.com/papers/gfs.html
VTCS4604 66Prakash2016
Resources§ HadoopWiki
– Introducdon• h_p://wiki.apache.org/lucene-hadoop/
– GeqngStarted• h_p://wiki.apache.org/lucene-hadoop/GeqngStartedWithHadoop
– Map/ReduceOverview• h_p://wiki.apache.org/lucene-hadoop/HadoopMapReduce• h_p://wiki.apache.org/lucene-hadoop/HadoopMapRedClasses
– EclipseEnvironment• h_p://wiki.apache.org/lucene-hadoop/EclipseEnvironment
§ Javadoc– h_p://lucene.apache.org/hadoop/docs/api/
VTCS4604 67Prakash2016
Resources
§ ReleasesfromApachedownloadmirrors– h_p://www.apache.org/dyn/closer.cgi/lucene/hadoop/
§ Nightlybuildsofsource– h_p://people.apache.org/dist/lucene/hadoop/nightly/
§ Sourcecodefromsubversion– h_p://lucene.apache.org/hadoop/version_control.html
VTCS4604 68Prakash2016
FurtherReading§ Programmingmodelinspiredbyfuncdonallanguageprimidves§ Parddoning/shufflingsimilartomanylarge-scalesordngsystems
– NOW-Sort['97]§ Re-execudonforfaulttolerance
– BAD-FS['04]andTACC['97]§ LocalityopdmizadonhasparallelswithAcdveDisks/Diamondwork
– AcdveDisks['01],Diamond['04]§ BackuptaskssimilartoEagerSchedulinginCharlo_esystem
– Charlo_e['96]§ DynamicloadbalancingsolvessimilarproblemasRiver's
distributedqueues– River['99]
VTCS4604 69Prakash2016