flink london meetup 3 march 2016 - flink basics
TRANSCRIPT
![Page 1: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/1.jpg)
Motivation
![Page 2: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/2.jpg)
TheEvolutionofMassive-ScaleDataProcessingTylerAkidau,StaffSoftwareEngineer@Googlehttps://goo.gl/5k0xaL
![Page 3: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/3.jpg)
TheEvolutionofMassive-ScaleDataProcessingTylerAkidau,StaffSoftwareEngineer@Googlehttps://goo.gl/5k0xaL
![Page 4: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/4.jpg)
We’renotthenewest!
![Page 5: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/5.jpg)
APACHEFLINKLONDONMEETUP3rdMarch2016|BonhillHouse,London
![Page 6: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/6.jpg)
Whatwe’llcovertoday
¨ Hand-waveybit¨ Practicalbit¨ Textbookbit
![Page 7: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/7.jpg)
Part1:Thehand-waveybit
¨ Aim:¤ MakesureweallhavesamebasicunderstandingofwhatFlinkis
¤ Introducekeyconceptsn Notexhaustiven Notexplainingmuch!
![Page 8: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/8.jpg)
WTFisFlink?
![Page 9: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/9.jpg)
Flinkbasics…
¨ ApacheFoundationtoplevelopensourceproject…¨ …fordistributeddataprocessing…¨ …witha“streamingfirst”architecture…¨ …runningontheJVM.Or:A‘free’waytoprocessalotofdata(especiallystreamingdata)on‘commodity’hardware,withacodebasethatiscontinuallyimproving.Usefulforreporting,analytics,logprocessing,machinelearning,etc.
![Page 10: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/10.jpg)
Somekeyterms
¨ DataStreamApossiblyunboundedimmutablecollectionofdataitemsofthesametype
¨ DataSetAnabstractrepresentationofafiniteimmutablecollectionofdataofthesametypethatmaycontainduplicates
¨ SourceCanbefile-based,socket-based,collection-based,Custom(e.g.Kaea)
¨ SinkConsumesDataSets/DataStreamsandforwardsthemtofiles,sockets,externalsystems,orprintsthem
¨ OperatorRepresentsanoperation(oradataprocessingstep)inthe‘JobGraph’–includespropertiesliketheactualcodeanddesiredparallelism.
![Page 11: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/11.jpg)
Applicationarchitecture
![Page 12: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/12.jpg)
Flink‘skeleton’programstructure
DataStream1. Obtaina
StreamExecutionEnvironment
2. Connecttodatastreamsources
3. Specifytransformationsonthedatastreams
4. Specifyoutputfortheprocesseddata
5. Executetheprogram[env.execute()]
DataSet1. Obtainan
ExecutionEnvironment2. Load/createtheinitial
data3. Specifytransformations
onthedata4. Specifywheretoput
results5. Executetheprogram
[env.execute(), print(), collect()]
![Page 13: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/13.jpg)
(infuturemeetupsGuestSpeakerswillgiveusthejuicydetails…)
KeyFlinkfeatures
![Page 14: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/14.jpg)
High Performance
Support for out-of-order events
Low latency
Exactly-once semantics
Flexible streaming windows
One runtime for stream & batch /
ecosystem
Back pressure
Delta iterate operators
![Page 15: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/15.jpg)
One runtime for stream & batch /
ecosystem
Delta iterate operators
High Performance
Support for out-of-order events
Low latency
Exactly-once semantics
Flexible streaming windows
Back pressure
AccordingtotheApacheFlinksite
(http://flink.apache.org/)
![Page 16: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/16.jpg)
Highperformance/Lowlatency
Highthroughput
Lowlatency
![Page 17: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/17.jpg)
Flowcontrolandbackpressure
¨ Backpressurebottleneck:‘pressure’buildingupbecausedataisarrivingfasterthanitcanbeprocessed.¤ Temporaryprocessslow-down(e.g.GConJVM)¤ Temporarytrafficspike
¨ “Flinkachievesthemaximumthroughputallowedbytheslowestpartofthepipeline”¤ Notaconfigurable‘feature’¤ Inherentinarchitecture(buffer-based)
![Page 18: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/18.jpg)
Exactly-oncesemanticsforstate
¨ Intheeventoffailure“Pickupwhereyouleftoff”.¤ Meansyouneedtorememberwhereyouleftoff(dataandstate)
¨ 3levelsofriskappetite:¤ L1–Acceptmisses(“Atmostonce”)¤ L2–Acceptduplicates(“Atleastonce”)¤ L3–Don’taccepteither(“Exactlyonce”)
¨ Checkpointing/snapshots¤ Dependentonstreamsource–e.g.Kaea¤ Orchestrationistricky(seenextslide)
![Page 19: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/19.jpg)
Checkpointingorchestration
![Page 20: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/20.jpg)
SupportforOut-of-Orderevents
¨ Reallife:messageswillbedelayed
¨ Everyeventistime-stamped¨ It’sharderthanitsounds(‘kinds’oftime,windows,watermarks,etc)
t1t2t3t5t6t7t4t8
![Page 21: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/21.jpg)
Highlyflexiblestreamingwindows
Thestartandendofthedatastreamthatisbeingprocessed.¨ Differentwaystodefinethewindow,including:
¤ Time(from9:00:00to9:00:04)¤ Count(fromitem12toitem18)¤ Session(fromfirst‘keyedevent’untilwedon’tseesame
keyforXtime–analogoustocookiesession)¤ Morecomplexlogicdrivenbythedata,andmore
complexwindowsdependingonwhatisneeded
![Page 22: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/22.jpg)
(Delta)iterateoperators
Iterateoperator
Deltaiterateoperator
Workon‘hot’Don’ttouch‘cold’
![Page 23: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/23.jpg)
(Delta)iterateoperators
![Page 24: Flink London meetup 3 March 2016 - Flink basics](https://reader030.vdocuments.net/reader030/viewer/2022021422/58f06a281a28abcb7e8b459b/html5/thumbnails/24.jpg)
Oneruntime/libraryecosystem
NB:• Librariesinbeta• APIsinJava,Scala,[Python]• FlinkCEPtoo?