![Page 1: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/1.jpg)
ANALYZING BIG DATA IS PROGRAMMING FOR THE CLOUD
Chris Boos (@boosc)[email protected]
CloudCamp Frankfurt 24.5.2012
Donnerstag, 24. Mai 12
![Page 2: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/2.jpg)
Data, lots of itDonnerstag, 24. Mai 12
![Page 3: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/3.jpg)
Even in simple datasets, common statistics fails - (avg, min, max, distribution)
Donnerstag, 24. Mai 12
![Page 4: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/4.jpg)
79 times more CPU power than used in Apollo missions on one iPhone
Donnerstag, 24. Mai 12
![Page 5: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/5.jpg)
Why you need big data
Data Processing 1960 s 1950 s Data
Information Mangement 1980 s 1970 s Information
Knowledge Management 1990 s Knowledge
Knowledge Ecology 2000 s Intelligence
Wisdom 2010 s Systems Thinking
Yield You Are Here !
Donnerstag, 24. Mai 12
![Page 6: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/6.jpg)
Finding clusters, evaluating outliers and interpreting white noise
Donnerstag, 24. Mai 12
![Page 7: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/7.jpg)
You are not looking for patterns, you are looking for anomalies
Donnerstag, 24. Mai 12
![Page 8: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/8.jpg)
Cloud Computing 1.0 Is
When the IT guys are finally able to explain to business
people what they were talking about 20 years ago!
Donnerstag, 24. Mai 12
![Page 9: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/9.jpg)
=
Donnerstag, 24. Mai 12
![Page 10: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/10.jpg)
Computation on demand
+Pay as you go
Donnerstag, 24. Mai 12
![Page 11: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/11.jpg)
Cloud Computing 2.0 Is
When the IT guys realize that using this scalable
ressource also calles for new ways of programming
Donnerstag, 24. Mai 12
![Page 12: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/12.jpg)
=
Donnerstag, 24. Mai 12
![Page 13: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/13.jpg)
go beyond IaaSand start
thinking parallel
Donnerstag, 24. Mai 12
![Page 14: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/14.jpg)
and
Donnerstag, 24. Mai 12
![Page 15: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/15.jpg)
BASE(Basically Available, Soft State, Eventual consistency)
not
ACID(Atomicity, Consistency, Isolation, Durability)
Donnerstag, 24. Mai 12
![Page 16: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/16.jpg)
How to scale (AWS Example)
• Do not allocate instances manually
• Each component needs to be independent
• Plan for failure
• Actively provoke failure
Donnerstag, 24. Mai 12
![Page 17: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/17.jpg)
Human Software
• Click Workers and Mechanical Turks are not just cheap labour
• They allow programmers to hand tasks to humans they are not able to handle algorithmically
• Make use of it to
• Do things too complicated for machine learning
• Pre populate machine learning spaces
Donnerstag, 24. Mai 12
![Page 18: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/18.jpg)
Old Style (Imperative) Programming
• Step by step explanation what to do
• Explaining WHAT to do rather than RESULTS you want
• Always necessary for basic algorithms
1
2
3
Donnerstag, 24. Mai 12
![Page 19: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/19.jpg)
One New Stly (Functional) Programming I
• Combine results to become a program
• Allows dynamic distribution
• Map-Reduce is only one way of doing it!
1
2
3
Donnerstag, 24. Mai 12
![Page 20: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/20.jpg)
Functional Programming II
F ( G ( H ( A,B) , C), D)
getMusicLikes(getFriends(facebookID)
Instead of
for i in getFriends(facebookID) getMusicLikes(i)
Donnerstag, 24. Mai 12
![Page 21: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/21.jpg)
Check out my tool list:http://www.hcboos.net/100-links/
Donnerstag, 24. Mai 12
![Page 22: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/22.jpg)
2 Examples
Donnerstag, 24. Mai 12
![Page 23: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/23.jpg)
The AMP3 Platform at Senzari.comAdaptable Music Parallel Processing Platform
Donnerstag, 24. Mai 12
![Page 24: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/24.jpg)
MARS-o-Matic at arago.deBig data based IT modelling and pricing app
Donnerstag, 24. Mai 12
![Page 25: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/25.jpg)
Thank You for Your Time
Donnerstag, 24. Mai 12
![Page 26: Chris Boos, arago AG: Big Data means new programs](https://reader033.vdocuments.net/reader033/viewer/2022042813/5480687eb379596a2b8b5ae0/html5/thumbnails/26.jpg)
Credits
• „Big Data Just Beginning to Explode“ by CSC http://www.csc.com/insights/flxwd/78931-big_data_just_beginning_to_explode
• „Social media network connections among twitter users“ by Marc Smith http://www.flickr.com/photos/marc_smith/
• Asteroid Datasets by Bruce Gary http://brucegary.net/POVENMIRE/x.htm
Donnerstag, 24. Mai 12