fortune time institute: big data - challenges for smartcity

31

Click here to load reader

Upload: victoria-lopez

Post on 16-Apr-2017

1.042 views

Category:

Business


1 download

TRANSCRIPT

Shanghai 2014

Big and Open Data Challengesfor Smartcity

Dr. Victoria LpezGrupo G-TeCwww.tecnologiaUCM.esUniversidad Complutense de Madrid

August26th201455 Exchange PlaceNYC

1

Big and Open data. Challengesfor SmartcityWhat about Big Data?Fighting with Big Data. Big Data. Big Projects. Privacity. Open Data. Transparency. Smartcities.2

What about Big Data?

From Data Warehouse to Big Data (large Data Bases) 31970 relational model inventedRDBMS declared mainstream till 90sOne-size fits all, Elephant vendors- heavily encoded even indexing by B-trees.

What about Big Data?

Big Data 3+1+1 Vs 4

Value

Fighting with Big Data5

Fighting with the Big Data Bioinformatics, Genoma data, DNA, RNA, Proteins and, in general all biological data have been required by computing monitors and storing in large data bases in several laboratories and researching centers along the world

The Human Genome Project 6

Customer point of viewLooking for flightsNot a simple search

7

Web Issues: Short path8

Joke but, behind our comfortable position there are some math and programming

Restrictions: Total timeTotal CostsDate/hourHow to sort the results?

http://www.sorting-algorithms.com/

9Web issues: Searching & Sorting

How many? 10

Order your room now!One teenager working = one afternoon at home

How many? 11Order all New York rooms NOW!One teenager working alone?

The solution: organization12

13Main feature: scalability to many nodesScan of 100 TB in 1 node @ 50 MB/sec = 23 daysScan in a cluster of 1000 nodes = 33 minutesCreated by Google (2004)Parallel programming modelSimple concept, smart, suitable for multiple applicationsBig datasets multi-node in multiprocessorsSets of nodes: Clusters or Grids (distributed programming)

Able to process 20 PB per dayBased on Map & Reduce, classical methods in functional programming related to the classic Divide & Conquer Come from numeric analysis (big matrix products).

Big Data: Map Reduce

Hadoop open code implementation of the computacional model Map ReduceUsed by Yahoo!, Facebook, Twitter Amazon, eBayCan be used in different architectures: both clusters (in-house) and grid (Cloudcomputing)Storrm and Spark are same model in memory instead of in diskhttps://hadoop.apache.org/ https://spark.apache.org/

14Big Data: Hadoop, Spark

How amount of data?15

Recommender Systems16

Renew your car insurance

Semantic Web toolsAnalysing & storing personal information

Business need to be competitive17

Harvard Business Review (HBR) blog, CMOs and CIOs Need to Get Along to Make Big Data Work,

Big Data & Business18

Big Data for Big projectsReal TimeThe Obama 2012 campaign used data analytics and the experimental method to assemble a winning coalition vote by vote. In doing so, it overturned the long dominance of TV advertising in U.S. politics and created something new in the world: a national campaign run like a local ward election, where the interests of individual voters were known and addressed.19

20

Big Data for Big projectsReal TimeHow Brazil vs. Germany played out on TwitterGeotagged tweets mentioning key terms around the Word Cup game, July 8, 2014

Where are my Personal Data?21

Social Sensing

The close future: Internet of the things22

Open Data

Open data is data that can be freely used, reused and redistributed by anyone subject only, at most, to the requirement to attribute and sharealike. OpenDefinition.org -Open data is data that can be freely used, reused and redistributed by anyone subject only, at most, to the requirement to attribute and share alike. OpenDefinition.orgAvailability and Access: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.Reuse and Redistribution: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must bemachine-readable.Universal Participation: everyone must be able to use, reuse and redistribute there should be no discrimination against fields of endeavour or against persons or groups. For example, non-commercial restrictions that would prevent commercial use, or restrictions of use for certain purposes (e.g. only in education), are not allowed.23

Open Data

24

Why Open Data by Open Knowledge Foundation

25

Our experience in developing systems to Madrid Open Data

Mariam SaucedoPilar TorralboDaniel Sanz

Recycla.me

Ana AlfaroSergio BallesterosLidia Sesma

Hctor Martoslvaro BustilloArturo Callejo

Beln Abellanas Jaime Ramos Ignacio P. de Ziriza

Victor TorresAlberto SegoviaMiguel Bueno

Mar Octavio de ToledoAntonio SanmartnCarlos Fernndez

MAPA DE RECURSOS RECYCLA.TE26

26

Parks and gardensParkings for CarsMotorbikesBikesRecycing PointsFixedMobileClothsStationsBioetanolGas Oil ElectricRoutes for bikesVas ciclistasCalles segurasResidential Priority Areas

Madrid Smart City27

27

RMapDemostration28

The way from data to valueBig Data CollectionMonitoringData cleaning and integrationHosted Data Platforms and the Cloud Big Data StorageModern Data BasesDistributed Computing Platforms NoSQL, NewSQL Big Data Systems SecurityMulticore scalabilityVisualization and User Interfaces Big Data AnalyticsFast algorithmsData compressionMachine learning toolsVisualization & Reporting

29

The MIT proposal stage list to deal with Big Data

Conclusions30Big Data, Open Data and Smartcity

Era of Data Revolution (Alex 'Sandy' Pentland, http://www.media.mit.edu/people/sandy)New technologies & developmentNew Business Great opportunities in Smartcity development

Dr. Victoria Lpez www.tecnologiaUCM.eswww.madrid.orgMadrid City Hall