how big is big data? and nosql databases

Post on 25-Feb-2016

42 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

How Big is Big Data? And NoSQL Databases . University of California, Berkeley School of Information IS 257: Database Management. Announcement. Change to presentations Presentation is now OPTIONAL for extra credit Now will be during finals week Monday Dec. 16 th from 1-4 - PowerPoint PPT Presentation

TRANSCRIPT

2013.11.21- SLIDE 1IS 257 – Fall 2013

How Big is Big Data? And NoSQL Databases

University of California, BerkeleySchool of Information

IS 257: Database Management

2013.11.21- SLIDE 2

Announcement• Change to presentations

– Presentation is now OPTIONAL for extra credit

– Now will be during finals week• Monday Dec. 16th from 1-4

– Final report also due then (or before)

IS 257 – Fall 2013

2013.11.21- SLIDE 3IS 257 – Fall 2013

Lecture Outline• Review

– Big Data (introduction)• More on Big Data and what it means• RDBMS vs NoSQL databases

2013.11.21- SLIDE 4IS 257 – Fall 2013

Big Data and Databases• “640K ought to be enough for anybody.”

– Attributed to Bill Gates, 1981

2013.11.21- SLIDE 5IS 257 – Fall 2013

The Grid: On-Demand Access to Electricity

Time

Qua

lity,

eco

nom

ies

of s

cale

Source: Ian Foster

2013.11.21- SLIDE 6

Big Data and Databases• We have already mentioned some Big

Data – The Walmart Data Warehouse– Information collected by Amazon on users

and sales and used to make recommendations

• Most modern web-based companies capture EVERYTHING that their customers do– Does that go into a Warehouse or someplace

else?

IS 257 – Fall 2013

2013.11.21- SLIDE 7IS 257 – Fall 2013

Why the Grid?(1) Revolution in Science• Pre-Internet

– Theorize &/or experiment, aloneor in small teams; publish paper

• Post-Internet– Construct and mine large databases of

observational or simulation data– Develop simulations & analyses– Access specialized devices remotely– Exchange information within

distributed multidisciplinary teamsSource: Ian Foster

2013.11.21- SLIDE 8IS 257 – Fall 2013

Why the Grid?(2) Revolution in Business• Pre-Internet

– Central data processing facility• Post-Internet

– Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B)

– Business processes increasingly computing- & data-rich

– Outsourcing becomes feasible => service providers of various sorts

Source: Ian Foster

2013.11.21- SLIDE 9

How Big is Big Data• How big is big?

IS 257 – Fall 2013

1 Kilobyte 1,000 bits/byte1 megabyte 1,000,0001 gigabyte 1,000,000,0001 terabyte 1,000,000,000,0001 petabyte 1,000,000,000,000,0001 exabyte 1,000,000,000,000,000,0001 zettabyte 1,000,000,000,000,000,000,000

2013.11.21- SLIDE 10

What is Big Data?• Ran across some interesting slides from a

decade ago that already frame the problem and did a fair job of predicting where we are today

– Slides by Jim Gray and Tony Hey : “In Search of Petabyte Databases” ca. 2001

IS 257 – Fall 2013

2013.11.21- SLIDE 11

Summary from Gray & Hey

• DBs own the sweet-spot:– 1GB to 100TB

• Big data is not in databases• HPTS crowd is not really high

performance storage (BIG DATA)• Cost of storage is people:

–Performance goal:1 Admin per PB

From Jim Gray and Tony Hey : “In Search of Petabyte Databases” ca. 2001IS 257 – Fall 2013

2013.11.21- SLIDE 12

Why People?

IS 257 – Fall 2013

One row of one of Google’s data centers

2013.11.21- SLIDE 13

What counts as Big Data• More OPS (other people’s slides)

– Taming the Big Data Fire Hose • by John Hugg, VoltDB

IS 257 – Fall 2013

2013.11.21- SLIDE 14

RDBMS vs NoSQL• From a course at Dalhousie Univ. in

Canada (including slides from Keith W. Hare “A Comparison of SQL and NoSQL Databases”)

IS 257 – Fall 2013

2013.11.21- SLIDE 15

You can buy Big Data…• Oracle will be happy to sell you systems

(hardware and software) to manage your exabytes…

IS 257 – Fall 2013

Oracle Exadata Database Machine X3-8

2013.11.21- SLIDE 16

And NoSQL too…• Oracle Big Data Appliance• With Oracle NoSQL Database (BerkeleyDB)

IS 257 – Fall 2013

top related