big data, cloud computing and no sql

30
© Copyright SELA software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com SELA DEVELOPER PRACTICE December 15-19, 2013 Manu Cohen-Yashar The Cloud, Big Data and NoSQL

Upload: manu-cohen-yashar

Post on 10-May-2015

346 views

Category:

Technology


2 download

DESCRIPTION

Cloud, Big Data and No Sql are popular buzz words today. This presentation shows how they all fit together. It makes sense in all of the above and show how these new technologies can help the business become more productive.

TRANSCRIPT

Page 1: Big data, Cloud Computing and No SQL

© Copyright SELA software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com

SELA DEVELOPER PRACTICEDecember 15-19, 2013

Manu Cohen-Yashar

The Cloud, Big Data and NoSQL

Page 2: Big data, Cloud Computing and No SQL

Agenda

What is the cloudData boom No SQLBig DataCloud DistributionsWhat’s next

Page 3: Big data, Cloud Computing and No SQL
Page 4: Big data, Cloud Computing and No SQL

Make sense of : Cloud , Big Data and No SQL

How they fit together

Make money !!!

Page 5: Big data, Cloud Computing and No SQL

What is the cloud

Cloud Computing is an Idea …

Infrastructure is provisioned by a cloud provider.Automatic Scale.Elasticity. Pay as you use.Availability.Simple, Automatic, Economic.

Page 6: Big data, Cloud Computing and No SQL

Type of Clouds

IAASPAASSAASand more…

Identity As A ServiceConnectivity As A Service

Storage As A Service

Page 7: Big data, Cloud Computing and No SQL

Lots of Data

Data is doubles every 18 monthPicturesWeb siteemailsSensorsGeo InformationFinancial InformationScienceArt. . . (Infinite list)

Page 8: Big data, Cloud Computing and No SQL

No Limits

With the cloud it is now possible to mount any size if cluster and conduct any computation in any scale.The one who will make sense of all available data will rule the world.

The conclusion: Use the cloud to analyze large scale of data.

Page 9: Big data, Cloud Computing and No SQL

Lets Talk about data

When we think of data we think of …

Page 10: Big data, Cloud Computing and No SQL

Data has many forms

Yet data comes in many forms and shapes

Graphs Documents

Time Series

Blobs

GeoSensors

UnstructuredStructured

Web

Page 11: Big data, Cloud Computing and No SQL

No Relational

Not all types of data fit well into the relational world.Not all data use cases fit well into the ACID conventionThe relational model does not scale very good

Difficult to distributeDifficult to replicate

Page 12: Big data, Cloud Computing and No SQL

The CAP Theory

RDBMS

Replicated NoSQL

ShardedNoSQL

During a network partition, a distributed system must choose either Consistency or Availability.

Page 13: Big data, Cloud Computing and No SQL

NO SQL

Large family of databasesNo SchemaNo relations enforcedDesigned for high scale and distribution

Types of NO SQL DBKey ValueWide ColumnsDocumentsGraph

Page 14: Big data, Cloud Computing and No SQL

Motivation for NO SQL

Large Scale and DistributionSimplicityLow costGood fit with the data modelVolume, Velocity and Variety

Page 15: Big data, Cloud Computing and No SQL

There is no one NO SQL solution for all use cases

Important

There are over than 150 possible offerings…

Page 16: Big data, Cloud Computing and No SQL

The Cloud and NO SQL

All Cloud Providers have NO SQL solutionsAzure TablesGoogle Big TableAmazon DynamoDB

NO SQL Databases are deployed on a cluster

There are large number of cloud hosting offerings for no-sql clusters

MongoHQ (MongoDB)Cassandra on Google Compute engineMany more

Page 17: Big data, Cloud Computing and No SQL

Example – Mongo in Azure

Page 18: Big data, Cloud Computing and No SQL

Big Data

What is Big?“Big” cannot fit on a single machine.

Conclusion:Big data has to be distributed.

Page 19: Big data, Cloud Computing and No SQL

Types of Big Data Processing

QueryGeneral AnalysisClassificationRecommendationClusteringAuditing and monitoringMore…

Page 20: Big data, Cloud Computing and No SQL

Challenges

Develop a parallel algorithmReduce the network traffic -> bring compute to dataMonitor and manage large number of parallel tasksSurvive failuresPerformanceLinear scale

Page 21: Big data, Cloud Computing and No SQL

Batch Processing VS Operational Intelligence

Batch ProcessingWork on existing dataProvide results within minutes

Operational IntelligenceWork on stream of dataProvide real-time results

Page 22: Big data, Cloud Computing and No SQL

Distributed File System

No one server can store Big Data filesDistribute files across clusterFailure is part of the gameSimilar API to traditional File SystemsExamples:

HDFSGFSCassandra FSMongo FS

Page 23: Big data, Cloud Computing and No SQL

Hadoop

Big Data Analysis PlatformBatch ProcessingBrings Compute tasks to data nodesParallel Processing using Map-ReduceOpen Source Huge eco system

Page 24: Big data, Cloud Computing and No SQL

Hadoop Eco System

Writing a valuable Map-Reduce job for Hadoop is not simpleMany open source projects provide abstractions

PigHiveHBaseSqoopMahoutZooKeeperMore

Page 25: Big data, Cloud Computing and No SQL

Hadoop on the Cloud

Hadoop runs on a clusterYou can use a cluster as a service on major cloud offerings

Page 26: Big data, Cloud Computing and No SQL

Storm

Real-Time big data analyticsProcess streams of dataCan be used with any programming languageWide integration with data sources

Page 27: Big data, Cloud Computing and No SQL
Page 28: Big data, Cloud Computing and No SQL

Check your schema

Be open to use NO-SQL data storesIdentify your use-case and find the right database for youCreate a simple POC

Page 29: Big data, Cloud Computing and No SQL

Look for Big Data

Ask yourself: What can I gain from big data?

How the new data or analysis scope can enhance your existing set of capabilities? What additional opportunities for intervention or processes optimisation does it present?

Identify your use case and find the right product and data model.Look for web distributions and create a simple POC

Page 30: Big data, Cloud Computing and No SQL

Questions