data modeling for big data & nosql technologies with karen lopez

18
www.infoadvisors.com Aug 2014 1 Big Data, NoSQL & Data Modeling 10 Tips for Data Modeling Success on Modern Data Projects Karen Lopez, InfoAdvisors www.datamodel.com Karen López Karen has 20+ years of data an information architecture experience on large, multi-project programs. She is a frequent speaker on data modeling, data-driven methodologies and pattern data models. She wants you to love your data. ©InfoAdvisors - infoadvisors.com #TEAMDATA Aug 2014

Upload: embarcadero-technologies

Post on 18-Nov-2014

626 views

Category:

Technology


2 download

DESCRIPTION

Watch the companion webinar for this presentation at http://embt.co/KLopez826. In this webinar, Karen Lopez of InfoAdvisors will cover 10 tips for the modern data architect and resources for coming up to speed on these new approaches. She will share how modern data modeling approaches address both SQL (relational) and NoSQL technologies. We'll look at the role of a data modeler, and how models, processes and data governance processes can add value to enterprise big data and NoSQL development projects.

TRANSCRIPT

Page 1: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

1

Big Data, NoSQL & Data Modeling

10 Tips for Data Modeling Success on Modern Data Projects

Karen Lopez, InfoAdvisorswww.datamodel.com

Karen López

Karen has 20+ years of data an information architecture experience on large, multi-project programs.

She is a frequent speaker on data modeling, data-driven methodologies and pattern data models.

She wants you to love your data.

©InfoAdvisors - infoadvisors.com

#TEAMDATA

Aug 2014

Page 2: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

2

POLL: Who Are You?

©InfoAdvisors - infoadvisors.com Aug 2014

©InfoAdvisors - infoadvisors.com

POLL: NoSQL Much?

Aug 2014

Page 3: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

3

“BIG DATA”

[x] Vs

“Data so big it’s awkward to work with”

Always capitalized Big Data

A confusing term because it defines what it IS NOT.

Aug 2014©InfoAdvisors - infoadvisors.com

“NoSQL”

Scale

Not SQL?

Not Relational?

Not Only Relational?

A confusing term because it defines what it IS NOT.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 4: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

4

Terminology

ACID

BASE

Eventual consistency

Schemaless

Constraints / Have-to/ MUST / OBEY / Rigid / Inflexible

Basically available, Soft state, Eventual consistency

Atomic, Consistent, Isolated, Durable

Aug 2014©InfoAdvisors - infoadvisors.com

Relational Tables with rows

Tables with rows

same columns

with the same datatypes

with the same constraints

with the same domains

This is a FEATUREThis is a FEATURE

On purpose

With many benefits

Write-optimized

Trans-action-optimized

Trans-action-optimized

Data integrity

Data quality

Consistent

Aug 2014©InfoAdvisors - infoadvisors.com

Page 5: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

5

Data Models – Traditional Process

Conceptual (Data) Model

Logical Data Model

Physical Data

Model(s) OLTP

OLTPOLTP OLTP

OLTP

MARTMART

OLTP

OLTPOLTP

Aug 2014©InfoAdvisors - infoadvisors.com

Relational

Aug 2014©InfoAdvisors - infoadvisors.com

Page 6: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

6

Traditional Data Modeler Involvement

Project Initiation

Architecture and

Infrastructure Design

SW Requirements

Development

Deployment

Aug 2014©InfoAdvisors - infoadvisors.com

The Big Data Story

Lots of data

Coming at us fast

Lots of variety in format & quality

We want all the data

Highly available

“It’s web scale”Aug 2014©InfoAdvisors - infoadvisors.com

Page 7: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

7

What do we really mean by scale?

Bringing computing to the data

Massively parallel processing

Cheap, commodity hardware, but lots of it

Optimized for Query/Reads/Questions/Telling stories

Aug 2014©InfoAdvisors - infoadvisors.com

Can we fit another buzzword in?

Clo

ud • Enable on-demand scaling

• Pay as you go pricing• Click to deploy• Service licensing, not product licensing, if any• Managed by others, not your data center

• Enable on-demand scaling• Pay as you go pricing• Click to deploy• Service licensing, not product licensing, if any• Managed by others, not your data center

Aug 2014©InfoAdvisors - infoadvisors.com

Page 8: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

8

We’ve been down this road before…

Traditional transactional applications

Reporting-optimized

tables/structures

Data Warehouse / Dimensional

Modeling

Aug 2014©InfoAdvisors - infoadvisors.com

ETL

EDW

Data Mart

Data Mart

Page 9: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

9

Hadoop

ETL

EDW

Analytics Mart

Data Mart

NoSQL, Not Only SQL

Relational GraphColumnar/Column

Family

Key ValueDocument Databases

Others

Aug 2014©InfoAdvisors - infoadvisors.com

Page 10: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

10

Graph Databases

Aug 2014©InfoAdvisors - infoadvisors.com

Key Value Pair

Aug 2014©InfoAdvisors - infoadvisors.com

Page 11: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

11

Columnar

Aug 2014©InfoAdvisors - infoadvisors.com

Sample Hive Statement

CREATE EXTERNAL TABLE TaxRebateUsage (

state string,

zipcode string,

agi_class int,

n1 int,

mars2 int,

prep int,

n2 int,

)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE

Aug 2014©InfoAdvisors - infoadvisors.com

Page 12: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

12

Sample JSON/MongoDB Notation

Aug 2014©InfoAdvisors - infoadvisors.com

Sample FoundationDB Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 13: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

13

Sample Cassandra Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Sample Vertica Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 14: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

14

Sample Neo4j Statement

Aug 2014©InfoAdvisors - infoadvisors.com

What else is different, now?

Maturing of high availability technologies

Maturing of ____ as a Service business models

RDBMS vendors adopting non-relational features

Open source software models

100s of Database optionsAug 2014©InfoAdvisors - infoadvisors.com

Page 15: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

15

The Big Data Big Lies

Schemaless

• Schema on Read, not Schema on Write

• Polyschematic

Big

• New data stories• New

technologies

Aug 2014©InfoAdvisors - infoadvisors.com

Importing Structures

Aug 2014©InfoAdvisors - infoadvisors.com

Page 16: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

16

10 Tips For Modeling in a Hybrid World

1. Models require a modeler

2. Data modeling tools are essential

3. There are many types of data models: know which ones you need

4. Modeling does not have to happen at the same time in every project. It should happen at the right time

5. Modeling is not just schema design. Think outside the boxes and lines

Aug 2014©InfoAdvisors - infoadvisors.com

10 Tips for Modeling in a Hybrid World

6. A data model is much more than a diagram

7. You will need training.

8. Team members may not understand modeling. They will need training

9. NoSQL is not one thing. Learn many patterns

10.Modern data architectures are likely hybrid solutions. You can’t just support one part.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 17: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

17

Modern Data Modeler Involvement

Project Initiation

Architecture and

Infrastructure Design

SW Requirements

Development

Deployment

Aug 2014©InfoAdvisors - infoadvisors.com

What does this mean for data modelers?

There will be jobs for traditional, ERD, relational modelers….

….just like there are still jobs of RPG and COBOL programmers

All data has a data story. Many data stories.

A good modeler is a an architect at heart – finding the right solution for the data story.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 18: Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

www.infoadvisors.com Aug 2014

18

Business Intelligence Journal

Look for September Issue Article on Modern

Data Architectures

Aug 2014©InfoAdvisors - infoadvisors.com

Thank You!

www.infoadvisors.comwww.datamodel.comwww.dataversity.net

#TEAMDATA

Aug 2014©InfoAdvisors - infoadvisors.com