www.infoadvisors.com Aug 2014
1
Big Data, NoSQL & Data Modeling
10 Tips for Data Modeling Success on Modern Data Projects
Karen Lopez, InfoAdvisorswww.datamodel.com
Karen López
Karen has 20+ years of data an information architecture experience on large, multi-project programs.
She is a frequent speaker on data modeling, data-driven methodologies and pattern data models.
She wants you to love your data.
©InfoAdvisors - infoadvisors.com
#TEAMDATA
Aug 2014
www.infoadvisors.com Aug 2014
2
POLL: Who Are You?
©InfoAdvisors - infoadvisors.com Aug 2014
©InfoAdvisors - infoadvisors.com
POLL: NoSQL Much?
Aug 2014
www.infoadvisors.com Aug 2014
3
“BIG DATA”
[x] Vs
“Data so big it’s awkward to work with”
Always capitalized Big Data
A confusing term because it defines what it IS NOT.
Aug 2014©InfoAdvisors - infoadvisors.com
“NoSQL”
Scale
Not SQL?
Not Relational?
Not Only Relational?
A confusing term because it defines what it IS NOT.
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
4
Terminology
ACID
BASE
Eventual consistency
Schemaless
Constraints / Have-to/ MUST / OBEY / Rigid / Inflexible
Basically available, Soft state, Eventual consistency
Atomic, Consistent, Isolated, Durable
Aug 2014©InfoAdvisors - infoadvisors.com
Relational Tables with rows
Tables with rows
same columns
with the same datatypes
with the same constraints
with the same domains
This is a FEATUREThis is a FEATURE
On purpose
With many benefits
Write-optimized
Trans-action-optimized
Trans-action-optimized
Data integrity
Data quality
Consistent
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
5
Data Models – Traditional Process
Conceptual (Data) Model
Logical Data Model
Physical Data
Model(s) OLTP
OLTPOLTP OLTP
OLTP
MARTMART
OLTP
OLTPOLTP
Aug 2014©InfoAdvisors - infoadvisors.com
Relational
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
6
Traditional Data Modeler Involvement
Project Initiation
Architecture and
Infrastructure Design
SW Requirements
Development
Deployment
Aug 2014©InfoAdvisors - infoadvisors.com
The Big Data Story
Lots of data
Coming at us fast
Lots of variety in format & quality
We want all the data
Highly available
“It’s web scale”Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
7
What do we really mean by scale?
Bringing computing to the data
Massively parallel processing
Cheap, commodity hardware, but lots of it
Optimized for Query/Reads/Questions/Telling stories
Aug 2014©InfoAdvisors - infoadvisors.com
Can we fit another buzzword in?
Clo
ud • Enable on-demand scaling
• Pay as you go pricing• Click to deploy• Service licensing, not product licensing, if any• Managed by others, not your data center
• Enable on-demand scaling• Pay as you go pricing• Click to deploy• Service licensing, not product licensing, if any• Managed by others, not your data center
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
8
We’ve been down this road before…
Traditional transactional applications
Reporting-optimized
tables/structures
Data Warehouse / Dimensional
Modeling
Aug 2014©InfoAdvisors - infoadvisors.com
ETL
EDW
Data Mart
Data Mart
www.infoadvisors.com Aug 2014
9
Hadoop
ETL
EDW
Analytics Mart
Data Mart
NoSQL, Not Only SQL
Relational GraphColumnar/Column
Family
Key ValueDocument Databases
Others
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
10
Graph Databases
Aug 2014©InfoAdvisors - infoadvisors.com
Key Value Pair
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
11
Columnar
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Hive Statement
CREATE EXTERNAL TABLE TaxRebateUsage (
state string,
zipcode string,
agi_class int,
n1 int,
mars2 int,
prep int,
n2 int,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
12
Sample JSON/MongoDB Notation
Aug 2014©InfoAdvisors - infoadvisors.com
Sample FoundationDB Statement
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
13
Sample Cassandra Statement
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Vertica Statement
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
14
Sample Neo4j Statement
Aug 2014©InfoAdvisors - infoadvisors.com
What else is different, now?
Maturing of high availability technologies
Maturing of ____ as a Service business models
RDBMS vendors adopting non-relational features
Open source software models
100s of Database optionsAug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
15
The Big Data Big Lies
Schemaless
• Schema on Read, not Schema on Write
• Polyschematic
Big
• New data stories• New
technologies
Aug 2014©InfoAdvisors - infoadvisors.com
Importing Structures
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
16
10 Tips For Modeling in a Hybrid World
1. Models require a modeler
2. Data modeling tools are essential
3. There are many types of data models: know which ones you need
4. Modeling does not have to happen at the same time in every project. It should happen at the right time
5. Modeling is not just schema design. Think outside the boxes and lines
Aug 2014©InfoAdvisors - infoadvisors.com
10 Tips for Modeling in a Hybrid World
6. A data model is much more than a diagram
7. You will need training.
8. Team members may not understand modeling. They will need training
9. NoSQL is not one thing. Learn many patterns
10.Modern data architectures are likely hybrid solutions. You can’t just support one part.
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
17
Modern Data Modeler Involvement
Project Initiation
Architecture and
Infrastructure Design
SW Requirements
Development
Deployment
Aug 2014©InfoAdvisors - infoadvisors.com
What does this mean for data modelers?
There will be jobs for traditional, ERD, relational modelers….
….just like there are still jobs of RPG and COBOL programmers
All data has a data story. Many data stories.
A good modeler is a an architect at heart – finding the right solution for the data story.
Aug 2014©InfoAdvisors - infoadvisors.com
www.infoadvisors.com Aug 2014
18
Business Intelligence Journal
Look for September Issue Article on Modern
Data Architectures
Aug 2014©InfoAdvisors - infoadvisors.com
Thank You!
www.infoadvisors.comwww.datamodel.comwww.dataversity.net
#TEAMDATA
Aug 2014©InfoAdvisors - infoadvisors.com