nosql - we know what it isn't, but what is it?

44
NoSQL Now we know what it’s not... what is it?

Upload: kevin-lawver

Post on 01-Nov-2014

980 views

Category:

Technology


4 download

DESCRIPTION

A presentation given at Refresh Savannah on the wonderful world of NoSQL, what it includes, what it means and where using new-style databases makes sense. There's also a demo that contrasts doing tags with MongoDB vs. doing it with a traditional RDBMS.

TRANSCRIPT

Page 1: NoSQL - We know what it isn't, but what is it?

NoSQLNow we know what it’s not... what is it?

Page 2: NoSQL - We know what it isn't, but what is it?

What are we running from?

• Relational databases are the defacto standard for storing data in a web application.

• A lot of times, that data isn’t really relational at all.

• RDBMS’s have lots of rules that can impact performance.

Page 3: NoSQL - We know what it isn't, but what is it?

Rules? What Rules?

• Classic relational databases follow the ACID rules:

• Atomicity

• Consistency

• Isolation

• Durability

Page 4: NoSQL - We know what it isn't, but what is it?

Atomicity

• If any part of the update fails, it all fails.

• Databases have to be able to lock tables and rows for operations, which can block or delay other incoming requests.

Page 5: NoSQL - We know what it isn't, but what is it?

Consistency

• After a transaction, all copies of the data must be consistent with each other (my interpretation).

• Replication across lots of shards is expensive especially if there’s locking involved.

Page 6: NoSQL - We know what it isn't, but what is it?

Isolation

• Data involved in a transaction must be inaccessible to other operations.

• Remember the thing about locked rows and tables?

• It’s a bummer.

Page 7: NoSQL - We know what it isn't, but what is it?

Durability

• Once a user is notified that a transaction has completed, the data must be accessible and all integrity constraints have been met.

Page 8: NoSQL - We know what it isn't, but what is it?

I come not to bury MySQL...

• Relational databases are great for a lot of uses.

• If you have data that’s actually relational and you need transactions, joins and have a limited number of data types, then an RDBMS will work for you.

Page 9: NoSQL - We know what it isn't, but what is it?

But...

• RDBMS’s have been treated like hammers and used for things they’re not good at and weren’t designed for.

• Like the web...

Page 10: NoSQL - We know what it isn't, but what is it?

Thus were born...

• Key-Value Stores

• Wide-Column Stores

• Document Stores/Databases

• Graph Databases

Page 11: NoSQL - We know what it isn't, but what is it?

All thrown together & clumsily dubbed...

Page 12: NoSQL - We know what it isn't, but what is it?

NoSQL

Page 13: NoSQL - We know what it isn't, but what is it?

Which, despite it’s negative sound, supposedly means:

“Not Only SQL”

Page 14: NoSQL - We know what it isn't, but what is it?

Yeah, I don’t believe it either...

Page 15: NoSQL - We know what it isn't, but what is it?

Key-ValueJust what it sounds like. You set a Key to a Value and

can then retrieve it.

Page 16: NoSQL - We know what it isn't, but what is it?

Key-Value Benefits

• Simple

• High performance (usually) because there are no transactions or relations so it’s a simple bucket and lookup.

• Extremely flexible

• Commonly used as caches in front of slower resources (like MySQL - bazinga!)

Page 17: NoSQL - We know what it isn't, but what is it?

Popular Players

• memcached - in memory only, extremely efficient hashing algorithm allows you to scale easily to hundreds of nodes.

• Redis - persistent, slightly more complex than memcached (has support for arrays) but still highly performant.

• Riak - The Rails Machine guys love it. Jesse?

Page 18: NoSQL - We know what it isn't, but what is it?

My Uses

• memcached: Read-through cache for Rails with cache-money.

• redis: persistent cache for results from our algorithm, partitioned by version and instance.

Page 19: NoSQL - We know what it isn't, but what is it?

Wide Column

• Family of databases modeled on either Google’s BigTable or Amazon’s Dynamo.

• Pick two out of three from the CAP theorem in order to get horizontal scalability.

• Data stored by column instead of by row.

Page 20: NoSQL - We know what it isn't, but what is it?

CAP?

• Consistency: All clients always have the same view of the data.

• Availability: Each client can always read and write.

• Partition Tolerance: The system works well despite physical network partitions

Page 21: NoSQL - We know what it isn't, but what is it?

Use cases

• Making sense out of large amounts of data where you know your query scenario ahead of time.

• Large = 100s of millions of records.

• Data-mining log files and other sources of similar data.

Page 22: NoSQL - We know what it isn't, but what is it?

Big Players

• HBase

• Cassandra

• Hypertable

• Amazon’s SimpleDB

• Google’s BigTable (the granddaddy of all of them)

Page 23: NoSQL - We know what it isn't, but what is it?

Graph Databases

• Store nodes, edges and properties

• Think of them as Things, Connections and Properties

• Good for storing properties and relationships.

• Honestly, I don’t fully understand them... anyone?

Page 24: NoSQL - We know what it isn't, but what is it?

The Players

• Neo4j

• FlockDB

• HyperGraphDB

Page 25: NoSQL - We know what it isn't, but what is it?

Document Stores

• Short on relationships, tall on rich data types.

• Big on eventual consistency and flexible schemas.

• Hybrid of traditional RDBMS and Key-Value stores.

Page 26: NoSQL - We know what it isn't, but what is it?

Use Cases

• Content Management Systems

• Applications with rapid partial updates

• Anything you don’t need joins or transactions for that you would normally use a RDBMS for.

Page 27: NoSQL - We know what it isn't, but what is it?

The Players

• CouchDB

• MongoDB

• Terrastore

Page 28: NoSQL - We know what it isn't, but what is it?

MongoDB

• Support for rich data types: arrays, hashes, embedded documents, etc

• Support for adding and removing things from arrays and embedded documents (addToSet, for example).

• Map/Reduce support and strong indexes

• Regular expression support in queries

Page 29: NoSQL - We know what it isn't, but what is it?

Design Considerations

• Embedded Documents - Use only if it the embedded document will always be selected with the parent.

• Indexes - MongoDB punishes you much earlier for missing indexes than MySQL.

• Document size - Currently, documents are limited to 4MB, which should be large enough, but if it’s not...

Page 30: NoSQL - We know what it isn't, but what is it?

Real-World MongoDB

• We use MongoDB heavily at MIS.

• Statistics application and reporting

• Top-secret new application

• Web crawler and indexer

• CMS

Page 31: NoSQL - We know what it isn't, but what is it?

Real-World ExampleLet’s do tags. Everything is taggable now, right?

Page 32: NoSQL - We know what it isn't, but what is it?

The MySQL Way

Page 33: NoSQL - We know what it isn't, but what is it?

Schema

Page 34: NoSQL - We know what it isn't, but what is it?

And to get a “thing’s” tags?

SELECT `tags`.* FROM `tags`

INNER JOIN `taggings` ON `tags`.id = `taggings`.tag_id

WHERE ((`taggings`.taggable_id = 237)

AND (`taggings`.taggable_type = 'Song'))

Page 35: NoSQL - We know what it isn't, but what is it?

Yuck!That’s a lot of pain for something so simple.

And I didn’t even show you finding things with tag “x”.Or how to set and unset tags on a “thing”.

Ouch.

Page 36: NoSQL - We know what it isn't, but what is it?

The MongoDB WayUsing MongoMapper and Rails 3

Page 37: NoSQL - We know what it isn't, but what is it?

class Post include MongoMapper::Document key :title, String key :body, String key :tags, Array ensure_index :tags end

Page 38: NoSQL - We know what it isn't, but what is it?

Let’s Make This Easy... def add_tag(tag) tag = Post.clean_tag(tag) self.tags << tag self.add_to_set(:tags => tag) unless self.new_record? end def remove_tag(tag) tag = Post.clean_tag(tag) self.tags.delete(tag) self.pull(:tags => tag) unless self.new_record? end def self.clean_tag(str) str.strip.downcase.gsub(" ","-").gsub(/[^a-z0-9-]/,"") end def self.clean_tags(str) out = [] arr = str.split(",") arr.each do |t| out << self.clean_tag(t) end out end

Page 39: NoSQL - We know what it isn't, but what is it?

Demo TimeSorry if you’re looking at this later, but it’s console time!

Page 40: NoSQL - We know what it isn't, but what is it?

Why I Love MongoDB

• Document model fits how I build web apps.

• For most apps, I don’t need transactions.

• Eventual consistency is actually OK.

• Partial updates and arrays make things that are a pain in SQL-land absolutely painless.

• It’s just smart enough without getting in the way.

Page 41: NoSQL - We know what it isn't, but what is it?

What’s NoSQL, really?

• The right tool for the job.

• We’ve got lots of options for storing application data.

• The key is picking the one that solves our real problem.

• And if an RDBMS is the right tool, that’s OK too.

Page 42: NoSQL - We know what it isn't, but what is it?

Questions?

Page 44: NoSQL - We know what it isn't, but what is it?

Thanks!

• Kevin Lawver

• @kplawver

[email protected]

• http://kevinlawver.com