java web development with mongodb (presented at devoxx 2010)

Post on 22-Nov-2014

450 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

In this presentation, we will try to answer the following- What is a document and a document database?- How does replication and sharding enable me to scale my application?- How does Java web development change when using MongoDB?- How do I deploy my application with MongoDB

TRANSCRIPT

Alvin Richards alvin@10gen.com

Topics

OverviewData modelingReplication & ShardingDeveloping with JavaDeployment

Drinking from the fire hose

Part OneMongoDB Overview

Strong adoption of MongoDB

90,000 Database downloads per month

Over 1,000 Production Deployments

web 2.0 companies started out using thisbut now:- enterprises- financial industries

3 Reason - Performance- Large number of readers / writers- Large data volume- Agility (ease of development)

non-­‐relational,  next-­‐generation  operational  datastores  and  databases

NoSQL Really Means:

RDBMS(Oracle,  MySQL)

past : one-size-fits-all

RDBMS(Oracle,  MySQL)

New Gen. OLAP

(vertica,  aster,  greenplum)

present : business intelligence and analytics is now its own segment.

RDBMS(Oracle,  MySQL)

New Gen. OLAP

(vertica,  aster,  greenplum)

Non-relationalOperational

Stores(“NoSQL”)

futurewe claim nosql segment will be: * large* not fragmented* ‘platformitize-able’

Philosophy:  maximize  features  -­‐  up  to  the  “knee”  in  the  curve,  then  stop

depth  of  functionality

scalab

ility  &  perform

ance •memcached

• key/value

• RDBMS

Horizontally ScalableArchitectures

no  joinsno  complex  transactions+

New Data ModelsImproved ways to develop

no  joinsno  complex  transactions+

Platform and Language supportMongoDB is Implemented in C++ for best performance

Platforms 32/64 bit• Windows• Linux, Mac OS-X, FreeBSD, Solaris

ease of development a surprisingly big benefit : faster to code, faster to change, avoid upgrades and scheduled downtimemore predictable performancefast single server performance -> developer spends less time manually coding around the databasebottom line: usually, developers like it much better after trying

Platform and Language supportMongoDB is Implemented in C++ for best performance

Platforms 32/64 bit• Windows• Linux, Mac OS-X, FreeBSD, Solaris

Language drivers for• Java • Ruby / Ruby-on-Rails • C#• C / C++• Erlang • Python, Perl, JavaScript• Scala• others...

ease of development a surprisingly big benefit : faster to code, faster to change, avoid upgrades and scheduled downtimemore predictable performancefast single server performance -> developer spends less time manually coding around the databasebottom line: usually, developers like it much better after trying

Part TwoData Modeling in MongoDB

So why model data?

A brief history of normalization• 1970 E.F.Codd introduces 1st Normal Form (1NF)• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)

Goals:• Avoid anomalies when inserting, updating or deleting• Minimize redesign when extending the schema• Make the model informative to users• Avoid bias towards a particular style of query

* source : wikipedia

The real benefit of relational

• Before relational• Data and Logic combined

• After relational• Separation of concerns• Data modeled independent of logic• Logic freed from concerns of data design

• MongoDB continues this separation

Relational made normalized data look like this

Document databases make normalized data look like this

Terminology

RDBMS MongoDB

Table Collection

Row(s) JSON  Document

Index Index

Join Embedding  &  Linking

Partition Shard

Partition  Key Shard  Key

DB ConsiderationsHow can we manipulate

this data ?

• Dynamic Queries

• Secondary Indexes

• Atomic Updates

• Map Reduce

Considerations• No Joins• Document writes are atomic

Access Patterns ?

• Read / Write Ratio

• Types of updates

• Types of queries

• Data life-cycle

So today’s example will use...

Design Session

Design documents that simply map to your applicationpost  =  {author:  “Hergé”,                date:  new  Date(),                text:  “Destination  Moon”,                tags:  [“comic”,  “adventure”]}

>db.post.save(post)

>db.posts.find()

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "Destination Moon", tags : [ "comic", "adventure" ] } Notes:• ID must be unique, but can be anything you’d like• MongoDB will generate a default ID if one is not supplied

Find the document

Secondary index for “author”

// 1 means ascending, -1 means descending

>db.posts.ensureIndex({author: 1})

>db.posts.find({author: 'Hergé'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", ... }

Add and index, find via Index

Verifying indexes exist

>db.system.indexes.find()

// Index on ID { name : "_id_", ns : "test.posts", key : { "_id" : 1 } }

// Index on author { _id : ObjectId("4c4ba6c5672c685e5e8aabf4"), ns : "test.posts", key : { "author" : 1 }, name : "author_1" }

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags >db.posts.find({tags: {$exists: true}})

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags >db.posts.find({tags: {$exists: true}})

Regular expressions: // posts where author starts with h >db.posts.find({author: /^h*/i })

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags >db.posts.find({tags: {$exists: true}})

Regular expressions: // posts where author starts with h >db.posts.find({author: /^h*/i })

Counting: // posts written by Hergé    >db.posts.find({author:  “Hergé”}).count()

Extending the Schema        new_comment  =  {author:  “Kyle”,                                  date:  new  Date(),                                text:  “great  book”}

 >db.posts.update({_id:  “...”  },                        {  ‘$push’:  {comments:  new_comment},                          ‘$inc’:  {comments_count:  1}})

     {  _id  :  ObjectId("4c4ba5c0672c685e5e8aabf3"),          author  :  "Hergé",        date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",          text  :  "Destination  Moon",        tags  :  [  "comic",  "adventure"  ],        comments_count:  1,          comments  :  [   {     author  :  "Kyle",     date  :  "Sat  Jul  24  2010  20:51:03  GMT-­‐0700  (PDT)",     text  :  "great  book"   }        ]}

 

Extending the Schema

// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})

Extending the Schema

// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})

// find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)

Extending the Schema

// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1})

>db.posts.find({comments.author:”Kyle”})

// find last 5 posts: >db.posts.find().sort({date:-1}).limit(5)

// most commented post: >db.posts.find().sort({comments_count:-1}).limit(1)

When sorting, check if you need an index

Extending the Schema

Explain a query plan>  db.blogs.find({author:  'Hergé'}).explain(){   "cursor"  :  "BtreeCursor  author_1",   "nscanned"  :  1,   "nscannedObjects"  :  1,   "n"  :  1,   "millis"  :  5,   "indexBounds"  :  {     "author"  :  [       [         "Hergé",         "Hergé"       ]     ]   }

Watch for full table scans

>  db.blogs.find({text:  'Destination  Moon'}).explain()    {   "cursor"  :  "BasicCursor",   "nscanned"  :  1,   "nscannedObjects"  :  1,   "n"  :  1,   "millis"  :  0,   "indexBounds"  :  {       }}

Map Reduce

Map reduce : count tagsmapFunc = function () { this.tags.forEach(function (z) {emit(z, {count:1});});}

reduceFunc = function (k, v) { var total = 0; for (var i = 0; i < v.length; i++) { total += v[i].count; } return {count:total}; }

res = db.posts.mapReduce(mapFunc, reduceFunc)

>db[res.result].find() { _id : "comic", value : { count : 1 } } { _id : "adventure", value : { count : 1 } }

Group

• Equivalent to a Group By in SQL

• Specific the attributes to group the data

• Process the results in a Reduce function

Groupcmd  =  {  key:  {  "author":true  },                initial:  {count:  0},                reduce:  function(obj,  prev)  {                                prev.count++;                            },            };result  =  db.posts.group(cmd);

[   {     "author"  :  "Hergé",     "count"  :  1   },   {     "author"  :  "Kyle",     "count"  :  3   }]

Review

So Far:- Started out with a simple schema- Queried Data- Evolved the schema - Queried / Updated the data some more

Single Table Inheritance

>db.shapes.find() { _id: ObjectId("..."), type: "circle", area: 3.14, radius: 1} { _id: ObjectId("..."), type: "square", area: 4, d: 2} { _id: ObjectId("..."), type: "rect", area: 10, length: 5, width: 2}

// find shapes where radius > 0 >db.shapes.find({radius: {$gt: 0}})

// create index >db.shapes.ensureIndex({radius: 1})

One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents

One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents

- Embedded tree - Single document - Natural - Hard to query

One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries hard e.g find latest comments across all documents

- Embedded tree - Single document - Natural - Hard to query

- Normalized (2 collections) - most flexible - more queries

Many - ManyExample: - Product can be in many categories- Category can have many products

Products- product_id

Category- category_id

Product_Categories- product_id- category_id

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    

Many - Many

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    categories:      {  _id:  ObjectId("4c4ca25433fb5941681b912f"),            name:  "Adventure",            product_ids:  [  ObjectId("4c4ca23933fb5941681b912e"),                                        ObjectId("4c4ca30433fb5941681b9130"),                                        ObjectId("4c4ca30433fb5941681b913a"]}

Many - Many

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    categories:      {  _id:  ObjectId("4c4ca25433fb5941681b912f"),            name:  "Adventure",            product_ids:  [  ObjectId("4c4ca23933fb5941681b912e"),                                        ObjectId("4c4ca30433fb5941681b9130"),                                        ObjectId("4c4ca30433fb5941681b913a"]}

//All  categories  for  a  given  product>db.categories.find({product_ids:  ObjectId("4c4ca23933fb5941681b912e")})

Many - Many

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    categories:      {  _id:  ObjectId("4c4ca25433fb5941681b912f"),            name:  "Adventure"}

Alternative

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    categories:      {  _id:  ObjectId("4c4ca25433fb5941681b912f"),            name:  "Adventure"}

//  All  products  for  a  given  category>db.products.find({category_ids:  ObjectId("4c4ca25433fb5941681b912f")})    

Alternative

products:      {  _id:  ObjectId("4c4ca23933fb5941681b912e"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("4c4ca25433fb5941681b912f"),                                          ObjectId("4c4ca25433fb5941681b92af”]}    categories:      {  _id:  ObjectId("4c4ca25433fb5941681b912f"),            name:  "Adventure"}

//  All  products  for  a  given  category>db.products.find({category_ids:  ObjectId("4c4ca25433fb5941681b912f")})  

//  All  categories  for  a  given  productproduct    =  db.products.find(_id  :  some_id)>db.categories.find({_id  :  {$in  :  product.category_ids}})  

Alternative

TreesFull Tree in Document

{  comments:  [          {  author:  “Kyle”,  text:  “...”,                replies:  [                                            {author:  “Fred”,  text:  “...”,                                              replies:  []}                ]}    ]}

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 4MB limit

   

TreesParent Links- Each node is stored as a document- Contains the id of the parent

Child Links- Each node contains the id’s of the children- Can support graphs (multiple parents / child)

Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }

Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:>db.tree2.find({ancestors: ‘b’})

Array of Ancestors- Store Ancestors of a node { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }

//find all descendants of b:>db.tree2.find({ancestors: ‘b’})

//find all ancestors of f:>ancestors = db.tree2.findOne({_id:’f’}).ancestors>db.tree2.find({_id: { $in : ancestors})

findAndModifyQueue example

//Example: find highest priority job and mark

job = db.jobs.findAndModify({ query: {inprogress: false}, sort: {priority: -1), update: {$set: {inprogress: true, started: new Date()}}, new: true})

Part ThreeReplication & Sharding

Scaling

• Data size only goes up• Operations/sec only go up• Vertical scaling is limited• Hard to scale vertically in the cloud• Can scale wider than higher

What is scaling?Well - hopefully for everyone here.

Traditional Horizontal Scaling

• read only slaves• caching• custom partitioning code

scaling isn’t newsharding isn’tmanual re-balancing is painful at best

New methods of Scaling

• relational database clustering• consistent hashing (Dynamo)• range based partitioning (BigTable/PNUTS)

Read Scalability : Replication

write

read

ReplicaSet  1

Primary

Secondary

Secondary

Basics• MongoDB replication is a bit like MySQL replication

Asynchronous master/slave at its core• Variations:

Master / slaveReplica Pairs (deprecated – use replica sets)Replica Sets

• A cluster of N servers• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery• All writes to primary• Reads can be to primary (default) or a secondary

Replica Sets

Replica Sets – Design Concepts

1. Write is durable once avilable on a majority of members

2. Writes may be visible before a cluster wide commit has been completed

3. On a failover, if data has not been replicated from the primary, the data is dropped (see #1).

Replica Set: Establishing

Member 1

Member 2

Member 3

Replica Set: Electing primary

Member 1

Member 2PRIMARY

Member 3

Replica Set: Failure of master

Member 1

Member 2DOWN

Member 3PRIMARY

negotiate new

master

Replica Set: Reconfiguring

Member 1

Member 2DOWN

Member 3PRIMARY

Replica Set: Member recovers

Member 1

Member 2RECOVER-

ING

Member 3PRIMARY

Replica Set: Active

Member 1

Member 2

Member 3PRIMARY

Set Member TypesNormal (priority == 1)Passive (priority == 0)Arbiter (no data, but can vote)

Write Scalability: Sharding

write

read

ReplicaSet  1

Primary

Secondary

Secondary

ReplicaSet  2

Primary

Secondary

Secondary

ReplicaSet  3

Primary

Secondary

Secondary

key  range  0  ..  30

key  range  31  ..  60

key  range  61  ..  100

Sharding

• Scale horizontally for data size, index size, write and consistent read scaling

• Distribute databases, collections or a objects in a collection

• Auto-balancing, migrations, management happen with no down time

• Replica Sets for inconsistent read scaling

for inconsistent read scaling

Sharding

• Choose how you partition data• Can convert from single master to sharded system with no downtime• Same features as non-sharding single master• Fully consistent

Range Based

• collection is broken into chunks by range• chunks default to 200mb or 100,000 objects

Architecture

client

mongos ...mongos

mongodmongod

mongod mongod

mongod

mongod ...

Shards

mongod

mongod

mongod

ConfigServers

Config Servers

• Hold meta data of where chunks are located •1 or 3 of them (3 for availability)• changes are made with 2 phase commit• if a majority are down, meta data goes read only• system is online as long as 1/3 is up

Shards

• Hold the actual data •Can be master, master/slave or replica sets• Replica sets gives sharding + full auto-failover• Regular mongod processes

mongos

• Sharding Router (or Switch)• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no extra network traffic

Writes

• Inserts : require shard key, routed• Removes: routed and/or scattered• Updates: routed or scattered

Queries

• By shard key: routed• Sorted by shard key: routed in order• By non shard key: scatter gather• Sorted by non shard key: distributed merge sort

Operations

• split: breaking a chunk into 2• migrate: move a chunk from 1 shard to another• balancing: moving chunks automatically to keep system in balance

Part FourJava Development

Library Choices• Raw MongoDB Driver

Map<String, Object> view of objectsRough but dynamic

• Morphia (type-safe mapper)POJOsAnnotation based (similar to JPA)Syntactic sugar and helpers

• OthersCode generators, other jvm languages

MongoDB Java Driver• BSON Package

TypesEncode/DecodeDBObject (Map<String, Object>)

Nested MapsDirectly encoded to binary format (BSON)

• MongoDB PackageMongoDBObject (BasicDBObject/Builder)DB/DBColletionDBQuery/DBCursor

BSON PackageTypes

int and longArray/ArrayListStringbyte[] – binDataDouble (IEEE 754 FP)Date (secs since epoch)NullBooleanJavaScript StringRegex

MongoDB Package• Mongo

Connection, ThreadSafeWriteConcern*

• DBAuth, Collections getLastError()Command(), eval()RequestStart/Done

• DBCollectionInsert/Save/Find/Remove/Update/FindAndModifyensureIndex

Simple ExampleDBCollection  coll  =  new  Mongo().getDB(“blogdb”);

ArrayList<String>  tags  =  new  ArrayList<String>();tags.add("comic");tags.add("adventure");

coll.save(   new  BasicDBObjectBuilder(                “author”,  “Hergé”).       append(“text”,  “Destination  Moon”).       append(“date”,  new  Date()).            append(“tags”,  tags);

Simple Example, AgainDBCollection  coll  =  new  Mongo().getDB(“blogdb”);

ArrayList<String>  tags  =  new  ArrayList<String>();tags.add("comic");tags.add("adventure");

Map<String,  Object>  fields  =  new  …fields.add(“author”,  “Hergé”);  fields.add(“text”,  “Destination  Moon”);fields.add(“date”,  new  Date());fields.add(“tags”,  tags);

coll.insert(new  BasicDBObject(fields));

DBObject <-> (B/J)SON{author:”kyle”,  text:“Destination  Moon”,date:    }

BasicDBObjectBuilder  dbObj  =  new  BasicDBObjectBuilder()

.append(“author”,  “Hergé”)  

.append(“text”,  “Destination  Moon”)  

.append(“date”,  new  Date())  .get();

String  text  =  (String)dbObj.get(“text”);  

JSON.parse(…)DBObject  dbObj  =  JSON.parse(“   {‘author’:‘Hergé’,    ‘text’:‘Destination  Moon’,  ‘date’:‘Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)’,}

”);

ListsDBObject  dbObj  =  JSON.parse(“   {‘author’:‘Hergé’,    ‘text’:‘Destination  Moon’,  ‘date’:‘Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)’,}

”);

List<String>  tags  =  new  …tags.add(“comic”);tags.add(“adventure”);dbObj.put(“tags”,  tags);

{…,  tags:  [‘comic’,  ‘adventure’]}

Maps of MapsCan represent object graph/treeAlways keyed off String (field)

Morphia: MongoDB MapperMaps POJOType-safeAccess Patterns: DAO/Datastore/???Data TypesJPA likeMany concepts came from Objectify (GAE)

Annotations@Entity(“collectionName”)@Id@Transient (not transient)@Indexed(…)@Property(“fieldAlias”)@AlsoLoad({aliases})@Reference@Serialized[@Embedded]

Lifecycle Events@PrePersist@PreSave@PostPersist@PreLoad@PostLoad

EntityListenersEntityInterceptor

Basic POJO@Entityclass Person { @Id String author; @Indexed Date date; String text;}

Datastore Basicsget(class, id)find(class, […])save(entity, […])delete(query)getCount(query)update/First(query, upOps)findAndModify/Delete(query, upOps)

Add, Get, DeleteBlog  entry  =  new  Blog(“Hergé”,  New  Date(),  “Destination  Moon”)

Datastore  ds  =  new  Morphia().createDatastore()

ds.save(entry);

Blog  foundEntry  =  ds.get(Blog.class,  “Hergé”)

ds.delete(entry);

QueriesDatastore  ds  =  …

Query  q  =  ds.createQuery(Blog.class);

q.field(“author”).equal(“Hergé”).limit(5);

for(Blog  e  :  q.fetch())      print(e);

Blog  entry  =  q.field(“author”).startsWith(“H”).get();

UpdateDatastore  ds  =  …Query  q  =  ds.find(Blog.class,  “author”,  “Hergé”);UpdateOperation  uo  =  ds.createUpdateOperations(cls)

uo.inc(“views”,  1).set(“lastUpdated”,  new  Date());

UpdateResults  res  =  ds.update(q,  uo);if(res.getUpdatedCount()  >  0)    //do  something?

Update Operationsset(field,  val)unset(field)

inc(field,  [val])dec(field)

add(field,  val)addAdd(field,  vals)

removeFirst/Last(field)removeAll(field,  vals)

Relationships[@Embedded]

Loaded/Saved  with  EntityUpdate

@Reference

Stored  as  DBRef(s)Loaded  with  EntityNot  automatically  saved

Key<T>  (DBRef)

Stored  as  DBRef(s)Just  a  link,  but  resolvable  by  Datastore/Query

MongoDB features in Java

• Durability• Replication• Sharding• Connection options

Durability

What failures do you need to recover from?• Loss of a single database node?• Loss of a group of nodes?

Durability - Master only

• Write acknowledged when in memory on master only

Durability - Master + Slaves

• Write acknowledged when in memory on master + slave

• Will survive failure of a single node

Durability - Master + Slaves + fsync• Write acknowledged when in memory on master + slaves

• Pick a “majority” of nodes

• fsync in batches (since it blocking)

Setting default error checking//  Do  not  check  or  report  errors  on  writecom.mongodb.WriteConcern.NONE;

//  Use  default  level  of  error  check.  Do  not  send//  a  getLastError(),  but  raise  exction  on  errorcom.mongodb.WriteConcern.NORMAL;

//  Send  getLastError()  after  each  write.  Raise  an//  exception  on  errorcom.mongodb.WriteConcern.STRICT;

//  Set  the  concerndb.setWriteConcern(concern);

Customized WriteConcern//  Wait  for  three  servers  to  acknowledge  writeWriteConcern  concern  =        new  WriteConcern(3);

//  Wait  for  three  servers,  with  a  1000ms  timeoutWriteConcern  concern  =        new  WriteConcern(3,  1000);

//  Wait  for  3  server,  100ms  timeout  and  fsync  //  data  to  diskWriteConcern  concern  =        new  WriteConcern(3,  1000,  true);          //  Set  the  concerndb.setWriteConcern(concern);

Using Replication from Java

slaveOk()- driver to send read requests to Secondaries- driver will always send writes to Primary

Can be set on-­‐  DB.slaveOk()-­‐  Collection.slaveOk()-­‐  find(q).addOption(Bytes.QUERYOPTION_SLAVEOK);

Using sharding Java

Before sharding

coll.save(  new  BasicDBObjectBuilder(“author”,  “Hergé”).       append(“text”,  “Destination  Moon”).       append(“date”,  new  Date());

Query  q  =  ds.find(Blog.class,  “author”,  “Hergé”);

After sharding

No  code  change  required!

Connection options

MongoOptions  mo  =  new  MongoOptions();

//  Restrict  number  of  connectionsmo.connectionsPerHost  =  MAX_THREADS  +  5;

//  Auto  reconnection  on  connection  failuremo.autoConnectRetry  =  true;

Part FiveDeploying MongoDB

Part FiveDeploying MongoDB

• Performance tuning• Sizing• O/S Tuning / File System layout• Backup

Backup

• Typically backups are driven from a slave• Eliminates impact to client / application traffic to master

Backup

•Two strategies• mogodump / mongorestore• fsync + lock

mongodump

• binary, compact object dump• each consistent object is written• not necessarily consistent from start to finish

fsync + lock

• fsync - flushes buffers to disk• lock - blocks writes

db.runCommand({fsync:1,lock:1})

• Use file-system / LVM / storage snapshot

• unlock db.$cmd.sys.unlock.findOne();

Slave delay

• Protection against app faults• Protection against administration mistakes

O/S Config

• RAM - lots of it

• Filesystem• EXT4 / XFS• Better file allocation & performance

• I/O• More disk the better• Consider RAID10 or other RAID configs

Monitoring

• Munin, Cacti, Nagios

Primary function: • Measure stats over time• Tells you what is going on with your system• Alerts when threshold reached

Remember me?

Summary

MongoDB makes building Java Web application simple

You can focus on what the apps needs to do

MongoDB has built-in

• Horizontal scaling (reads and writes)• Simplified schema evolution• Simplified deployed and operation• Best match for development tools and agile processes

top related