introducing mongodb into your organization
TRANSCRIPT
1
Introducing MongoDB into
your Organization
Edouard Servan-Schreiber, Ph.D.
Director for Solution Architecture
@edouardss
2
• You are using, or want to use, MongoDB
– What benefits?
– Potential Use cases
– Steering the adoption of MongoDB
• Why is MongoDB Safe
– Execution
– Operational
– Financial
• Why 10gen?
– People
– Company
– Future
3
Your First MongoDB Project
4
Big Data
New Programming
models
New Hardware Architecture
5
Horizontally Scalable
{ author: “roger”,
date: new Date(),
text: “Spirited Away”,
tags: [“Tezuka”, “Manga”]}
Document
Oriented
High
Performance-indexes
-RAM
Application
6
User Data Management High Volume Data Feeds
Content Management Operational Intelligence Product Data Mgt
7
• “NoSQL databases are proving
valuable for scaling out cloud and on-
premises uses of numerous content
types, and document-oriented open-
source solutions are emerging as one
of the leading choices. “
8
• Reassuring the Ops Team
• Reassuring the Business Team
• Start with low stakes – learn to trust
• Grow towards a mission critical use case
• LET US HELP YOU! [email protected]
9
Execution
10
11
{
_id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "roger",
date : "Sat Jul 24 2010 19:47:11",
text : "Spirited Away",
tags : [ "Tezuka", "Manga" ],
comments : [
{ author : ’’ Fred ",
date : "Sat Jul 24 2010 20:51:03",
text : "Best Movie Ever” } ,
{ author : ’’ Bill ",
date : "Sat Jul 24 2010 21:13:23",
text : ” No Way !! ” }
]
}
12
Iteration
13
• Start
• Develop
• Scale
14
Operational
15
• Elastic capacity
• Data center outages
• Upgrading DB versions
• Upgrade App versions
• Change/Evolve schema/representation
16
• Data Durability
– Journal
– Replicated Writes
• Data Consistency
– Single Master
– Shard to Scale
• YOU are in control!
17
• Millions of IO ops/sec
• Petabytes of data
• Commodity hardware – Virtual hardware
18
Economics
19
• Less code
• More productive coding
• Easier to maintain
• Contingency plans for turnover
• Commodity hardware
• No upfront license, pay for value over time
• Cost visibility for growth of usage
20
Analyze a staggering amount of data for a system build on continuous stream of high-quality text pulled from online sources
Adding too much data too quickly resulted in outages; tables locked for tens of seconds during inserts
Initially launched entirely on MySQL but quickly hit performance road blocks
Problem
Life with MongoDB has been good for Wordnik. Our code is faster, more flexible and dramatically smaller. Since we don’t spend time worrying about the database, we can spend more time writing code for our application.
Migrated 5 billion records in a single day with zero downtime
MongoDB powers every website requests: 20m API calls per day
Ability to eliminated memcached layer, creating a simplified system that required fewer resources and was less prone to error.
Why MongoDB
Reduced code by 75% compared to MySQL
Fetch time cut from 400ms to 60ms
Sustained insert speed of 8k words per second, with frequent bursts of up to 50k per second
Significant cost savings and 15% reduction in servers
Impact
Wordnik uses MongoDB as the foundation for its “live” dictionary that stores its entire
text corpus – 3.5T of data in 20 billion records
Tony Tam, Vice President of Engineering and Technical Co-founder
21
Why 10gen ?
22
Dwight Merriman – CEO
Founder, CTO DoubleClick
Max Shireson – President
COO MarkLogic
9 Years at Oracle
Eliot Horowitz – CTO
Co-founder of Shopwiki, DoubleClick
Erik Frieberg – VP Marketing
HP Software, Borland, BEA
Ben Sabrin – VP of Sales
VP of Sales at Jboss, over 9 years of Open Source experience
23
• Community and Commercial
• Dedicated support staff across the globe
– NY
– CA
– Dublin
– London
– Australia
24
• Union Square Ventures
• Sequoia Capital
• Flybridge Capital
• NEA
• $80M raised overall
• Most recent round: $42M in May…
25
What’s in store…
26
• Authentication
• Data encryption
– At rest
– In flight
• Full Text Search
• Global Database lock ?
• Monitoring
27
Version 2.2 (now)
• Database level locking
• Aggregation Framework
• TTL collections
• Geo-aware sharding
• Read Preferences
Version 2.4 (Q4 2012)
• Kerberos/LDAP authentication
• Collection level locking
• Full Text Search
• Improved Aggregation Framework