webinar: general technical overview of mongodb for ops teams
DESCRIPTION
MongoDB is the leading open-source, document database. In this webinar we'll dive into the technical details of MongoDB by first focusing on what makes it different from traditional relational database management systems. We'll review data storage, high availability and scaling for MongoDB. Next we'll discuss what's involved in deploying MongoDB in production. Finally, we'll delve into some of the operational challenges including performance tuning, capacity planning and what it takes to deploy robust highly-available cluster topology.TRANSCRIPT
General Technical Overview of MongoDB for Ops Teams
Senior Solutions Architect, MongoDB
Asya Kamsky
#MongoDB
MongoDB
MongoDB
The leading NoSQL database
Document Database
Open-Source
General Purpose
MongoDB Business Value
Enabling New Apps Better Customer Experience
Lower TCOFaster Time to Market
4,000,000+ MongoDB Downloads
100,000+ Online Education Registrants
20,000+ MongoDB User Group Members
20,000+ MongoDB Days Attendees
15,000+ MongoDB Management Service (MMS) Users
Global Community
MongoDB and Enterprise IT Stack
EDWHadoop
Man
agem
ent
& M
on
ito
rin
gS
ecurity &
Au
ditin
g
RDBMS
CRM, ERP, Collaboration, Mobile, BI
OS & Virtualization, Compute, Storage, Network
RDBMS
Applications
Infrastructure
Data Management
Online Data Offline Data
Data HubUser Data Management
Big Data Content Mgmt & Delivery
Mobile & Social
MongoDB Solutions
• 10 of the Top Financial Services Institutions
• 10 of the Top Electronics Companies
• 10 of the Top Media and Entertainment Companies
• 8 of the Top Retailers
• 6 of the Top Telcos
• 5 of the Top Technology Companies
• 4 of the Top Healthcare Companies
Fortune 500 & Global 500
MongoDB Partners (200+)
Software & Services
Cloud & Channel Hardware
Data Model
Operational Database Landscape
RDBMS
Agility
MongoDB
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
Document Data Model
Relational MongoDB
{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}
Document Model Benefits
Agility and flexibility– Data models can evolve easily– Companies can adapt to changes quickly
Intuitive, natural data representation– Developers are more productive– Many types of applications are a good fit
Reduces the need for joins, disk seeks– Programming is more simple– Performance can be delivered at scale
MongoDB is full featured
MongoDBRich Queries
• Find Paul’s cars• Find everybody in London with a
car built between 1970 and 1980
Geospatial• Find all of the car owners within
5km of Trafalgar Sq.
Text Search• Find all the cars described as
having leather seats
Aggregation• Calculate the average value of
every user's car collection
Map Reduce• What is the ownership pattern of
colors by geography over time? (is purple trending up in China?)
{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}
Shell
Command-line shell for
interacting directly with
database
Shell and Drivers
DriversDrivers for most popular programming languages and frameworks
> db.collection.insert({company:“10gen”, product:“MongoDB”})> > db.collection.findOne(){
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),
“company” : “10gen”“product” : “MongoDB”
}
Java
Python
Perl
Ruby
Haskell
JavaScript
Developers are more productive
Scalability
Automatic Sharding
• Increase or decrease capacity as you go
• Automatic balancing
• Three types of sharding:
hash-based
range-based
tag-aware
Query Routing
• Multiple query optimization models
• Many sharding options appropriate for different apps
High Availability
High Availability – Ensure application availability during
many types of failures
Disaster Recovery – Address the RTO and RPO goals for
business continuity
Maintenance – Perform upgrades and other
maintenance operations with no application downtime
Availability Considerations
Replica Sets
• Replica Set – two or more copies
• “Self-healing” shard
• Addresses many concerns:
- High Availability
- Disaster Recovery
- Maintenance
Replica Set Benefits
Business Needs Replica Set Benefits
High Availability Automated failover
Disaster Recovery Hot backups offsite
Maintenance Rolling upgrades
Low Latency Locate data near users
Workload Isolation Read from designated nodes
Data Consistency Tunable Consistency
Deployment Architecture
MongoDB Architecture
Deployment
• Automated failover
• Tolerates server failures
• Tolerates rack failures
• Number of replicas defines failure tolerance
Primary – A Primary – B Primary – C
Secondary – A
Secondary – A
Secondary – B
Secondary – B
Secondary – C
Secondary – C
Global Deployment/Local Writes
Primary:NYC
Secondary:NYC
Primary:LON
Primary:SYD
Secondary:LON
Secondary:NYC
Secondary:SYD
Secondary:LON
Secondary:SYD
Global Data Distribution
Real-time
Real-time Real-time
Real-time
Real-time
Real-time
Real-time
Primary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Performance
Better Data Locality
Performance
In-Memory Caching
In-Place Updates
Entertainment Company: 1,400 servers
Craigslist: 5B documents
Carfax: 11B documents
Tier 1 Bank: 30K ops/sec
Major Retailer: 50K ops/sec
Fed Agency: 500K ops/sec
Wordnik: 20B documents, 35,000 ops/sec
Performance at Scale
MongoDB Performance*
Top 5 Marketing Firm
Government Agency
Top 5 Investment Bank
Data Key/value 10+ fields, arrays, nested documents
20+ fields, arrays, nested documents
Queries Key-based1 – 100 docs/query80/20 read/write
Compound queriesRange queriesMapReduce20/80 read/write
Compound queriesRange queries50/50 read/write
Servers ~250 ~50 ~40
Ops/sec 1,200,000 500,000 30,000
* These figures are provided as examples. Your application governs your performance.
Capacity Planning
Requirements
Testing
Monitoring
Key Deployment Considerations
Capacity Planning
Requirements
Testing
Monitoring
Performance Tuning
Understanding
Adjusting
Monitoring
Key Performance Considerations
Monitoring
Monitoring
• CLI and internal status commands• mongostat; mongotop;
db.serverStatus()
• Plug-ins for munin, Nagios, cacti, etc.
• Integration via SNMP to other tools
• MMS
MongoDB Management Service
Cloud-based suite of services for managing MongoDB deployments
Charts, custom dashboards and automated alerting
Tracks 100+ metrics – performance, resource utilization, availability and response times
15,000+ users
MongoDB Management Service
Cloud-based suite of services for managing MongoDB deployments
Backup and restore with– point-in-time recovery, – support for sharded
clusters
Cloud-based suite of services for managing MongoDB deployments
• MMS On-Prem included with MongoDB Enterprise (backup coming soon)
MongoDB Management Service
A Picture Speaks a Thousand Words
Symptoms
High Use CPU Similar Query Pattern
Monitoring Best Practices
• Monitor Logs– Alert, escalate– Correlate
• Disk– Monitor
• Instrument/Monitor App (including logs!)
• Know your application and application (write) characteristics
Questions?
Thank You
Senior Solutions Architect, MongoDB
Asya Kamsky
#MongoDB