mongodb use cases: healthcare, cms, analytics

22
MongoDB Use Cases Healthcare, CMS, Analytics Thomas O’Rourke Upstream Innovations Ltd. Oulu / Seattle

Upload: mongodb

Post on 01-Nov-2014

1.727 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: MongoDB Use Cases: Healthcare, CMS, Analytics

MongoDBUse Cases

Healthcare, CMS, Analytics

Thomas O’RourkeUpstream Innovations Ltd.

Oulu / Seattle

Page 2: MongoDB Use Cases: Healthcare, CMS, Analytics
Page 3: MongoDB Use Cases: Healthcare, CMS, Analytics

www.dashwire.com

Page 4: MongoDB Use Cases: Healthcare, CMS, Analytics

Dashwire Dashconfig• Users configure their mobile phones on PC.

o Email accounts, wallpapers, ringtones, bookmarks, contacts, etc.o Generates a lot of data!

• Wanted: Google Analytics + Splunk + BI.o Sensitive data:

• Can’t send out => No Google Analytics.o Many sources

• (Server log files, SQS, Web analytics, etc.)o internal error report &

• UI issues (powerful paradigm)o Real time vs. Reports/Enterprise

• ~500,000 events a day o Store for year

Page 5: MongoDB Use Cases: Healthcare, CMS, Analytics

Solution• Eco-system in Mongo

o Evolved

• Layered architectureo L1. Store - “De-duplication.

• Streaming live (syslog) • Playback of log files

o L2. Parsing into key/value pairs.o L3. Processing. o L4. Reports.

• Trade-offs for real-timeo Reconcilero Trade offs for real time and offline

Page 6: MongoDB Use Cases: Healthcare, CMS, Analytics

Tools• MongoDB • Ruby• Sinatra• Ruby driver

o (Connection pooling, multithreaded, replica set support)

• Event machine + em-mongo• ZeroMQ• Sinatra/Rack/Thin• Mixpanel• Server density• Excel• Highcharts• softlayer

Page 7: MongoDB Use Cases: Healthcare, CMS, Analytics

Integrity ChecksOnce day

Eco system

Page 8: MongoDB Use Cases: Healthcare, CMS, Analytics

Parsing logs"2012-08-17 13:08:11 app02 Passngr[20167]: I script(www-data) -- {\”analytics\":{\"scenario\":\"three\",\"initial scenario\":\"three\",\"phone\":\”Cool Phone\",\"name\":\"Facebook\",\"time\":\"2012-08-17 18:08:11.399 UTC\",\"event\":\"Bookmark Added\",\"browser_tracking_id\":\"857b307a4d1xxxxx08ebca70f6\",\"browser_time\":\"2012-08-17 18:08:14.794 UTC\",\"browser_event\":1,\"session_id\":\"68528379d5xxxxxxxcda27fd625fe\"}}"

{ scenario: “three”, phone : “Cool phone”, event : “Bookmark Added”, session_id : ... }

JSON.parse( )

Collection = Event_Bookmark_Added

Page 9: MongoDB Use Cases: Healthcare, CMS, Analytics

De-duplication• Multikey index

o Integers perform well• MD5 of entire log line as string (only use half of result)• Unix time stamp (seconds)• Fraction of second (if one is present)

• Better to use millisecond but not required

@collections[collection].create_index( [ [:ts, Mongo::ASCENDING],

[:ts_frac, Mongo::ASCENDING], [:dhash, Mongo::ASCENDING ] ], { :unique => true, :drop_dups => true} )

Page 10: MongoDB Use Cases: Healthcare, CMS, Analytics

Process pattern

Pre allocate “processed : 0”At insert time (creation)

Index (no dup)

process

@collections[collection].insert( doc )

Page 11: MongoDB Use Cases: Healthcare, CMS, Analytics
Page 12: MongoDB Use Cases: Healthcare, CMS, Analytics

Reports• Needed both Real time and Enterprise (Excel Reports)

o We use MongoDB for both and all intermediate tables

• Reports o Map/Reduce for Reports and Graphs o Considered MySQL but rejected as unnecessaryo Write Excel (*.xlsx) directly using Ruby and accessing MongoBD.

• https://github.com/randym/axlsx

• Real-timeo Incremental Map/Reduce gives performance to do real time graphs.

• http://www.highcharts.com

Page 13: MongoDB Use Cases: Healthcare, CMS, Analytics

Server Density

Page 14: MongoDB Use Cases: Healthcare, CMS, Analytics

PART 2Technical Discussion

• Performance• Durability• Replica sets• Maintenance• Transactions• Drivers and Languages• Demos

Page 15: MongoDB Use Cases: Healthcare, CMS, Analytics

Performance• ~3000 inserts a second for unsafe mode.• < 1000 for safe mode.• Indexes = memory.• Use slaves when possible for reads (note:

consistency)• Your driver makes a HUGE difference.• Pre-allocate for updates!• Safe mode is much slower

o Not everything is required to be 100% safeo Not everything is unsafe.o Think! ARCHITECT your durability where you need it!

Page 16: MongoDB Use Cases: Healthcare, CMS, Analytics

Durability SAFE /SLOWER

FAST/UNSAFE

Page 17: MongoDB Use Cases: Healthcare, CMS, Analytics

Replica set uses• Redundancy

o Data is at multiple nodeso n-seconds behind mode, is an ‘ass’ saver (it’s very easy to accidentally drop a

collection!)

• Failovero Sleep at night

• Maintenanceo Backup slaveso Build indexes on slaves and promote them

• Load balancingo Reads on slaves

@collection.insert(doc, :safe => { :w => “majority” } )Journal + replicate (journal only applies to primary) but guarantees the rollback will be available if failed before replication.

Page 18: MongoDB Use Cases: Healthcare, CMS, Analytics

Maintenance• Backup/Maintenance

o Backup by stopping slave, copy files, start slave• /data/*• Can be copied and backed up and compressed• Compression is high! (Can be 70%!) because fields names are not

compressedo Mongo export and import BSON can be run while database is runningo Server density

• Nodes health • Slave lag - time behind• Index size• Etc.

Page 19: MongoDB Use Cases: Healthcare, CMS, Analytics

Transactions• findAndUpdate().

o Atomic update and return it in same document

• Upserts and indexes .• Planning for failure not assuming transactions.

Page 20: MongoDB Use Cases: Healthcare, CMS, Analytics

Driver and language• Driver and Language

o Use a dynamic language! Ruby, Python, etc. o Driver support for replica set, and connection pool preferred.o A Simple ORM/Mapper, etc. works great.

• Mongoid• MongoMapper• Or even just plain driver (Mongo Ruby driver)

o Learn Javascript! • Shell Javascript commands and Ruby driver methods are very

similaro findOne vs find_one

• Map/Reduce –is always Javascript• Everything is a Map/Reduce – get used to it.• (It’s not difficult for these purposes!)

Page 21: MongoDB Use Cases: Healthcare, CMS, Analytics
Page 22: MongoDB Use Cases: Healthcare, CMS, Analytics

Demos• https://github.com/tomjoro/mongo_browser

o JQuery tree viewo Sinatrao Mongo

• Coolo Integrating R with MongoDBo Highcharts

• Contact information:o http://www.linkedin.com/in/tomjoro [email protected]