mongodb world 2016: do what matters: migrating to self-managed infrastructure

31

Upload: mongodb

Post on 15-Apr-2017

224 views

Category:

Technology


0 download

TRANSCRIPT

Brian ScanlanEngineering Manager at Intercom

@brian_scanlan

What is Intercom?Intercom is one place for every team in an internet

business to communicate with customers, personally, at scale—on your website, inside web and mobile apps, and

by email.

Do What Matters: Migrating to Self-

Managed Infrastructure

Things that actually matter🔒Security🔒

Making 💰💰💰

💩👖

Team Infrastructure core values

1. Security, Availability, Performance, Scalability, Cost, Efficiency - prioritize for maximum impact

2. Faster, Safer, Easier, Shipping3. Zero Touch Ops4. Run Less Software

“Let’s run more software”

(Mostly) downtime-free migration 👍

Intercom App and Intercom tags required a custom migration:

1. Create “DMZ” hosts with identically named replica-sets

2. Ask nicely for the keyfile from third party, install on DMZ hosts

3. Using rs.add(), add DMZ hosts to MongoHQ replica-sets and sync data over

4. Restart DMZ replica set hosts with MMS replica set key, detach from MongoHQ replica set

5. Low level hackery to make replica-set as MMS expects: db.system.replset.update() to manipulate replica-set member IDs, and quickly removing/inserting admin users

(Almost) downtime-free migration 👍

“We were paying $x a month, now we’re paying AWS/MongoDB $y a month for our MongoDB hosting, a cost saving of 62%”

Interesting discoveries at scale Ruby MongoDB driver defaults

/etc/security/limits.d/999-intercom.conf: mongod - nofile 640000 mongod - nproc 640000

/etc/sysctl.conf kernel.pid_max = 655360 net.ipv4.tcp_max_syn_backlog = 204800

Wrote a document identifying problemCost estimatesOperations estimatesStarted playing around with MMSWrote “Intermission”Benchmarked available host typesMoved test accounts to AWS hosted MongoDBTuned drivers, hostsVerified backups work well

Verified rollback planPointed canary monitoring at new infrastructureSetup new alarmingBroke a server to see what happenedMoved real customers overMore even more customers overFinished moving customers overChanged host type to i2.8xlargeRotated credentialsMore host tuning…

🙀 problems are worth solving when the impact matters

dvara (proxying MongoDB

FTW)

🙀 problems are worth solving when the impact

matters

Thanks@brian_scanlan