mongodb as a fast and queryable cache

32
MongoDB Conference Berlin 2011 MongoDB as a queryable cache

Upload: mongodb

Post on 12-May-2015

10.959 views

Category:

Technology


0 download

DESCRIPTION

Martin Tepper of Travel IQ presents at Mongo Berlin

TRANSCRIPT

Page 1: MongoDB as a fast and queryable cache

MongoDB Conference Berlin 2011

MongoDB as aqueryable cache

Page 2: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

About me

• Martin Tepper

• Lead Developer at Travel IQ

• http://monogreen.de

Page 3: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Contents

• About Travel IQ

• The problem

• The solution

• The headaches

Page 4: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

About Travel IQ

• Meta Search Engine for Flights and Hotels

• 9 Hotel Providers

• 21 Flight Providers

• ~ 6000 searches per day

• ~ 64k provider queries per day

Page 5: MongoDB as a fast and queryable cache
Page 6: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

About Travel IQ

• Real-Time Aggregation

• Ruby/Rails based

• API-Driven

Page 7: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Quick aside

• Ruby: OO script language

• Rails: MVC Web application framework

• ActiveRecord: ORM framework

Page 8: MongoDB as a fast and queryable cache

The Problem

Page 9: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Basic Architecture

Page 10: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Basic Architecture

Page 11: MongoDB as a fast and queryable cache
Page 12: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Strongly Normalized

• Very organized

• Reuse of models

• Saves disk space

• But …

Page 13: MongoDB as a fast and queryable cache

sql = <<-SQL SELECT MIN(outerei.id) FROM ( SELECT OBJ1.starts_at AS OBJ1_starts_at, OBJ1.ends_at AS OBJ1_ends_at, OBJ1.origin_id AS OBJ1_origin_id, OBJ1.destination_id AS OBJ1_destination_id, MIN(P1.price) AS the_price FROM packages P1 LEFT JOIN journeys OBJ1 ON (P1.outbound_journey_id = OBJ1.id) LEFT JOIN results R1 ON (R1.package_id = P1.id) LEFT JOIN packagings PA1a ON (PA1a.package_id = P1.id AND PA1a.position = 1) LEFT JOIN offers O1a ON (PA1a.offer_id = O1a.id) WHERE R1.search_id IN (#{search_id}) AND R1.search_type = 'FlightSearch' AND O1a.expires_at > #{expiring_after} GROUP BY OBJ1.starts_at, OBJ1.ends_at, OBJ1.origin_id, OBJ1.destination_id ) AS innerei JOIN ( SELECT P2.id, OBJ2.starts_at AS OBJ2_starts_at, OBJ2.ends_at AS OBJ2_ends_at, OBJ2.origin_id AS OBJ2_origin_id, OBJ2.destination_id AS OBJ2_destination_id, P2.price FROM packages P2 LEFT JOIN results R2 ON (R2.package_id = P2.id) LEFT JOIN journeys OBJ2 ON (P2.outbound_journey_id = OBJ2.id) LEFT JOIN packagings PA2a ON (PA2a.package_id = P2.id AND PA2a.position = 1) LEFT JOIN offers O2a ON (PA2a.offer_id = O2a.id) WHERE R2.search_id IN (#{search_id}) AND R2.search_type = 'FlightSearch' AND O2a.expires_at > #{expiring_after} ) AS outerei ON ( OBJ1_starts_at = OBJ2_starts_at AND OBJ1_ends_at = OBJ2_ends_at AND OBJ1_origin_id = OBJ2_origin_id AND OBJ1_destination_id = OBJ2_destination_id AND outerei.price = the_price ) GROUP BY OBJ1_starts_at, OBJ1_ends_at, OBJ1_destination_id, OBJ1_origin_id SQL

Page 14: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

The problem

• Strongly normalized database

• Complex query requirements

• Lots of joins

• ActiveRecord and rendering overhead

• Slow API calls

Page 15: MongoDB as a fast and queryable cache

The Solution

Page 16: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Solution 1: Schema

• Redo the schema

• Migration hard

• Some relationships hard to denormalize

Page 17: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Solution 2: Memcached

• Memcached

• Very fast response times

• But no real queries

→ Horrible abstraction layer

Page 18: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

0

2,0

4,0

6,0

8,0

10,0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Memcached response times over time

seconds after search start

resp

onse

tim

e of

api

cal

l in

seco

nds

Page 19: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Solution 3: MongoDB

• Document-oriented – less render overhead

• Grouping of offers

• Proper queries and counts

• Still quite fast

Page 20: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

How we use MongoDB

Page 21: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

How we use MongoDB

• Replica set with 2 nodes and 2 arbiters

• Two servers with 16 cores / 64GB RAM

→ run MySQL and MongoDB

• ~ 600 writes/s and reads/s normal load

• ~ 6000 writes/s doable

Page 22: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

0

2,0

4,0

6,0

8,0

10,0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

seconds after search start

resp

onse

tim

e of

api

cal

l in

seco

nds

MongoDB response times over time

Page 23: MongoDB as a fast and queryable cache

The Headaches

Page 24: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Segmentation Faults

• Only in production

Page 25: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Segmentation Faults

• Only in production

→ Replica Set helped a lot

→ Fixed with nightly build

Page 26: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Write performance during peak load

• Lots of small concurrent writes

Page 27: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Write performance during peak load

• Lots of small concurrent writes

→ Solved by bundling writes

Page 28: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Hotel data too big to denormalize

• In separate collection

Page 29: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Hotel data too big to denormalize

• In separate collection

→ Solved with app-level “join“

Page 30: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Data consistency

• Typical caching problem

• Updates to MySQL also in MongoDB

Page 31: MongoDB as a fast and queryable cache

MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25

Problems with MongoDB

• Data consistency

• Typical caching problem

• Updates to MySQL also in MongoDB

→ Solved with callbacks in ActiveRecord

Page 32: MongoDB as a fast and queryable cache

Thank you