brainrepublic - powered by no-sql

30
Andreas Jung www.zopyxgroup.com Powered by MongoDB & no-SQL Mongo Berlin October, 4th 2010 Montag, 4. Oktober 2010

Upload: andreas-jung

Post on 24-Jun-2015

2.080 views

Category:

Documents


1 download

DESCRIPTION

Talk given at Mongo Berlin 2010 by Andreas Jung

TRANSCRIPT

Page 1: BRAINREPUBLIC - Powered by no-SQL

Andreas Jungwww.zopyxgroup.com

Powered by MongoDB & no-SQL

Mongo BerlinOctober, 4th 2010

Montag, 4. Oktober 2010

Page 2: BRAINREPUBLIC - Powered by no-SQL

/ME

• Developer (backend) and software-analyst

• Strong background in Python, Zope and Plone

• Former Zope 2 release manager

• Co-founder and chairman of the German Speaking Zope User Group (DZUG)

• Director of the Zope Foundation

• Member of Plone Foundation

• Author of tons of add-ons for Python, Zope and Plone

• Head of ZOPYX

Montag, 4. Oktober 2010

Page 3: BRAINREPUBLIC - Powered by no-SQL

The Zope and Plone Expert Network

• German based full-Service partner network

• ZOPYX (Tübingen)

• Veit Schiele (Berlin)

• Zetwork (Oldenburg)

• Banality (Essen)

• Python, Zope, Plone & other cool stuff

Montag, 4. Oktober 2010

Page 4: BRAINREPUBLIC - Powered by no-SQL

Agenda

• What is BRAINREPUBLIC?

• „no-SQL“ techologies used in the project

• Evaluation of technologies

• My view on MongoDB - pros and cons

• BRAINREPUBLIC architecture

Montag, 4. Oktober 2010

Page 5: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 6: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 7: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 8: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 9: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 10: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 11: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 12: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 13: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 14: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 15: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 16: BRAINREPUBLIC - Powered by no-SQL

Montag, 4. Oktober 2010

Page 17: BRAINREPUBLIC - Powered by no-SQL

• Criteria:

• fast

• scalable

• distributed

• Special requirement:

• having fun :-)

Choosing a database and tools for BRAINREPUBLIC

Montag, 4. Oktober 2010

Page 18: BRAINREPUBLIC - Powered by no-SQL

Powered by

repoze.bfg

Montag, 4. Oktober 2010

Page 19: BRAINREPUBLIC - Powered by no-SQL

repoze.bfg• BFG is a "pay only for what you eat" Python web framework

• based on WSGI (Web Service Gateway Interface)

• What makes BFG special:

• It‘s tested

• Simplicity

• Minimalism

• Documentation

• Speed

Montag, 4. Oktober 2010

Page 20: BRAINREPUBLIC - Powered by no-SQL

• fulltext search-engine based on Apache Lucene

• REST-style API for HTTP (XML/JSON)

• flexible field-based configuration through XML

• many plugins

• fast

• scales up/vertically (data partitioning)

• scales out/horizontally (clustering)

Montag, 4. Oktober 2010

Page 21: BRAINREPUBLIC - Powered by no-SQL

• AMPQ (Advanced Message Queuing Protocol) based message queue

• Open-Source (VMWare)

• implemented in Erlang

• very fast (7500 messages/second)

• very stable

• flexible routing mechanisms for messages

• support for clustering

• implements producer & consumer pattern

Montag, 4. Oktober 2010

Page 22: BRAINREPUBLIC - Powered by no-SQL

• ZODB (Zope Object DataBase)

• RDBMS

• „no-SQL“ databases

• key-value stores

• document-oriented databases:

• MongoDB

• Couchdb

Possible database candidates

Montag, 4. Oktober 2010

Page 23: BRAINREPUBLIC - Powered by no-SQL

• breaking more complex data structures into key-value pairs is a pain

• Map-reduce is brainfuck

• implementations do not provide a „traditional“ query API

Evaluation of key-value storages

Montag, 4. Oktober 2010

Page 24: BRAINREPUBLIC - Powered by no-SQL

• schema-less databases are nice

• easy to deal w/ requirement changes

• JSON suitable for complex data structures

Evaluation of document-oriented storages

Montag, 4. Oktober 2010

Page 25: BRAINREPUBLIC - Powered by no-SQL

MongoDB CouchDB

very fast (>10K ops/second) pretty slow

native drivers (TCP/IP) REST/HTTP API

Map-Reducerich query API Map-Reduce

Master-SlaveReplica setSharding

easy replication

Montag, 4. Oktober 2010

Page 26: BRAINREPUBLIC - Powered by no-SQL

• Performance, performance, performance

• implementing a fast system on top of HTTP-based web-services/APIs is a bad idea

• Rich query API (the world needs more than pure M-R)

• JSON-like queries are not my thing (better syntax needed?)

So why MongoDB (and not CouchDB)?

Montag, 4. Oktober 2010

Page 27: BRAINREPUBLIC - Powered by no-SQL

BRAINREPUBLIC Architecture

Varnish

HAProxy

App App App

MongoDB RabbitMQ SOLR

PaymentBilling

Varnish

HAProxy

App App App

HA Heartbeat

Montag, 4. Oktober 2010

Page 28: BRAINREPUBLIC - Powered by no-SQL

Lessons learned/Looking back

• MongoDB is kind of the „swiss knife of the no-SQL“ DBs

• very fast and reliable

• very low entry-barrier

• easy programming

• offers more than Map-Reduce

• 10gen seems to have ambitious goals with MongoDB

• good documentation (update website, books upcoming)

• very good community support (IRC, mailing list)

Montag, 4. Oktober 2010

Page 29: BRAINREPUBLIC - Powered by no-SQL

My wish list...

• Poor replication performance (Master-Slave: 2.5-3 MB/sec)

• Indexes should fit completely into memory?

• A more fine-grained authentication model?

• Parallel map-reduce?

• Better usage of existing indexes (vs. compound indexes)?

• An alternative query API (not based on JSON) possible?

Montag, 4. Oktober 2010

Page 30: BRAINREPUBLIC - Powered by no-SQL

www.brainrepublic.com

www.zopyxgroup.com

Montag, 4. Oktober 2010