hermes: free the data! distributed computing with mongodb
TRANSCRIPT
Hermes: Free the Data!Distributed Applications with MongoDB
Presented by Warren Chang, VP Engineering @ Borderfree
2
Hermes {hur'-meez}Hermès
Hermes 69230
3
Iconic French Brand since 1837Known for luggage and handbags
4
(hur'-meez)Greek God,Divine Messenger of the GodsLink between the mortals and Olympians
5
HermesIt’s a conceptSimple, Persisted, Message Bus
6
Event driven messaging Distributed Applications Flexible architecture Self-contained business rules
7
Event driven messaging Distributed Applications Flexible architecture Self-contained business rules
8
Event driven messaging Distributed Applications Flexible architecture Self-contained business rules
9
Event driven messaging Distributed Applications Flexible architecture Self-contained business rules
10
Lets take a step back
11
Consulting
Publishing
Ad Tech
12
currencies60+
countries100+
customers170+
2014 GMV$550M
+
Borderfree helps online retailers go global
14
2012Tech Stack Diversity
Java
15
2013Tech Stack Diversity
JavaPHP
16
2014Tech Stack Diversity
JavaPHPnode.jspythonscala
17
THE CHALLENGE
18
The Challenge J2EE Best Practices Enterprise Application Server
Weblogic Single Database (Oracle) Hibernate ORM 3 digit growth YOY Speed to market mentality Data Lock-in
19
WHAT DO WE DO?
20
The ChallengeWhat to do about the data? Decouple Stream Persist Monitor
21
SOLUTIONS
22
What we loved about it: Great messaging platform. Highly scalable Reliable
What didn’t work for us: Lack of Persistence Administrative overhead Additional Infrastructure Complex message handling
23
What we loved about it: Highly Scalable Durable Messages Extendibility
What didn’t work for us: Expertise It is HARD! Additional Infrastructure Support
24
Features Known & supported
infrastructure Highly Scalable Fault Tolerant Simple & Flexible
Caveats Initial sizing & design is
critical
MongoDBComponents: Capped Collection Tail-able Cursor Replica set Flexible Schema Aggregation Framework
25
Hermes
26
Hermes Breakdown{ Capped Collections }http://docs.mongodb.org/v2.6/core/capped-collections/
Highlights:• FIFO Queue• Fixed Size Collection• Guarantee sort by insertion order
($natural)• Tail-able Cursor support!• Very little support overhead
{Code}• use hermes
db.createCollection(“orders”, {‘capped’:true, ‘size’:4000000000})
27
Hermes Breakdown{ JSON }Basic Schema
{_id: ObjectId()typ: (msg type: String),dt: MongoDate() data: {
/* flexible json object data here */ }
}
28
Hermes Breakdown{ Tailable Cursor }http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/
Highlights:• Non-exhausting cursor• Query-’less’ feed of data
{code}Python
cursor = self.coll.find(tailable=True, await_data=True)Or
cursor = self.coll.find({'_id':{'$gt’:<objectId>}}, tailable=True, await_data=True)
29
Hermes Breakdown{ Replica Set }http://docs.mongodb.org/manual/core/replication/
Highlights:• Low overhead replication• Scalable redundancy• Optimize read/write efficiency
30
Simple Example
31
# setup Mongo Connection try: self.conn = pymongo.MongoClient(host=servers) self.db = self.conn[db] self.coll = self.db[coll] self.startPoint = startpoint except Exception as e: self.logger.error(e) self.logger.error(str(self.__class__.__name__) + " :: Connection to Database Failed!") exit(1)
def run(self): self.logger.info(str(self.__class__.__name__) + " starting tailable cursor.") if self.startPoint != None: # start from last queue position cursor = self.coll.find({'_id':{'$gt':self.startPoint}}, tailable=True, await_data=True) else: # start from top of queue cursor = self.coll.find(tailable=True, await_data=True) while cursor.alive: try: data = cursor.next() rec = data self.logger.debug("updt:" + rec['data']['UPDATED_DATE'] + " ordr:" + rec['data']['ORDER_ID'] + " objid: " + str(rec['_id'])) except StopIteration: self.logger.info(str(self.__class__.__name__) + " waiting ...")
32
Core System Migration
33
34
Questions?