unpredictable interactive terabytes of data laurent dolle.pdf · a mongodb daemon (mongod)...
TRANSCRIPT
![Page 1: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/1.jpg)
Insert Co-branding logo 1. Click on placeholder 2. Click ’Insert’ 3. Click ‘Picture’ 4. Locate the co-branding logo, click Insert 5. Align with bottom line of amadeus-logo
Unpredictable & interactive analysis of terabytes of data
Amadeus Revenue Accounting Metadata Search
Big Data Paris, 11 March 2015
Laurent Dollé [email protected]
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 2: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/2.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Amadeus today
1
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 3: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/3.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Amadeus In a few words
Amadeus is a technology company dedicated to the
global travel industry.
We are present in 195 countries with a worldwide team of more than 11,000 people.
Our solutions help improve the
business performance of travel agencies, corporations, airlines,
airports, hotels, railways and more.
![Page 4: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/4.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Connecting The travel industry
Cruiselines
Hotels
Car rental
Ground handlers
Ferry operators
Ground transportation
Airports
Travel agencies
Insurance companies
Airlines
![Page 5: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/5.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Supporting The traveler life cycle
Post-trip
On trip
Pre-trip Buy/Purchase
Search
Inspire
![Page 6: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/6.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Robust Global operations
We designed & own our Data Processing Centres _ Central DC @ Erding, Germany
_ Remote DCs all over the globe
_ Recovery DC on standby in case of natural disasters
1.6+ billion transactions
processed per day
502+ million travel agency bookings processed in 2013
615+ million Passengers Boarded in 2013
95% of the world’s scheduled network
airline seats
![Page 7: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/7.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Close To our customers
![Page 8: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/8.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Our commitment To innovation
_ Amadeus has invested €2.9bn in
Research & Development since 2004.
_ Nominated within “top 3” software companies in 2013 European Union Industrial R&D Investment Scorecard.
![Page 9: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/9.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Amadeus growth is powered by a
sustainable
transaction-based business model
Global air travel Is a growth industry
Source: IATA. Airline Industry forecast 2013-2017
2.98 billion air passengers
2012 2017
3.91 billion air passengers
31 % growth
![Page 10: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/10.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Amadeus Revenue Accounting
2
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 11: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/11.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Revenue of a flight ticket
is shared
_ Travel agent
_ Governments
_ Airlines: many can be involved
(marketing & operating)
What for?
Passenger Revenue Accounting
Amadeus Revenue Accounting handles cash flows
on behalf of airlines
_ Tracking
_ Error handling & optimisation
_ Reporting: analysis & audit
![Page 12: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/12.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Distribution IT
• Data centres
• Platforms and applications
• Sales & marketing infrastructure
• Customers
In common
Increasing accuracy By leveraging our GDS position
Real-time tracking of airline’s
passenger sales revenue
_ at usage time: effective revenue
_ at sale time, weeks before:
expected revenue
![Page 13: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/13.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
_Facilitate
strategic decisions
_Optimise revenue accounting
processes
Amadeus Revenue Accounting Key benefits & features
Web apps, APIs & feeds hosted in the Amadeus cloud (SaaS)
![Page 14: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/14.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Metadata Search business needs
3
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 15: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/15.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
One of our launch partners is a
large European airline
_ transporting 35m+ passengers a year
_ key player in the
revenue accounting industry
Business needs Gathered from a launch partner
They requested a user-friendly way to query any data in our main operational database
_ Unpredictable ad-hoc search
_ Many advanced reporting requirements
Migrating
_ from their
in-house data warehouse
_ to our
cloud-based solution
![Page 16: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/16.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
_Graphical user interface edit, import, save & share queries
_Data warehouse fed in real time 4 years history (140m+ documents, versioned)
_ Interactive response times
_ Search further using
chained queries (patent pending)
Metadata Search The main promises
![Page 17: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/17.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
November 2013 User acceptance testing
December 2014 Migration & parallel running validation on production
Summer 2015 Production cut-over
Post cut-over SLA & optimisation based on usage statistics
Project milestones And possible impacts
Any delay or functional gap may
impact the whole project as application is used to validate
migration and parallel running phases.
![Page 18: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/18.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
User-friendly
SQL graphical user interface
4
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 19: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/19.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
SQL paradigm Split into 2 functional areas
2 functional areas can be defined
_ Search criteria predicates filtering the results
_ Displayed data projections and related functions
SELECT A, SUM(B) WHERE A > C AND B > D GROUP BY A ORDER BY A
![Page 20: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/20.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Graphical user interface Query editor
![Page 21: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/21.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Graphical user interface Query editor
![Page 22: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/22.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Technical constraints
5
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 23: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/23.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Expecting fast answer to unpredictable queries
No index, no hint (almost)
_ Fields to be scanned unknown
_ Main-memory full scans to decrease response time
Need to scale out for sustainable performances
Support mainstream SQL DML statements
_ Aggregation
_ Cross-column comparison, Boolean logic
_ Sort
![Page 24: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/24.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Document timeline implemented to
retrieve efficiently the particular version of a document
based on arbitrary date, event name, flags
Efficient upserts & transactions needed to
replace or update multiple versions at each write
Resilient & user-friendly versioning Featuring a document timeline
1.0 Issuance
1.1 Issuance confirmation
2.0 Exchange
Timeline 3.0 Usage
3.1 Usage (replay)
3.2 Usage (replay)
Events out of timeline 2.1 Exchange (replay)
4.0 Exchange
conflict: 3.2 bumped out of timeline conflict
last issuance confirmation last 2.x last usage last issuance
last 1.x last 3.x last exchange
final event last 4.x
Flags
![Page 25: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/25.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Our main operational database is an Oracle document store containing
Protocol Buffers documents
(4000+ fields)
A schema-less document store would ease
_ the ETL transformation process
(400+ metadata fields to load)
_ the data model maintenance & synchronization between both databases
Schema-less document store For agile integration
![Page 26: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/26.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Consistency favoured over availability (CAP)
_ Expecting accuracy since data used by auditors
_ However: no operational impact application is not MCA
No contractual SLA
_ To be agreed after benchmarking on production
_ Interactive response times expected
with very few parallel users
_ Full outages out of business hours accepted
Consistency & availability And their impacts
![Page 27: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/27.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Runs on standard x86 architecture
C++, Python & Java drivers
Enterprise-grade security
_ SSL encryption
_ Kerberos authentication
_ Data-at-rest encryption
Integration In the Amadeus standards
![Page 28: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/28.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
_ Oracle Mounting all data in memory is irrelevant for cost & hardware reasons: 90TB for our biggest prospect.
_ MySQL cluster Technical & functional limitations,
complex to implement & maintain.
_ Impala Still young, with a steep learning curve. Distributed data analysis not exactly matching our use-case.
Considered alternatives To MongoDB
_ Couchbase Slightly behind MongoDB for document
search (index mandatory).
N1QL not finalized.
Key-value store not exactly matching our use-case.
_ Crescando Amadeus in-house R&D database engine
(index-less, main-memory only,
partitioning data at CPU core level).
Project terminated.
![Page 29: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/29.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Technical architecture
6
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 30: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/30.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Microsharding solves this issue.
Database is highly sharded – as many shards as cores –
so that each shard spawns its own thread,
thus sharing efficiently the workload on the whole CPU power.
Enforcing parallel processing To speed up aggregation queries
A MongoDB daemon (mongod) processes
any incoming query on a single thread.
Modern hardware architectures features
many sockets (2-4) and many cores (8-16),
meaning wasted computing power
if we do not enforce parallel processing.
Our online analytical processing use-case implies
intense workload (full scans)
with limited concurrency as queries are queued and
run sequentially.
![Page 31: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/31.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
_Microsharding validated, from 6 to 48 shards on 6 physical servers
Performances increase almost linearly in respect to the number of shards
_On-the-fly rebalancing validated Cleaning step is mandatory (12 shards and +)
Benchmarking CPU usage Through in-memory microsharding
0
50
100
150
200
250
300
350
400
0 10 20 30 40 50 60
tim
e
shards
Full scan
0
200
400
600
800
1000
1200
1400
1600
1800
0 10 20 30 40 50 60
tim
e
shards
Full scan with aggregation
![Page 32: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/32.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
_ Performances increase linearly in respect to the amount of scanned data
_ Positive impact of caching (light blue dots) validated on full scans only
Benchmarking scalability Through data ramp-up
0
2
4
6
8
10
12
0 200 400 600 800 1000 1200
tim
e
data size
Full scan
0
100
200
300
400
500
0 200 400 600 800 1000 1200
tim
e
data size
Full scan with aggregation
Behaviour reproduced for 2 shard distributions 24 & 48 shards on 6 physical servers, 100% in-memory
![Page 33: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/33.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Benchmarking scalability Through generated search criteria
0
2
4
6
8
10
0 10000 20000 30000 40000 50000
tim
e
search criteria pairs (A and B)
Full scan: OR & AND
0
0,5
1
1,5
2
0 10000 20000 30000 40000 50000
tim
e
search criteria
Full scan: IN
_ Performances increase linearly in respect to the amount of search criteria
![Page 34: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/34.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
6 physical data servers
_ Server HP ProLiant DL580 Gen8
4 sockets, x86, rack
_ 4x CPU Intel Xeon E7-4850 v2
2.30 GHz, 12 physical cores
_ RAM 512GB 40GB/s scanning speed
_ 2x flash cards Fusion-io ioScale 3.2TB 1.5GB/s read
3 virtual config servers
_ RAM 8GB
Production cluster setup Facts & figures
Overall cluster
_ 288 cores, 288 sharded replica sets (2x+1)
_ 3TB RAM, 38.4TB flash card storage
Currently 1 year of production data (4 expected)
_ 250m+ docs (1bn)
_ Data size 2.8TB (11TB) docs with padding
_ Average object size 11.9KB
_ File size 3.97TB (16TB) data & index extents
![Page 35: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/35.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
265ced1609a17cf1a5979880a2ad364653895ae8
Input queue
Error queue
RA
wo
rkfl
ow
Revenue Accounting operational database
Write
Read
REV
Sharded replica sets
Config servers
1st 2nd x
Mongo daemons & arbiter
Shell & drivers (C++, Python, Java)
mongoimport initial/massive feed
live feed
REV OBE BATCH CLUSTER - SLES
MONGODB CLUSTER - RHEL
on-call, debugging & ad-hoc investigation
AQG lib C++ driver
Shard router
service
live trigger
MSG live
gateway
Shard router
applicative
Shard router
applicative
REV OBE OLTP CLUSTER - SLES
SI
https
Browser
corrective feed
MSF front-end
edifact
JSON files
MSG batch
gateway AQG lib
C++ driver ORACLE C
LU
STER
Technical architecture
![Page 36: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/36.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Microsharding is a powerful way to increase response times, what else can bring value?
Database customisation And its results
NUMA
Kernel tuning
Striped replica set
Cgroups
Cgroups Prevent shards from competing for memory when data does not fit into RAM – especially with microsharding. Low-memory Cgroups may be compressed with zRAM/WiredTiger.
Kernel tuning Optimize Linux in case of CPU-bound effort (vs. IO-bound): small readahead, THP off, increase task scheduler.
NUMA Restrict access to CPU & memory for secondary daemons.
Striped replica set Span shards on all the available hardware, with secondary daemons replicated on different nodes for smooth failover.
![Page 37: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/37.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
High availability & fault tolerance
265ced1609a17cf1a5979880a2ad364653895ae8
Mongo daemon
Mongo daemon
Mongo daemon
1st 2nd
Mongo daemons
1st 2nd x
Mongo daemons & arbiter
1st 2nd x
Mongo daemons & arbiter
1st 2nd x
Mongo daemons & arbiter
1st 2nd x
Mongo daemons & arbiter
1st 2nd x
Mongo daemons & arbiter
1st 2nd x
Mongo daemons & arbiter
2nd
1st 2nd
Mongo daemons
2nd
1st 2nd
Mongo daemons
2nd
UNSHARDED DATABASE SHARDS SHARDED REPLICA SETS SHARDED REPLICA SETS STRIPED & SHARDED REPLICA SETS
_ Many options & combinations possible
_ Updates performed on-the-fly
Horizontal scaling through sharding
High availability through replication (primary & secondary shards)
Cheaper, relaxed high-availability through arbiters (empty shards)
Hardware fault-tolerance through physical servers
C B A
Shard, replicate & stripe
![Page 38: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/38.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Production benchmarks
7
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 39: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/39.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Full scan aggregation is CPU-bound,
with a fixed entry cost for unwinds.
_ no unwind 3s
_ unwinds on 1, 2 or 3 levels 70s
Interactive response times promise is complied with
on basic use-cases
In the absence of concurrency,
response times are consistent across all tests.
Production response times And their lessons learnt
Indexes have a linear impact on response times.
Complex query with 4 match criteria
_ full scan 100s
_ index, 40% selectivity 40s
Complex query with 4 match criteria,
including field-on-field comparison
_ full scan 190s
_ index, 40% selectivity 70s
_ index, 75% selectivity 145s
Position of the match operator in the
aggregation pipeline can impact index usage.
![Page 40: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/40.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Integrated monitoring
8
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 41: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/41.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Ops Manager Flavours
MongoDB Ops Manager can be run
_ in the cloud
_ on premise
On-prem version features
_ an admin GUI
_ a monitoring API
![Page 42: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/42.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Ops Manager API Integrated in topology explorer
![Page 43: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/43.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Ops Manager API Integrated in ping watchdog
![Page 44: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/44.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Ops Manager API Integrated in real-time monitoring
![Page 45: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/45.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Ops Manager API Integrated in Ops workbench
![Page 46: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/46.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Feedback on 1 year of Open Source
9
265ced1609a17cf1a5979880a2ad364653895ae8
![Page 47: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/47.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
Need some basic help? Some expert advice? Or the source code?
Google can definitely help, but MongoDB too.
_ Turn Pre-sales Engineers & Solutions Architects into Trainers & Evangelists
_ Everybody can open tickets in MongoDB’s JIRA, but Commercial Support can
process them even faster for you (premium)
_ A dedicated Technical Account Manager can follow your project, provide ad-hoc support and chase tickets internally
Turn your employees into smart creatives _ Empower small teams, embrace agility, set broad objectives & watch the magic
_ Even internal use-cases might be addressed by accident
Services & empowerment Can help you go the extra mile
![Page 48: Unpredictable interactive terabytes of data Laurent DOLLE.pdf · A MongoDB daemon (mongod) processes any incoming query on a single thread. Modern hardware architectures features](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed2ec24d33bf86ae87958f1/html5/thumbnails/48.jpg)
Change the Year in the Copyright field 1. Click ‘Insert’ in Top menu 2. Click ’Header & Footer’ 3. Write new Year in field ‘Footer’ 4. Click ‘Apply to All’
You can follow us on:
AmadeusITGroup amadeus.com/blog amadeus.com
Thank you
265ced1609a17cf1a5979880a2ad364653895ae8