oscon 2015 - openio: enabling the petabyte-sized mailboxes
TRANSCRIPT
2006
idea & firstconcept
2007
designdev starts
2009
1st massiveproduction
2012
the projectis opensourced
2014
10+ PBmanaged
2015
OpenIOfork
3 avenue Antoine PinayParc d'Activité des 4 Vents59510 Hem
@OpenIO OpenIO
github.com/open-io
180 Sansom Street, FI 4 San Francisco, 94104 CA
FRANCE USA
•Cyrus scalability issues
•OpenIO technology
•Cyrus IMAP 3.0
Storage infrastructure challenge
Open source object storage
OpenIO for a scale-out Cyrus
Summary
Cyrus scalability issues
Share-nothing clusters
• Migrate mailboxes to new clusters as local storage becomes full
• Storage cost with SAS drives • Ends up with unused CPU/RAM ressources
on Cyrus servers
Disk used
email size
IOPS zoneSSD drives
capacity zone SATA drives
Number of emails
Email distribution by size (based on actual data)Be smart with storage
Existing technologies
Distributed Hash Tables Consistent Hashing
Single name node
• Good for few large files (large web indexes, nosql tables, …)
• Bad for numerous small objects(emails,…)
• Good for trillion of objects
• Bad because of rebalancing of part of the data when adding capacity
OpenIO is different
1,000,000,000,000 = 100,000,000 x 10,000Mailboxes Emails
per Mailbox
Do not track objects, track containers!
Trillions of objects? Let’s do the math
IndirectionsPragmatic design
Containers
…
Data chunksContainers store locations of objects, not objects
……
Grid & Conscience
Grid of nodes
• Massively distributed • Each node takes part in directory &
storage services • no SPOF - Resilient to node failures
On-the-fly best matchmaking
Conscience
• Collects metrics of each node • Computes node scores • Real time asynchronous process • Advanced load balancing:
select the best nodes at a particular time route requests to them
Cyrus 3.0 + OpenIO
OpenIO
Steps
#1 Scale out archive storage
#2 Bring Conscience to Cyrus Murder
Sept 2015Available
#3 Scale-out backend for calendars & contacts
End of 2015
To provide the best Cyrus IMAP node when a mailbox is created or migrated
To provide per-user tiny databases spread across the grid of nodes
atlas.hashicorp.com/openio/boxes/sds-cyrus/
Get the Vagrant box $ vagrant up