federated mesos clusters for global data center designs
TRANSCRIPT
HUAWEI TECHNOLOGIES CO. LTD.
www.huawei.com
Federated Mesos Clusters for Global Data Centers Designs
Krishna M Kumar, Lead Architect, Huawei Cloud
HUAWEI TECHNOLOGIES CO. LTD. 3
What is Federation in General? “A federation is a group of computing or network providers agreeing upon standards of operation in a collective fashion.” - wiki
Regional Authority: Autonomously working body.
Federal Layer: Helps the regional Authorities co-operate with each other.
So Cloud Federation is the union of multiple co-operating data Centers across the geography solving a common purpose.
Federal
Layer
Regional
Authority
Regional
Authority
Regional
Authority
Regional
Authority
HUAWEI TECHNOLOGIES CO. LTD. 6
Why Federation?
High Availability.
No Vendor Lock-in.
Cloud bursting(accommodating spikes in demand).
Load balancing across geographies.
Application Upgrade/Migrate
Policy Based Deployment.
Economic benefits among providers.
HUAWEI TECHNOLOGIES CO. LTD. 8
Cloud Federation Nomad (Hashicorps)
Nomads
Gossip
Gossip
Similar to Google’s Borg
HUAWEI TECHNOLOGIES CO. LTD. 11
Some Mesos federation designs considered
in our research lab but dropped………
Design 1 : Nested/Proxy approach
Design 2 : Multi-Zone Mesos Super Cluster
Design 3 : Mesos Global Manager (MGM)
HUAWEI TECHNOLOGIES CO. LTD. 13
Why Multi-Master? Its really hard to control Super-Hero’s if you are not one. Ask this man!!!
Phew!!!
Big Day…
HUAWEI TECHNOLOGIES CO. LTD. 14
Benefits of Multi-Master
Each Data Center is a Super Hero, that will co-operate with each other.
• No single point of failure. • DC co-operate with each other using gossip protocol. • The frameworks gets fast feedbacks because it is connected to all the masters directly. The framework to be federated. • Centralized data store layer. • A simple policy Engine to demonstrate cloud bursting.
HUAWEI TECHNOLOGIES CO. LTD. 16
Data Center 3
Data Center 1
Data Center 4 Data Center 2
Hashicorp’s Consul will store all the Policy information
Each Mesos Master is accompanied by a ‘Gossiper’. Who will be the representative of this Mesos run Datacenter in the federation.
‘Gossipers’ talks to each other in the federation and understand the current policy
Gossiper
Gossipers negotiate with each other and informs their respective master what framework deserves the offers.
Framework
Gossiper Gossiper
Gossiper
Consul
Master Master
Master
Master
Broad Overview
HUAWEI TECHNOLOGIES CO. LTD. 17
Data Center 3
Data Center 1
Data Center 4 Data Center 2
Gossiper
Gossiper Gossiper
Gossiper
Consul
Consul
Consul
Consul
Gossipers talk to each other using hashicorp’s Member List library
Hashicorp’s Consul uses the same MemberList Library
Overview of Consul and
Gossiper Interaction
HUAWEI TECHNOLOGIES CO. LTD. 18
Federated Master
FedAlloc: An Allocation module inherited from the default DRF module of Master FedComm: A Mesos module of type Anonymous to which gossiper will talk to.
FedAlloc
FedComm
Master
Gossiper Allocation Module Anonymous Module
HUAWEI TECHNOLOGIES CO. LTD. 19
Internals of Federated Master
FedAlloc
FedComm
Mesos Master
(Write only) (Read only)
Plug-in Plug-in
(Conditional Wait)
F. Id Suppress by FW
Suppress by Federation
1122001 True True
1122007 True False
1122005 False True
1122004 False False
Gossiper
HUAWEI TECHNOLOGIES CO. LTD. 20
Internals of Federated Master (Cont.)
F. Id Suppress by FW
Suppress by Federation
1122001 True True
1122007 True False
1122005 False True
1122004 False False
FedAlloc
FedComm
Mesos Master
(Write only) (Read only)
Plug-in Plug-in
(mutex)
FedComm (TCP read on Gossiper)
Lock Table
Write
Unlock Table
Signal Condition FedAlloc (Conditional Variable)
Lock Table
Read
Call suppress( )/revive( )
unlock
Fedcomm automatically gets invoked once the condition variable is set.
Gossiper
HUAWEI TECHNOLOGIES CO. LTD. 21
Gossiper
Anon Client: This instructs the master when to start and when to stop sending the Offers. MasterInfo: This module periodically performs http GET on its respective Mesos master to update its statistical information HTTP: Http Server that exposes some REST API’s Member List (ML): Module that actually implements gossip layer. Consul Lib: Library to talk to Consul and Replicate to other DC’s. Also implements a watch if there is any update on the policy. Policy Engine: Read from Consul and interprets two policies:
1. Max Threshold 2. Next Max DC
HTTP
Master Info
Anon Client
Policy Engine
Consul Lib
ML
Gossiper
HUAWEI TECHNOLOGIES CO. LTD. 22
Master-Gossiper Interaction
Consul
Data Center 2
Data Center 3
Data Center 4
Data Center 5
FedAlloc
FedComm
Master
HTTP
Master Info
Anon Client
Policy Engine
Consul Lib
ML
Gossiper
HUAWEI TECHNOLOGIES CO. LTD. 23
Framework
Protocol
M1: Mesos Master managing our DC1
M2: Mesos Master managing our DC2
M3: Mesos Master managing our DC3
Sample Policy: If we run out of resource in our
DC burst into Next Cloud
Register to Master 1;
M2
Register to Master 2
Offer 1
Launch Task 1
OOR
Offer 2
Launch Task 2
Offer 3
Launch Task 3
M3
Protocol
Register to Master 3
OOR
OOR OOR
M1
Sequence Diagram
HUAWEI TECHNOLOGIES CO. LTD. 24
Gossiper - Exchange Framework Broadcast{ Framework 11 Framework 7 Framework 5 }
Broadcast{ Framework 1 Framework 7 Framework 10 }
Broadcast{ Framework 8 Framework 7 Framework 4 }
Broadcast{ Framework 8 Framework 7 Framework 4 }
Gossiper 4
Gossiper 3
Gossiper 2
Gossiper 1
HUAWEI TECHNOLOGIES CO. LTD. 25
Gossiper - Exchange Resource Information
Gossiper 4
Gossiper 3
Gossiper 2
Gossiper 1
Broadcast{ CPU: 4 RAM: 16GB Disk: 2TB }
Broadcast{ CPU: 4 RAM: 8GB Disk: 80GB }
Broadcast{ CPU: 2 RAM: 4GB Disk: 1TB }
Broadcast{ CPU: 8 RAM: 4GB Disk: 1.2TB }
HUAWEI TECHNOLOGIES CO. LTD. 26
Gossiper - Exchange Out Of Resource
Gossiper 4
Gossiper 2
Gossiper 1
Out of Resource(OOR)
Gossiper 3
HUAWEI TECHNOLOGIES CO. LTD. 27
Minimal Policy Engine Implemented for this Experiment
• We needed a minimal Policy Engine to demonstrate cloud-busting scenario • This Policy Engine is embedded as a part of Gossiper and can interpret only two simple rules • The content of the Policy Engine in an array of Policy objects. • Each Policy object has set of rules which needs to be applied. • We use Hashicorp’s Consul to store Policy which is replicated across datacenter to avoid single point failure. • Any update in the policy in one DC is instantly propagated to others. Gossiper watches Consul KeyStore and keeps the
latest copy of the policy.
{
"Name": "Policy_One",
"Rules": [{
"Name": "MinMax",
"Priority": 1,
"Scope": "",
"Content": {
"MinOrMax": "MAX"
}
}, {
"Name": "Threshold",
"Priority": 4,
"Scope": "",
"Content": {
"ResourceLimit": 90
}
}]
}
Simple Policy with two Rules
Rule 1: • If Cloud busting which DC to choose ? • One with Max Resources or Min Resources?
Rule 2: • When should you perform Cloud busting? • At what Resource Percentage?
HUAWEI TECHNOLOGIES CO. LTD. 35
Challenges / Future Work Planned
Policy Engine with enhanced load balancing/Affinity
Optimize the Gossip protocol for data consistency across clusters.
Network throughput/Latency
Service Discovery (i.e. DNS, etc.)
Consolidated Monitoring, health, alerts, etc.
Security & compliance in the Federation
Work with the Mesos community for further refinement……….
Thank you www.huawei.com
Copyright©2016 Huawei Technologies Co. Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and
operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to
differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and
constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.