Copyright © 2010 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.
Confidential & Proprietary. Do not Distribute
Copyright © 2010 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.
Confidential & Proprietary. Do not Distribute
Three Degrees of MediationChallenges & Lessons in Building Cloud-agnostic Systems
Copyright © 2014 Alex Maclinovsky All Rights Reserved.
Alex Maclinovsky,
Principal Engineer, Sears Holdings
2Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
What is Cloud-Agnostic and why should I care?
3Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Cloud-Agnostic System consumes cloud services
while being loosely coupled to the underlying cloud
platforms and providers. Common CAS traits:
• Integrates with the underlying cloud
rather than just running on it
• Large contact surface with the cloud
• Leverages Cloud API for the integration
• Orchestrates cloud operations and capabilities
• Typically integrates on the lower (IaaS, STaaS)
levels of abstraction
Degrees of Cloud-Agnostic behavior
4Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
• Works with a TBD Cloud
• Works with multiple versions of a Cloud
• Can work with one of several clouds
• Can work with more than one cloud
• Can support new clouds
• Uses the same code to talk to multiple clouds
Will support future features and
capabilities of target clouds
Marginal
Useful
Valu
e o
f m
edia
tion
TechnologyParallels:
Approaches for building Generic Clients
5Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Lowest Common Denominator
• Implements only functionality which is present
and consistently implemented in all target systems
• Leaves all deviations out of scope
Reflection
• Builds rich canonical domain model encompassing
majority of the features found in the target systems
• Uses meta-model + discovery APIs to allow users to
discover feature set supported by specific target
Do This
• Implements only a single operation doThis()
that takes an XML document describing the request
<XML /><XML /><XML />
<XML /><XML /><XML />
Popular Multi-Cloud Integration Options
• Apache jclouds – often seen as the leader of the pack,
VM-centric – no networking, support for cloud-specific
features is largely done via provider contexts
• Apache d-cloud - even more basic, with no networking
support. Is a REST API not a Java library
• Apache Libcloud – a python library that lacks even
most basic canonical relying on the dynamic language
to hide feature differences between cloud drivers
• Dasein Cloud – the only one built on a real canonical
model. Supports broad variety of clouds. Has rich
networking. OSS foundation for Dell Cloud Manager
• Cisco CIAC - Cisco Intelligent Automation for Cloud
6Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
The Importance of Canonicals
• Much more variability between clouds than RDBMS
• Whether cloud abstraction layer uses a rich, well-
defined canonical determines its mediation value
– … and, ultimately, ability to write cross-cloud code
• Next 2 slides compare code snippets launching
VMs with default configurations in EC2 and Terremark eCloud, highlighting: common,
parameterizable and divergent code and
showing overall mediation score between two
integration libraries: one uses the other:
7Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
jclouds – What Canonical? + =
8Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
ComputeServiceContext context = ContextBuilder.newBuilder("aws-ec2") …
Template template =
context.getComputeService().templateBuilder().osFamily(OsFamily.CENTOS).build();
options.as(AWSEC2TemplateOptions.class).subnetId(subnetId);
template.getOptions().as(EC2TemplateOptions.class).noKeyPair();
Set<? extends NodeMetadata> nodes =
context.getComputeService().createNodesInGroup("webserver", 1, template);
NodeMetadata node = Iterables.get(nodes, 0);
// when you need access to very ec2-specific features, use the provider-specific context
AWSEC2Client ec2Client =
AWSEC2Client.class.cast(context.getProviderSpecificContext().getApi());
ComputeServiceContext context = ContextBuilder.newBuilder("trmk-ecloud") …
RestContext<TerremarkECloudClient, TerremarkECloudAsyncClient> providerContext =
context.getProviderContext();
CommonVCloudClient client = context.getApi();
CatalogItem item = client.findCatalogItemInOrgCatalogNamed(null, null, "Centos");
VAppTemplate vAppTemplate = client.getVAppTemplate(item.getEntity().getHref());
vdc = client.findVDCInOrgNamed(null, null);
VApp = client.instantiateVAppTemplateInVDC(vdc.getHref(), vAppTemplate.getHref(), serverName);
taskTester = new RetryablePredicate<String>(new TaskSuccess(context.getAsyncApi()), 300, 10,
TimeUnit.SECONDS);
if (!taskTester.apply(task.getHref())
throw new Exception("could not deploy and powerOn "+vApp.getHref());
Dasein Cloud + =
9Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
provider= constructProvider(DASEIN_CLASS_AWS, ACCOUNT_IDENTIFIER_AWS, CREDENTIALS_PUBLIC_AWS,
CREDENTIALS_PRIVATE_AWS, CLOUD_NAME_AWS, ENDPOINT_AWS, DEFAULT_REGION_AWS);
if (provider != null) {
try {
String name = “ServerName";
String imageId = "ami-802c96e9";
String productId = "m1.small";
VMLaunchOptions vmOpts = VMLaunchOptions.getInstance(productId, imageId, name, "Minimal
EC2 VM Launch Test");
String result = vmOpts.build(provider);
System.out.println("Resulting VM ID: "+ result);
} catch (CloudException e) { …
provider = constructProvider(DASEIN_CLASS_TMK, ACCOUNT_IDENTIFIER_TMK, CREDENTIALS_PUBLIC_TMK,
CREDENTIALS_PRIVATE_TMK, CLOUD_NAME_TMK, ENDPOINT_TMK, DEFAULT_REGION_TMK);
if (provider != null) {
try {
String name = "ServerName";
String imageId = "32452";
String productId = "248:2379:TEMPLATE";
VMLaunchOptions vmOpts = VMLaunchOptions.getInstance(productId, imageId, name, "Minimal
TMK VM Launch Test");
String result = vmOpts.build(provider);
System.out.println("Resulting VM ID: "+ result);
} catch (CloudException e) { …
Dasein Cloud : Power of Reflection
10Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
static CloudProvider constructProvider(String providerClass, String account, String shared,
String secret, String name, String endpoint, String regionId) {
Cloud cloud = Cloud.register(name, name, endpoint, (Class<? extends CloudProvider>)
Class.forName(providerClass));
ContextRequirements requirements = cloud.buildProvider().getContextRequirements();
List<ContextRequirements.Field> fields = requirements.getConfigurableValues();
List<ProviderContext.Value> values = new ArrayList<ProviderContext.Value>(fields.size());
for (ContextRequirements.Field f : fields) {
if (f.type.equals(ContextRequirements.FieldType.KEYPAIR)) {
if (shared != null && secret != null) {
values.add(ProviderContext.Value.parseValue(f, shared, secret));
} else {
throw new RuntimeException("Keypair parameters are not set up correctly");
}
} else {
String value = System.getProperty(f.name);
values.add(ProviderContext.Value.parseValue(f, value));
}
}
ProviderContext ctx = cloud.createContext(account, regionId, values.toArray(new
ProviderContext.Value[0]));
provider = ctx.connect();
return provider;
}
The common method which is identical across both clouds and
was factored out for brevity:
But Canonical & Reflection is not Enough
• Even when the code for each operation (create
network, launch VM, assign IP, etc.) is the same
across the target clouds, code to implement the
same user story will differ because the sequence
of operations will differ between clouds:
– the user story of interest being not: Launch server A with
characteristics XYZ on cloud B, but:
– Launch server A with characteristics XYZ in the [location
C of the] network D with access rights E on the cloud B -
the devil in the details [of cloud providers’ networking]
11Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
End-to-end Meaningful Use Case:A Tale of Two Clouds
12Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Deploy a simple standalone application serving API calls from mobile clients over HTTPS from an existing image.
AWS
1. Create VPC2. Add Internet Gateway3. Create Public subnet4. Create Firewall5. Add Allow rule to the FW6. Deploy a Server on the
subnet into the FW7. Create public IP8. Assign Public IP to Server
NTTA / OpSource / Dimension Data
1. Create Network2. Remove extra FW rules3. Deploy a Server on the
network4. Check for free public IPs in
the network and create a public IP block if necessary
5. Assign Public IP to Server
Consistent, in-order steps. Specific to a particular Cloud.Common enough to be modelled and inferred via reflection.
Three Degrees of Mediation
As I was struggling why even the best cloud abstraction layer
doesn’t help me write truly generic code to implement the
above use case, I realized there was not just one but three
distinct levels of mediation that needed to be addressed:
• Syntactic – which can be done through a good
library based on a canonical model
• Semantic – that could be tackled to a degree via
reflection and
• Idiosyncratic – that had to be addressed on case-
by-case bases
13Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Syntactic vs Semantic Model in mapping Network topology across cloud providers
14Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Network UniverseEncapsulates a set of mutually routable
networks with a distinct IP space and means of interconnection with the
outside world
Usable NetworkA contiguous space of IP addresses
representable by a CIDR, that can be used as a deployment target for servers. Has a distinct perimeter, with ACL, NAT, RNAT and Forwarding rules associated
with it. All servers deployed within a UN are mutually reachable.
Network PartitionAn optional subdivision within a
Network that can be used for partitioning of IP space and controlling
multicast boundaries.
Semantic Model Syntactic Model
Network
Subnet
Amazon
VPC
Subnet
Dimension Data Azure
Virtual Network
Network Subnet
Datacenter(implied)
Deploy servers here...
Examples of Idiosyncratic Features
Provider “Feature” Mitigation
Terremark
eCloud
Private image launches with original IP of
source VM, making it unreachable
Use public images and
configure on the fly
NTTA Issuing too many networking Ops per DC
locks cloud’s networking layer
Client-side throttling of
networking requests
NTTA When adding drives to an existing server
7th and 11th SCSI slots are left empty
Vertical scaling logic needs
to handle this correctly
Google GCA Default templates start without swap, so
small VMs getting stuck in heavy
configuration during heavy provisioning
Launch large VMs then
replace with small after 10
minutes
Windows
Azure
Incomplete emulation of IP protocol –
some communication between 2 VMs on
the same subnet might not work
Use only supported ports
and protocols
AWS EC2 When launching an instance in VPC
without specifying subnet it appears to be
quietly ejected into legacy EC2
Avoid
15Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Polymorphic Orchestration
• Structure cloud-facing code as a hierarchy of
workflows arranged according to abstraction level:
Level 0: Service Provider – common across all services
Level 1: Business Service – e.g. IaaS Provisioning
Level 2: Generic Technical Operation (provider-
agnostic) – e.g. Create Network or Launch Server
Level 3: Provider-Specific Technical Operation – e.g.
Create Amazon VPC or Reserve NTTA Public IP Block
• Higher level workflows specify generic behaviors
while lower levels provide necessary overrides for
semantic and idiosyncratic deviations
16Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Use of Polymorphic Workflows to Implement Application Deployment UC
17Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
L2 Deploy
Application
L3: AWSL3: NTTA
NTTA AWSCreate Network Universe
Create Network UniverseCreate VPC
Create IG
Create Usable Network
Create Usable NetworkCreate Subnet
Create Network
Create or Discover Firewall
Create or Discover FirewallCreate Security Group
Secure Network
Secure Network
Remove ACL
loop
Open Firewall Port (Add SG Rule)
Open Firewall Port (Create ACL)
Deploy Server
Deploy Server
Create Public IP
Create Public IPCreate Elastic IP
Reserve Public IP Blockopt
Assign IP to Server
Assign IP to Server
Legend:
NoOP
Conclusions
• Building Cloud-agnostic systems is really hard!
• But it is possible, given right tools, design,
architecture and realistic expectations
• CA system needs to address all 3 mediation levels:
Abstraction layer with good canonical provides syntactic
Reflective logic and polymorphic orchestration can ensure
semantic consistency
Idiosyncratic requires case-by-case handling by
provider-specific code
• This approach will work for new providers but not
new features18Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.
Q & A
19Copyright © 2014 Alex Maclinovsky All Rights Reserved. Confidential & Proprietary.