architecting for the cloud
DESCRIPTION
Richard Blewett. Architecting for the Cloud. Agenda. The Problem Space Managing State Architect for Change Dealing with Failure. Scaling. Mainframe Approach. Scaling. Cloud Approach. Effects of scaling. Applications run on many machines concurrently - PowerPoint PPT PresentationTRANSCRIPT
© 2009 RoCk SOLid KnOwledge 1
RoCkKnOwledge
SOLiDhttp://www.rocksolidknowledge.com
Architecting for the CloudRichard Blewett
© 2009 RoCk SOLid KnOwledge 2
Agenda
The Problem Space
Managing State
Architect for Change
Dealing with Failure
© 2009 RoCk SOLid KnOwledge 3
ScalingMainframe Approach
© 2009 RoCk SOLid KnOwledge 4
ScalingCloud Approach
© 2009 RoCk SOLid KnOwledge 5
Effects of scaling
Applications run on many machines concurrently
Every request may hit a different machine
Hardware fails
© 2009 RoCk SOLid KnOwledge 6
Statelessness is King
Statelessness: Local state of any kind is unreliable
Store critical state in Azure Storage or SQL Azure
See Eric Nelson’s talk for more details
© 2009 RoCk SOLid KnOwledge 7
.NET Services Service Bus
© 2009 RoCk SOLid KnOwledge 8
Don’t be Scared of the Service Bus
Messaging backbone for the cloud
User Service Bus to bridge in-flight cloud data to full data on premise
Enables rich message exchange patterns
© 2009 RoCk SOLid KnOwledge 9
Prefer Asynchrony
Connect components using queuesAzure queue storage
Synchrony is a form of coupling
© 2009 RoCk SOLid KnOwledge 10
Asynchrony = Load Levelling
© 2009 RoCk SOLid KnOwledge 11
Asynchrony = Scalability
© 2009 RoCk SOLid KnOwledge 12
Its OK to Degrade
Some functionality is mission critical
Other functionality can wait
Degrade service to maintain mission critical functionality
You can do everything else later
© 2009 RoCk SOLid KnOwledge 13
Not Everyone Needs Perfect Consistency
Many services move though “inconsistent” states
EmailIMSMS
© 2009 RoCk SOLid KnOwledge 14
Update is Hard
Difficult to update and remain available
Design for rolling updateAzure supports two concurrent deployed versionsAzure will support update groups
Update Group 1 Update Group 2 Update Group 3
© 2009 RoCk SOLid KnOwledge 15
Update Code *or* Data
Do not update code and data at the same time
Design data to handle multiple code versionsDesign code to handle multiple data versions
© 2009 RoCk SOLid KnOwledge 16
Maintain Compatibility
Make sure that you can roll back change without breaking everything
© 2009 RoCk SOLid KnOwledge 17
Timing of Update can be Hard
Depends on location of users
Depends on work pattern of users
Don’t forget the Pacific Ocean is big with a low population
© 2009 RoCk SOLid KnOwledge 18
Change is Inevitable
Expect things to changeDon’t hard-code valuesUse azure config
string val = RoleManager.GetConfigurationSetting("LoggingLevel");
© 2009 RoCk SOLid KnOwledge 19
Azure configuration
<ServiceConfiguration serviceName="PhotoGallery"> <Role name="WebRole"> <Instances count="1"/> <ConfigurationSettings> <Setting name="LoggingLevel" value="Error"/> </ConfigurationSettings> </Role></ServiceConfiguration>
<ServiceDefinition name="PhotoGallery"> <WebRole name="WebRole"> ... <ConfigurationSettings> <Setting name="LoggingLevel" /> </ConfigurationSettings> </WebRole></ServiceDefinition>
.csdef
.cscfg
© 2009 RoCk SOLid KnOwledge 20
What Just Happened?
Debugging and Diagnostics non-trivial
Use local fabric for testingTest against local storageTest against cloud storage
© 2009 RoCk SOLid KnOwledge 21
Tracing is Key
Add trace statements in codeCan be filtered on log level from configCritical errors raised as alerts
if( RoleManager.GetConfigurationSetting("LoggingLevel") == "Verbose"){ RoleManager.WriteToLog("Information", "Product Purchased");}
© 2009 RoCk SOLid KnOwledge 22
Failures Often Transient
Build retry logic into your code
Remember to stop retrying eventually
© 2009 RoCk SOLid KnOwledge 23
Failures Can be “Catastrophic”
Don’t assume your “shutdown” logic will be executed
Try to keep state consistent enough at all times
Think about sanity checking on start
© 2009 RoCk SOLid KnOwledge 24
Fault Domains
Fault domains allow you divide application for fault tolerance
Not available yet
Fault Domain 1 Fault Domain 2 Fault Domain 3
© 2009 RoCk SOLid KnOwledge 25
The Challenges are Not Just Technical
CostEstimatingActual
Lock-inVendors have vested interest
LegalWhere is my data?
© 2009 RoCk SOLid KnOwledge 26
RoCkKnOwledge
SOLiDhttp://www.rocksolidknowledge.com
Q & AThanks for coming
[email protected]://rocksolidknowledge.com/blogs