building global and highly available services using windows azure
DESCRIPTION
Building Global and Highly Available Services Using Windows Azure. Name Title Microsoft Corporation. Agenda. As the load increases, are you still available? If platform fails are you still available? How do you upgrade your service? Thinking Globally - PowerPoint PPT PresentationTRANSCRIPT
Building Global and Highly Available Services Using Windows AzureNameTitleMicrosoft Corporation
AgendaAs the load increases, are you still available?If platform fails are you still available?How do you upgrade your service?Thinking Globally If the compute is closer to the user, what about the dependencies?
AssumptionsYou know the basicsWeb/Worker RolesSQL AzureWindows Azure StorageAsynchronous ProgrammingWindows Azure diagnostics
You have deployed a service to Windows AzureEverything can and will (eventually) break
Why do services fail?Increased workloadFailureHardwareNetwork Platform ServiceTransient conditions
HumanUpgrades
What do we mean by available?Same functionalityDegraded functionalityFailsafe
As the load increases, are you still available?
It is better to have 50 x 1GB database than 1 x 50GB database
What is wrong with this?
Scale me out too
Everything needs to scale
What about this?
As the load increases, are you still available?Scale everything OUTPartition data (for size AND performance)
TestTest at scaleSecurity Test
FeedbackEnable Windows Azure Diagnostics*Setup external monitoring
*May increase problem – scale that too
If platform fails are you still available?
Basics – what you get for freeElasticityEasily deploy compute resources and scale up and down
Automated Service ManagementWindows Azure will (automatically) recover bad nodes
Fault DomainsWindows Azure deploys services across fault boundaries
Storage Resilience3 copies of storage maintained
Fault ToleranceWhen Windows Azure breaks, it fixes itself!Can your service?
Codifying OperationsUpgrade DomainsConfigure in ServiceDefinition.csdef<ServiceDefinition name="RedDir"xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition" upgradeDomainCount="3">
Transient Datacenter ConditionsDo you have Retry Logic?
What did you mean, retry logic?Transient conditions in the datacenter/network/serviceExample:SQL Azure Error 40501The service is currently busy. Retry the request after 10 seconds.
Transient Fault Handling Frameworkhttp://windowsazurecat.com/2011/02/transient-fault-handling-framework/ Retry against anything that might be external and have transient conditions*:SQL AzureWindows Azure StorageService Bus3rd Party Services
RetryChange this to a code snippet on a slide.
demo
How do you upgrade your service?
Upgrade Strategies: VIP Swap
Upgrade Strategies: Upgrade
WEB WORKER WEB WORKER
Upgrade StrategiesNew Service & Swap DNS
Thinking Globally
Thinking GloballyNetwork latencyPut compute closer to user.Put data closer to user.
Global availabilityDatacenter outages.Synchronizing data.
Network Latency
Serve Blobs from the Edge24 global locations with 99.95% availabilityCDN now works for web apps, not just for public blobs
CDN Blob StorageClosest Point of Presence
Possibly many hops or poor links
Few hops
Windows Azure Traffic ManagerDirect users to the service in the closest region with the Windows Azure Traffic Manager
Policies Monitoring
foo.cloudapp.net
DNS response1.2.3.4
Traffic Manager
demo
If the compute is closer to the user, what about the dependencies?
Windows Azure Platform Services
Windows Azure Compute Create multiple deployments – user traffic manager to route traffic
Traffic Manager should update DNS to clients
Windows Azure Storage Role your own synchronization Service Specific implementation
SQL Azure Use SQL Azure Data Sync Service Service Specific implementation
Reporting Services Deploy reports to different locations Service Specific implementation
Service Bus Create multiple namespaces Service Specific implementation
Access Control Service Future Service Specific implementation
Cache Create deployment specific cache(s) Default programming model will handle cache failure
Service Specific ImplementationsDoes your service fail without that platform service?Can your service use the same platform services from another data center?Can your service not use that platform service temporarily?
Site FailoverIf a site specific dependency is out, fail over to another siteEasy: Use Traffic ManagerHard: Code your own
Site Failover
demo
Synchronizing Data
demo
SummaryWindows Azure gives you high availability capabilities for freeThink about scaling outHandle transient conditions
Codify operationsAutomate redeployments etc.
Use Global Features for maximum availability & reachWindows Azure Traffic ManagerSQL Data Sync
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to
be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION
IN THIS PRESENTATION.