azure sydney 2015 bootcamp architecture presentation

39
Cloud Solution Architecture Planning for and designing for failure Aaron Saikovski Readify Lead Engineer – Platform April 2015

Upload: aaron-saikovski

Post on 19-Jul-2015

316 views

Category:

Software


0 download

TRANSCRIPT

Cloud Solution Architecture

Planning for and designing

for failure

Aaron Saikovski

Readify Lead Engineer – Platform

April 2015

Page

Agenda› Intros

› Design and build for Failure

› How can Microsoft Azure help?

› Demo

› Q & A

/ Copyright ©2014 by Readify Pty Ltd2

Page

About Me› Readify Lead Engineer – Platform

› 20+ years in the IT industry

› Former Microsoftie (5+ year veteran)

› Ask me about Office365, Azure and AWS

› Follow me on Twitter @RuskyDuck72

› Email [email protected]

› MCPD, MCITP, MCTS, MCT (Alumni)

› AWS Solutions Architect - Associate

Page

Design and build for failure

Page

Overview› Design and build your applications with failure in mind

› How will you react and contain the “Blast radius”?

› What types of failures are you expecting?› Virtual Machine

› Services or Web APIs

› Actual region failures or outages

› Tampering or unauthorised access

› Corrupt data

› Localised or application wide

› Bad network latency?

Services and applications will fail. Its how you react that’s important!!

/ Copyright ©2014 by Readify Pty Ltd5

Page

Service Levels - Recap

/ Copyright ©2014 by Readify Pty Ltd6

“Gold standard”

Page

Business Continuity Planning› Recovery Point Objective (RPO):

› defined by business continuity planning. It is the maximum targeted period in which data might

be lost from an IT service due to a major incident.

› Recovery Time Objective (RTO):

› targeted duration of time and a service level within which a business process must be restored

after a disaster (or disruption) in order to avoid unacceptable consequences associated with a

break in business continuity.

› Recovery Level Objective (RLO):

› defines the granularity with which you must be able to recover data —whether you must be

able to recover the whole instance, database or set of databases, or specific tables.

› Mean Time to Repair (MTTR):

› basic measure of the maintainability of repairable items. It represents the average time required

to repair a failed component or device

(Source: WikiPedia)

/ Copyright ©2014 by Readify Pty Ltd7

Page

Shared Responsibility Model

/ Copyright ©2014 by Readify Pty Ltd8

Page

Planning for failure› Observe the Azure SLAs esp. VMs

› Automate everything – Have versions of your “Stacks”

› Iterate and improve your cloud stacks (Think Agile)

› Decouple your architecture into elastic, scalable tiers – scale out

› Observe any service limits esp. network latency and

throughput

› Don’t forget to build in security at ALL tiers – NACLs and the

like

› How will each tier respond to its own or other tier failures?

/ Copyright ©2014 by Readify Pty Ltd9

Page

Planning for failure..cont› Optimise network traffic between the tiers – Async and queues

› Monitor and gather statistics – Monitor everything!!!!

› Always assume you have been hacked or are about to be

› Spread your application across multiple regions

› Actually test a failure – simulate a “Game day” approach

› Always test a catastrophic failure. You don’t want to test this

when it actually happens

/ Copyright ©2014 by Readify Pty Ltd10

Page

Sample: Standard H/A Design

/ Copyright ©2014 by Readify Pty Ltd11

Page

Sample: Better/Best Design

/ Copyright ©2014 by Readify Pty Ltd12

Page

Common Tools› “Ping” your service tiers:

› http://www.pingdom.com/

› Monitor your applications:

› OpsGenie-http://www.opsgenie.com/

› New Relic -http://newrelic.com/

› Manage your events and logs:

› Azure Application Insights -http://azure.microsoft.com/en-us/services/application-insights/

› LogStash-http://logstash.net/

› Automate your cloud builds:

› PowerShell and PowerShell DSC -http://technet.microsoft.com/en-au/library/dn249912.aspx

› Salt -http://saltstack.com/

› Chef-https://www.chef.io/

› Puppet -https://puppetlabs.com/

› Vagrant -http://www.vagrantup.com/

/ Copyright ©2014 by Readify Pty Ltd13

Page

How can Microsoft Azure help?

Page

How can Azure help?› Most Azure services have a 99.9%+ SLA

› Pricing is billed Hourly/Monthly

› Use Automation to easily provision services – PowerShell

› Geo-Redundancy is available – use it!!

› Azure Web Apps + Azure Traffic Manager = SLA of 99.95%

when configured in failover configuration

› SQL Azure has and uptime SLA of 99.99%

› Virtual Networks can provide isolation of your stack and VPN

(S2S, ExpressRoute)

/ Copyright ©2014 by Readify Pty Ltd15

Page

Key Azure services

/ Copyright ©2014 by Readify Pty Ltd16

Page

Azure Storage

Page

Azure Storage - Overview› Four types of storage available:

› Block blob:

› streaming and storing documents, videos, pictures, backups, and other

unstructured text or binary data.

› Page blobs and Disks:

› random read and write operations, page blobs are ideal for VHD images.

› Tables and Queues:

› Tables -NoSQL storage for unstructured and semi-structured data.

› Queues - reliable messaging solution for your apps.

› Files (Preview):

› creates a shared file system using the standard SMB 2.1 protocol.

/ Copyright ©2014 by Readify Pty Ltd18

Page

Azure Storage

/ Copyright ©2014 by Readify Pty Ltd19

Page

Azure Virtual Machines

Page

Azure VMs - Overview› Full support for Linux and Windows machines

› PAYG model –based on an hourly/minute/monthly model

› Compute tiers:

› Basic Tier (A0-A4):

› Good for dev/test workloads

› Standard Tier (A0-A7):

› Good for most workloads

› Compute Optimised (D1-D14):

› SSD backed

› 60% faster CPU than A-Series

› Performance Optimised (G1-G5):

› Intel® Xeon® processor E5 v3 family

› 2X memory and 4X times SSD than the D-series./ Copyright ©2014 by Readify Pty Ltd21

Page

Azure VMs – Overview..cont

› Network Optimised (A8-A9):

› 40Gbit/s InfiniBand network support

› Remote direct memory access (RDMA) technology

› Compute Intensive (A10-A11):

› High-performance clusters, modeling and simulations, video

encoding, and other compute or network intensive scenarios.

/ Copyright ©2014 by Readify Pty Ltd22

Page

Azure VMs – High Availability› Configure VMs to Use Availability sets to achieve 99.95% guaranteed SLA

› Grouping VMs into Availability sets allows for rolling updates

› Configure a load balancer with Availability sets

/ Copyright ©2014 by Readify Pty Ltd23

Page

Web Apps

Page

Overview

› Fast, easy to deploy enterprise scale web application

hosting platform

› Features Autoscale, A/B testing and deployment slots

› Supported .NET, Java, PHP, Python or Node.js

› Supports on-premises hybrid connections

› Available in Free, Shared, Basic and Standard editions

› Deploy code from Visual Studio Online, Git, FTP, Web

Deploy and Dropbox

/ Copyright ©2014 by Readify Pty Ltd25

Page

Azure Web Apps - Autoscale

/ Copyright ©2014 by Readify Pty Ltd26

Page

Service Tiers

/ Copyright ©2014 by Readify Pty Ltd27

Page

SQL Server

Page

Overview› Option of hosting SQL server on SQL Azure or on VMs

› SQL Server Always On template available in Azure Gallery

› SQL Azure almost on parity with traditional SQL

› Which to choose?

› SQL Azure:

› Fully managed and easily managed DBaaS.

› Massive scale out and “sharding”

› 500GB database size limit

› SQL IaaS:

› Traditional SQL on Windows VMs

› Scale up

› Domain joined

/ Copyright ©2014 by Readify Pty Ltd29

Page

SQL on Virtual Machines

/ Copyright ©2014 by Readify Pty Ltd30

SQL Mirroring to be deprecated/removed in SQL v.Next!!!

Page

SQL Azure Key Features› Billed per hour regardless of usage/size

› Migration tools available e.g. SQL migration wizard

› Can use familiar SQL tools to export/import data

(.bacpac files)

› Performance measured in Database Throughput Units

(DTU)

› Mix of CPU, memory and read/write operations

/ Copyright ©2014 by Readify Pty Ltd31

Page

SQL Azure Key Features..cont› Geo-replication across regions and servers

› SQL Azure uses UTC DateTime by default

› SQL Azure uses SQL authentication only -

AD/AzureAD authentication not supported

/ Copyright ©2014 by Readify Pty Ltd32

Page

SQL Azure – Service Tiers

/ Copyright ©2014 by Readify Pty Ltd33

Page

SQL Azure – Geo-replication› Standard geo-replication

› Standard and Premium tier

› Secondary offline replica

› Offline until an outage occurs

› Billed at 75% cost of primary database

› Active geo-replication

› Premium tier only

› Max. 4 geo-replicated online secondaries

› Readable secondaries

› Billed at 100% cost of primary database/ Copyright ©2014 by Readify Pty Ltd34

Page

SQL Azure – Active Geo-replication

/ Copyright ©2014 by Readify Pty Ltd35

Page

Checklist

Page

Azure Availability Checklist

› Availability checklist

› Application design

› Deployment and maintenance

› Data management

› Errors and failures

Reference: https://github.com/mspnp/azure-guidance/blob/master/availability-checklist.md

/ Copyright ©2014 by Readify Pty Ltd37

Page

DEMO:

Building a Highly Available Web Application (And watch me break it )

Download the labfiles from here:https://azbootcamp2015au.blob.core.windows.net/arch-lab2/lab2-labfiles.zip

http://bit.ly/1FfIGgn

/ Copyright ©2014 by Readify Pty Ltd38

Page

Slideshare:

http://www.slideshare.net/aaronsaikovski

/ Copyright ©2014 by Readify Pty Ltd39