high availability

High AvailabilityPascal Robert Druide informatique

Business requirements

• Check your requirements!

• Recovery Time Objective (RTO)

• Recovery Point Objective (RPO)

• Budget

• SLA from your providers

DNS

• Think about it!

• Use a solid DNS provider!

• TTL!

Cloud or my own?

• Cloud is more flexible and more scalable on demand

• Easy grow with the cloud

• Own hardware might cost less

• Choose your provider wisely (read the fine prints!)

• Snapshots!

Simple setup

• One Web server + one or more app servers + one db server

• Pros:

• Cost is not too high

• Scales your WO applications

• Cons:

• No high availability for your Web and database services

HA for databases

• Replication/standby

• Only failover, or read-only slaves

• Great for multi locations

• Clustering/load balancing

• Might have some drawbacks

• Not ideal for multi locations

• Amazon RDS

Replication/clustering doesn’t replace backups!!

!

Neither RAID!

Tools for the job

Tools for Linux

• Heartbeat

• HAProxy

• DRBD

Heartbeat

• Can mounts a virtual network interface

• Monitor services and switch over

• Failover or load balancing

HAProxy

• Load balancer as software

• Can use Heartbeat for LB failover

• Can look for session ID in cookie or url path

• Can act as basic firewall

• Not only for HTTP(S)

DRBD

• RAID 1 over a network

• Failover or clustering, depends on file system

Basic HA setup

• Two Web/apps servers with Heartbeat (active/passive)

• Two database servers and Heartbeat and DRBD

Average HA setup

• Two Web servers with Heartbeat

• Two app servers with Heartbeat (for Monitor)

• Two database servers and Heartbeat and DRBD

Fantastic HA setup

• Two load balancers with HAProxy and Heatbeat

• Two active Web servers

• Two or more app servers

• Two database servers, with Heartbeat and DRBD

Tools for the cloud

• Auto scaling

• RackSpace Auto Scale

• Amazon Auto Scaling

• Load balancers

• Amazon Elastic Load Balancer

• Linode NodeBalancers

• Rackspace Cloud Load Balancers

Rackspace Auto Scale

• Can check by memory, CPU, load, file system and network

• Have APIs

• Need VM images

• Specify minimum and maximum

Amazon Auto Scaling

• Works with Cloud Watch

• Will scale based on network requests or load

• Needs AMIs

• Have APIs

• Specify minimum and maximum

Amazon Elastic Load Balancer

• Supports TCP, HTTP, HTTPS and SSL

• Can check path (URL)

• Can load balance between regions

• Integration with CloudWatch

• Can use the application’s session cookie (wosid)

Linode Node Balancers

• Supports TCP, HTTP and HTTPS

• Session Stickiness works with tables of IPs or HTTP cookie

• Health check can do a status (2xx, 3xx) check or regex on body

Rackspace Cloud Load Balancers

• Can cache content (images, audio, video, css)

• Can display error page when all nodes are down

• Session persistence by cookie for http only

• Required for Auto Scale

Mixing

• You can use HAProxy and heartbeat in the cloud

• DRBD might work, but I can’t confirm

Monitoring/relaunch

• You can use Nagios’ event handlers to restart stuck instances

• Amazon Cloud Watch and Rack monitoring are good too

• Start new app instances or VMs based on memory or other criteria

Alternatives

• Use mod_proxy_balancer and Direct Connect

• Use Puppet/Chef

TODOs

• Scripts to monitor state of apps for scaling

• Event handlers for Nagios

• Replace JavaMonitor with something else

• Should we get away from wotaskd and the WO adaptor?

high availability

Software

cloud drbd

app servers

database servers

haproxy load balancer

heartbeat activepassive

linux heartbeat haproxy

webapps servers

active web servers