a high availability scalable website

Upload: api-21206679

Post on 30-May-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 A High Availability Scalable Website

    1/12

    A High Availability Scalable WebsiteStefan Debattista C3126232

    Leeds Metropolitan University, Leeds, United Kingdom

    Abstract

    Purpose To recommend a Website hosting solution for RaisingMillions.com and demonstrate

    real-life solutions grounded in reality that provides High Availability (HA) and best practicestandards

    Design/Methodology/Approach This paper reviews the real-world implementation of a

    Website with ambition to grow from small to large.

    FindingsThe technology is readily available to scale slowly when needed, without the need to

    invest in a server that is likely to last years on end. This does not however mean that the

    application code does not need to be thoroughly planned out and designed for scaling from dayone.

    Colleagues During this paper I will refer to Josh Nesbitt, an Application Designer whom I shall

    be collaborating with during implementation to get our application Online.

    Practical Implications The solution provides HA for web servers on a Just in Time (JIT) basis.

    A systems analysis is recommended for implementing these ideas.

    Before commencing this assignment I carried out an analysis of the problem. As this was not a

    requirement I have included it in Appendices.

    1 IntroductionRaisingMillions.com aims to be a world leading site to help raise money for charities across theglobe. The Application will be designed using an agile Web development framework - Ruby on

    Rails hosted on a Linux server. Our core process involves collecting 1 Million images from

    donators to charity. When a user uploads an image it will be saved into three image sizes; we willtherefore be dealing with at least 3 million images which will add up to a combined total of

    ~100Kb per image set, this does not include the additional text/images that people will upload nor

    does it include the size of the Web pages. We will therefore be handling at least 1.5 Terabytes(TB) of Data and even more bandwidth than that. A fair amount of data considering a Seagate

    Cheetah hard disk takes 2.2 hours to read 1Tb at 125 MB/s. This complexity will have an impact

    on hardware and software such as bandwidth, storage, and response times. My goal in thisresearch is to discover the opportunities in order to achieve in designing a scalable plan to host

    our website that will begin with a handful of users and eventually be able to a million.

    This paper sets out to identify:

    The requirements of our site (See Appendices) Software elements comprising a Web Application Strategic deployment methods Problems likely to be generated from a hardware perspective as a result of growth Solutions for HA Solutions for Data Management

  • 8/14/2019 A High Availability Scalable Website

    2/12

    A High Availability Scalable Website Stefan Debattista

    Page 2

    2 Review of methods and technologies2.1 What are the components of a Web Application and specifically of a Ruby on Rails

    Web app?

    Prior to commencing a review of how to host a HA Web Application it is important to understand

    the architecture of the elements (Booch, 2001) constituting it. Jim Conallen, a Web Modelling

    Evangelist offers a diagram of a canonical Web architecture seen in Figure 1.

    Figure 1 - Web Architecture Diagram (Conallen, 2003)

    During development stage RaisingMillions.com is run on a local machine, specifically, on Joshs

    Apple MacBook Pro. This is only possible because the requirements on the components of the

    machine are not very demanding as it is not generating loads from the world to use just yet.

    Figure 2 - Website Elements running on a single machine

    Mongrel (Web server) executes the application during development. The script/servercommandwill run the application in development mode on port 3000 (Thomas, Hansson, 2007). However

    for the deployed application Mongrel is far from ideal, this is because it is a single-threaded, and

    can therefore can only process a single request at a time:- it couldnt possible keep up on theWeb.

    Rails applications are deployed using a front end Web server such as Apache to handle the

    incoming requests from clients (Thomas, Hansson, 2007) which can handle hundreds of requests

    at the same time.

    This setup can scale to multiple application servers. Apache will distribute load to any number of

    Rails processes running on any number of back-end machines. This basic architecture allows forRails to scale until the Web or database server falls over, however, as you might expect, these toocan scale up to multiple machines. See where were going here? Well see how to do this in just a

    bit.

    2.2 DeploymentDeployment is the stage where the application is uploaded to a live Web server . Its when thebeer and champagne are supposed to flow, it will be written about in WiredMagazine and will

    Internet

    Rails (Web App)

    Rails (Web App)

    Rails (Web App)

    Web Server

    (Apache or similar)

  • 8/14/2019 A High Availability Scalable Website

    3/12

    A High Availability Scalable Website Stefan Debattista

    Page 3

    become an overnight name in the WWW. Unfortunately (for the Marketers) it may not be that

    easy, but fortunately for us (Techs) it gives us time to manage growth.

    It is best practice to first deploy onto a development server (Thomas, Hansson, 2007). This

    doesnt have to be, and indeed shouldnt be, the final deployment environment, nor does it haveto be on a heavy-duty machine just yet. The sole purpose of this stage is practise, and testing. A

    code test commit deploy routine (Thomas, Hansson, 2007) is good practice and worth

    getting used to. It will highlight teething problems that are likely to emerge during realdeployment. It will also give clients, lecturers, and trusted friends the opportunity feedback ideasand issues to us. The skills I will acquire during this stage are:

    Server deployment Migration

    This will also allow me to begin monitoring and testing the server, at some point the Pentium 4laptop sitting in the hallway is going to run out of resources! And measuring when this happens

    will give an insight to loads the application produces on the server with x number of clients.

    So far the application has been hosted on PCs, the development server during coding, and thedeployment server during the test stage. Now comes the time when we commit and deploy onto a

    Web server. A Web server is essentially a more powerful computer that can usually be upgraded

    and that is always connected to the Internet with a stable and static Internet connection. There arethree main options here:

    To host the website yourself To rent web space - although this gives you no control over the machine which renders

    this type of service useless for us

    To outsource it to a third party Web hosting companyHosting in-house requires a WLAN link, Firewall, Router and Fast Internet connection (a 6MB -

    T3 equivalent 1:1 connection to a provider like BT or Easynet costs in the region of 5,000 per

    annum) as well as the Web / Domain Server itself. Web hosting companies will rent you a

    Machine for a monthly price and generally provide a host of hardware support. Figures 3demonstrates a real-life hosting service offered by one of the worlds leading hosting companies

    Rackspace whom i requested a quote from..

  • 8/14/2019 A High Availability Scalable Website

    4/12

    A High Availability Scalable Website Stefan Debattista

    Page 4

    Figure 3 - Rackspace Options 1/2, and 3

  • 8/14/2019 A High Availability Scalable Website

    5/12

    A High Availability Scalable Website Stefan Debattista

    Page 5

    As the amount of users increase more demand will be put on the Web server, Application Server

    and Database server. Each element must be available to perform their functions (Microsoft

    MCSE 70-293, 2004), and depending to how critical they are it is important to take steps to

  • 8/14/2019 A High Availability Scalable Website

    6/12

    A High Availability Scalable Website Stefan Debattista

    Page 6

    ensure that they are up and running as much of the time as possible. What are the resources that

    are going to suffer? What needs to be done?

    2.3 Problems are likely to be generated from a hardware perspective as a result ofgrowth

    Scalability refers to the ability of computer systems to handle growing amounts of work in a

    graceful manner or to be readily enlarged (A B. Bondi, 2000). In order to do this a scalablecomputer system needs to support three basic areas of growth: processing, storage, and

    bandwidth (Randal E. Bryant, 2009). One can achieve scalability by scaling vertically (scale up)

    or scaling horizontally (scale out). To scale vertically means to add resources to a single node inthe system and therefore improving its performance. This typically involves the addition of more

    resources such as processors, memory or hard disk space. Scaling horizontally on the other hand

    involves distributing resources among multiple machines, or, a cluster of servers that are

    interconnected with a high speed LAN, and in the case of multiple data centres, a high speedWAN. Google for example has data centres throughout the globe and their algorithms cleverly

    work out the closest centre to you therefore minimising your response rate (C D. Cuong, 2007).

    There is always the need make constant iterations on bottlenecks (C D. Cuong, 2007). Softwareiterations deal with making the actual applications more efficient. The main hardware concerns

    are bandwidth, processing resources and data space. Gigabit Ethernet and switches between

    nodes (servers) solve LAN bandwidth issues with their high data rates.

    2.4 Solutions for High AvailabilityOne surprising thing about web hosting is that a machine can handle a large number of visitors aslong as the data is mostly static.

    Web, Database, and Storage ServerRouterSwitch

    Wan (Internet)

    Figure 4: A simple website works great on a standard PC

    A standard machine with a Core 2 Duo Processor running Windows or Linux and Apacheconnected with a 6MBps connection could handle hundreds of thousands of visitors per day. This

    configuration will work great unless:

    Traffic increases Machine fails Pages are large Pages are dynamic Back-end processing needs to take place

    This is because a single processor can only provide a limited amount of processing power. There

    are three two strategies for handling the increased load:

    1. Servers can be upgradedmainly Ram and CPUs (Vertical Scaling)2. Clustering a number of machines (Horizontal Scaling)

    A server OS can ensure HA by means of clustering, which are groups of servers that function as asingle entity. The elements that constitute the Web Application in Section 3.1 will be hosted on

    each one of these clustered servers as and when needed. The general idea is that when the load is

    too great and nothing can be done from a software point of view, the next step is to add anothermachine to distribute the load. Clients access the server applications using a specially assigned

    cluster name and cluster Internet Protocol (IP) address, and one or more of the servers in the

    cluster are responsible for responding to each client request (Microsoft MCSE 70-293, 2004).

    Also should a server in the cluster fail then another server in the cluster will take over the

  • 8/14/2019 A High Availability Scalable Website

    7/12

    A High Availability Scalable Website Stefan Debattista

    Page 7

    responsibility of the failed servers processes. This is called a failover, and when themalfunctioning machine comes back online it can begin it can restart its processes, this is called afailback (Microsoft MCSE 70-293, 2004).

    Microsofts solution technologies consist ofserver clusters and network load balancing. TheseMicrosoft solutions, a part of MS Server 2003, provide excellent support and integration with the

    Windows environment.

    Linuxs HA concept is very similar to Microsofts. Google amended Linux Red Hat for loadbalancing (Harnur, 2000). They used 15,000 PCs to build the worlds largest Linux cluster and

    achieved nearly 100% uptime for processing over 150 million queries per day with

  • 8/14/2019 A High Availability Scalable Website

    8/12

    A High Availability Scalable Website Stefan Debattista

    Page 8

    A SAN is a network connecting servers to storage to provide HA for data (Microsoft 2002;

    Freedman, 1999) with Fibre Optic Cable. SAN devices generally deliver data really quickly (upto 2Gbps) and connect up to 256 devices. A SAN is designed to eliminate single points of failure

    and is the way forward for large disk capacities, fast delivery and high redundancy but at a much

    greater cost to RAID.

    3 Analysis3.1 Components of a Web ApplicationEvery application has its own unique setup and different requirements for accessing andprocessing information. YouTube for example had unforeseen problems with their thumbnails.

    Cyong Do talked about the realization that even though the team (of 7) were concerned with

    delivering Video they did not envisage a problem with the little thumbnails. This issue wascaused by immense amounts of requests for data, unlike a video stream which involved one

    request for the video to be played with one video came tens of thumbnail requests because from

    the clients.

    The thoughts and methodology behind the scaling of elements highlighted in this report in that

    they are easily scalable by cloning the processes and distributing them onto multiple servers isremarkable. It is a simple solution for a complex problem.

    When it comes to dealing with the problems between applications and also within applications

    themselves the basic principle is to simply the solution as much as possible and deal with it,

    ensuring not to break something else.

    3.2 DeploymentIn the data driven world that we now live in, deployment solutions is not an issue. In an attempt toquantify the volume, researchers reckon there was enough digital data in 2006 to theoretically fill

    12 separate stacks of novels, each of which would extend the 93 million miles from the Earth tothe sun. By 2010, the accumulation of digital data would further extend these 12 stacks of books

    to reach from the sun to Pluto and back! (McLean, 2007)

    What is an issue however is coping with the vastness of the operating systems that WebApplications, as are all applications, deployed on. There are simply so many things to consider in

    setting up and maintaining a secure Operating System(s) that there are always major threats to the

    Data.

    A fail-proof backup backing up system is imperative because when everything is working -

    everything is great, however should data get corrupted, deleted, or lost, business implications can

    quite literally ruin a company and render it useless.

    I believe deployment is a quick and painless process, because someone with something to deploy

    will always have one of four options:

    Hosting Providers where all one needs is a credit cardto access one of the countless hosting providers

    A static IP at home and a small computer or set ofcomputers to experiment on

    A largish system of his/her own at work, or acompanys cash to buy one

    A beast of a system at work capable of thousands,millions, or even trillions of processes every second

    Figure 6 - Google's first server

  • 8/14/2019 A High Availability Scalable Website

    9/12

    A High Availability Scalable Website Stefan Debattista

    Page 9

    3.3 Problems are likely to be generated from a hardware perspective as a result ofgrowth

    As with Application components every system has its own unique setup and is prone to its own

    problems. One major differentiating factor is that dealing with hardware issues is very likely tocost a lot more money.

    3.4

    Solutions for HA and Data ManagementThe flexibility of hardware allows us to build computer systems that serve different computing

    needs for different organisations.

    Solutions for HA systems have the potential to be very expensive. Although from a hobby or

    make-do perspective solutions can be cheap but complicated. 40 Clustered Pentium 4s mayhave the same processing power as a rack mounted IBM Blade Server but are certainly a lot

    messier to deal with; but certainly something to be proud of.

    A differentiating factor between solving hardware and software problems is that there is a lot ofreliance on hardware vendors. These companies sometimes also go astray of regulations. Some

    SAN vendors for example are not part of the Storage Networking Industry Association (SNIA)

    and this can cause problems because of hardware conflicts. Reviews are excellent resources to

    find out more information.

    4 Proposed methodologyMy recommendations for the assessment and implementation of a hosting theRaisingMillions.com computer system adopts a Systems Analysis process in terms of designing

    and planning the system and a Just In time (JIT) methodology for implementing new ideas. Thisis an established process for implementing better (usually more expensive or more complex) ideas

    only when needed to keep overheads and time to a minimum. The Systems Analysis approach I

    recommend consists four phases:

    Phase 1 - Understanding the business needs Phase 2 - Analysing systems requirements Phase 3 - Analysing and making-decisions and Phase 4 - Implementing the system Phase 1

    o Answering questions like Budgeto Timescaleso Realistic HA needso Realistic storage needso Speedo Environmental considerationso Risk log

    Phase 2o Specific technical requirements (Use Case Document, Wireframes)o Research will help determine more precise hard disk requirements, processor

    requirements, and backup.o Operating system will need to be Linux to host Rails, but do we also need a

    Windows server?

    o Future technical requirementso A checklist of functional requirements should be compiled.

  • 8/14/2019 A High Availability Scalable Website

    10/12

    A High Availability Scalable Website Stefan Debattista

    Page 10

    Phase 3o Matching precise set ups with the constraints and requirements set out in Phase 1

    and 2.

    o Skills should be considered and technology appropriately choseno Testing - site should be loaded onto the deployment server.o Technologies and their set-up will be detailed.o Tests should be carried out to make implementation is as smooth as possible.

    Phase 4o Implementation/project plan with all above details detailed.

    What I can envisage is demonstrating hosting the site being hosted on a small

    cluster of laptops or virtual servers and switching off one demonstrating the

    failback and failback features. Also implementing some sort of backup and

    disaster recovery plan. It would also be interesting to explore S3 or EC2

    depending on constraints

    5

    SummaryThere is an unlimited amount of money one can spend, on an unlimited amount of ways to host aWeb Application because of the vast amounts of systems and services available on the market.

    A website needs to support three types of areas of growth, these being processing, bandwidth, and

    storage and one can achieve scalability by distributing these among multiple physical or virtualservers. Usually the Operating System can take care of all these by means of clustering. Storage,

    however, is nowadays managed by means of SANs (Storage Area Networks) although a cheaper

    alternative is the older RAID Configuration which can still be affective if centralized storage isnot the key objective greatly minimizing management time. Database scalability is usually

    handled independently of the OS by means of its intrinsic design; this is a DBAs role.

    The important thing is to plan for the highest foreseeable simultaneous load on the systems

    software and hardware. There is no point in investing in a massive infrastructure for a websitethat may or may not become very popular. And becoming popular on the net is not as easy as one

    may think.

    The infrastructure available to us IT professionals is so vast that we can literally pick and choose

    specific areas to invest money in to better performance, and the flexibility is unparalleled. Site

    and server statistics help us to identify when to scale and by how much.

    This paper has helped me gain a sound understanding of the systems supporting a website and the

    importance of planning the architecture, not least to save money because of buying or renting the

    wrong products or software. No engineer constructs a building without a solid plan in place and

    neither should we.

  • 8/14/2019 A High Availability Scalable Website

    11/12

    A High Availability Scalable Website Stefan Debattista

    Page 11

    6 Appendices6.1 Systems Analysis of computing needs6.1.1 Problems and requirementsThis is the most important part of the project. It encompasses understanding the businessrequirements and to design a system to realize these. A systems analysis approach was undertaken

    and this generally consists of preliminary investigation, problem identification, requirements

    analysis, decision analysis, system implementation and operation/support (Whitten et al., 2001)

    The website is what our customers use to interact with our business and must be stable to provide

    a true 24/7 high availability (HA). A measure of a systems availability is the amount of uptime.

    Table 1 provides an example of ranges of Uptime, however there is no hard and fast rule of whatthe accepted rate isindeed, different organizations have different standards. For example, EBay

    achieved a 99.94% pa (June 2004) uptime, a 4% increase on site availability from 95.2% (June

    1999) (eWeek, 2004).

    Software and hardware failure are reported to account for 50% of unplanned downtime. Natural

    disasters and human errors are the other causes (Marcus and Stern, 2003). HA aims to minimisethese failures.

    These requirements were drawn up for RaisingMillions.com. These requirements are speculative

    estimates from the RaisingMillions team:

    RaisingMillions.com must (eventually) support 1000-1500 of simultaneous users RaisingMillions.com must be a true 24/7/365 HA minimizing downtime RaisingMillions.com must be backed up RaisingMillions.com requires 2TB of storage for the Images and other files RaisingMillions.com must have failover redundancy RaisingMillions.com must be cost effective with the possibility to expand

    Uptime (%) Downtime per year Downtime per week

    (minutes)

    98.00 7.3 days 202

    99.00 3.65 days 101

    99.50 43.8 hours 50

    99.80 17.52 hours 20

    99.90 8.76 hours 10

    99.99 52.6 minutes 1

    Table 1 - Measuring Availability Cited from Yan Han An Integrated High Availability Platform

    6.1.2 Insight into modern scalability solutions such as Grid/Cloud computing:I was going to write about this topic and even though I looked into it, and very much enjoyed

    researching and watching some very interesting videos about stuff like Googles MAP Reduce

    Programming Model, Amazons EC2 and S3, and Yahoos M45 models comprising of tens of

    thousands of processors, petabytes of discs (Petabyte isnt even in my Microsoft Word!),

    capable of tens of millions calculations per second! Unfortunately I feel that the word count is

    simply not large enough to explore and report this field of study and it does not really tie into

    the aim of this assignment: To give a Web Start-ups (RiaisingMillions.com) a realistic, cost

    efficient solution to scaling an internet site.

  • 8/14/2019 A High Availability Scalable Website

    12/12

    A High Availability Scalable Website Stefan Debattista

    Page 12

    7 Bibliography1. BONDI, A B. 'Characteristics of scalability and their impact on performance',

    Proceedings of the 2nd international workshop on Software and performance,Ottawa, Ontario, Canada, 2000, ISBN 1-58113-195-X, pages 195 203

    2. Video: YouTube Scalability. Google Tech Talk. CYONG DO (and an elite group ofscalability ninjas). Google: 2007.

    3. BRYANT, R E. 2009. Data Intensive Super Scalable Computing, lecture notesdistributed in the topic module code MSc COMPUTING. Carnegie Mellon University,

    Oxford LHB.

    4. BOOCH, G. 2001. IBM: The architecture of Web applications [online]. [03 March 2009].Available from World Wide Web:

    http://www.ibm.com/developerworks/ibm/library/it-booch_web/

    5. D. THOMAS, D H. HANSSON. 2007. Agile Web Development with Rails. Texas:Pragmatic Bookshelf

    6. C. ZACHER. 2004. Windows Server 2003 Network Infrastructure MCSE 70-293.Washington: Microsoft Press

    7. RAID. 2006. Wikipedia: Redundant array of independent disks [online]. [04 March2009]. Available from World Wide Web:

    http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks

    8. MCLEAN, D. 2007. Theres how much data? 15/03/2007. Internet News [online]. [06Mar 2009]. Available from World Wide Web:

    http://www.websearchguide.ca/netblog/archives/005907.html

    9. Whitten, J.L., Bentley, L.D., Dittman, K.C. (2001), Systems Analysis and DesignMethods, 5th ed., McGraw-Hill, Boston, MA, .

    10. eWeek (2004), "Marketplace to the world", eWeek, Vol. 21 No.35, pp.22-4.11. Marcus, E., Stern, H. (2003), Blueprints for High Availability, 2nd ed., Wiley, New York,

    NY.

    12. Harnur, S. (2000), "Google relies exclusively on Linux platform to chug along",available at: www.hpworld.com/hpworldnews/hpw009/02nt.html (accessed 26

    October 2004), .

    13. Holzle, U. (2002), The Google Linux Cluster, University of Washington, Seattle, WA,available at:

    www.cs.washington.edu/info/videos/asx/colloq/UHoelzle_2002_11_05.asx (accessed

    26 October 2004), .

    14. Microsoft (2002), Microsoft Computer Dictionary, 5th ed., Microsoft Press,Washington, DC, .

    http://en.wikipedia.org/wiki/Special:BookSources/158113195Xhttp://www.ibm.com/developerworks/ibm/library/it-booch_web/http://en.wikipedia.org/wiki/Redundant_array_of_independent_diskshttp://www.websearchguide.ca/netblog/archives/005907.htmlhttp://www.websearchguide.ca/netblog/archives/005907.htmlhttp://en.wikipedia.org/wiki/Redundant_array_of_independent_diskshttp://www.ibm.com/developerworks/ibm/library/it-booch_web/http://en.wikipedia.org/wiki/Special:BookSources/158113195X