sri-prj702- project report

Kurra Srirekha 1

NELSON MARLBOROUGH INSTITUTE OF TECHNOLOGY

BRIDGING AWS WITH AZURE FOR BACKUP

AND FAILOVER PROVISION:

USING VPN CONNECTION

Kurra Srirekha

Student ID: 13472618

Course ID: PRJ702

9/29/2016

Graduate Diploma Project in Information Technology

Kurra Srirekha 2

Abstract

There are many mission-critical businesses deployed via cloud computing. The recent Amazon

Web Services (AWS) cloud outage due to a Sydney storm impacted their business brand.

Similarly, Azure outages have impacted customers globally. All these recent examples point

to one challenge, and that is interruption in cloud computing. Hence there needs to be a better

solution than depending on one cloud. The solution that is discussed in this project is cloud-to-

cloud failover and backup.

Two clouds, AWS and Azure, are chosen for the implementation of this project. This

project has achieved the objective of business continuity with 100% data availability by

integrating AWS, Azure, MariaDB cluster, Openswan, IPsec, VPN tunnel, Secure Shell (SSH)

and session control protocol (SCP) technologies. Sections 1 and 2 of this project cover the

potential challenges in cloud outages, provide details on some of the most recent outages and

actual challenges faced by businesses due to those cloud outages, a literature review, and the

theoretical study of the concept. Sections 3 and 4 cover the analysis, design, solution

architecture and implementation of the project. Section 5 covers the test scenarios, test cases,

results and evaluation.

Though there are lessons to be learned from this project implementation, it provides a

direction for future work by enhancing cloud-to-cloud failover to achieve high scalability and

performance.

This project bridges AWS with the Azure cloud environment using a virtual private

network (VPN) to provide backup and failover.

Key words: Cloud-to-Cloud Failover, Cloud Outage, Cloud Backup, Cloud Load Balancing,

AWS, Azure, MariaDB cluster, Openswan, IPsec, VPN tunnel, SSH and SCP.

Kurra Srirekha 3

Acknowledgement

It is always a delight to remind the praiseworthy people in the Graduate program for their great

guidance I received to uphold my practical as well as theoretical skill in Graduate Diploma.

Firstly, it has indeed been a great privilege for me to have Mr. David Airehrour, as my

mentor for this project. His awe-inspiring personality, superb guidance and constant support

are the motive power behind this project work.

I take this opportunity to express my most extreme appreciation to him. I am additionally

obliged to him for his auspicious and significant advice.

I would also like to acknowledge and my heartfelt gratitude to Mr. Ali Javan (IT

Lecturer) and Mrs. Charanya Mohanakrishnan (IT Lecturer) who consistently upheld me in

each possible way, from starting encouragement to support till this date.

Finally, I am thankful to all technical and non-teaching staff of the Department of

Information Technology (Networks) and NMIT for their constant assistance and co-operation.

Kurra Srirekha 4

Table of Contents List of Figures ........................................................................................................................... 6

List of Tables ............................................................................................................................ 7

Acronyms .................................................................................................................................. 8

Chapter 1: Introduction of the Project .................................................................................. 9

1.1. Introduction ................................................................................................................. 9

1.2. Project Objective ....................................................................................................... 11

1.3. Significance of Project .............................................................................................. 12

1.4. Scope and Limitations of Project .............................................................................. 13

1.5. Summary ................................................................................................................... 14

Chapter 2: Background of Research .................................................................................... 15

2.1. Introduction ............................................................................................................... 15

2.2. Backup and Failover.................................................................................................. 15

2.3. Cloud Computing ...................................................................................................... 17

2.3.1. Backup and Failover Approaches in Cloud ........................................................... 19

2.3.2. Cloud Outages ....................................................................................................... 21

2.4. Multi-Cloud Environment ......................................................................................... 23

2.4.1. Cloud-to-Cloud Failover........................................................................................ 23

2.4.2. Benefits of Connecting Two Clouds ..................................................................... 24

2.5. VPN ........................................................................................................................... 25

2.5.1. Cloud-to-Cloud Connectivity using VPN .............................................................. 25

2.6. Ubuntu ....................................................................................................................... 26

2.7. Summary ................................................................................................................... 26

Chapter 3: Resources and Technical Analysis .................................................................... 27

3.1. Introduction ............................................................................................................... 27

3.2. Analysis of Resources ............................................................................................... 27

3.3. Analysis of Technical Challenges ............................................................................. 28

3.4. Summary ................................................................................................................... 29

Chapter 4: Design and System Implementation ................................................................. 30

4.1. Introduction ............................................................................................................... 30

4.2. Problem and Context ................................................................................................. 30

4.3. Solution Structure ...................................................................................................... 30

4.4. Implementation Process ............................................................................................ 32

Kurra Srirekha 5

4.4.1. Site-to-Site VPN Tunnel Setup.............................................................................. 32

4.4.2. MariaDB Galera Cluster Setup (Replication and Failover) ................................... 33

4.4.3. SSH Keys and SCP Command (File System Backup) .......................................... 34

4.5. Challenges ................................................................................................................. 35

4.6. Summary ................................................................................................................... 35

Chapter 5: Testing and Evaluation ...................................................................................... 37

5.1. Introduction ............................................................................................................... 37

5.2. VPN Tunnel Testing.................................................................................................. 37

5.3. Database Cluster Replication .................................................................................... 38

5.4. AWS-AZURE: Inter-Cloud Transfer (file transfers between two clouds) ............... 41

5.5. Results Evaluation ..................................................................................................... 42

5.5.1. AWS-AZURE Failover ........................................................................................ 42

5.5.2. AWS-AZURE Backup ......................................................................................... 42

5.6. Summary ................................................................................................................... 42

Chapter 6: Recommendations and Future Scope ............................................................... 43

6.1. Introduction ............................................................................................................... 43

6.2. Recommendations ..................................................................................................... 43

6.3. Future Work .............................................................................................................. 44

6.4. Summary ................................................................................................................... 44

Chapter 7: Conclusion ........................................................................................................... 46

7.1. Introduction ............................................................................................................... 46

7.2. Conclusion ................................................................................................................. 46

8. Reference List .................................................................................................................. 47

9. Appendices ..................................................................................................................... 51

9.1. Implementation Screenshots ..................................................................................... 51

9.2. Coding Part of Implementation ................................................................................. 61

Kurra Srirekha 6

List of Figures

1.1 Sydney Strom at Collaroy in Northen Sydney Impacted AWS Cloud (Source: SBS.com) 9

2.1 Multiple clustered applications before failover (Source: Microsoft TechNet) .................. 16

2.2 Multiple clustered applications after failover (Source: Microsoft TechNet) ..................... 16

2.3 Network Loan Balancing (Source: Microsoft TechNet) .................................................... 17

2.4 Cloud Computing (Source: Wikipedia) ............................................................................. 18

2.5 Amazon Aurora Database Backup for Replication and Clustering (source: AWS) .......... 19

2.6 Use of Multi Cloud Configuration using AWS and Azure (Source: Google Images) ....... 24

2.7 Failover in Multi-Cloud Environment(Soure: Google Images) ......................................... 24

4.1 Solution Structure For Backup and Failover ..................................................................... 31

4.2 Architectural Design of AWS and Azure Cloud to Cloud Connectivity ........................... 31

4.3 Virtual Network Gateway between AWS and Azure ........................................................ 32

5.1 Communicating Both VMs of AWS and Azure ................................................................ 38

5.2 Database Replication Between AWS and Azure ............................................................... 39

5.3 File Transferring Between AWS and Azure ...................................................................... 41

6.1 Cloud to Cloud Load Balancer – Future Scope ................................................................. 44

Kurra Srirekha 7

List of Tables

1.1 Earlier Cloud Service Outages ........................................................................................... 10

3.1 Sample Linux Commands .................................................................................................. 28

3.2 MySQL Configuration ....................................................................................................... 29

4.1 Openswan Configuration File ............................................................................................ 33

4.2 Maria DB Installation Command ....................................................................................... 33

4.3 Configuration for wsrep Option in my.config ................................................................... 34

4.4 Configuration for VSRep Option in my.config ................................................................. 34

5.1 Test Case Scenarios ........................................................................................................... 37

5.2 VPN Tunnel Testing Result ............................................................................................... 38

5.3 Database Cluster Testing Result ........................................................................................ 39

5.4 Cloud to Cloud Backup Testing Result ............................................................................. 41

Kurra Srirekha 8

Acronyms

AWS Amazon Web Services

VPN Virtual Private Network

VNet Virtual Network

SSH Secure Shell

SCP Secure Copy Protocol

VM Virtual Machine

VPC Virtual Private Cloud

ERP Enterprise Resource Planning

ELB Elastic Load Balancing

NAT Network Address Translator

OS Operating System

PaaS Platform as a Service

SaaS Software as a Service

Kurra Srirekha 9

Chapter 1: Introduction of the Project

1.1. Introduction

Most of the cloud-based businesses including banks in Sydney and surrounding areas were

interrupted on June 4, 2016, due to a storm in Sydney. Many services were affected due to the

power failing for Amazon Simple Storage Service (S3) and Elastic Cloud Compute (EC2) (Juha

Saarinen, 2016). This was the inspiration for this project. This project provides the study,

design and implementation of cloud-to-cloud failover to provide 100% availability of the

services that are deployed or dependent on cloud computing.

1.1 Sydney Storm at Collaroy in Northern Sydney Impacted AWS Cloud (Source: SBS.com)

In many cases, organisations with large-size deployments choose to operate their

framework on numerous platforms or environments based on requirements such as providing

backup, failover, redundancy or expanding their business. By using cloud computing and

virtual cloud services provided by different vendors are perfectly matched to backup and

failover scenarios.

However, depending on one cloud service provider is not reliable for meeting customer

requirements in multiple geographic regions. Because, even the world’ largest cloud computing

platforms are vulnerable to periodic failure or power outage (Head, 2016).. That means

Kurra Srirekha 10

enterprise cloud users must still consider business continuity planning, providing backup for

their data and failover. Therefore, what is needed is a network of multiple clouds (Veena,

2013).

1.1 Previous Cloud Service Outages

Having various data centres at various geographical areas provided by large cloud

service vendors is one way to overcome this type of issue. However, that would not have been

an option for customers who do not have data centres in multiple locations in their cloud

network region. For instance, AWS does not have data centres in multiple locations in

Australia, only in the Sydney area (Head, 2016).

Along these lines, by utilising the services of different cloud suppliers, redundancy of

service provides more benefits than simply business impact. Users can host their own cloud

servers on data centres from different providers. This helps users to manage the risks related to

the business progression of the cloud service supplier. This is conceivable as every service

vendor works independently. Hence, the multi-cloud environment is a major focus in helping

Kurra Srirekha 11

clients to utilise backup and failover services over various cloud suppliers and platforms

(Rawat, 2013).

This paper gives a structural example that illustrates the coordination of failover and

data backup between the AWS and AZURE cloud environments with the help of the VPN

tunnel. In this project, the VPN tunnel performs a critical part in the improvement of the

networking system. In this way, the computer is directly connected to the private networking

system with the help of the VPN (Huanhuan et al., 2015).

The further chapters of this report are structured in a meaningful manner. Chapter 2

states and explains the background of the study. Chapter 3 defines the analysis of this report

based on the domain and technical level study. Chapter 4 describes the design and

implementation of this project. A detailed description of the results and evaluation of results is

given in Chapter 5. Chapter 6 describes the future scope and recommendations. Finally, the

conclusions drawn from the implementation are discussed in Chapter 7.

1.2. Project Objective

The primary objective of this project is to provide backup and failover by connecting Microsoft

Azure and Amazon Web Services using a VPN tunnel. This project creates a site-to-site IPsec

tunnel so that it can be connected to a Virtual Private Cloud that is hosted in AWS to an Azure

network for failover and providing backup. The detailed objectives are given below.

1. Creating a Virtual Private Cloud (VPC) in AWS and a Virtual Network (VNET)

in AZURE.

2. Deploying Virtual Machines (VMs) in both AWS and AZURE virtual

environments, each using Ubuntu.

3. Creating a Virtual Gateway on the AZURE end and installing Openswan on the

AWS end in order to connect both cloud networks by using VPN.

4. Establishing an IPsec tunnel between two cloud platforms and ensuring that the

two VMs are communicating with each other.

5. Providing backup and data migration by bridging two cloud environments using

the VPN tunnel.

Kurra Srirekha 12

While implementing this project, the questions below are documented for study and

literature review purposes.

1. How are the AWS and the AZURE important for the backup and failover with

the help of the VPN tunnels?

2. What level of security will be achieved by this connection?

3. How effective can this connectivity be in terms of network workload or

failover?

4. What will be the cost of implementation of a site-to-site IPsec tunnel?

5. How supportive are AWS and Azure in the VPN tunnel?

1.3. Significance of Project

This study is important for companies who have their business-critical functionalities deployed

on the cloud. There is evidence from incidents when major cloud companies like Amazon,

Microsoft had service failures. Many companies research cloud availability before

implementing their business on the cloud. This project helps companies to understand how

cloud-to-cloud failover works to support the continuity of their business.

There are scenarios where companies like HP had to bear a loss of $US160 million due

to the failure of a traditional ERP implementation in 2004. Similarly, Hershey and Nike ERP

implementations have failed in the past. According to Gartner (Gartner, 2016) research, 55%

to 75% of traditional ERP implementations are at risk. This is the reason most of the businesses

are moving their business to the cloud for safe and secure business functionalities. This project

provides insight into cloud outages and failover mechanisms to provide 100% availability for

business functionality.

In addition, this project discusses the benefits of the multi-cloud environment. Different

approaches to connect two different clouds are studied in this project. This project provides

further understanding of new technology, such as the use of VPN to connect two clouds. This

research covers the implementation of cloud-to-cloud failover using VPN. This project is

significant to readers who want to enable seamless relocation of existing services between two

different dedicated cloud infrastructures.

The significance from a business point of view is listed below.

Kurra Srirekha 13

1. Increased Profits: Avoid losses caused by unpredicted system failures.

2. Peace of Mind: Making maximum use of computers to achieve 100% business

availability, providing a trusted business solution.

3. Sustain Brand Name: Avoid brand impact due to unexpected failures.

1.4. Scope and Limitations of Project

Project Scope

The scope of this study is based on cloud-to-cloud failover. For implementation and analysis

purposes, Amazon Cloud and Microsoft Azure are considered in scope for this project. This

project creates a combination of the clouds that is capable of providing connectivity to high-

level operations including backup, failover and data migration. For this setup, on the AWS end,

an Ubuntu server is selected as it is the underlying platform.

The scope of data collection is from other organisations with this connection setup, and

various books and journals are utilised to obtain in-depth knowledge of site-to-site

connectivity.

This research requires technical knowledge of the technologies listed below.

1. Network Connectivity: A working knowledge of network connectivity is required

because the setup requires high-level security.

2. Networking: A working grasp of routing setups and protocols is vital to execute the

proposed networking connectivity.

3. AWS and Azure Cloud Services: Marston et al. (2011, p.176) mentioned that AWS

and Azure offer similar capabilities in terms of storage and networking. It is essential

to gather in-depth knowledge of these two platforms in performing the network

connectivity. The functionality associated with each of these platforms needs to be

analysed, and based on that, the implementation process will be carried out.

4. Ubuntu: Hands-on knowledge of Ubuntu is required as there is a need to work with

Ubuntu on both AWS and AZURE.

In addition to the above, the resources below are requirements for the project.

1. Amazon Web Services active account

Kurra Srirekha 14

2. Azure cloud services active account

3. Minimum $100 credit in both Amazon and Azure

4. Laptop in working condition.

5. Ubuntu at both the Amazon and Azure ends is required in order to have site-to-

site secure IPsec tunnel connectivity (blogs.technet.microsoft.com, 2016).

Project Limitations

Time constraint is the primary limitation of this study. In addition, other project limitations are

listed below.

1. This research does not compare cloud technologies and providers.

2. The cost analysis of cloud failover is out of scope.

3. This research is based on secondary information, and the information is sourced

from the AWS and Azure company websites.

4. This research does not cover the testing of all failover scenarios.

1.5. Summary

In this section, the background, need, significance, scope and limitations are studied. This

section provides a basic understanding about AWS and Azure clouds as an introduction for

beginners. The primary motive of this project is to study and implement cloud-to-cloud failover

for 100% systems availability. Ongoing research in this area helps to indicate the scope for

future study. This study is important to examine business dependency on the software cloud.

The next section provides an in-depth background of the study, concepts for different terms,

and techniques. It also covers other researchers’ viewpoints on cloud failover.

Kurra Srirekha 15

Chapter 2: Background of Research

2.1. Introduction

This chapter describes the background of the research related to this project. This chapter

includes a discussion of cloud backup and failover, which covers how failovers and backups

help in data recovery and business continuity. The information provided in this section is based

on various scholarly studies and reviews in cloud backup and failover techniques.

2.2. Backup and Failover

For any organisation, business continuity is very important. For example, the Sydney storm in

June 2016 impacted millions of people due to unavailability of online services. The business

continuity issue has impacted many brands. Organisations want to have fully functioning

applications and services even in the event of a disaster. Thus, organisations need failover

mechanisms in order to ensure that services can run with only minor interruptions in the case

of a disaster. Generally, any enterprise maintains up-to-date copies of data in different

geographical locations so that data can be accessible without interruptions even if one location

fails or is disabled, which is known as the backup mechanism (Rouse, 2016).

Failover is the ability of a system to protect against a minor disturbance in a short period

of time using a significant automated process. It is different from load balancing. In a failover

scenario, one server enables operations on another server when the usual one has failed,

whereas in load balancing, all the servers are operational but work in a load-sharing mode. The

failover concept is designed for maximum availability whereas load balancing is built for

scalability. Backup is the capacity to proceed with data in a case of outages, usually with

multiple copies of data in different locations, and normally including manual activities. It

addresses data recovery and continuity through two independent platforms or environments,

each containing its own information and executables (Huanhuan et al., 2015).

Both failover and load balancing concepts are achieved through clustering. The

diagrams shown below depict the architecture of server failover.

Kurra Srirekha 16

2.1 Multiple clustered applications before failover (Source: Microsoft TechNet)

2.2 Multiple clustered applications after failover (Source: Microsoft TechNet)

Kurra Srirekha 17

2.3 Network Loan Balancing (Source: Microsoft TechNet)

2.3. Cloud Computing

Cloud computing is based on an online data processing and availability. The applications are

implemented on shared infrastructure through web. Before the usage of cloud computing,

internet provided online document sources, for multiple users like Google Mail or Google Docs

which are termed as Software as a Service(SaaS) (Chang, 2011).

With so many organisations requiring SaaS for their own functionalities, Amazon

developed AWS, which runs on the cloud and thus empowers the organisations’ functionalities

to run on the cloud (Amazon Web Services, 2010). The benefits of AWS enable organisations

to consume these services, build their own functionalities, and build a private cloud. Ubuntu

Kurra Srirekha 18

Server Edition running Ubuntu Enterprise Cloud is an example of running a private cloud

(Eucalyptus Systems, 2010).

There has been much research into cloud computing, especially in the area of the

execution of virtualised resources in public and private clouds. Many scholar researches are

carried out in cloud computing and components (Buyya et al, 2011).

The diagram below shows a typical cloud computing scenario where multiple devices

are connected to the cloud to operate multiple services built on one platform.

2.4 Cloud Computing (Source: Wikipedia)

Cloud computing has become a cost-effective solution for businesses due to shared and

globally distributed resources. Clients are provided global access to cloud platforms. IT

companies like IBM, Google, AWS, Microsoft and many others have built data centres across

regions to support cloud services. By 2020, cloud computing revenue is estimated to reach

$US241 billion (Reid et al., 2011). Ease of infrastructure and application setup in the cloud has

influenced companies to use cloud shared services (Arean, 2013). According to an IBM white

paper, 61% of UK organisations are dependent on cloud solutions (White paper, 2013).

However, there are a few security challenges, such as recovery mechanisms, trust, and risk

Kurra Srirekha 19

management, which ought to be considered to give better client fulfillment and business

continuity.

2.3.1. Backup and Failover Approaches in Cloud

Disaster recovery (DR) is a process used in most organisations as part of their Business

Continuity Plan (BCP). There are many components of BCP, and data backup and recovery is

one of them. Most databases and servers provide inbuilt backup and restore methods which are

easy to set up.

Failover is different from backup in that to handle a failover scenario, database

application and service backup is mandatory to provide continuous connectivity from the point

of failure. Hence in failover architecture, backup and database replication is an important

concept. Amazon provides database connectivity for six familiar databases (Oracle, Amazon

Aurora, MySQL, PostgreSQL, Microsoft SQL Server and MariaDB). Amazon Aurora is its

own database management system (DBS), and according to Amazon (AWS, 2016) it provides

five times the performance of MySQL. The diagram illustrated below shows the Amazon

database backup and replication scenario.

2.5 Amazon Aurora Database Backup for Replication and Clustering (Source: AWS)

The cloud platform provides different approaches for backup, recovery, and failover

based on the infrastructure of organisations.

Backup Approaches

Kurra Srirekha 20

IT infrastructures are categorised into cloud-native, on-premises and hybrid environments.

i. Cloud-Native environment: This scenario deals with an infrastructure that relies

completely on the cloud. If an organisation is running all its services from the cloud,

that organisation can have many built-in features in order to back up, protect data

and support recovery requirements.

ii. On-premises environment: This scenario deals with an infrastructure that exists

on the premises, using no components in the cloud. However, this environment

allows some software vendors to directly connect their applications with cloud

storage solutions for providing backup and recovery support.

iii. Hybrid environment: This scenario deals with the two infrastructures discussed

above, cloud-native and on-premises environments. These two structures are

combined into a hybrid environment. Here, the network has both on-premises and

cloud infrastructure components. Thus, applications that are running in the cloud

will be connected to applications that are running on-premises. However, latency is

the main constraint while uploading data to the cloud, and consistent performance

is required in order to protect data. (AWS, 2016).

Failover Approaches:

When the primary components in the network such as a server, processor, network or database

become unavailable due to either failover or downtime, then secondary components in that

network or other network components will take the responsibility to provide fault tolerance.

The cloud platform provides services like a load balancer to maintain the workloads between

systems or networks in the event of failure. (Rouse, 2005).

i. Multi-Server: In this scenario, if one organisation has only one data centre in the

cloud platform, there can be multiple servers to store their applications, and by using

a load balancer, then if one server fails, they can run their applications by using

another instance in the cloud so that their business will continue without any

interruptions. When the load balancer is used, the workload can be distributed

between the instances according to spikes in usage. (Robertson, 2016).

Limitations: Here, if one server fails, by using the load balancing service, the cloud

user can have business continuity. However, there is still an issue if that cloud load

balancer service fails or that entire data centre fails due to power outages or disasters.

Kurra Srirekha 21

ii. Multi-Datacentre: To overcome the limitations of a single data centre organisation,

the multiple data centre concept is proposed. An organisation with large

deployments can run the business in multiple data centres in the cloud. Thus, if one

data centre fails due to disaster or power outage in that data centre location, a user

can use another data centre to run their business by using a load balancing service.

(Robertson, 2016).

Limitations: Here, the load balancer is needed to distribute the workload between

data centres. Thus, if that load balancing service fails, or if the entire cloud

infrastructure fails due to disaster or electricity issues, the business continuity will

be lost.

Therefore, if an organisation's applications run in the cloud, such as SaaS or PaaS, their

crucial business data relies upon the cloud provider, which leads to risk in the event of data

loss or failure of that cloud platform.

The organisation itself has to take responsibility for protecting its important data,

regardless where data actually exists. Generally, PaaS and SaaS providers such as Amazon

Web Services, Microsoft Azure, and Google Apps will perform the task of backing up the

user's data. However, such backups are for their own benefit, not for users. For example, a user

needs to recover deleted data, but he might find that cloud provider is unable or unwilling to

help due to disasters or power outages. Then that user may not be able to recover his lost data.

Hence, one cannot depend on a single cloud provider; organisations should consider

their business continuity by having suitable backup and failover mechanisms in the event of

unexpected failure of a current cloud platform.

2.3.2. Cloud Outages

No matter how widespread large cloud platforms become in the world, or what level of

performance, improved availability or uptime is offered by cloud providers, the cloud is

vulnerable to periodic failures or power outages. (Head, 2016). This report describes the best-

known AWS and AZURE cloud outages in recent years, as this project mainly deals with

Amazon and Azure cloud environments.

Amazon Outages

1. 2016 (June 5): Sydney spent that Sunday struggling through severe storms. The

outage also affected AWS services including EC2, Database Migration Service,

Kurra Srirekha 22

ElastiCache, RDS (Relational Database Service), CloudFormation, Route 53

Private DNS, CloudHSM, Redshift, Elastic Beanstalk and Storage Gateway.(Juha

Saarinen, 2016).

2. 2015 (September 20): The Northern Virginia (US-EAST-1) region has suffered a

giant outage affecting AWS services, including CloudWatch, Cognito and

DynamoDB, which all broke down that day. (Fiveash, 2015).

3. 2012 (December 25): The Elastic Load Balancing (ELB) was down in the US-East

area. It affected the applications that utilise the ELB, and these applications were

disconnected for over 23 hours. For example, organisations like Netflix were

affected due to this outage (Fernand, 2014).

Azure Outages

1. 2016 (March 23): East Asian region users faced a problem for App Service\Web

Apps and Virtual Machines due to an interruption to the physical network

infrastructure within the region. That problem later meant higher-than-usual latency

for those trying to reach cloudy VMs. (Sharwood, 2016).

2. 2015 (December 3): The European region suffered from an outage which caused

users to be unable to access both Microsoft Azure infrastructure-as-a-service (IaaS),

and business productivity tools that are based on the Office 365 cloud for several

hours in that region. (Donnelly, 2015).

3. 2012 (February 29): Windows Azure experienced a vast outage. In response, the

service management system had been switched off for about seven hours

worldwide. At the time of that incident, a senior software engineer from Microsoft

reported that the major cause of the incident was certification issues. (Parnell,

2012).

These cloud outages prove that, even though if cloud users use the best cloud platforms,

this does not prevent periodic failures or electric power outages due to natural disasters.

Therefore, cloud users should think about their business continuity before cloud outages

happen.

Kurra Srirekha 23

To eliminate dependency on one cloud, the multi-cloud environment is available.

Connecting one cloud network by using another cloud platform is the most likely solution in

order to eradicate interruptions to businesses.

2.4. Multi-Cloud Environment

This procedure utilises various cloud platforms to diminish the threat of data loss or downtime

due to unexpected errors in cloud computing. Data that is limited to one cloud service may be

at risk in case of situations such as service failure, disaster, and power outages. Therefore, that

specific cloud data will be inaccessible (Woods et. al., 2010).

Hence, this platform will be the best solution for users who want business continuity

by backing up data from cloud to cloud and using a failover provision to handle the failure of

even one cloud computing environment.

Limitations:

The limitations of multi-cloud implementation are listed below (Ravello, 2014).

1. Complexity: as different cloud services and different infrastructures are

connected, there is no standard terminology to build a multi-cloud platform.

2. Management Overhead: due to the complexity of the network architecture,

expertise is needed to determine what to move to the cloud, when, where, and

why.

2.4.1. Cloud-to-Cloud Failover

A cloud network architecture combining two other different cloud networks is created to

provide backup and failover. The outcome in this study is achieved by providing end-to-end

security to two different networks in order to allow the users to manage the workload in both

cloud networks, providing full VM connectivity with the help of a secure IPsec tunnel. This

setup is a working system that supports high availability operations, backup and failover. Thus,

this project is suitable for organisations that want to operate their framework on various

platforms depending upon their requirements, such as providing backup, failover and

redundancy to expand their business.

This project demonstrates the failover from the Amazon cloud to the Azure cloud. The

diagram below illustrates the use of two cloud environments. The services provided by AWS

Kurra Srirekha 24

and Azure provide the functionality to set up the multi-cloud environment for the failover

scenario.

2.6 Use of Multi-cloud Configuration using AWS and Azure (Source: Google Images)

2.7 Failover in Multi-Cloud Environment (Source: Google Images)

There are further benefits of multi-cloud failover which overcome the limitations and promote

business continuity.

2.4.2. Benefits of Connecting Two Clouds

The benefit of this multi-cloud failover is that it provides multi-region application continuity

and a high performance database in the cloud. In addition, the multi-cloud takes advantage of

all the benefits provided by the single cloud environment as shown below. (Ravello, 2014).

1. Reduce dependency

Kurra Srirekha 25

2. High availability

3. Failover

4. Competitive prices

5. Business extensions

The benefits of multi-cloud failover are achieved through VPN.

2.5. VPN

The main focus of this project is connecting AWS virtual private cloud to Azure virtual network

by using a virtual private network, and hence understanding the VPN is significant to this

project.

VPN builds a private network on a public network. It authenticates PC to send

and receive data across shared networks. VPNs normally permit remote access connections

which are authenticated and make utilization of encryption procedures in order to secure private

data. The protected VPN protocol that is used in this project design for the purpose of end to

end security is Internet Protocol Security (IPsec), which was first created for IPv6. IPsec

utilises encryption, encapsulating an IP packet inside an IPsec packet to meet the security

objectives of integrity, authentication, and privacy. IPsec is perfect for network-to-network

(site-to-site) tunnelling.

In this project, VPN is used to connect two distinct virtual cloud networks. In general,

security is considered to be the key issue across the virtual networks, where users face endpoint

issues while using the cloud. As this project deals with AWS and Azure, the complete security

standards that are offered by the AWS and Azure for security provision at all network levels

ensure user safety and privacy.

2.5.1. Cloud-to-Cloud Connectivity using VPN

Cloud service providers like Amazon and Azure support multi-cloud configurations through a

VPN gateway. A VPN gateway is a collection of resources that are used to send network traffic

between virtual networks and on-premises locations. Gateways are used for site-to-site (S2S),

point-to-site, and multi-point connections. This project implements the site-to-site connection

between the AWS and Azure virtual networks.

Kurra Srirekha 26

Site-to-Site connection: This is a connection over an IPsec/IKE VPN tunnel. This

kind of connection needs a VPN device situated on-premises that has a public IP address

associated with it and is not situated behind a Network Address Translation (NAT). S2S

associations can be utilised for cross-premises and hybrid configurations (McGuire, 2016).

2.6. Ubuntu

Ubuntu is a Linux-based operating system. The Ubuntu server is used for the project

implementation because it is free of cost. Information about Ubuntu is out of scope for this

project. The Ubuntu Server Guide (team, 2016) should be referred to for the installation in this

project. Ubuntu documentation provides step-by-step information on installation and

configuration of the various server applications on Ubuntu system to fit the requirements.

2.7. Summary

In this chapter, the backup and failover processes are studied. This chapter provides a basic

overview of cloud backup and failover techniques and well-known cloud outages in recent

years. This chapter concludes that the multi-cloud environment is the solution to avoiding

unexpected cloud outages and describes how both AWS and Azure support the multi-cloud

configurations. Continuing research in this area helps to indicate the scope for future study.

This project is significant when considering business dependency on the software cloud. The

next chapter provides in-depth analysis of the project, related to the domain and technical levels

in the multi-cloud environment.

Kurra Srirekha 27

Chapter 3: Resources and Technical Analysis

3.1. Introduction

Analysis is important to the implementation of the project to overcome the challenges and

reduce the risk of project failure. After the review of the literature and concepts related to multi-

cloud failover, this section presents the analysis by integrating different components to form a

solution. This analysis establishes the connection between project objectives and technical

feasibility.

The objective of this project is to set up a project in the AWS and Azure cloud

environments, replicate databases to and from each server, back up the file system and test the

failover scenario. The tools used to achieve these objectives are Ubuntu, Openswan, IPsec,

VPN tunnel and SSH.

3.2. Analysis of Resources

As this project is based on two different cloud platforms, there is a need to analyse the most

suitable cloud platforms for the project objectives. Both cloud platforms should support the

VPN tunnel, as it plays a main role in the project. Choosing the appropriate cloud vendors

based on the project targets will reduce the complexity. In this project AWS and Azure cloud

platforms had been chosen. As these two platforms having technically more similarities, this

combination will reduce the complexity for implementing multi cloud environment.

In the process of VPN tunnel implementation between AWS and Azure, a software

VPN has been used at AWS end to make EC2 instance acts like VPN device. As this project

deals with Ubuntu server, Openswan source package is used. It is a VPN software since 2005

for Linux-based operating systems and it supports most of IPsec extensions. It is already

included in the distribution of Ubuntu, Gentoo, Red Hat and many others.

In the backup and failover pattern implementation, multi-master synchronous

replication has been used inside each database cluster for high availability and for the failover

provision.

In this project, synchronous replication is implemented using MariaDB Galera, which

is a multi-master cluster for MySQL/XtraDB/InnoDB databases. MariaDB is an upgraded,

drop-in substitute for MySQL, and it utilises the Galera library for database replication.

Kurra Srirekha 28

It is only accessible for operating systems that are based on Linux, which supports the

InnoDB or XtraDB storage engines with the essential features of automatic joining of nodes,

synchronous replication, read and write when connected with any cluster node, and direct client

connections. Hence, the advantages with MariaDB lead to numerous DBMS clustering

solutions that incorporate clusters with no slave lag, no data transaction loss, and minute client

latencies (vexxhost, 2015).

3.3. Analysis of Technical Challenges

Ubuntu is a Linux-based OS, and hence it requires thorough understanding so that other

components like Openswan can be installed smoothly. To avoid the challenges of configuring

Openswan on Ubuntu, an understanding of Linux code to edit and save the configuration is

required. An analysis of the Openswan configuration file is required as modifying it incorrectly

results in rework. The commands listed below are useful for the implementation of this project.

3.1 Sample Linux Commands

Edit/insert: - I

Exit: - :x (or) :q

Save and Exit: - :wq

Paste: Fn + Ins

The analysis of the firewall setting between the two clouds is important. It is required

to authenticate one cloud’s virtual machine with the other cloud. Hence it is important to update

the route table and security groups of the Openswan server in AWS correctly. These security

groups and route tables are used to allow traffic from remote networks.

It is required to include two UDP custom inbound rules. These rules include 4500 and

500, both using the Azure gateway IP address with /32 as the CIDR and routing table by

including the network address of Azure virtual system.

In order to successfully install MariaDB in both cloud VMs, it is required to check the

compatibility of the Ubuntu server with the MariaDB version before implementation of the

project.

Kurra Srirekha 29

The analysis of the MySQL configuration files is important for setting up the MariaDB

cluster for database replication. Once the MySQL configuration files change is complete, a

service restart is required using the following command.

service mysql restart

However, this command may result in failure if the databases on both nodes are not

authenticated. Hence, the MySQL configuration file is to be analysed and updated as shown

below to make it work.

3.2 MySQL Configuration

To transfer files from one cloud to another cloud, the SSH key of both VMs must be

exchanged. Then, bidirectional file transfer will be successful.

During the technical analysis of this project, the focus is kept on configuration changes,

and with continuous learning, all the challenges can be resolved.

3.4. Summary

This section provides the technical analysis of the different tools prior to the actual

implementation. The technical analysis is required to choose a different tool in case of a version

compatibility issue. In this project, each component is analysed, and for configuration changes,

a few online documents are referenced.

The next section provides the design of this project and the code, along with screenshots

from the project implementation.

Kurra Srirekha 30

Chapter 4: Design and System Implementation

4.1. Introduction

The previous sections covered techniques and tools related to cloud-to-cloud failover. This

section provides the design and implementation of cloud-to-cloud failover so that the services

are not impacted. This implementation is done in two stages. In the first stage, the failover of

AWS MySQL database to Azure MySQL database using MariaDB is performed. In the second

stage, the automated backup of the file system from AWS to Azure cloud is performed using

IPsec.

4.2. Problem and Context

The significance of this implementation is the need for the business continuity plan to meet

100% availability of service even on the cloud platform. On September 15, 2016, a global DNS

outage impacted Azure services for all regions. There are many such examples of cloud

unavailability in 2016, as discussed in the literature for this project, impacting businesses across

the globe. Hence this project covers an implementation of 100% cloud computing availability

using the cloud-to-cloud failover concept.

While implementing this project, there were some challenges due to the integration of

different components to form a structured solution. For the failover to happen from AWS to

Azure, it is important that both clouds connect and communicate with each other. This bridge

is set up using Openswan, which is installed on the AWS Ubuntu and Azure Ubuntu virtual

machines. The continuous learning and debugging techniques have helped in successful

configuration and implementation of Openswan and in building a VPN tunnel between the two

clouds. The architecture of this project is described in later sections.

4.3. Solution Structure

Defining a solution is a systematic process in which different components are linked to build a

solution. The solution structure is a process of capturing all the technical details in an

architectural format. This project has followed a recurring procedure in which the architecture

rework is done based on pseudocode testing (H. J. La and S. D. Kim, 2009). The diagram below

is an illustration of the steps followed in the solution architecture of this project.

Kurra Srirekha 31

4.1 Solution structure for backup and failover

This procedure has helped in identifying functional and non-functional areas. One result

of the solution structuring is a process flow design of the cloud-to-cloud failover network

architecture components as illustrated below.

4.2 Architectural design of AWS and Azure cloud-to-cloud connectivity

Amazon Virtual Private Cloud (VPC) is the primary virtual machine in the cloud

network of this project. The Ubuntu server is installed on this VM. The backup server is Azure

Virtual Net on which Ubuntu is also installed. To connect both virtual machines over the cloud,

Openswan is installed on the primary cloud (AWS) and configured with VPN so that both

Kurra Srirekha 32

communicate with each other using a VPN tunnel. MariaDB is used for MySQL database

replication, and IPsec is used for file backup from AWS to Azure. The process flow and

installation of this solution is explained next in the implementation process section.

4.4. Implementation Process

The implementation of the project begins with creation of a VPC in the AWS US-West region

with a network address of 10.0.0.0/16 and one public subnet with an address range of

10.0.0.0/24. An Ubuntu server is deployed in the subnet and assigned an IP of 10.0.0.61. This

server is associated with an Elastic IP (EIP) address to access the Ubuntu server by using the

SSH command line interface. The EIP is required to connect with the Azure network.

A virtual network (VNet), similar to VPC in AWS, is created in the Azure US-East

region with a network address of 172.16.0.0/16 and an added subnet with network address

172.16.1.0/24. The Ubuntu server is deployed in the virtual network and assigned the IP

address of 172.16.1.5. The configuration of site-to-site connectivity is set by using EIP and the

address space of AWS. A virtual network gateway is created as shown in the screenshot below,

which generates a virtual gateway IP address and manages the shared key.

4.3 Virtual Network Gateway between AWS and Azure

4.4.1. Site-to-Site VPN Tunnel Setup

Openswan is deployed in the AWS Ubuntu server so that it acts as a VPN device to work for

the VPN tunnel. Openswan depends on IPsec protocol. This protocol is used for encoding IP

traffic before packets are exchanged between source and destination. Router is used for

encryption, decryption, encapsulation and de-capsulation. Security groups and route tables are

Kurra Srirekha 33

modified on AWS end to allow traffic from Azure. The Openswan /etc/ipsec.conf file’s

connection section is given below.

4.1 Openswan Configuration File

Where,

# Left = AWS side Ubuntu server IP address

# Left subnet = AWS network address

# Right = Azure virtual network gateway

# Right subnet = Azure virtual network address

The implementation process is divided into two stages: one for failover cluster setup

and the other for file system backup.

4.4.2. MariaDB Galera Cluster Setup (Replication and Failover)

MariaDB Galera Cluster is deployed on both VMs to synchronise the MySQL database through

replication. Below is the command used to install the MariaDB cluster.

4.2 MariaDB Installation Command

Kurra Srirekha 34

After the deployment of the MariaDB cluster, a configuration file called “my.config”

is changed on each virtual machine with wsrep configuration options so that the MariaDB

cluster understands the endpoints for communication. In addition to this change, the

configuration file is changed for the VSRep option to provide the IP address and related details

of the other VM. Hence in the AWS VM, the IP address of Azure is provided and vice versa,

so that MariaDB acts as a mediator for replication between the two clouds. To configure VSRep

under the [mysqld] directory on each node, the “my.config” file is changed with their particular

hostnames, IP address and root passwords.

4.3 Configuration for wsrep Option in my.config

4.4 Configuration for VSRep Option in my.config

The configuration is completed successfully, and hence the bi-directional replication of

MySQL DB is initiated using the MariaDB cluster.

4.4.3. SSH Keys and SCP Command (File System Backup)

Kurra Srirekha 35

SSH is a kind of protocol that permits secure connections between virtual machines. The SCP

command is useful for transferring files across SSH connections from one cloud to another. An

SSH public key is used to transfer the data between AWS and Azure. The SSH key of the AWS

Ubuntu server is copied into the “authentications_key” file in Azure and vice versa so that

required files can be transferred from one Ubuntu VM to another Ubuntu VM by using an SCP

command.

4.5. Challenges

Connecting two different clouds is not an easy process. While implementing the VPN tunnel

that is based on IPsec mode between AWS and Azure, there will be a problem when the peers

of the tunnel are at the back of the NAT. The NAT changes the data of an IP packet so that this

IP packet will be rejected by the other peer as the signature of the packet is wrong. The solution

is commonly known as NAT-T (NAT Traversal); it works using an IPsec packet encapsulating

technique in UDP packets. As these packets can pass via NAT routers, there will be no loss of

packets or packet drops.

When using Openswan to create a VPN tunnel, there will be two parameters called Left

and Right, which are simply peers on the two ends of its tunnel. Various parameters for these

two ends will be configured in “/etc/ipsec.conf”, which is used to define a tunnel between two

nodes. The “ipsec.conf” file will have “nat_traversal=no” by default, and as there is a need to

support NAT Traversal, that command should be changed as “nat_traversal=yes”.

There is an “auto” parameter in the conn section of the ipsec.conf file. This parameter is

used to set the automatic operation which should be done during IPsec startup. By default, this

parameter is defined as “auto=ignore”, which indicates that no operation is set as automatic

startup. Therefore, this value should be changed to “auto=start”, which indicates the automatic

connection. (Rosen, 2016).

Openswan supports NAT-T in order to pass IP packets through NAT routers, which makes

UDP packets by encapsulating IPsec packets without being dropped. Hence, to allow UDP

packets, the inbound rules in the security group of the Openswan server that is located in the

AWS virtual private cloud should be modified with two custom UDP rules that allow port

ranges 4500 and 500 to allow traffic from the Azure virtual network gateway.

4.6. Summary

Kurra Srirekha 36

This section provides the detailed configuration of the Ubuntu server on both nodes (AWS and

Azure) and other components’ configuration for failover and backup purposes. The sanity

check is done to test the connectivity between two nodes. The next section captures the test

results and evaluation of the objectives of this project.

Kurra Srirekha 37

Chapter 5: Testing and Evaluation

5.1. Introduction

The implementation of cloud-to-cloud failover and backup is achieved using the control panel

provided by AWS with components like Openswan, MariaDB cluster, VPN and SSH

integration. A virtual network gateway is created in the Azure virtual machine and Openswan

in the AWS virtual machine. The coding is done in AWS in order to create the tunnel between

AWS and Azure. For MariaDB, the coding is done on both VMs for configuration.

This section covers the test cases, results and evaluation of how the implementation

achieves the objectives. The testing is carried out in two phases: one for cloud failover and one

for file system backup validation. The test case scenarios are shown in the table below.

5.1 Test Case Scenarios

Sr # Test Description Execution Step Test Result Remarks

1. VPN Tunnel Testing

2. Database Cluster Replication

3. File transfers between two clouds

Test Results Codes:

OK – Test condition is passed

NG – Not good (i.e., test condition failed)

NT – Not tested

The screenshots captured while testing are provided in the appendix.

5.2. VPN Tunnel Testing

The testing of the VPN tunnel is important because it shows the success or failure of

connectivity between the two clouds. In order to verify whether the VPN is up on the virtual

machines, the command show crypto ipsec sa is used. If the connection is successful, then the

Kurra Srirekha 38

output of this command shows both the inbound and outbound SPI. This result shows the

encaps/decaps counters incrementing if the traffic passes through the tunnel.

5.1 Communicating Both VMs of AWS and Azure

5.2 VPN Tunnel Testing Result

Sr # Test Description Execution Step Test Result Remarks

1. VPN Tunnel Testing a. Go to AWS

b. Open command prompt

c. Use command below

show crypto ipsec sa

OK

2. VPN Session Check a. Go to AWS

b. Open Command prompt

c. Use command below

show vpn-sessiondb

OK Result is

Session status: UP-ACTIVE

5.3. Database Cluster Replication

The cluster status monitoring is required to check if the cluster is operational. This test case

covers the monitoring, status updates and failover scenario.

Kurra Srirekha 39

5.2 Database Replication Between AWS and Azure

5.3 Database Cluster Testing Result

Sr # Test Description Execution Step Test

Result

Remarks

CHECKING CLUSTER INTEGRITY

1. Check the Cluster

Configuration

a. Open MariaDB Windows and run

command below

SHOW GLOBAL STATUS LIKE

'wsrep_%';

OK It shows the wsrep

protocol version as 5,

last committed as 202

and thread count as 2

2. Check Cluster

Integrity


'wsrep_cluster_state_uuid';

OK It shows the cluster

state UUID which

helps to determine that

this node is part of the

cluster.

3. Check the number of

nodes in the cluster


'wsrep_cluster_size';

OK The value is 2

4. Check cluster status SHOW GLOBAL STATUS LIKE

'wsrep_cluster_status';

OK The value is primary.

CHECKING THE NODE STATUS

Kurra Srirekha 40

5. Check if node is

accepting queries


'wsrep_ready';

OK The value is ON

6. Check if node has

network

connectivity with

other node


'wsrep_connected';

OK The value is ON

7. Check node state SHOW GLOBAL STATUS LIKE

'wsrep_local_state_comment';

OK The value is Joined

CHECKING THE REPLICATION HEALTH

8. Check the average

size of write set

queue

SHOW STATUS LIKE

'wsrep_local_recv_queue_avg';

OK The value is 3.34. If

the value is greater

than 0.0, it means there

is delay in replication.

Ideally the value

should be as close as

possible to 0.0.

9. Check the pause

status due to flow

control

SHOW STATUS LIKE

'wsrep_flow_control_paused';

OK The value is 0.18. If

the value is greater

than 0.0, it means the

node is paused due to

flow control. Ideally

the value should be as

close as possible to 0.0.

DETECTING SLOW NETWORK ISSUES

10. Check average

length of the query

SHOW STATUS LIKE

'wsrep_local_send_queue_avg';

OK The value is 0.14.

Values greater than 0.0

indicate a network

bottleneck. Ideally the

value should be as

close as possible to 0.0.

FAILOVER TO AZURE

Kurra Srirekha 41

11. Check if database is

failed over

AWS database should be down and

Azure database should be up and

acting as primary

OK Azure database is

showing as primary

5.4. AWS-AZURE: Inter-Cloud Transfer (file transfers between two

clouds)

SSH and SCP are used for the file transfer between two VMs for backup and restore purposes

in the case of cloud failure. SSH is the general protocol, and SCP is the Linux SSH client

command. Hence this covers the test cases and results of file transfer scenarios as shown in the

table below.

5.3 File Transferring Between AWS and Azure

5.4 Cloud-to-Cloud Backup Testing Result

Sr # Test

Description

Execution Step Test

Result

Remarks

1. Check all SSH

connected

sessions

[root@router~]# netstat -tnpa | grep 'ESTABLISHED.

*sshd'

OK It shows two IP

addresses as

established with

SSH

2. Check if SSH

service is

running

sudo service ssh start OK

Kurra Srirekha 42

3. Check if SCP

is transferring

file through

SSH

[virtual machine ~]$ scp examplefile

yourusername@yourserver:/home/yourusername/

OK The file is

copied to the

virtual machine

correctly.

5.5. Results Evaluation

During this testing procedure, there are some important evaluations needed of the test results.

This testing is carried out to make sure the defined objectives are tested as part of this project

implementation. The evaluation is performed on two major test conditions as described next.

5.5.1. AWS-AZURE Failover

The failover is performed using the MariaDB cluster. The database on AWS is failed manually,

and it is evaluated that it immediately fails over to Azure. Because both databases are in sync

with the help of replication services, there was no outage, and hence they provided 100%

availability of the data. This testing is important for the mission-critical businesses that are

dependent on cloud computing.

5.5.2. AWS-AZURE Backup

For the AWS-to-Azure backup, the SSH protocol is used. The files are backed up using an SCP

Linux command. The AWS virtual machine is made unavailable, and the files are still

accessible from the Azure virtual machine. This evaluation verifies that, from the point of

failure, the cloud-to-cloud file system backup and recovery still function without any data loss

or unavailability during a disaster.

5.6. Summary

The testing is a critical part of this project as it tests the project through multiple scenarios as

defined in the test cases and provides the results. The failover from AWS to Azure is achieved

using the MariaDB cluster, and file system synchronisation is performed using SCP through

the SSH protocol.

Kurra Srirekha 43

Chapter 6: Recommendations and Future Scope

6.1. Introduction

The implementation of this project overcomes the cloud failure challenges when using cloud

technologies. Though there are lessons to be learned from this project implementation, it

provides a direction for future work by enhancing cloud-to-cloud failover to achieve high

scalability and performance.

6.2. Recommendations

The VPN device must support some requirements in order to work correctly; for example, it

must have a public-facing IPv4 address and support IKEv1. The VPN device must also support

NAT-T, AES 128-bit encryption, SHA-1, etc. to establish the IPsec security associations in the

VPN tunnel.

In this project, the software VPN Openswan, which supports all the above requirements.

has been used at the AWS end of the VPN tunnel. However, this can be done by using different

methods.

Windows server 2012 R2 can be used as tunnel’s end-point on the AWS side and also

as an Azure VPN device. A Windows PowerShell script will be used in this method to install

RRAS (Routing and Remote Access Server) on the AWS server, and it needs to be configured

in order to create the site-to-site IPsec tunnel between AWS and Azure.

Openswan can be used on both sides, or Strongswan can be used at the AWS end, to

create the VPN tunnel. However, Strongswan is a more developed document than Openswan.

(Michael, 2015). Another software VPN called TMG 2010 supports these prerequisites, yet

achieving full network functionality has turned out to be harder than anticipated. (MikeWo,

2013).

Other recommendations from this project are listed below.

1. In the cloud failover project, thorough understanding and hands-on experience of the

technologies help to save time in implementation.

2. Prior to cloud-to-cloud failover implementation, a compatibility study of the different

cloud technologies and tools is important.

Kurra Srirekha 44

3. In the cloud-to-cloud failover testing, maximum scenarios are required to be covered.

6.3. Future Work

The scope of this project is limited to cloud clustering and file system backup. Therefore, the

future scope in cloud availability will be the implementation of cloud-to-cloud load balancing

for higher performance and scalability.

This can be implemented using HA Proxy, which is an open-source package that is used

to create load balancing between two different virtual networks. By using this source, a virtual

IP address that acts like a load balancer can be created between AWS and Azure to balance the

workload between primary and secondary sites. The solution structure can be as illustrated in

the diagram below.

6.1 Cloud-to-Cloud Load Balancer – Future Scope

In this project the file backup is implemented manually, which can be automated in

future.

The cost and security analysis of the multi-cloud environment can be considered for

future study and implementation.

6.4. Summary

A multi-cloud environment between AWS and Azure had been implemented by using an

Openswan software VPN. This chapter discussed other approaches to achieve cloud-to-cloud

connectivity. In this chapter, a few recommendations are made regarding the VPN device. This

Kurra Srirekha 45

chapter also defined the creation of the automatic backup and load balancer between AWS and

Azure to balance the workload for providing high availability, as included in the future scope

of this project.

Kurra Srirekha 46

Chapter 7: Conclusion

7.1. Introduction

The integration of AWS and Azure provides cloud-to-cloud failover in order to overcome the

challenges arising from unexpected cloud outages. This chapter will discuss the overall

conclusion of this project.

7.2. Conclusion

This project will be a meaningful solution for large-sized organisations that are considering

their business continuity even in the case of unexpected outages or failures.

The test results of this project proved that 100% business availability on the cloud is

achievable. Recent examples of cloud outages, like the 2016 Sydney storm resulting in an AWS

outage or the Azure outages due to other natural calamities, have forced the business

community to rethink the alternative options for their service availability. This project has

implemented and tested the potential for 100% cloud availability by using cloud-to-cloud

failover techniques.

Kurra Srirekha 47

8. Reference List

Books, Journals & Documents

1. Arean, O. (2013). Disaster recovery in the cloud. Network security, 9, 5-7.

doi:10.1016/S1353-4858(13)70101-6

2. Buyya, R., Broberg, J., & Goscinski, A. M. (2011). Cloud computing: Principles and

paradigms. Brooklyn, NY: Wiley.

3. Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud

computing and developing information technology platforms: Vision, buildup, and

actuality for conveying computing as the fifth utility. Anticipated generation computer

systems, 25(6), 599-616.

4. Cassidy, L. (2016, August 03). PROJECT MANAGEMENT: TIME ESTIMATES AND

PLANNING. Retrieved from projectsmart.

5. CIF (2016). UK cloud adaption and trends for 2016. Retrieved from

https://www.cloudindustryforum.org/content/uk-cloud-adoption-trends-2016

6. Daniel, N., Rich, W., Chris, G., Graziano, O., Sunil, S., Lamia, Y., & Dmitrii, Z. (2009).

The eucalyptus open-source cloud-computing system. IEEE international symposium

on cluster computing and the grid, 12(9), 124-131. doi:10.1109/CCGRID.2009.93

7. Deepa. (2012, August 7). 14 cloud outages in 7 months, who is next? CIOL Bureau 2,

2-3. Retrieved from http://www.ciol.com/14-cloud-outages-months/

8. Donnelly, C. (2015, December 03). European Office 365 and Microsoft Azure users hit

by service outage. Retrieved from Computerweekly.

9. Ferguson, T. (2009). Salesforce.com outage hits thousands of businesses. Retrieved

from http://news.cnet.com/8301-1001_3-10136540-92.html

10. Fernand, F. (2014, April 23). Managing elasticity across Multi-cloud providers.

Retrieved from Slideshare: http://www.slideshare.net/fifiant/multicloud

11. Fiveash, K. (2015). AWS outage knocks Amazon, Netflix, Tinder and IMDb in MEGA

data collapse. London: the register.

12. Foley, M. J. (2016). Global DNS outage hits Microsoft Azure customers. Australia:

ZDNet.

Kurra Srirekha 48

13. Head, B. (2016). Ignore cloud faults at your peril. CWANZ (p. 15). Australia:

TechTarget.

14. Hofmann, P., & Woods, D. (2010). Cloud computing: The limits of public clouds for

business applications. IEEE internet computing, 6, 91-93.

15. Huanhuan X., Frank F., & Claus P. (2015). An architecture pattern for multi-cloud high

availability and disaster recovery. Workshop on Federated Cloud Networking

FedCloudNet, pp. 5-6

16. Jackson, T. (2008). We feel your pain and we are sorry. Retrieved from

http://gmailblog.blogspot.com/2008/08/we-feel-your-pain-and-were-sorry.html

17. John, M. (2010). Amazon elastic compute cloud (EC2). Retrieved from

http://aws.amazon.com/ec2/.

18. Juha Saarinen, A. C. (2016). AWS Sydney outage downs big-name web companies.

Australia: itnews.

19. Kevin J. (2009). Secure Cloud Computing: An Architecture Ontology Approach.

Retrieved from http://sunset.usc.edu/gsaw/gsaw2009/s12b/jackson.pdf, DataLine,

2009.

20. Lin, Y. K., & Chang, P. C. (2011). Maintenance reliability estimation for a cloud

computing network with nodes failure. Expert systems with applications, 38, 14185–

14189.

21. Marston, S., Li, Z., Bandyopadhyay, S., Zhang, J., & Ghalsasi, A. (2011). Cloud

computing - The business perspective. Decision Support Systems, 51, 176-189. doi:

10.1016/j.dss.2010.12.006

22. McGuire, C. (2016, 09 21). About VPN Gateway. Retrieved from Microsoft Azure:

https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-

vpngateways/

23. Parnell, B.-A. (2012). Microsoft's Azure cloud down and out for 8 hours. London: the

register. Retrieved from the register.

24. Perez, J. C. (2008). Extended Gmail outage hits apps admins. Retrieved from

http://www.computerworld.com/s/article/9117322/Extended_Gmail_outage_hits_App

s_admins

http://aws.amazon.com/ec2/

Kurra Srirekha 49

25. Pierre R. (2016). Step-by-step: connect your AWS and Azure environments with a VPN

tunnel. Retrieved from https://blogs.technet.microsoft.com/canitpro/2016/01/11/step-

by-step-connect-your-aws-and-azure-environments-with-a-vpn-tunnel/ /

26. Rawat, V. (2013). Reducing failure probability of cloud storage services using multi-

cloud. Kota, Rajasthan: University College of Engineering, RTU. Retrieved from

https://arxiv.org/ftp/arxiv/papers/1310/1310.4919.pdf

27. Reid, S., Kicker, H., Matzke, P., Bartels, A., & Lisserman, M. (2011). Sizing the cloud.

technical report. Retrieved from http://www.forrester.com-/E-

/Sizing+The+Cloud/fulltext/RES58161objectid=RES58161

28. Robertson, B. (2009). Top five cloud computing adoption inhibitors. Enterprise

innovation, 5, 12-14.

29. Rouse, M. (2007). Line of business. Retrieved from

http://searchcio.techtarget.com/definition/ LOB

30. Rouse, M. (2016). Line of business. Retrieved from

http://searchcio.techtarget.com/definition/ LOB, (accessed on 20 Nov 2012).

31. Saarinen, J., Coyne, A. C. (2016). AWS Sydney outage downs big-name web companies.

Australia: itnews.

32. Sharwood, S. (2016). Azure's wobbly day as three services glitch around the world.

London: The Register.

33. Team. (2016). Ubuntu Server Guide. Retrieved from helpubuntu:

https://help.ubuntu.com/lts/serverguide/serverguide.pdf

34. Veena R. (2013). Reducing failure probability of cloud storage services using multi-

cloud. Kota, Rajasthan: University College of Engineering, RTU. Retrieved from

https://arxiv.org/ftp/arxiv/papers/1310/1310.4919.pdf

35. Xiong, H. F. (2015). An architecture pattern for multi-cloud high availability and

disaster recovery. Workshop on Federated Cloud Networking FedCloudNet, 15.

Websites

http://www.rightscale.com/blog/enterprise-cloud-strategies/private-and-hybrid-

clouds-9-use-cases-and-implementation-advice

https://www.opsgility.com/blog/2013/09/03/connecting-clouds-site-to-site-aws-azure/

http://developers-club.com/posts/196798/

www.Gartner.com

https://help.ubuntu.com/lts/serverguide/serverguide.pdf

http://www.rightscale.com/blog/enterprise-cloud-strategies/private-and-hybrid-clouds-9-use-cases-and-implementation-advice

http://www.rightscale.com/blog/enterprise-cloud-strategies/private-and-hybrid-clouds-9-use-cases-and-implementation-advice

https://www.opsgility.com/blog/2013/09/03/connecting-clouds-site-to-site-aws-azure/

http://developers-club.com/posts/196798/

http://www.gartner.com/

Kurra Srirekha 50

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Aurora.html

https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-

vpngateways/

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Aurora.html

https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-vpngateways/

https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-vpngateways/

Kurra Srirekha 51

9. Appendices

This section provides the screenshots used during the implementation of the project.

9.1. Implementation Screenshots

The following are the implementation screenshots of the procedure to connect the AWS and

Azure cloud platforms using VPN connection in order to provide backup and failover.

1) Created a VPC with single public subnet in AWS environment

9.1: Creating VPC with single public subnet

9.2: Details of VPC while creating

Kurra Srirekha 52

2) Deployed Ubuntu server in VPC

9.3: Selecting Ubuntu AMI in EC2 dashboard

9.4: Configuration details of Ubuntu instance while deploying

Kurra Srirekha 53

3) Created Elastic IP and associated that EIP with Ubuntu server that was created in

previous step

9.5: Allocating EIP in EC2 dash board

9.6: Associate EIP with Ubuntu server

Kurra Srirekha 54

4) Created Azure virtual network

9.7: Created virtual network in Azure

5) Created site-to-site connectivity

9.8: Given AWS VPC address space and Elastic IP of AWS while creating Site-to-Site

connection

Kurra Srirekha 55

9.9: Created site-to-site connectivity

6) Defined the Azure virtual network subnet and added Gateway subnet

9.10: Adding subnet and gateway for virtual network in Azure

Kurra Srirekha 56

7) Created the Azure Virtual Network Gateway

9.11: Azure virtual network Gateway IP address produced

9.12: Manage shared key generated

8) Connect to the Ubuntu VM on AWS side to configure Openswan

Kurra Srirekha 57

9.13: Accessing Ubuntu by using SSH PuTTY on AWS side

9) Installed Openswan on Ubuntu server in AWS

Used the following command to install openswan on Ubuntu.

sudoapt-get install openswan

9.14: Installing Openswan in the AWS Ubuntu server

10) Configured the openswan by editing the code

Kurra Srirekha 58

The code for the Openswan configurations is presented in next section (9.2).

Once the Openswan configurations are done, edited the “sysctl.conf “file:

Enabled the IP forwarding to the Open Swan VM by uncommenting the command below:

net.ipv4.ip_forward=1

Next, disabled the “source / destination checking” option on the Open Swan server:

9.15: Disabling source/destination check

11) Modified Security Groups to Allow Traffic from Windows Azure

9.16: Modified security groups for openswan server in AWS

Kurra Srirekha 59

12) Azure Virtual Network Connected to Amazon AWS Virtual Private Cloud

Before restarting the Openswan server, the connection representation is as shown below:

9.17: Connection between two clouds before configuring openswan

Then, restarted the openswan server by using command below:

sudo service ipsec restart

Now, the graphical representation of connection between two clouds will be as show:

9.18: connection between two clouds after configuring Openswan

Kurra Srirekha 60

Then, used ping command to check whether both VMs on the two clouds are communicating

with each other or not

9.19: Pinging both VMs in different clouds

13) Deployed Maria DB cluster to perform Database replication

Installed Maria DB cluster on two nodes by using the command below:

Maria DB configurations on two nodes is provided in Coding section (9.2).

14) Replication of Database between two cloud VMs, which can consider as Failover

Kurra Srirekha 61

9.20: database replication between two VMs

15) File transferring between two cloud VMs; can consider as File System backup

9.21: Files transferring between two VMs

9.2. Coding Part of Implementation

Openswan Configurations on AWS side:

Edited the ipsec.conf file:

Kurra Srirekha 62

9.22: Edited ipsec. conf file

Edited the amnazure. conf file:

9.23: Edited amnazure. conf file

Edited the ipsec. secrets file:

Specified the Azure gateway and manage shared key in command below:

Kurra Srirekha 63

9.24: Edited ipsec.secrets file

Maria DB Cluster Configurations:

Maria DB galera cluster has been deployed in each VM to provide synchronous MySQL

database replication. Below command is used to install Maria DB cluster in each VM:

Then, to configure the Maria DB cluster, the files below have been changed.

Kurra Srirekha 64

MySQL Settings:

First of all, opened the my.cnf file and commented the following lines, which are

uncommented by default on Ubuntu servers.

Kurra Srirekha 65

MariaDB Settings:

Now added the following lines for wsrep configuration options in my.cnf file

under [mysqld]directive as shown below in AWS server and Azure server.

VSRep Providers Configurations:

Here, configured the VSRep configurations under the [mysqld]directory on each node by

adding the following lines in /etc/mysql/my.cnf file with their specific IP address, hostnames

and root password.

Kurra Srirekha 66

In this project, bi-directional replication has been implemented between AWS and Azure VMs;

this replication provides the failover and failback between the AWS and Azure sites.