sri-prj702- project report
TRANSCRIPT
Kurra Srirekha 1
NELSON MARLBOROUGH INSTITUTE OF TECHNOLOGY
BRIDGING AWS WITH AZURE FOR BACKUP
AND FAILOVER PROVISION:
USING VPN CONNECTION
Kurra Srirekha
Student ID: 13472618
Course ID: PRJ702
9/29/2016
Graduate Diploma Project in Information Technology
Kurra Srirekha 2
Abstract
There are many mission-critical businesses deployed via cloud computing. The recent Amazon
Web Services (AWS) cloud outage due to a Sydney storm impacted their business brand.
Similarly, Azure outages have impacted customers globally. All these recent examples point
to one challenge, and that is interruption in cloud computing. Hence there needs to be a better
solution than depending on one cloud. The solution that is discussed in this project is cloud-to-
cloud failover and backup.
Two clouds, AWS and Azure, are chosen for the implementation of this project. This
project has achieved the objective of business continuity with 100% data availability by
integrating AWS, Azure, MariaDB cluster, Openswan, IPsec, VPN tunnel, Secure Shell (SSH)
and session control protocol (SCP) technologies. Sections 1 and 2 of this project cover the
potential challenges in cloud outages, provide details on some of the most recent outages and
actual challenges faced by businesses due to those cloud outages, a literature review, and the
theoretical study of the concept. Sections 3 and 4 cover the analysis, design, solution
architecture and implementation of the project. Section 5 covers the test scenarios, test cases,
results and evaluation.
Though there are lessons to be learned from this project implementation, it provides a
direction for future work by enhancing cloud-to-cloud failover to achieve high scalability and
performance.
This project bridges AWS with the Azure cloud environment using a virtual private
network (VPN) to provide backup and failover.
Key words: Cloud-to-Cloud Failover, Cloud Outage, Cloud Backup, Cloud Load Balancing,
AWS, Azure, MariaDB cluster, Openswan, IPsec, VPN tunnel, SSH and SCP.
Kurra Srirekha 3
Acknowledgement
It is always a delight to remind the praiseworthy people in the Graduate program for their great
guidance I received to uphold my practical as well as theoretical skill in Graduate Diploma.
Firstly, it has indeed been a great privilege for me to have Mr. David Airehrour, as my
mentor for this project. His awe-inspiring personality, superb guidance and constant support
are the motive power behind this project work.
I take this opportunity to express my most extreme appreciation to him. I am additionally
obliged to him for his auspicious and significant advice.
I would also like to acknowledge and my heartfelt gratitude to Mr. Ali Javan (IT
Lecturer) and Mrs. Charanya Mohanakrishnan (IT Lecturer) who consistently upheld me in
each possible way, from starting encouragement to support till this date.
Finally, I am thankful to all technical and non-teaching staff of the Department of
Information Technology (Networks) and NMIT for their constant assistance and co-operation.
Kurra Srirekha 4
Table of Contents List of Figures ........................................................................................................................... 6
List of Tables ............................................................................................................................ 7
Acronyms .................................................................................................................................. 8
Chapter 1: Introduction of the Project .................................................................................. 9
1.1. Introduction ................................................................................................................. 9
1.2. Project Objective ....................................................................................................... 11
1.3. Significance of Project .............................................................................................. 12
1.4. Scope and Limitations of Project .............................................................................. 13
1.5. Summary ................................................................................................................... 14
Chapter 2: Background of Research .................................................................................... 15
2.1. Introduction ............................................................................................................... 15
2.2. Backup and Failover.................................................................................................. 15
2.3. Cloud Computing ...................................................................................................... 17
2.3.1. Backup and Failover Approaches in Cloud ........................................................... 19
2.3.2. Cloud Outages ....................................................................................................... 21
2.4. Multi-Cloud Environment ......................................................................................... 23
2.4.1. Cloud-to-Cloud Failover........................................................................................ 23
2.4.2. Benefits of Connecting Two Clouds ..................................................................... 24
2.5. VPN ........................................................................................................................... 25
2.5.1. Cloud-to-Cloud Connectivity using VPN .............................................................. 25
2.6. Ubuntu ....................................................................................................................... 26
2.7. Summary ................................................................................................................... 26
Chapter 3: Resources and Technical Analysis .................................................................... 27
3.1. Introduction ............................................................................................................... 27
3.2. Analysis of Resources ............................................................................................... 27
3.3. Analysis of Technical Challenges ............................................................................. 28
3.4. Summary ................................................................................................................... 29
Chapter 4: Design and System Implementation ................................................................. 30
4.1. Introduction ............................................................................................................... 30
4.2. Problem and Context ................................................................................................. 30
4.3. Solution Structure ...................................................................................................... 30
4.4. Implementation Process ............................................................................................ 32
Kurra Srirekha 5
4.4.1. Site-to-Site VPN Tunnel Setup.............................................................................. 32
4.4.2. MariaDB Galera Cluster Setup (Replication and Failover) ................................... 33
4.4.3. SSH Keys and SCP Command (File System Backup) .......................................... 34
4.5. Challenges ................................................................................................................. 35
4.6. Summary ................................................................................................................... 35
Chapter 5: Testing and Evaluation ...................................................................................... 37
5.1. Introduction ............................................................................................................... 37
5.2. VPN Tunnel Testing.................................................................................................. 37
5.3. Database Cluster Replication .................................................................................... 38
5.4. AWS-AZURE: Inter-Cloud Transfer (file transfers between two clouds) ............... 41
5.5. Results Evaluation ..................................................................................................... 42
5.5.1. AWS-AZURE Failover ........................................................................................ 42
5.5.2. AWS-AZURE Backup ......................................................................................... 42
5.6. Summary ................................................................................................................... 42
Chapter 6: Recommendations and Future Scope ............................................................... 43
6.1. Introduction ............................................................................................................... 43
6.2. Recommendations ..................................................................................................... 43
6.3. Future Work .............................................................................................................. 44
6.4. Summary ................................................................................................................... 44
Chapter 7: Conclusion ........................................................................................................... 46
7.1. Introduction ............................................................................................................... 46
7.2. Conclusion ................................................................................................................. 46
8. Reference List .................................................................................................................. 47
9. Appendices ..................................................................................................................... 51
9.1. Implementation Screenshots ..................................................................................... 51
9.2. Coding Part of Implementation ................................................................................. 61
Kurra Srirekha 6
List of Figures
1.1 Sydney Strom at Collaroy in Northen Sydney Impacted AWS Cloud (Source: SBS.com) 9
2.1 Multiple clustered applications before failover (Source: Microsoft TechNet) .................. 16
2.2 Multiple clustered applications after failover (Source: Microsoft TechNet) ..................... 16
2.3 Network Loan Balancing (Source: Microsoft TechNet) .................................................... 17
2.4 Cloud Computing (Source: Wikipedia) ............................................................................. 18
2.5 Amazon Aurora Database Backup for Replication and Clustering (source: AWS) .......... 19
2.6 Use of Multi Cloud Configuration using AWS and Azure (Source: Google Images) ....... 24
2.7 Failover in Multi-Cloud Environment(Soure: Google Images) ......................................... 24
4.1 Solution Structure For Backup and Failover ..................................................................... 31
4.2 Architectural Design of AWS and Azure Cloud to Cloud Connectivity ........................... 31
4.3 Virtual Network Gateway between AWS and Azure ........................................................ 32
5.1 Communicating Both VMs of AWS and Azure ................................................................ 38
5.2 Database Replication Between AWS and Azure ............................................................... 39
5.3 File Transferring Between AWS and Azure ...................................................................... 41
6.1 Cloud to Cloud Load Balancer – Future Scope ................................................................. 44
Kurra Srirekha 7
List of Tables
1.1 Earlier Cloud Service Outages ........................................................................................... 10
3.1 Sample Linux Commands .................................................................................................. 28
3.2 MySQL Configuration ....................................................................................................... 29
4.1 Openswan Configuration File ............................................................................................ 33
4.2 Maria DB Installation Command ....................................................................................... 33
4.3 Configuration for wsrep Option in my.config ................................................................... 34
4.4 Configuration for VSRep Option in my.config ................................................................. 34
5.1 Test Case Scenarios ........................................................................................................... 37
5.2 VPN Tunnel Testing Result ............................................................................................... 38
5.3 Database Cluster Testing Result ........................................................................................ 39
5.4 Cloud to Cloud Backup Testing Result ............................................................................. 41
Kurra Srirekha 8
Acronyms
AWS Amazon Web Services
VPN Virtual Private Network
VNet Virtual Network
SSH Secure Shell
SCP Secure Copy Protocol
VM Virtual Machine
VPC Virtual Private Cloud
ERP Enterprise Resource Planning
ELB Elastic Load Balancing
NAT Network Address Translator
OS Operating System
PaaS Platform as a Service
SaaS Software as a Service
Kurra Srirekha 9
Chapter 1: Introduction of the Project
1.1. Introduction
Most of the cloud-based businesses including banks in Sydney and surrounding areas were
interrupted on June 4, 2016, due to a storm in Sydney. Many services were affected due to the
power failing for Amazon Simple Storage Service (S3) and Elastic Cloud Compute (EC2) (Juha
Saarinen, 2016). This was the inspiration for this project. This project provides the study,
design and implementation of cloud-to-cloud failover to provide 100% availability of the
services that are deployed or dependent on cloud computing.
1.1 Sydney Storm at Collaroy in Northern Sydney Impacted AWS Cloud (Source: SBS.com)
In many cases, organisations with large-size deployments choose to operate their
framework on numerous platforms or environments based on requirements such as providing
backup, failover, redundancy or expanding their business. By using cloud computing and
virtual cloud services provided by different vendors are perfectly matched to backup and
failover scenarios.
However, depending on one cloud service provider is not reliable for meeting customer
requirements in multiple geographic regions. Because, even the world’ largest cloud computing
platforms are vulnerable to periodic failure or power outage (Head, 2016).. That means
Kurra Srirekha 10
enterprise cloud users must still consider business continuity planning, providing backup for
their data and failover. Therefore, what is needed is a network of multiple clouds (Veena,
2013).
1.1 Previous Cloud Service Outages
Having various data centres at various geographical areas provided by large cloud
service vendors is one way to overcome this type of issue. However, that would not have been
an option for customers who do not have data centres in multiple locations in their cloud
network region. For instance, AWS does not have data centres in multiple locations in
Australia, only in the Sydney area (Head, 2016).
Along these lines, by utilising the services of different cloud suppliers, redundancy of
service provides more benefits than simply business impact. Users can host their own cloud
servers on data centres from different providers. This helps users to manage the risks related to
the business progression of the cloud service supplier. This is conceivable as every service
vendor works independently. Hence, the multi-cloud environment is a major focus in helping
Kurra Srirekha 11
clients to utilise backup and failover services over various cloud suppliers and platforms
(Rawat, 2013).
This paper gives a structural example that illustrates the coordination of failover and
data backup between the AWS and AZURE cloud environments with the help of the VPN
tunnel. In this project, the VPN tunnel performs a critical part in the improvement of the
networking system. In this way, the computer is directly connected to the private networking
system with the help of the VPN (Huanhuan et al., 2015).
The further chapters of this report are structured in a meaningful manner. Chapter 2
states and explains the background of the study. Chapter 3 defines the analysis of this report
based on the domain and technical level study. Chapter 4 describes the design and
implementation of this project. A detailed description of the results and evaluation of results is
given in Chapter 5. Chapter 6 describes the future scope and recommendations. Finally, the
conclusions drawn from the implementation are discussed in Chapter 7.
1.2. Project Objective
The primary objective of this project is to provide backup and failover by connecting Microsoft
Azure and Amazon Web Services using a VPN tunnel. This project creates a site-to-site IPsec
tunnel so that it can be connected to a Virtual Private Cloud that is hosted in AWS to an Azure
network for failover and providing backup. The detailed objectives are given below.
1. Creating a Virtual Private Cloud (VPC) in AWS and a Virtual Network (VNET)
in AZURE.
2. Deploying Virtual Machines (VMs) in both AWS and AZURE virtual
environments, each using Ubuntu.
3. Creating a Virtual Gateway on the AZURE end and installing Openswan on the
AWS end in order to connect both cloud networks by using VPN.
4. Establishing an IPsec tunnel between two cloud platforms and ensuring that the
two VMs are communicating with each other.
5. Providing backup and data migration by bridging two cloud environments using
the VPN tunnel.
Kurra Srirekha 12
While implementing this project, the questions below are documented for study and
literature review purposes.
1. How are the AWS and the AZURE important for the backup and failover with
the help of the VPN tunnels?
2. What level of security will be achieved by this connection?
3. How effective can this connectivity be in terms of network workload or
failover?
4. What will be the cost of implementation of a site-to-site IPsec tunnel?
5. How supportive are AWS and Azure in the VPN tunnel?
1.3. Significance of Project
This study is important for companies who have their business-critical functionalities deployed
on the cloud. There is evidence from incidents when major cloud companies like Amazon,
Microsoft had service failures. Many companies research cloud availability before
implementing their business on the cloud. This project helps companies to understand how
cloud-to-cloud failover works to support the continuity of their business.
There are scenarios where companies like HP had to bear a loss of $US160 million due
to the failure of a traditional ERP implementation in 2004. Similarly, Hershey and Nike ERP
implementations have failed in the past. According to Gartner (Gartner, 2016) research, 55%
to 75% of traditional ERP implementations are at risk. This is the reason most of the businesses
are moving their business to the cloud for safe and secure business functionalities. This project
provides insight into cloud outages and failover mechanisms to provide 100% availability for
business functionality.
In addition, this project discusses the benefits of the multi-cloud environment. Different
approaches to connect two different clouds are studied in this project. This project provides
further understanding of new technology, such as the use of VPN to connect two clouds. This
research covers the implementation of cloud-to-cloud failover using VPN. This project is
significant to readers who want to enable seamless relocation of existing services between two
different dedicated cloud infrastructures.
The significance from a business point of view is listed below.
Kurra Srirekha 13
1. Increased Profits: Avoid losses caused by unpredicted system failures.
2. Peace of Mind: Making maximum use of computers to achieve 100% business
availability, providing a trusted business solution.
3. Sustain Brand Name: Avoid brand impact due to unexpected failures.
1.4. Scope and Limitations of Project
Project Scope
The scope of this study is based on cloud-to-cloud failover. For implementation and analysis
purposes, Amazon Cloud and Microsoft Azure are considered in scope for this project. This
project creates a combination of the clouds that is capable of providing connectivity to high-
level operations including backup, failover and data migration. For this setup, on the AWS end,
an Ubuntu server is selected as it is the underlying platform.
The scope of data collection is from other organisations with this connection setup, and
various books and journals are utilised to obtain in-depth knowledge of site-to-site
connectivity.
This research requires technical knowledge of the technologies listed below.
1. Network Connectivity: A working knowledge of network connectivity is required
because the setup requires high-level security.
2. Networking: A working grasp of routing setups and protocols is vital to execute the
proposed networking connectivity.
3. AWS and Azure Cloud Services: Marston et al. (2011, p.176) mentioned that AWS
and Azure offer similar capabilities in terms of storage and networking. It is essential
to gather in-depth knowledge of these two platforms in performing the network
connectivity. The functionality associated with each of these platforms needs to be
analysed, and based on that, the implementation process will be carried out.
4. Ubuntu: Hands-on knowledge of Ubuntu is required as there is a need to work with
Ubuntu on both AWS and AZURE.
In addition to the above, the resources below are requirements for the project.
1. Amazon Web Services active account
Kurra Srirekha 14
2. Azure cloud services active account
3. Minimum $100 credit in both Amazon and Azure
4. Laptop in working condition.
5. Ubuntu at both the Amazon and Azure ends is required in order to have site-to-
site secure IPsec tunnel connectivity (blogs.technet.microsoft.com, 2016).
Project Limitations
Time constraint is the primary limitation of this study. In addition, other project limitations are
listed below.
1. This research does not compare cloud technologies and providers.
2. The cost analysis of cloud failover is out of scope.
3. This research is based on secondary information, and the information is sourced
from the AWS and Azure company websites.
4. This research does not cover the testing of all failover scenarios.
1.5. Summary
In this section, the background, need, significance, scope and limitations are studied. This
section provides a basic understanding about AWS and Azure clouds as an introduction for
beginners. The primary motive of this project is to study and implement cloud-to-cloud failover
for 100% systems availability. Ongoing research in this area helps to indicate the scope for
future study. This study is important to examine business dependency on the software cloud.
The next section provides an in-depth background of the study, concepts for different terms,
and techniques. It also covers other researchers’ viewpoints on cloud failover.
Kurra Srirekha 15
Chapter 2: Background of Research
2.1. Introduction
This chapter describes the background of the research related to this project. This chapter
includes a discussion of cloud backup and failover, which covers how failovers and backups
help in data recovery and business continuity. The information provided in this section is based
on various scholarly studies and reviews in cloud backup and failover techniques.
2.2. Backup and Failover
For any organisation, business continuity is very important. For example, the Sydney storm in
June 2016 impacted millions of people due to unavailability of online services. The business
continuity issue has impacted many brands. Organisations want to have fully functioning
applications and services even in the event of a disaster. Thus, organisations need failover
mechanisms in order to ensure that services can run with only minor interruptions in the case
of a disaster. Generally, any enterprise maintains up-to-date copies of data in different
geographical locations so that data can be accessible without interruptions even if one location
fails or is disabled, which is known as the backup mechanism (Rouse, 2016).
Failover is the ability of a system to protect against a minor disturbance in a short period
of time using a significant automated process. It is different from load balancing. In a failover
scenario, one server enables operations on another server when the usual one has failed,
whereas in load balancing, all the servers are operational but work in a load-sharing mode. The
failover concept is designed for maximum availability whereas load balancing is built for
scalability. Backup is the capacity to proceed with data in a case of outages, usually with
multiple copies of data in different locations, and normally including manual activities. It
addresses data recovery and continuity through two independent platforms or environments,
each containing its own information and executables (Huanhuan et al., 2015).
Both failover and load balancing concepts are achieved through clustering. The
diagrams shown below depict the architecture of server failover.
Kurra Srirekha 16
2.1 Multiple clustered applications before failover (Source: Microsoft TechNet)
2.2 Multiple clustered applications after failover (Source: Microsoft TechNet)
Kurra Srirekha 17
2.3 Network Loan Balancing (Source: Microsoft TechNet)
2.3. Cloud Computing
Cloud computing is based on an online data processing and availability. The applications are
implemented on shared infrastructure through web. Before the usage of cloud computing,
internet provided online document sources, for multiple users like Google Mail or Google Docs
which are termed as Software as a Service(SaaS) (Chang, 2011).
With so many organisations requiring SaaS for their own functionalities, Amazon
developed AWS, which runs on the cloud and thus empowers the organisations’ functionalities
to run on the cloud (Amazon Web Services, 2010). The benefits of AWS enable organisations
to consume these services, build their own functionalities, and build a private cloud. Ubuntu
Kurra Srirekha 18
Server Edition running Ubuntu Enterprise Cloud is an example of running a private cloud
(Eucalyptus Systems, 2010).
There has been much research into cloud computing, especially in the area of the
execution of virtualised resources in public and private clouds. Many scholar researches are
carried out in cloud computing and components (Buyya et al, 2011).
The diagram below shows a typical cloud computing scenario where multiple devices
are connected to the cloud to operate multiple services built on one platform.
2.4 Cloud Computing (Source: Wikipedia)
Cloud computing has become a cost-effective solution for businesses due to shared and
globally distributed resources. Clients are provided global access to cloud platforms. IT
companies like IBM, Google, AWS, Microsoft and many others have built data centres across
regions to support cloud services. By 2020, cloud computing revenue is estimated to reach
$US241 billion (Reid et al., 2011). Ease of infrastructure and application setup in the cloud has
influenced companies to use cloud shared services (Arean, 2013). According to an IBM white
paper, 61% of UK organisations are dependent on cloud solutions (White paper, 2013).
However, there are a few security challenges, such as recovery mechanisms, trust, and risk
Kurra Srirekha 19
management, which ought to be considered to give better client fulfillment and business
continuity.
2.3.1. Backup and Failover Approaches in Cloud
Disaster recovery (DR) is a process used in most organisations as part of their Business
Continuity Plan (BCP). There are many components of BCP, and data backup and recovery is
one of them. Most databases and servers provide inbuilt backup and restore methods which are
easy to set up.
Failover is different from backup in that to handle a failover scenario, database
application and service backup is mandatory to provide continuous connectivity from the point
of failure. Hence in failover architecture, backup and database replication is an important
concept. Amazon provides database connectivity for six familiar databases (Oracle, Amazon
Aurora, MySQL, PostgreSQL, Microsoft SQL Server and MariaDB). Amazon Aurora is its
own database management system (DBS), and according to Amazon (AWS, 2016) it provides
five times the performance of MySQL. The diagram illustrated below shows the Amazon
database backup and replication scenario.
2.5 Amazon Aurora Database Backup for Replication and Clustering (Source: AWS)
The cloud platform provides different approaches for backup, recovery, and failover
based on the infrastructure of organisations.
Backup Approaches
Kurra Srirekha 20
IT infrastructures are categorised into cloud-native, on-premises and hybrid environments.
i. Cloud-Native environment: This scenario deals with an infrastructure that relies
completely on the cloud. If an organisation is running all its services from the cloud,
that organisation can have many built-in features in order to back up, protect data
and support recovery requirements.
ii. On-premises environment: This scenario deals with an infrastructure that exists
on the premises, using no components in the cloud. However, this environment
allows some software vendors to directly connect their applications with cloud
storage solutions for providing backup and recovery support.
iii. Hybrid environment: This scenario deals with the two infrastructures discussed
above, cloud-native and on-premises environments. These two structures are
combined into a hybrid environment. Here, the network has both on-premises and
cloud infrastructure components. Thus, applications that are running in the cloud
will be connected to applications that are running on-premises. However, latency is
the main constraint while uploading data to the cloud, and consistent performance
is required in order to protect data. (AWS, 2016).
Failover Approaches:
When the primary components in the network such as a server, processor, network or database
become unavailable due to either failover or downtime, then secondary components in that
network or other network components will take the responsibility to provide fault tolerance.
The cloud platform provides services like a load balancer to maintain the workloads between
systems or networks in the event of failure. (Rouse, 2005).
i. Multi-Server: In this scenario, if one organisation has only one data centre in the
cloud platform, there can be multiple servers to store their applications, and by using
a load balancer, then if one server fails, they can run their applications by using
another instance in the cloud so that their business will continue without any
interruptions. When the load balancer is used, the workload can be distributed
between the instances according to spikes in usage. (Robertson, 2016).
Limitations: Here, if one server fails, by using the load balancing service, the cloud
user can have business continuity. However, there is still an issue if that cloud load
balancer service fails or that entire data centre fails due to power outages or disasters.
Kurra Srirekha 21
ii. Multi-Datacentre: To overcome the limitations of a single data centre organisation,
the multiple data centre concept is proposed. An organisation with large
deployments can run the business in multiple data centres in the cloud. Thus, if one
data centre fails due to disaster or power outage in that data centre location, a user
can use another data centre to run their business by using a load balancing service.
(Robertson, 2016).
Limitations: Here, the load balancer is needed to distribute the workload between
data centres. Thus, if that load balancing service fails, or if the entire cloud
infrastructure fails due to disaster or electricity issues, the business continuity will
be lost.
Therefore, if an organisation's applications run in the cloud, such as SaaS or PaaS, their
crucial business data relies upon the cloud provider, which leads to risk in the event of data
loss or failure of that cloud platform.
The organisation itself has to take responsibility for protecting its important data,
regardless where data actually exists. Generally, PaaS and SaaS providers such as Amazon
Web Services, Microsoft Azure, and Google Apps will perform the task of backing up the
user's data. However, such backups are for their own benefit, not for users. For example, a user
needs to recover deleted data, but he might find that cloud provider is unable or unwilling to
help due to disasters or power outages. Then that user may not be able to recover his lost data.
Hence, one cannot depend on a single cloud provider; organisations should consider
their business continuity by having suitable backup and failover mechanisms in the event of
unexpected failure of a current cloud platform.
2.3.2. Cloud Outages
No matter how widespread large cloud platforms become in the world, or what level of
performance, improved availability or uptime is offered by cloud providers, the cloud is
vulnerable to periodic failures or power outages. (Head, 2016). This report describes the best-
known AWS and AZURE cloud outages in recent years, as this project mainly deals with
Amazon and Azure cloud environments.
Amazon Outages
1. 2016 (June 5): Sydney spent that Sunday struggling through severe storms. The
outage also affected AWS services including EC2, Database Migration Service,
Kurra Srirekha 22
ElastiCache, RDS (Relational Database Service), CloudFormation, Route 53
Private DNS, CloudHSM, Redshift, Elastic Beanstalk and Storage Gateway.(Juha
Saarinen, 2016).
2. 2015 (September 20): The Northern Virginia (US-EAST-1) region has suffered a
giant outage affecting AWS services, including CloudWatch, Cognito and
DynamoDB, which all broke down that day. (Fiveash, 2015).
3. 2012 (December 25): The Elastic Load Balancing (ELB) was down in the US-East
area. It affected the applications that utilise the ELB, and these applications were
disconnected for over 23 hours. For example, organisations like Netflix were
affected due to this outage (Fernand, 2014).
Azure Outages
1. 2016 (March 23): East Asian region users faced a problem for App Service\Web
Apps and Virtual Machines due to an interruption to the physical network
infrastructure within the region. That problem later meant higher-than-usual latency
for those trying to reach cloudy VMs. (Sharwood, 2016).
2. 2015 (December 3): The European region suffered from an outage which caused
users to be unable to access both Microsoft Azure infrastructure-as-a-service (IaaS),
and business productivity tools that are based on the Office 365 cloud for several
hours in that region. (Donnelly, 2015).
3. 2012 (February 29): Windows Azure experienced a vast outage. In response, the
service management system had been switched off for about seven hours
worldwide. At the time of that incident, a senior software engineer from Microsoft
reported that the major cause of the incident was certification issues. (Parnell,
2012).
These cloud outages prove that, even though if cloud users use the best cloud platforms,
this does not prevent periodic failures or electric power outages due to natural disasters.
Therefore, cloud users should think about their business continuity before cloud outages
happen.
Kurra Srirekha 23
To eliminate dependency on one cloud, the multi-cloud environment is available.
Connecting one cloud network by using another cloud platform is the most likely solution in
order to eradicate interruptions to businesses.
2.4. Multi-Cloud Environment
This procedure utilises various cloud platforms to diminish the threat of data loss or downtime
due to unexpected errors in cloud computing. Data that is limited to one cloud service may be
at risk in case of situations such as service failure, disaster, and power outages. Therefore, that
specific cloud data will be inaccessible (Woods et. al., 2010).
Hence, this platform will be the best solution for users who want business continuity
by backing up data from cloud to cloud and using a failover provision to handle the failure of
even one cloud computing environment.
Limitations:
The limitations of multi-cloud implementation are listed below (Ravello, 2014).
1. Complexity: as different cloud services and different infrastructures are
connected, there is no standard terminology to build a multi-cloud platform.
2. Management Overhead: due to the complexity of the network architecture,
expertise is needed to determine what to move to the cloud, when, where, and
why.
2.4.1. Cloud-to-Cloud Failover
A cloud network architecture combining two other different cloud networks is created to
provide backup and failover. The outcome in this study is achieved by providing end-to-end
security to two different networks in order to allow the users to manage the workload in both
cloud networks, providing full VM connectivity with the help of a secure IPsec tunnel. This
setup is a working system that supports high availability operations, backup and failover. Thus,
this project is suitable for organisations that want to operate their framework on various
platforms depending upon their requirements, such as providing backup, failover and
redundancy to expand their business.
This project demonstrates the failover from the Amazon cloud to the Azure cloud. The
diagram below illustrates the use of two cloud environments. The services provided by AWS
Kurra Srirekha 24
and Azure provide the functionality to set up the multi-cloud environment for the failover
scenario.
2.6 Use of Multi-cloud Configuration using AWS and Azure (Source: Google Images)
2.7 Failover in Multi-Cloud Environment (Source: Google Images)
There are further benefits of multi-cloud failover which overcome the limitations and promote
business continuity.
2.4.2. Benefits of Connecting Two Clouds
The benefit of this multi-cloud failover is that it provides multi-region application continuity
and a high performance database in the cloud. In addition, the multi-cloud takes advantage of
all the benefits provided by the single cloud environment as shown below. (Ravello, 2014).
1. Reduce dependency
Kurra Srirekha 25
2. High availability
3. Failover
4. Competitive prices
5. Business extensions
The benefits of multi-cloud failover are achieved through VPN.
2.5. VPN
The main focus of this project is connecting AWS virtual private cloud to Azure virtual network
by using a virtual private network, and hence understanding the VPN is significant to this
project.
VPN builds a private network on a public network. It authenticates PC to send
and receive data across shared networks. VPNs normally permit remote access connections
which are authenticated and make utilization of encryption procedures in order to secure private
data. The protected VPN protocol that is used in this project design for the purpose of end to
end security is Internet Protocol Security (IPsec), which was first created for IPv6. IPsec
utilises encryption, encapsulating an IP packet inside an IPsec packet to meet the security
objectives of integrity, authentication, and privacy. IPsec is perfect for network-to-network
(site-to-site) tunnelling.
In this project, VPN is used to connect two distinct virtual cloud networks. In general,
security is considered to be the key issue across the virtual networks, where users face endpoint
issues while using the cloud. As this project deals with AWS and Azure, the complete security
standards that are offered by the AWS and Azure for security provision at all network levels
ensure user safety and privacy.
2.5.1. Cloud-to-Cloud Connectivity using VPN
Cloud service providers like Amazon and Azure support multi-cloud configurations through a
VPN gateway. A VPN gateway is a collection of resources that are used to send network traffic
between virtual networks and on-premises locations. Gateways are used for site-to-site (S2S),
point-to-site, and multi-point connections. This project implements the site-to-site connection
between the AWS and Azure virtual networks.
Kurra Srirekha 26
Site-to-Site connection: This is a connection over an IPsec/IKE VPN tunnel. This
kind of connection needs a VPN device situated on-premises that has a public IP address
associated with it and is not situated behind a Network Address Translation (NAT). S2S
associations can be utilised for cross-premises and hybrid configurations (McGuire, 2016).
2.6. Ubuntu
Ubuntu is a Linux-based operating system. The Ubuntu server is used for the project
implementation because it is free of cost. Information about Ubuntu is out of scope for this
project. The Ubuntu Server Guide (team, 2016) should be referred to for the installation in this
project. Ubuntu documentation provides step-by-step information on installation and
configuration of the various server applications on Ubuntu system to fit the requirements.
2.7. Summary
In this chapter, the backup and failover processes are studied. This chapter provides a basic
overview of cloud backup and failover techniques and well-known cloud outages in recent
years. This chapter concludes that the multi-cloud environment is the solution to avoiding
unexpected cloud outages and describes how both AWS and Azure support the multi-cloud
configurations. Continuing research in this area helps to indicate the scope for future study.
This project is significant when considering business dependency on the software cloud. The
next chapter provides in-depth analysis of the project, related to the domain and technical levels
in the multi-cloud environment.
Kurra Srirekha 27
Chapter 3: Resources and Technical Analysis
3.1. Introduction
Analysis is important to the implementation of the project to overcome the challenges and
reduce the risk of project failure. After the review of the literature and concepts related to multi-
cloud failover, this section presents the analysis by integrating different components to form a
solution. This analysis establishes the connection between project objectives and technical
feasibility.
The objective of this project is to set up a project in the AWS and Azure cloud
environments, replicate databases to and from each server, back up the file system and test the
failover scenario. The tools used to achieve these objectives are Ubuntu, Openswan, IPsec,
VPN tunnel and SSH.
3.2. Analysis of Resources
As this project is based on two different cloud platforms, there is a need to analyse the most
suitable cloud platforms for the project objectives. Both cloud platforms should support the
VPN tunnel, as it plays a main role in the project. Choosing the appropriate cloud vendors
based on the project targets will reduce the complexity. In this project AWS and Azure cloud
platforms had been chosen. As these two platforms having technically more similarities, this
combination will reduce the complexity for implementing multi cloud environment.
In the process of VPN tunnel implementation between AWS and Azure, a software
VPN has been used at AWS end to make EC2 instance acts like VPN device. As this project
deals with Ubuntu server, Openswan source package is used. It is a VPN software since 2005
for Linux-based operating systems and it supports most of IPsec extensions. It is already
included in the distribution of Ubuntu, Gentoo, Red Hat and many others.
In the backup and failover pattern implementation, multi-master synchronous
replication has been used inside each database cluster for high availability and for the failover
provision.
In this project, synchronous replication is implemented using MariaDB Galera, which
is a multi-master cluster for MySQL/XtraDB/InnoDB databases. MariaDB is an upgraded,
drop-in substitute for MySQL, and it utilises the Galera library for database replication.
Kurra Srirekha 28
It is only accessible for operating systems that are based on Linux, which supports the
InnoDB or XtraDB storage engines with the essential features of automatic joining of nodes,
synchronous replication, read and write when connected with any cluster node, and direct client
connections. Hence, the advantages with MariaDB lead to numerous DBMS clustering
solutions that incorporate clusters with no slave lag, no data transaction loss, and minute client
latencies (vexxhost, 2015).
3.3. Analysis of Technical Challenges
Ubuntu is a Linux-based OS, and hence it requires thorough understanding so that other
components like Openswan can be installed smoothly. To avoid the challenges of configuring
Openswan on Ubuntu, an understanding of Linux code to edit and save the configuration is
required. An analysis of the Openswan configuration file is required as modifying it incorrectly
results in rework. The commands listed below are useful for the implementation of this project.
3.1 Sample Linux Commands
Edit/insert: - I
Exit: - :x (or) :q
Save and Exit: - :wq
Paste: Fn + Ins
The analysis of the firewall setting between the two clouds is important. It is required
to authenticate one cloud’s virtual machine with the other cloud. Hence it is important to update
the route table and security groups of the Openswan server in AWS correctly. These security
groups and route tables are used to allow traffic from remote networks.
It is required to include two UDP custom inbound rules. These rules include 4500 and
500, both using the Azure gateway IP address with /32 as the CIDR and routing table by
including the network address of Azure virtual system.
In order to successfully install MariaDB in both cloud VMs, it is required to check the
compatibility of the Ubuntu server with the MariaDB version before implementation of the
project.
Kurra Srirekha 29
The analysis of the MySQL configuration files is important for setting up the MariaDB
cluster for database replication. Once the MySQL configuration files change is complete, a
service restart is required using the following command.
service mysql restart
However, this command may result in failure if the databases on both nodes are not
authenticated. Hence, the MySQL configuration file is to be analysed and updated as shown
below to make it work.
3.2 MySQL Configuration
To transfer files from one cloud to another cloud, the SSH key of both VMs must be
exchanged. Then, bidirectional file transfer will be successful.
During the technical analysis of this project, the focus is kept on configuration changes,
and with continuous learning, all the challenges can be resolved.
3.4. Summary
This section provides the technical analysis of the different tools prior to the actual
implementation. The technical analysis is required to choose a different tool in case of a version
compatibility issue. In this project, each component is analysed, and for configuration changes,
a few online documents are referenced.
The next section provides the design of this project and the code, along with screenshots
from the project implementation.
Kurra Srirekha 30
Chapter 4: Design and System Implementation
4.1. Introduction
The previous sections covered techniques and tools related to cloud-to-cloud failover. This
section provides the design and implementation of cloud-to-cloud failover so that the services
are not impacted. This implementation is done in two stages. In the first stage, the failover of
AWS MySQL database to Azure MySQL database using MariaDB is performed. In the second
stage, the automated backup of the file system from AWS to Azure cloud is performed using
IPsec.
4.2. Problem and Context
The significance of this implementation is the need for the business continuity plan to meet
100% availability of service even on the cloud platform. On September 15, 2016, a global DNS
outage impacted Azure services for all regions. There are many such examples of cloud
unavailability in 2016, as discussed in the literature for this project, impacting businesses across
the globe. Hence this project covers an implementation of 100% cloud computing availability
using the cloud-to-cloud failover concept.
While implementing this project, there were some challenges due to the integration of
different components to form a structured solution. For the failover to happen from AWS to
Azure, it is important that both clouds connect and communicate with each other. This bridge
is set up using Openswan, which is installed on the AWS Ubuntu and Azure Ubuntu virtual
machines. The continuous learning and debugging techniques have helped in successful
configuration and implementation of Openswan and in building a VPN tunnel between the two
clouds. The architecture of this project is described in later sections.
4.3. Solution Structure
Defining a solution is a systematic process in which different components are linked to build a
solution. The solution structure is a process of capturing all the technical details in an
architectural format. This project has followed a recurring procedure in which the architecture
rework is done based on pseudocode testing (H. J. La and S. D. Kim, 2009). The diagram below
is an illustration of the steps followed in the solution architecture of this project.
Kurra Srirekha 31
4.1 Solution structure for backup and failover
This procedure has helped in identifying functional and non-functional areas. One result
of the solution structuring is a process flow design of the cloud-to-cloud failover network
architecture components as illustrated below.
4.2 Architectural design of AWS and Azure cloud-to-cloud connectivity
Amazon Virtual Private Cloud (VPC) is the primary virtual machine in the cloud
network of this project. The Ubuntu server is installed on this VM. The backup server is Azure
Virtual Net on which Ubuntu is also installed. To connect both virtual machines over the cloud,
Openswan is installed on the primary cloud (AWS) and configured with VPN so that both
Kurra Srirekha 32
communicate with each other using a VPN tunnel. MariaDB is used for MySQL database
replication, and IPsec is used for file backup from AWS to Azure. The process flow and
installation of this solution is explained next in the implementation process section.
4.4. Implementation Process
The implementation of the project begins with creation of a VPC in the AWS US-West region
with a network address of 10.0.0.0/16 and one public subnet with an address range of
10.0.0.0/24. An Ubuntu server is deployed in the subnet and assigned an IP of 10.0.0.61. This
server is associated with an Elastic IP (EIP) address to access the Ubuntu server by using the
SSH command line interface. The EIP is required to connect with the Azure network.
A virtual network (VNet), similar to VPC in AWS, is created in the Azure US-East
region with a network address of 172.16.0.0/16 and an added subnet with network address
172.16.1.0/24. The Ubuntu server is deployed in the virtual network and assigned the IP
address of 172.16.1.5. The configuration of site-to-site connectivity is set by using EIP and the
address space of AWS. A virtual network gateway is created as shown in the screenshot below,
which generates a virtual gateway IP address and manages the shared key.
4.3 Virtual Network Gateway between AWS and Azure
4.4.1. Site-to-Site VPN Tunnel Setup
Openswan is deployed in the AWS Ubuntu server so that it acts as a VPN device to work for
the VPN tunnel. Openswan depends on IPsec protocol. This protocol is used for encoding IP
traffic before packets are exchanged between source and destination. Router is used for
encryption, decryption, encapsulation and de-capsulation. Security groups and route tables are
Kurra Srirekha 33
modified on AWS end to allow traffic from Azure. The Openswan /etc/ipsec.conf file’s
connection section is given below.
4.1 Openswan Configuration File
Where,
# Left = AWS side Ubuntu server IP address
# Left subnet = AWS network address
# Right = Azure virtual network gateway
# Right subnet = Azure virtual network address
The implementation process is divided into two stages: one for failover cluster setup
and the other for file system backup.
4.4.2. MariaDB Galera Cluster Setup (Replication and Failover)
MariaDB Galera Cluster is deployed on both VMs to synchronise the MySQL database through
replication. Below is the command used to install the MariaDB cluster.
4.2 MariaDB Installation Command
Kurra Srirekha 34
After the deployment of the MariaDB cluster, a configuration file called “my.config”
is changed on each virtual machine with wsrep configuration options so that the MariaDB
cluster understands the endpoints for communication. In addition to this change, the
configuration file is changed for the VSRep option to provide the IP address and related details
of the other VM. Hence in the AWS VM, the IP address of Azure is provided and vice versa,
so that MariaDB acts as a mediator for replication between the two clouds. To configure VSRep
under the [mysqld] directory on each node, the “my.config” file is changed with their particular
hostnames, IP address and root passwords.
4.3 Configuration for wsrep Option in my.config
4.4 Configuration for VSRep Option in my.config
The configuration is completed successfully, and hence the bi-directional replication of
MySQL DB is initiated using the MariaDB cluster.
4.4.3. SSH Keys and SCP Command (File System Backup)
Kurra Srirekha 35
SSH is a kind of protocol that permits secure connections between virtual machines. The SCP
command is useful for transferring files across SSH connections from one cloud to another. An
SSH public key is used to transfer the data between AWS and Azure. The SSH key of the AWS
Ubuntu server is copied into the “authentications_key” file in Azure and vice versa so that
required files can be transferred from one Ubuntu VM to another Ubuntu VM by using an SCP
command.
4.5. Challenges
Connecting two different clouds is not an easy process. While implementing the VPN tunnel
that is based on IPsec mode between AWS and Azure, there will be a problem when the peers
of the tunnel are at the back of the NAT. The NAT changes the data of an IP packet so that this
IP packet will be rejected by the other peer as the signature of the packet is wrong. The solution
is commonly known as NAT-T (NAT Traversal); it works using an IPsec packet encapsulating
technique in UDP packets. As these packets can pass via NAT routers, there will be no loss of
packets or packet drops.
When using Openswan to create a VPN tunnel, there will be two parameters called Left
and Right, which are simply peers on the two ends of its tunnel. Various parameters for these
two ends will be configured in “/etc/ipsec.conf”, which is used to define a tunnel between two
nodes. The “ipsec.conf” file will have “nat_traversal=no” by default, and as there is a need to
support NAT Traversal, that command should be changed as “nat_traversal=yes”.
There is an “auto” parameter in the conn section of the ipsec.conf file. This parameter is
used to set the automatic operation which should be done during IPsec startup. By default, this
parameter is defined as “auto=ignore”, which indicates that no operation is set as automatic
startup. Therefore, this value should be changed to “auto=start”, which indicates the automatic
connection. (Rosen, 2016).
Openswan supports NAT-T in order to pass IP packets through NAT routers, which makes
UDP packets by encapsulating IPsec packets without being dropped. Hence, to allow UDP
packets, the inbound rules in the security group of the Openswan server that is located in the
AWS virtual private cloud should be modified with two custom UDP rules that allow port
ranges 4500 and 500 to allow traffic from the Azure virtual network gateway.
4.6. Summary
Kurra Srirekha 36
This section provides the detailed configuration of the Ubuntu server on both nodes (AWS and
Azure) and other components’ configuration for failover and backup purposes. The sanity
check is done to test the connectivity between two nodes. The next section captures the test
results and evaluation of the objectives of this project.
Kurra Srirekha 37
Chapter 5: Testing and Evaluation
5.1. Introduction
The implementation of cloud-to-cloud failover and backup is achieved using the control panel
provided by AWS with components like Openswan, MariaDB cluster, VPN and SSH
integration. A virtual network gateway is created in the Azure virtual machine and Openswan
in the AWS virtual machine. The coding is done in AWS in order to create the tunnel between
AWS and Azure. For MariaDB, the coding is done on both VMs for configuration.
This section covers the test cases, results and evaluation of how the implementation
achieves the objectives. The testing is carried out in two phases: one for cloud failover and one
for file system backup validation. The test case scenarios are shown in the table below.
5.1 Test Case Scenarios
Sr # Test Description Execution Step Test Result Remarks
1. VPN Tunnel Testing
2. Database Cluster Replication
3. File transfers between two clouds
Test Results Codes:
OK – Test condition is passed
NG – Not good (i.e., test condition failed)
NT – Not tested
The screenshots captured while testing are provided in the appendix.
5.2. VPN Tunnel Testing
The testing of the VPN tunnel is important because it shows the success or failure of
connectivity between the two clouds. In order to verify whether the VPN is up on the virtual
machines, the command show crypto ipsec sa is used. If the connection is successful, then the
Kurra Srirekha 38
output of this command shows both the inbound and outbound SPI. This result shows the
encaps/decaps counters incrementing if the traffic passes through the tunnel.
5.1 Communicating Both VMs of AWS and Azure
5.2 VPN Tunnel Testing Result
Sr # Test Description Execution Step Test Result Remarks
1. VPN Tunnel Testing a. Go to AWS
b. Open command prompt
c. Use command below
show crypto ipsec sa
OK
2. VPN Session Check a. Go to AWS
b. Open Command prompt
c. Use command below
show vpn-sessiondb
OK Result is
Session status: UP-ACTIVE
5.3. Database Cluster Replication
The cluster status monitoring is required to check if the cluster is operational. This test case
covers the monitoring, status updates and failover scenario.
Kurra Srirekha 39
5.2 Database Replication Between AWS and Azure
5.3 Database Cluster Testing Result
Sr # Test Description Execution Step Test
Result
Remarks
CHECKING CLUSTER INTEGRITY
1. Check the Cluster
Configuration
a. Open MariaDB Windows and run
command below
SHOW GLOBAL STATUS LIKE
'wsrep_%';
OK It shows the wsrep
protocol version as 5,
last committed as 202
and thread count as 2
2. Check Cluster
Integrity
SHOW GLOBAL STATUS LIKE
'wsrep_cluster_state_uuid';
OK It shows the cluster
state UUID which
helps to determine that
this node is part of the
cluster.
3. Check the number of
nodes in the cluster
SHOW GLOBAL STATUS LIKE
'wsrep_cluster_size';
OK The value is 2
4. Check cluster status SHOW GLOBAL STATUS LIKE
'wsrep_cluster_status';
OK The value is primary.
CHECKING THE NODE STATUS
Kurra Srirekha 40
5. Check if node is
accepting queries
SHOW GLOBAL STATUS LIKE
'wsrep_ready';
OK The value is ON
6. Check if node has
network
connectivity with
other node
SHOW GLOBAL STATUS LIKE
'wsrep_connected';
OK The value is ON
7. Check node state SHOW GLOBAL STATUS LIKE
'wsrep_local_state_comment';
OK The value is Joined
CHECKING THE REPLICATION HEALTH
8. Check the average
size of write set
queue
SHOW STATUS LIKE
'wsrep_local_recv_queue_avg';
OK The value is 3.34. If
the value is greater
than 0.0, it means there
is delay in replication.
Ideally the value
should be as close as
possible to 0.0.
9. Check the pause
status due to flow
control
SHOW STATUS LIKE
'wsrep_flow_control_paused';
OK The value is 0.18. If
the value is greater
than 0.0, it means the
node is paused due to
flow control. Ideally
the value should be as
close as possible to 0.0.
DETECTING SLOW NETWORK ISSUES
10. Check average
length of the query
SHOW STATUS LIKE
'wsrep_local_send_queue_avg';
OK The value is 0.14.
Values greater than 0.0
indicate a network
bottleneck. Ideally the
value should be as
close as possible to 0.0.
FAILOVER TO AZURE
Kurra Srirekha 41
11. Check if database is
failed over
AWS database should be down and
Azure database should be up and
acting as primary
OK Azure database is
showing as primary
5.4. AWS-AZURE: Inter-Cloud Transfer (file transfers between two
clouds)
SSH and SCP are used for the file transfer between two VMs for backup and restore purposes
in the case of cloud failure. SSH is the general protocol, and SCP is the Linux SSH client
command. Hence this covers the test cases and results of file transfer scenarios as shown in the
table below.
5.3 File Transferring Between AWS and Azure
5.4 Cloud-to-Cloud Backup Testing Result
Sr # Test
Description
Execution Step Test
Result
Remarks
1. Check all SSH
connected
sessions
[root@router~]# netstat -tnpa | grep 'ESTABLISHED.
*sshd'
OK It shows two IP
addresses as
established with
SSH
2. Check if SSH
service is
running
sudo service ssh start OK
Kurra Srirekha 42
3. Check if SCP
is transferring
file through
SSH
[virtual machine ~]$ scp examplefile
yourusername@yourserver:/home/yourusername/
OK The file is
copied to the
virtual machine
correctly.
5.5. Results Evaluation
During this testing procedure, there are some important evaluations needed of the test results.
This testing is carried out to make sure the defined objectives are tested as part of this project
implementation. The evaluation is performed on two major test conditions as described next.
5.5.1. AWS-AZURE Failover
The failover is performed using the MariaDB cluster. The database on AWS is failed manually,
and it is evaluated that it immediately fails over to Azure. Because both databases are in sync
with the help of replication services, there was no outage, and hence they provided 100%
availability of the data. This testing is important for the mission-critical businesses that are
dependent on cloud computing.
5.5.2. AWS-AZURE Backup
For the AWS-to-Azure backup, the SSH protocol is used. The files are backed up using an SCP
Linux command. The AWS virtual machine is made unavailable, and the files are still
accessible from the Azure virtual machine. This evaluation verifies that, from the point of
failure, the cloud-to-cloud file system backup and recovery still function without any data loss
or unavailability during a disaster.
5.6. Summary
The testing is a critical part of this project as it tests the project through multiple scenarios as
defined in the test cases and provides the results. The failover from AWS to Azure is achieved
using the MariaDB cluster, and file system synchronisation is performed using SCP through
the SSH protocol.
Kurra Srirekha 43
Chapter 6: Recommendations and Future Scope
6.1. Introduction
The implementation of this project overcomes the cloud failure challenges when using cloud
technologies. Though there are lessons to be learned from this project implementation, it
provides a direction for future work by enhancing cloud-to-cloud failover to achieve high
scalability and performance.
6.2. Recommendations
The VPN device must support some requirements in order to work correctly; for example, it
must have a public-facing IPv4 address and support IKEv1. The VPN device must also support
NAT-T, AES 128-bit encryption, SHA-1, etc. to establish the IPsec security associations in the
VPN tunnel.
In this project, the software VPN Openswan, which supports all the above requirements.
has been used at the AWS end of the VPN tunnel. However, this can be done by using different
methods.
Windows server 2012 R2 can be used as tunnel’s end-point on the AWS side and also
as an Azure VPN device. A Windows PowerShell script will be used in this method to install
RRAS (Routing and Remote Access Server) on the AWS server, and it needs to be configured
in order to create the site-to-site IPsec tunnel between AWS and Azure.
Openswan can be used on both sides, or Strongswan can be used at the AWS end, to
create the VPN tunnel. However, Strongswan is a more developed document than Openswan.
(Michael, 2015). Another software VPN called TMG 2010 supports these prerequisites, yet
achieving full network functionality has turned out to be harder than anticipated. (MikeWo,
2013).
Other recommendations from this project are listed below.
1. In the cloud failover project, thorough understanding and hands-on experience of the
technologies help to save time in implementation.
2. Prior to cloud-to-cloud failover implementation, a compatibility study of the different
cloud technologies and tools is important.
Kurra Srirekha 44
3. In the cloud-to-cloud failover testing, maximum scenarios are required to be covered.
6.3. Future Work
The scope of this project is limited to cloud clustering and file system backup. Therefore, the
future scope in cloud availability will be the implementation of cloud-to-cloud load balancing
for higher performance and scalability.
This can be implemented using HA Proxy, which is an open-source package that is used
to create load balancing between two different virtual networks. By using this source, a virtual
IP address that acts like a load balancer can be created between AWS and Azure to balance the
workload between primary and secondary sites. The solution structure can be as illustrated in
the diagram below.
6.1 Cloud-to-Cloud Load Balancer – Future Scope
In this project the file backup is implemented manually, which can be automated in
future.
The cost and security analysis of the multi-cloud environment can be considered for
future study and implementation.
6.4. Summary
A multi-cloud environment between AWS and Azure had been implemented by using an
Openswan software VPN. This chapter discussed other approaches to achieve cloud-to-cloud
connectivity. In this chapter, a few recommendations are made regarding the VPN device. This
Kurra Srirekha 45
chapter also defined the creation of the automatic backup and load balancer between AWS and
Azure to balance the workload for providing high availability, as included in the future scope
of this project.
Kurra Srirekha 46
Chapter 7: Conclusion
7.1. Introduction
The integration of AWS and Azure provides cloud-to-cloud failover in order to overcome the
challenges arising from unexpected cloud outages. This chapter will discuss the overall
conclusion of this project.
7.2. Conclusion
This project will be a meaningful solution for large-sized organisations that are considering
their business continuity even in the case of unexpected outages or failures.
The test results of this project proved that 100% business availability on the cloud is
achievable. Recent examples of cloud outages, like the 2016 Sydney storm resulting in an AWS
outage or the Azure outages due to other natural calamities, have forced the business
community to rethink the alternative options for their service availability. This project has
implemented and tested the potential for 100% cloud availability by using cloud-to-cloud
failover techniques.
Kurra Srirekha 47
8. Reference List
Books, Journals & Documents
1. Arean, O. (2013). Disaster recovery in the cloud. Network security, 9, 5-7.
doi:10.1016/S1353-4858(13)70101-6
2. Buyya, R., Broberg, J., & Goscinski, A. M. (2011). Cloud computing: Principles and
paradigms. Brooklyn, NY: Wiley.
3. Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., & Brandic, I. (2009). Cloud
computing and developing information technology platforms: Vision, buildup, and
actuality for conveying computing as the fifth utility. Anticipated generation computer
systems, 25(6), 599-616.
4. Cassidy, L. (2016, August 03). PROJECT MANAGEMENT: TIME ESTIMATES AND
PLANNING. Retrieved from projectsmart.
5. CIF (2016). UK cloud adaption and trends for 2016. Retrieved from
https://www.cloudindustryforum.org/content/uk-cloud-adoption-trends-2016
6. Daniel, N., Rich, W., Chris, G., Graziano, O., Sunil, S., Lamia, Y., & Dmitrii, Z. (2009).
The eucalyptus open-source cloud-computing system. IEEE international symposium
on cluster computing and the grid, 12(9), 124-131. doi:10.1109/CCGRID.2009.93
7. Deepa. (2012, August 7). 14 cloud outages in 7 months, who is next? CIOL Bureau 2,
2-3. Retrieved from http://www.ciol.com/14-cloud-outages-months/
8. Donnelly, C. (2015, December 03). European Office 365 and Microsoft Azure users hit
by service outage. Retrieved from Computerweekly.
9. Ferguson, T. (2009). Salesforce.com outage hits thousands of businesses. Retrieved
from http://news.cnet.com/8301-1001_3-10136540-92.html
10. Fernand, F. (2014, April 23). Managing elasticity across Multi-cloud providers.
Retrieved from Slideshare: http://www.slideshare.net/fifiant/multicloud
11. Fiveash, K. (2015). AWS outage knocks Amazon, Netflix, Tinder and IMDb in MEGA
data collapse. London: the register.
12. Foley, M. J. (2016). Global DNS outage hits Microsoft Azure customers. Australia:
ZDNet.
Kurra Srirekha 48
13. Head, B. (2016). Ignore cloud faults at your peril. CWANZ (p. 15). Australia:
TechTarget.
14. Hofmann, P., & Woods, D. (2010). Cloud computing: The limits of public clouds for
business applications. IEEE internet computing, 6, 91-93.
15. Huanhuan X., Frank F., & Claus P. (2015). An architecture pattern for multi-cloud high
availability and disaster recovery. Workshop on Federated Cloud Networking
FedCloudNet, pp. 5-6
16. Jackson, T. (2008). We feel your pain and we are sorry. Retrieved from
http://gmailblog.blogspot.com/2008/08/we-feel-your-pain-and-were-sorry.html
17. John, M. (2010). Amazon elastic compute cloud (EC2). Retrieved from
http://aws.amazon.com/ec2/.
18. Juha Saarinen, A. C. (2016). AWS Sydney outage downs big-name web companies.
Australia: itnews.
19. Kevin J. (2009). Secure Cloud Computing: An Architecture Ontology Approach.
Retrieved from http://sunset.usc.edu/gsaw/gsaw2009/s12b/jackson.pdf, DataLine,
2009.
20. Lin, Y. K., & Chang, P. C. (2011). Maintenance reliability estimation for a cloud
computing network with nodes failure. Expert systems with applications, 38, 14185–
14189.
21. Marston, S., Li, Z., Bandyopadhyay, S., Zhang, J., & Ghalsasi, A. (2011). Cloud
computing - The business perspective. Decision Support Systems, 51, 176-189. doi:
10.1016/j.dss.2010.12.006
22. McGuire, C. (2016, 09 21). About VPN Gateway. Retrieved from Microsoft Azure:
https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-
vpngateways/
23. Parnell, B.-A. (2012). Microsoft's Azure cloud down and out for 8 hours. London: the
register. Retrieved from the register.
24. Perez, J. C. (2008). Extended Gmail outage hits apps admins. Retrieved from
http://www.computerworld.com/s/article/9117322/Extended_Gmail_outage_hits_App
s_admins
Kurra Srirekha 49
25. Pierre R. (2016). Step-by-step: connect your AWS and Azure environments with a VPN
tunnel. Retrieved from https://blogs.technet.microsoft.com/canitpro/2016/01/11/step-
by-step-connect-your-aws-and-azure-environments-with-a-vpn-tunnel/ /
26. Rawat, V. (2013). Reducing failure probability of cloud storage services using multi-
cloud. Kota, Rajasthan: University College of Engineering, RTU. Retrieved from
https://arxiv.org/ftp/arxiv/papers/1310/1310.4919.pdf
27. Reid, S., Kicker, H., Matzke, P., Bartels, A., & Lisserman, M. (2011). Sizing the cloud.
technical report. Retrieved from http://www.forrester.com-/E-
/Sizing+The+Cloud/fulltext/RES58161objectid=RES58161
28. Robertson, B. (2009). Top five cloud computing adoption inhibitors. Enterprise
innovation, 5, 12-14.
29. Rouse, M. (2007). Line of business. Retrieved from
http://searchcio.techtarget.com/definition/ LOB
30. Rouse, M. (2016). Line of business. Retrieved from
http://searchcio.techtarget.com/definition/ LOB, (accessed on 20 Nov 2012).
31. Saarinen, J., Coyne, A. C. (2016). AWS Sydney outage downs big-name web companies.
Australia: itnews.
32. Sharwood, S. (2016). Azure's wobbly day as three services glitch around the world.
London: The Register.
33. Team. (2016). Ubuntu Server Guide. Retrieved from helpubuntu:
https://help.ubuntu.com/lts/serverguide/serverguide.pdf
34. Veena R. (2013). Reducing failure probability of cloud storage services using multi-
cloud. Kota, Rajasthan: University College of Engineering, RTU. Retrieved from
https://arxiv.org/ftp/arxiv/papers/1310/1310.4919.pdf
35. Xiong, H. F. (2015). An architecture pattern for multi-cloud high availability and
disaster recovery. Workshop on Federated Cloud Networking FedCloudNet, 15.
Websites
http://www.rightscale.com/blog/enterprise-cloud-strategies/private-and-hybrid-
clouds-9-use-cases-and-implementation-advice
https://www.opsgility.com/blog/2013/09/03/connecting-clouds-site-to-site-aws-azure/
http://developers-club.com/posts/196798/
www.Gartner.com
Kurra Srirekha 50
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Aurora.html
https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-about-
vpngateways/
Kurra Srirekha 51
9. Appendices
This section provides the screenshots used during the implementation of the project.
9.1. Implementation Screenshots
The following are the implementation screenshots of the procedure to connect the AWS and
Azure cloud platforms using VPN connection in order to provide backup and failover.
1) Created a VPC with single public subnet in AWS environment
9.1: Creating VPC with single public subnet
9.2: Details of VPC while creating
Kurra Srirekha 52
2) Deployed Ubuntu server in VPC
9.3: Selecting Ubuntu AMI in EC2 dashboard
9.4: Configuration details of Ubuntu instance while deploying
Kurra Srirekha 53
3) Created Elastic IP and associated that EIP with Ubuntu server that was created in
previous step
9.5: Allocating EIP in EC2 dash board
9.6: Associate EIP with Ubuntu server
Kurra Srirekha 54
4) Created Azure virtual network
9.7: Created virtual network in Azure
5) Created site-to-site connectivity
9.8: Given AWS VPC address space and Elastic IP of AWS while creating Site-to-Site
connection
Kurra Srirekha 55
9.9: Created site-to-site connectivity
6) Defined the Azure virtual network subnet and added Gateway subnet
9.10: Adding subnet and gateway for virtual network in Azure
Kurra Srirekha 56
7) Created the Azure Virtual Network Gateway
9.11: Azure virtual network Gateway IP address produced
9.12: Manage shared key generated
8) Connect to the Ubuntu VM on AWS side to configure Openswan
Kurra Srirekha 57
9.13: Accessing Ubuntu by using SSH PuTTY on AWS side
9) Installed Openswan on Ubuntu server in AWS
Used the following command to install openswan on Ubuntu.
sudoapt-get install openswan
9.14: Installing Openswan in the AWS Ubuntu server
10) Configured the openswan by editing the code
Kurra Srirekha 58
The code for the Openswan configurations is presented in next section (9.2).
Once the Openswan configurations are done, edited the “sysctl.conf “file:
Enabled the IP forwarding to the Open Swan VM by uncommenting the command below:
net.ipv4.ip_forward=1
Next, disabled the “source / destination checking” option on the Open Swan server:
9.15: Disabling source/destination check
11) Modified Security Groups to Allow Traffic from Windows Azure
9.16: Modified security groups for openswan server in AWS
Kurra Srirekha 59
12) Azure Virtual Network Connected to Amazon AWS Virtual Private Cloud
Before restarting the Openswan server, the connection representation is as shown below:
9.17: Connection between two clouds before configuring openswan
Then, restarted the openswan server by using command below:
sudo service ipsec restart
Now, the graphical representation of connection between two clouds will be as show:
9.18: connection between two clouds after configuring Openswan
Kurra Srirekha 60
Then, used ping command to check whether both VMs on the two clouds are communicating
with each other or not
9.19: Pinging both VMs in different clouds
13) Deployed Maria DB cluster to perform Database replication
Installed Maria DB cluster on two nodes by using the command below:
Maria DB configurations on two nodes is provided in Coding section (9.2).
14) Replication of Database between two cloud VMs, which can consider as Failover
Kurra Srirekha 61
9.20: database replication between two VMs
15) File transferring between two cloud VMs; can consider as File System backup
9.21: Files transferring between two VMs
9.2. Coding Part of Implementation
Openswan Configurations on AWS side:
Edited the ipsec.conf file:
Kurra Srirekha 62
9.22: Edited ipsec. conf file
Edited the amnazure. conf file:
9.23: Edited amnazure. conf file
Edited the ipsec. secrets file:
Specified the Azure gateway and manage shared key in command below:
Kurra Srirekha 63
9.24: Edited ipsec.secrets file
Maria DB Cluster Configurations:
Maria DB galera cluster has been deployed in each VM to provide synchronous MySQL
database replication. Below command is used to install Maria DB cluster in each VM:
Then, to configure the Maria DB cluster, the files below have been changed.
Kurra Srirekha 64
MySQL Settings:
First of all, opened the my.cnf file and commented the following lines, which are
uncommented by default on Ubuntu servers.
Kurra Srirekha 65
MariaDB Settings:
Now added the following lines for wsrep configuration options in my.cnf file
under [mysqld]directive as shown below in AWS server and Azure server.
VSRep Providers Configurations:
Here, configured the VSRep configurations under the [mysqld]directory on each node by
adding the following lines in /etc/mysql/my.cnf file with their specific IP address, hostnames
and root password.
Kurra Srirekha 66
In this project, bi-directional replication has been implemented between AWS and Azure VMs;
this replication provides the failover and failback between the AWS and Azure sites.