imig: toward anadaptive live migration method for ......directory. we use a voltech pm1000+ power...

16
© The British Computer Society 2014. All rights reserved. For Permissions, please email: [email protected] doi:10.1093/comjnl/bxu065 iMIG: Toward an Adaptive Live Migration Method for KVM Virtual Machines Jianxin Li 1, Jieyu Zhao 1 , Yi Li 1 , Lei Cui 1 , Bo Li 1 , Lu Liu 2 and John Panneerselvam 2 1 State Key Laboratory of Software Development Environment, School of Computer Science & Engineering, Beihang University, Beijing, China 2 School of Computing and Mathematics, University of Derby, Derby, UK Corresponding author: [email protected] With the energy and power costs increasing alongside the growth of the IT infrastructures, achieving workload concentration and high availability in cloud computing environments is becoming more and more complex. Virtual machine (VM) migration has become an important approach to address this issue, particularly; live migration of the VMs across the physical servers facilitates dynamic workload scheduling of the cloud services as per the energy management requirements, and also reduces the downtime by allowing the migration of the running instances. However, migration is a complex process affected by several factors such as bandwidth availability, application workload and operating system configurations, which in turn increases the complications in predicting the migration time in order to negotiate the service-level agreements in a real datacenter. In this paper, we propose an adaptive approach named improved MIGration (iMIG), in which we characterize some of the key metrics of the live migration performance, and conduct several experiments to study the impacts of the investigated metrics on the Kernel-based VM (KVM) functionalities, as well as the energy consumed by both the destination and the source hosts. Our results reveal the importance of the configured parameters: speed limit, TCP buffer size and max downtime, along with the VM properties and also their corresponding impacts on the migration process. Improper setting of these parameters may either incur migration failures or causes excess energy consumption. We witness a few bugs in the existing Quick EMUlator (QEMU)/KVM parameter computation framework, which is one of most widely used KVM frameworks based on QEMU. Based on our observations, we develop an analytical model aimed at better predictions of both the migration time and the downtime, during the process of VM deployment. Finally, we implement a suite of profiling tools in the adaptive mechanism based on the qemu-kvm-0.12.5 version, and our experiment results prove the efficiency of our approach in improving the live migration performance. In comparison with the default migration approach, our approach achieves a 40% reduction in the migration latency and a 45% reduction in the energy consumption. Keywords: cloud computing; live migration; energy saving; configures; adaptive model Received 17 December 2013; revised 1 May 2014 Handling editor: Zhangbing Zhou 1. INTRODUCTION In recent years, cloud computing [1] has emerged as an efficient paradigm which enables ubiquitous, on-demand services to the continuous public computing and storage requirements. The continuous operation of massive datacenters and other computing resources in the cloud consumes a notable amount of energy and the level of the energy consumption has now become a serious question in the cloud arena. Virtualization has become an enabling technology for cloud computing as it simplifies the service delivery by providing a platform for optimizing complex IT resources in an effective and energy efficient way [2]. In the cloud datacenter, virtual machine (VM) encapsulates the Section B: Computer and Communications Networks and Systems The Computer Journal, 2014 The Computer Journal Advance Access published July 22, 2014 at Beihang University on September 4, 2014 http://comjnl.oxfordjournals.org/ Downloaded from

Upload: others

Post on 09-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

© The British Computer Society 2014. All rights reserved.For Permissions, please email: [email protected]

doi:10.1093/comjnl/bxu065

iMIG: Toward an Adaptive LiveMigration Method for KVM Virtual

Machines

Jianxin Li1∗

, Jieyu Zhao1, Yi Li

1, Lei Cui

1, Bo Li

1, Lu Liu

2

and John Panneerselvam2

1State Key Laboratory of Software Development Environment, School of Computer Science & Engineering,Beihang University, Beijing, China

2School of Computing and Mathematics, University of Derby, Derby, UK∗Corresponding author: [email protected]

With the energy and power costs increasing alongside the growth of the IT infrastructures, achievingworkload concentration and high availability in cloud computing environments is becoming moreand more complex. Virtual machine (VM) migration has become an important approach to addressthis issue, particularly; live migration of the VMs across the physical servers facilitates dynamicworkload scheduling of the cloud services as per the energy management requirements, and alsoreduces the downtime by allowing the migration of the running instances. However, migration is acomplex process affected by several factors such as bandwidth availability, application workload andoperating system configurations, which in turn increases the complications in predicting the migrationtime in order to negotiate the service-level agreements in a real datacenter. In this paper, we proposean adaptive approach named improved MIGration (iMIG), in which we characterize some of thekey metrics of the live migration performance, and conduct several experiments to study the impactsof the investigated metrics on the Kernel-based VM (KVM) functionalities, as well as the energyconsumed by both the destination and the source hosts. Our results reveal the importance of theconfigured parameters: speed limit, TCP buffer size and max downtime, along with the VM propertiesand also their corresponding impacts on the migration process. Improper setting of these parametersmay either incur migration failures or causes excess energy consumption. We witness a few bugs in theexisting Quick EMUlator (QEMU)/KVM parameter computation framework, which is one of mostwidely used KVM frameworks based on QEMU. Based on our observations, we develop an analyticalmodel aimed at better predictions of both the migration time and the downtime, during the processof VM deployment. Finally, we implement a suite of profiling tools in the adaptive mechanism basedon the qemu-kvm-0.12.5 version, and our experiment results prove the efficiency of our approachin improving the live migration performance. In comparison with the default migration approach,our approach achieves a 40% reduction in the migration latency and a 45% reduction in the energy

consumption.

Keywords: cloud computing; live migration; energy saving; configures; adaptive model

Received 17 December 2013; revised 1 May 2014Handling editor: Zhangbing Zhou

1. INTRODUCTION

In recent years, cloud computing [1] has emerged as an efficientparadigm which enables ubiquitous, on-demand services tothe continuous public computing and storage requirements.The continuous operation of massive datacenters and othercomputing resources in the cloud consumes a notable amount of

energy and the level of the energy consumption has now becomea serious question in the cloud arena. Virtualization has becomean enabling technology for cloud computing as it simplifies theservice delivery by providing a platform for optimizing complexIT resources in an effective and energy efficient way [2]. Inthe cloud datacenter, virtual machine (VM) encapsulates the

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

The Computer Journal Advance Access published July 22, 2014 at B

eihang University on Septem

ber 4, 2014http://com

jnl.oxfordjournals.org/D

ownloaded from

Page 2: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

2 J. Li et al.

execution of a complete operating system (OS), allowing theimplementation of multiple virtual OS environments to co-existon the same computer with strong isolation between each other[3, 4]. Live migration [5] is a key feature of the VMs, whichallows moving a continuously running VM from one physicalhost to another. The attractive benefits of this live migrationincludes energy saving, high availability and load balancing,etc. Various research works are being carried out recently, tooptimize the performance of live migration, such as pre-copy[6], post-copy [7] and trace and replay [8]. Though these studieslook comprehensive [6, 9–13], they do not account for the factof configuration parameters affecting the migration time, thedowntime of the VMs and also the power of the computingnodes.

In a dynamic scenario, attempting a quick VM migrationwith a speed of 100 MB/s in a low bandwidth network of30 MB/s, often results in a poor performance. In such a case,deploying the task with right parameters and performancemetrics plays a crucial role in the overall performance of thenetwork. With this in mind, we highlight the need for theconsideration of the parameters such as bandwidth, service-level agreements (SLA), local availability, current networkstatus and many, while attempting a VM migration in thecloud. An SLA, which is directly related to the perceivabilityof the end users, is an important part of the VM services.SLAs are mostly specified before the deployment of the VMs,which requires the cloud providers to gain a prior knowledgeabout the desired configuration parameters in order to satisfythe agreement in terms of the migration performance andthe datacenter energy management criterion. Migration is acomplex process affected by many factors such as networkbandwidth, local system capability, etc. The cloud environmentis generally dynamic, and so the pre-configured parametersshould be adopted according to the dynamic environments andthe energy management requirements. An improper parametersetting might lead to unexpected migration failures, thusperformance degradations.

Figure 1 depicts such an example where the max downtimeis configured as 200 ms. Here, the migration is lasting for alonger time than expected, because of the fact that the estimateddowntime is larger than 200 ms and also the migration is notallowing a stop-and-copy phase. In this case, the migration isconsidered to be a failure. VM migration is very common in adatacenter environment with hundreds of VMs, and hence theenergy consumption criterion is always attached to it. Hence,the adopting strategies of these pre-configured parameters toachieve the desired performance of live migration are importantand desirable. To address the above problems, we propose anadaptive live migration approach named improved MIGration(iMIG) based on Quick EMUlator (QEMU)/Kernel-based VM(KVM) [14]. Kernel-based VM (KVM) is a Linux kernelmodule that allows a user space program to utilize the hardwarevirtualization features of various processors. QEMU makes useof KVM when running a target architecture that is same as the

0

200

400

600

800

1000

1200

1400

1600

150 200 250 300 350 400 450 500 550 600

Exp

ecte

d D

ownt

ime

(ms)

Elapsed Time (s)

Expected Downtime

FIGURE 1. An example of migration with fluctuant expecteddowntime. In this example, the parameter of max downtime is set to200 ms default, but it incurs longer migration time without a tradeoffstrategy due to unsatisfied expected downtime.

FIGURE 2. The methodology of iMIG. iMIG provides twocomponents: one is to analyze the parameters to build models andthe another is a patch on QEMU/KVM which can support adaptivelive migration.

host architecture. The KVM project is maintaining a fork ofQEMU called qemu-kvm.We conduct several experiments inour study based on this QEMU/KVM framework.

The design goal is a 2-fold strategy, as shown in Fig. 2. Thefirst is to study the parameters and analyze their configuredrelationships with the VM via empirical study, along with theirimpacts on the VM live migration performance. The secondphase is to develop an analytical model with an adaptivemechanism to improve the migration performance and thesuccess ratio. The major contributions of the paper are asfollows:

(i) The impacts of the key parameters affecting the KVMlive migration have been evaluated through empiricalstudies based on QEMU/KVM. By analyzing the resultsof the KVM live migration performance under differentconditions, we have identified a few constraints and also

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 3: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 3

observed the impacts of these studied parameters on thelive migration performance.

(ii) We design a new adaptive migration mechanism, callediMIG, to reduce the migration duration in order togain a better migration performance addressing theexisting issues of live migration in KVM. During ourstudy, we have also identified a few drawbacks ofthe existing migration mechanism, and adopted thenecessary modifications to achieve better performance.

(iii) The proposed mechanism, iMIG, achieves betterperformances and exhibits a higher adaptability incomparison with the existing live migration mechanismof QEMU/KVM.Also, iMIG provides a suit of profilingtools and methodologies for guiding the administratorto deploy the VMs effectively.

The remaining of this paper is organized as follows: Section 2gives the description of our experiments and results. InSection 3, we establish a model to describe the impact ofthe configuration parameters on the overall migration process.Section 4 represents the design and the implementation of iMIG.In Section 5, we discuss the related research works focused onthe live migration. Section 6 concludes this paper and alongwith our future research directions.

2. OVERVIEW AND EMPIRICAL STUDY

In this section, we give a brief introduction to the functionalitiesof the KVM live migration [15] and an overview of iMIG,and then we present our experiments in detail, includingthe environment that we use. This section also describes theconfiguration parameters and their impacts on the migrationperformance.

2.1. Overview of iMIG methodology

Live migration is affected by many parameters, such asnetwork bandwidth, dirty memory speed (which reflects on theapplication workload) and so on.The initial operation of iMIG isto assess the relationship between the configuration parametersand the live migration performance and their correspondingimpacts. Based on this assessment and observation, we build ourmodel, which can be used for adaptive migration guidance andSLA prediction. Thus, iMIG provides a patch on QEMU/KVMto enable an adaptive migration and improves the performanceof existing migration.

2.2. Description of KVM live migration

KVM is an open source system virtualization module, which hasbeen integrated in every major distribution of Linux since Linux2.6.20 [16], and it leverages the simulated devices implementedin QEMU. QEMU/KVM has now become the mainstream VM

Monitor (VMM) in academia and industry, supporting a widevariety of guest OSs.

To evaluate the performance of KVM live migration, wemust analyze the core codes related to migration. KVM livemigrations do not involve basic core state, and is mainlyimplemented in QEMU. Function do_migrate (Monitor *mon,const QDict *qdict, QObject **ret_data) is the entry function ofthe live migration. The parameter uri implies that the migrationis be divided into four modes, i.e. tcp, exec, unix and fd.The files Migration-tcp.c, migration-exec.c, migration-fd.c andmigration-unix.c correspond to the four migration methods,respectively. The tcp way is the main mode used in practice,which is adopted in this study.

KVM uses the pre-copy strategy [6] to do the migration,as shown in Fig. 3. The first stage of migration is carriedout by the function qemu_savevm_state_begin (Monitor *mon,QEMUFile *f, int blk_enable,int shared). And then themigrate_fd_put_ready(void *opaque) iterately copy the dirtiedpages generated during the last transformation. Finally, themigrate_fd_cleanup stops the VM, copies the remaining dirtypages, closes and releases the socket, frees other resources, andsets the state as MIG_STATE_COMPLETED, to complete themigration process.

We need some principles to justice this migration andthere are two key metrics to measure the performance of livemigration [9].

• Migration time. It reflects the total time from which themigration begins to the time it totally finishes. We denotethe migration time by Tmig.

• Downtime. It describes the maximum time allowable,during which the service is unavailable and is denoted byTdown.

Based on the source code analysis, we found that there areseveral parameters that impact these two metrics, shown inTable 1.

2.3. The experimental environment setup

The environment is a PC cluster with 32 nodes connected by1 Gbps network. We choose two hosts from them with Debian-6.0-amd-64 installed, and the kernel version 2.6.32.5, and theQEMU version is 0.12.5. During the experiments, we useNetwork File System (NFS) [17] to provide these two hostswith the file sharing service. NFS Sever is set in Host1, andHost2 can mount the sharing files by the ‘nfs mount’ command.The image of theVM waiting to be migrated is saved in the NFSdirectory. We use a Voltech PM1000+ power analyzer unit tomeasure and record the transient power and the total energyconsumption.

The default values of some key parameters are shown inTable 2. The workload running on the VM is a program thatrandomly allocates a specified memory size and writes dataon to it. In the experiments, we vary some of the default

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 4: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

4 J. Li et al.

FIGURE 3. KVM live migration flow.

TABLE 1. Parameters of iMIG.

Parameters Denotation Brief description

VM memory M The larger memory size, the longer it will take to finish the migrationMigration speed limit (used bandwidth) speed_limit It stands for the real-time throughput during the migration. We denoted it as B

Page dirty rate (workload) It indicates the load that running on the VMMax downtime Tmax−down It describes the longest time allowable during which the VM is unavailableTCP_Buffer size tcp_buffer It represents the size of TCP_Buffer in our experimentsIterate count N It states the count of the iterate copy happens during the migrationith Round I It means that at the current time, it is the ith round of the migrationTime of current round ti It reveals the time it takes to finish the current round copy

TABLE 2. Default parameters value.

VM Max TCP_Buffer Speedmemory Workload downtime size limit

2 GB 20 MB/s 1000 ms 16 MB 50 MB/s

values to identify the associated parameters. For example, weset the default value of max downtime 1000 ms, but in someexperiments, we change this value to 300 or 30 ms with specialstatements.

2.4. The experimental results

During the experiments, we change one parameter at a timeto, respectively, study the effect of the other parameters onthe final migration performance. The experiment mainly usesthe max downtime, speed limit, TCP_Buffer size, rate of dirtypages and VM memory size for analysis. Using StatisticalProduct and Service Solutions (SPSS) software to fit the curveof the experiment results, the specified impact of every singleparameter on the migration performance is obtained, which inpractice can give guidance for efficient migration.

(i) Impact of speed limit

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 5: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 5

20

30

40

50

60

70

80

90

100

20 40 60 80 100 120 140 160 180 200

Mig

rati

on T

ime(

s)

Speed Limit (MB/s)

Migration Time

FIGURE 4. Migration time vs speed limit.

150

200

250

300

350

400

450

20 40 60 80 100 120 140 160 180 200

Dow

ntim

e (m

s)

Speed Limit (MB/s)

Downtime

FIGURE 5. Downtime vs speed limit.

In this experiment, we vary the speed limit from 10 to 200 MB/sand study the changes in the migration time and the downtime.We analyze the causes for this change and use curve fittingmethod to study the specific impact of the speed limit onmigration time and downtime.

Migration time and downtime. From Figs 4 and 5, it can beseen that the impact of the speed limit on the total migrationtime is roughly inversely proportional to each other. And fordowntime, the curve escalates at the beginning and then turnsinto a smooth fluctuation, as the speed limit increases. Theinverse proportion of the speed limit on migration implies thatwith the initial increase of the speed limit (about <100 MB/s inthe experiment), the migration time reduces and after a certainlevel of the speed limit (say, 100 MB/s), the effect of the speedlimit on the migration time is negligible. We use SPSS toanalyze the results precisely, to obtain Equation (1). Detaileddata may not be the same, but this method can be used to analyze

0

20

40

60

80

100

120

140

0 5000 10000 15000 20000 25000 30000 35000

Thr

ough

put (

MB

/s)

Elapsed Time (ms)

40MB/s60MB/s80MB/s

100MB/s

FIGURE 6. Throughput on different speed limit.

0

50

100

150

200

250

300

0 2000 4000 6000 8000 10000 12000 14000

Thr

ough

put (

MB

/s)

Elapsed Time (ms)

110MB/s

FIGURE 7. Throughput with speed limit = 110 MB/s.

the results of the experiments conducted in a totally differentenvironment. The quantitative analysis of the SPSS enablesbetter predictions about the impacts of the parameters on theSLA, reflecting in the improvement of the adaptive migration.

Tmig ≈ 1700/x + 8. (1)

Network throughput. We also measured the real-time bandwidth(throughput) during the migration process, shown in Figs 6and 7. From Fig. 6, it is evident that the throughput increaseswith the speed limit. Also, when the speed limit exceeds 80%of the physical bandwidth (1 Gbps), the throughput fluctuatesrapidly, as shown in Fig. 7. So we advise not to consume allthe bandwidth during migration, doing so might lead to theinstability of the network and degrades the performance quality.

Energy consumption. During the experiments, we measurethe energy consumed when the speed_limit is 10, 30 and100 MB/s, respectively. The results of which are shown in Figs 8

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 6: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

6 J. Li et al.

46

48

50

52

54

56

58

60

62

0 50 100 150 200 250

Power(Watt)

Elapsed time(s)

100MB/s30MB/s10MB/s

FIGURE 8. Power vs speed limit at destination host.

90

95

100

105

110

115

120

125

130

0 50 100 150 200 250

Power(Watt)

Elapsed time(s)

100MB/s30MB/s10MB/s

FIGURE 9. Power vs speed limit at source host.

and 9, from which we can see that increasing the speed increasesthe power consumption of the hosts and decreasing the speedresults in excess energy consumption.

(ii) Impact of TCP_Buffer size

The analysis of the QEMU/KVM source code, reveals thatthe TCP Buffer size affects the control of the iteration loop,which in turn affects the entire migration process. We have alsoconducted experiments to study the impact of the TCP Buffersize on the migration process.

Migration time and downtime. From Figs 10 and 11, it canbe observed that with the increase in the TCP_Buffer size, thetotal migration time reduces and reaches a stable value, i.e. themigration time lasts ∼34 s when TCP_Buffer size is >6 MB.The downtime also shows similar trend, that is, the downtimelasts for 450 ms when TCP_Buffer size is above 6 MB.

Network throughput. In the previous experiment, we sawthe stabilization of the curves, as shown in Figs 10 and 11.

30

40

50

60

70

80

90

100

110

120

130

0 2 4 6 8 10 12 14 16 18 20

Mig

rati

on ti

me

(s)

TCP_Buffer (MB)

Migration time

FIGURE 10. Migration time vs TCP_Buffer size.

200

250

300

350

400

450

500

0 2 4 6 8 10 12 14 16 18 20

Dow

ntim

e (m

s)

TCP_Buffer (MB)

Downtime

FIGURE 11. Downtime vs TCP_Buffer size.

To explore this behavior, we change the TCP_Buffer size andexplore the throughput, respectively.

Though we set the same speed, the throughput varies, whichis shown in Figs 12 and 13. So, we conclude that the throughputis also in relation with the TCP Buffer size.

In Figs 6 and 7, we know that the throughput is relative to thespeed limit. A collective illustration of the TCP Buffer size andthe speed limit is presented in Figs 14 and 15. From all thesefigures, we conclude that the throughput is decided by both theTCP_Buffer size and the speed limit.

Energy consumption. We then compare the energy consump-tion when the TCP_Buffer size is 3 and 16 MB. Figures 16and 17, respectively, show the energy consumed on the destina-tion and the source host. We can see that when the TCP_Bufferis 3 MB, the power on destination host is ∼58W and the migra-tion lasts for ∼60 s. When the TCP_Buffer is 16 MB, the power

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 7: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 7

0

10

20

30

40

50

60

70

80

0 5 10 15 20 25 30 35 40

Thr

ough

put (

MB

/s)

Elapsed Time (s)

TCP Buffer=4MB

TCP Buffer=6MB

FIGURE 12. Figure 10: throughput on different TCP_Buffer size.

0

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 30 35

Thr

ough

put (

MB

/s)

Elapsed Time (s)

TCP Buffer=6MB

TCP Buffer=8MB

FIGURE 13. Throughput on different TCP_Buffer size.

is ∼60W but the migration latency is only 20 s. The resultsshow us that the latter consumes less energy though its powerlevel is higher. And a similar phenomenon is also seen on thesource host. Sometimes, with a larger TCP Buffer size, lowerenergy consumption can be achieved even with a higher powerlevel.

(iii) Impact of max downtime

Max downtime is one of the key control metrics that allowthe live migration to function in the stop-and-copy phase, andthus it has a key influence on the whole migration process. Wechange the max downtime from 30 to 3000 ms and study howthe migration time and downtime varies.

As we can see from Figs 18 and 19, the migration time anddowntime, both are proportional to the max downtime. Thoughwe know that the migration time decreases with the increase in

0

20

40

60

80

100

120

140

0 20 40 60 80 100 120 140 160 180 200

Mig

rati

on ti

me

(s)

Speed Limit(MB/s)

TCP Buffer=4MB

TCP Buffer=8MB

TCP Buffer=12MB

TCP Buffer=16MB

FIGURE 14. Migration time vs speed limit on different TCP_Buffersize.

0

20

40

60

80

100

120

0 20 40 60 80 100 120 140 160 180 200

Dow

ntim

e (m

s)

Speed Limit(MB/s)

TCP Buffer=4MB

TCP Buffer=8MB

TCP Buffer=12MB

TCP Buffer=16MB

FIGURE 15. Downtime vs speed limit on different TCP_Buffer size.

46

48

50

52

54

56

58

60

62

0 20 40 60 80 100 120 140

Power(Watt)

Elapsed time(s)

3MB16MB

FIGURE 16. Power vs different TCP_Buffer size on destination host.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 8: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

8 J. Li et al.

95

100

105

110

115

120

125

130

0 20 40 60 80 100 120

Power(Watt)

Elapsed time(s)

3MB16MB

FIGURE 17. Power vs different TCP_Buffer size on source host.

32.8

33

33.2

33.4

33.6

33.8

34

34.2

34.4

34.6

0 500 1000 1500 2000 2500 3000

Mig

rati

on ti

me

(s)

MaxDowntime (ms)

Migration Time

FIGURE 18. Migration time vs max downtime.

0

200

400

600

800

1000

1200

1400

0 500 1000 1500 2000 2500 3000

Dow

ntim

e (m

s)

MaxDowntime (ms)

Downtime

FIGURE 19. Downtime vs max downtime.

46

48

50

52

54

56

58

60

0 50 100 150 200 250

Power(Watt)

Elapsed time(s)

30ms300ms

FIGURE 20. Power vs max downtime at destination host.

95

100

105

110

115

120

125

130

0 50 100 150 200 250

Power(Watt)

Elapsed time(s)

30ms300ms

FIGURE 21. Power vs max downtime at source host.

the max downtime, only a minor change (from 32.8 to 34.6 s)the migration time is evident from Fig. 18.

In order to obtain a sharp change in the migration time,the max downtime should be increased considerably. But thispractice is against the policies of the SLA and cannot beaccepted. From Figs 20 and 21, we also conclude that whenthe max downtime changes, it has only a little impact on thepower consumption of both the source and the destination hosts,during the migration process. But a shorter max downtime willlead to a longer migration lantency and thus causes more energyconsumption.

(iv) Impact of workload

The workload running on the VM also has a great impact onthe migration metrics. In the experiments, we first use our ownprogram to study how the workload affects the performance(Figs 22 and 23). The program allocates certain amount ofmemory and set random bytes in the memory to simulate

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 9: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 9

49

49.5

50

50.5

51

51.5

10 20 30 40 50 60 70

Mig

rati

on ti

me

(s)

Workload (MB/s)

Migration time

FIGURE 22. Migration time vs workload.

240

260

280

300

320

340

360

380

400

420

10 20 30 40 50 60 70

Dow

ntim

e (m

s)

Workload (MB/s)

Downtime

FIGURE 23. Downtime vs workload.

the memory overheads. Besides, we use Phoronix [18], a setof benchmarks which consists of many Linux open sourcebenchmarks to imitate the workload.

For further analysis, we choose some typical benchmarks tosimulate the actual situation. The experimental conditions arelisted in Table 3 and adopted benchmarks and the results arelisted in Table 4.

From Table 4, we note that imagemagic generates most of thedata at run time, and writes most of the memory. Gzip generatesthe least amount of data. The downtime of gzip is the longest,indicating the presence of more number of dirty pages betweenthe leaving and entering phases of the last data in the stop-and-copy stage. The downtime of tiobench is short, indicatinga small number of dirty pages.

TABLE 3. Experimental conditions of benchmark.

VM memory Max downtime Bandwidth Network

2 GB 30 ms (default) 32 MB/s (default) GigE

TABLE 4. Benchmark and results.

linux- image-Benchmark apache kernel gzip tiobench magic

Migration time (s) 63.94 110.84 12.83 21.65 171.34Downtime (ms) 41.73 34.22 86.22 31.32 36.6

TABLE 5. Experimental conditions of bandwidth revise.

VM memory Max downtime Bandwidth Network

2 GB 30 ms (default) 32 MB/s (default) GigE

0

1

2

3

4

5

6

7

0 2 4 6 8 10 12 14

Wor

kloa

d (M

B)

Elapsed Time (s)

SpeedLimit=20MB/s

SpeedLimit=50MB/s

FIGURE 24. Workload of different speed limit.

In addition, we also choose memcached to carry out theload test. We run memcached server on a VM to carry outmemory operations by sending read and write operations fromthe physical host. We used the data sent out every time andthe remaining dirty pages to calculate the generation rate of thecurrent dirty data, since the size of the load cannot be quantified.But in reality, the VMs waiting to be migrated always have aspecific workload running on it. Energy consumption should beconsidered case by case, rather than the energy consumed onthe whole occasion.

Figure 24 shows the part-curve of the migration process.This figure shows the memory dirty pages accumulated in eachround. It can be found that the workload is lower when thebandwidth is higher, i.e. the workload is 4 and 1.6 MB when the

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 10: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

10 J. Li et al.

15

20

25

30

35

40

45

50

55

60

512 1024 1536 2048 2560 200

220

240

260

280

300

320

340

Mig

rati

on T

ime

(s)

Dow

ntim

e (m

s)

VM Memory (MB)

Migration_Time(s)Downtime(ms)

FIGURE 25. Performance vs different VM memory.

speed limit is 20 and 50 MB/s, respectively. This indicates thatmore number of dirty pages has been sent out to the destinationand the number of remaining dirty pages is reducing in unittime. This also highlights the importance of the speed limit onthe migration performance. The lower the bandwidth, the biggeris the load accumulated, and this increases the negative effecton the migration performance.

(v) Impact of VM memory

We set the VM memory size to 512, 1024, 1536, 2048 and2536 MB, respectively (Fig. 25). As expected, we learn that themigration time increases with the increase in the VM memorysize. However, the downtime is relative to the speed limit andthe data remains only in the last iteration, which shows no directrelation of the migration time to the VM memory size.

We also measure the energy consumed when theVM memoryis 2 and 1 GB with all the other conditions remaining the same.From Figs 26 and 27, we conclude that increasing the VMmemory sizes do not cause the power to raise. But a largerVM memory size can lead to more energy consumption.

In the next section, we simulate different networkenvironments as a part of our empirical study and interpretthe obtained results. We have offered a general analysismethod which in fact can be deployed in the real applicationenvironments.

3. iMIG MODEL

In this section, we set up a model to describe the relation betweenthe parameters and their corresponding impacts on the migrationperformance, as described in Section 2. The notation definitionis mentioned in Table 1.

46

48

50

52

54

56

58

60

62

0 20 40 60 80 100 120 140

Power(Watt)

Elapsed time(s)

2GB1GB

FIGURE 26. Power vs different VM memory at destination host.

90

95

100

105

110

115

120

125

0 20 40 60 80 100 120

Power(Watt)

Elapsed time(s)

2GB1GB

FIGURE 27. Power vs different VM memory at source host.

3.1. Static model of Tmig

During the whole migration process, the amount of dirty datagenerated can be calculated as ω× (Tmig −Tdown). But Tdown �Tmig, so Tdown can be ignored. The total amount of data thatneeds to be transferred is ω × Tmig + M . Then we obtain

Tmig = ω × Tmig + M

B, (2)

where B = min{speed_limit, tcp_buffer ∗ 10} and ω ≤ B.According to the code analysis and the experimental results in

Section 2, we know that not all the parameters are independent toeach other (for example, speed limit andTCP_Buffer).When theTCP_Buffer size is <0.1 times the speed_limit, each round cansend data consistently with the TCP_Buffer size approaching themaximum. But when the TCP_Buffer size is at its maximum,the data transferred in each round should be related to the

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 11: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 11

speed_limit. Based on this observation, we obtain the formulaB. To guarantee a successful migration, workload should be lessthan the transfer speed.

This is a static model. We can use Equation (2) to predict theTmig when we know the workload, VM memory, tcp_buffer andspeed _limit before conducting the migration.

3.2. Dynamic model of Tdown

Tdown is also a key metric to describe the migration performance.But it is difficult to predict this metric before the migrationactually starts. So we use a dynamic model to predict Tdown. Weuse to denote the dirty data generated in theith round and torepresent the remaining dirty data in theith round.

�Mi = ω × ti , �MiL = ω × ti−1 − B × ti ,

where B = min{speed_limit, tcp_buffer ∗ 10}.After N − 1 rounds, the total dirty data that is left can be

calculated as

(M − B × t1) + (ω × t1 − B × t2)

+ · · · + (ω × tN−2 − B × tN−1)

= M + ω ×N−2∑

i=1

ti − B ×N−1∑

i=1

ti .

Then, we obtain the data that will be sent in the N th round:

M+ω×N−2∑

i=1

ti −B×N−1∑

i=1

ti +ω×tN−1 = M+(ω−B)×N−1∑

i=1

ti .

The transition condition that turns the migration into the stop-and-copy stage is the expected downtime calculated in the(N − 1)th round which is less than the max downtime.

tN = M + (ω − B) × ∑n−1i=1 ti

B≤ Tmaxdowtime. (3)

We can use Equation (3) to predict the real Tdown during themigration. We calculate the theoretical values of the two modelscorrespondingly. But there’s still one problem: though ourprocedure allocates 20 MB/s, the rate of the dirty data in realmigration, is not as high as 20 MB/s, since some new dirtiedpages may overlap the pages that were not transferred. So weneed to revise the value of the workload during the computation.Combining the results of many of our experiments, we add acoefficient and set its value to 0.1.

As we can see from Fig. 28, our model predicts the Tmig

effectively.We also calculate Tdown by model 2, and realized thatthe model generally gives a higher value than the experimentalresults shown in Fig. 29. This is because the dirty pages

0

50

100

150

200

250

300

10 20 30 40 50 60 70

Mig

rati

on (

ms)

Speed Limit (MB/s)

Experiment ResultsModel 1

FIGURE 28. Tmig of experiments and model 1.

0

100

200

300

400

500

600

700

10 20 30 40 50 60 70

Dow

ntim

e (m

s)

Speed Limit (MB/s)

Model 2Experiment Result

FIGURE 29. Tdown of experiments and model 2.

generation rate may vary violently during the migration andit is difficult to predict in (3), thus causing the variationin Tdown.

We carefully build a static and a dynamic model, respectively,to analyze the migration process in different perspectives. Ourmodel provides a clear interpretation of the way of analyzing therelationships between the configuration parameters, and also thestrategies of predicting the migration time, which is an importantindex to evaluate the migration performance.

4. IMPLEMENTATION OF IMIG

In this section, we introduce the specific implementation ofiMIG.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 12: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

12 J. Li et al.

4.1. A Bug in QEMU/KVM

The network environment (Table 5) is the Gigabit network(1000 MB/s), so the network speed should be ∼100 MB/s.However, the result shows that the real-time network bandwidthcomputed by the QEMU/KVM is 1000 MB/s, which isobviously wrong. This causes a wrong prediction of theexpected downtime for the rest of the migration, leading to aninaccurate migration phase transmission, and ultimately makesthe actual downtime longer than the max downtime.

In Table 6, bytes stands for the mean value of the datatransferred in each iteration and iterate count represents thenumber of times that the iterate-copy occurs, t indicates themean time period of every round, downtime implies the realdowntime and the migration time describes the total migrationtime in real environments.

From our observations, we draw the following conclusions:

Total_Bytes_Transferred = bytes × iterate_count

= 1528MB ≈ mem, (4)

Total_Migration_Time = t × iterate_count

= 1 s � migration_time. (5)

The sum of the data transferred is close to the VM memory size,and the total migration time is calculated by multiplying the timeinterval of each iteration with the number of iterations, whichis far less than the actual time. Thus, the wrong calculation of t

leads to incorrect throughput.In QEMU/KVM, the function ram_save_live is called

by command qemu_savevm_state_iterate, and the latter isrepeatedly invoked to realize the iterating transmission.So we can acquire the time, before entering the functionqemu_savevm_state_iterate and thus the time interval betweentwo iterations is computed. This can be used to calculate thethroughput and the procedure is shown in Fig. 30.

Total_Migration_Time2 = t × iterate_count

= 46570 ms ≈ migration_time. (6)

TABLE 6. Results before bandwidth revise.

Bytes Iterate count T Downtime Migration time

581 035 2757 395 μs 315 ms 46 550 ms

After the modification, the experiment is repeated again and theobtained results are shown in Table 7. And the total migrationtime calculated this time is close to the actual value.

4.2. The optimization of existing migration

Computation of wrong bandwidth later affects the value of theexpected migration time. And this causes the VM to stop themigration in undesirable situations. Thus, leading to migrationfailures and in turn brings great loss to the users and the serviceproviders. After our modifications, we obtain a more accuratevalue of the expected migration time. As a result, we furtherobtain a more accurate downtime value, by which, reducing themigration failures.

We have identified the prediction method used in thecurrent QEMU/KVM mechanism to be less effective. In thissection, we use a simplified method depending on the currentcircumstances, to predict the metrics more accurately. Theconstraints we added are listed as follows:

(i) The number of iterations.

We set a threshold level to the number of iterations. Once theiterate_count reaches this threshold, the migration turns intothe stop-and-copy phase. This approach proves to promote thesuccess ratio by compromising the migration convergence.

(ii) Migration Time Tmig vs T ′mig

The sum of the VM memory and the newly generated dataof each iteration is the total amount of data that needs tobe transferred. We estimate the overall time denoted by T ′

mig.Meanwhile, we record the time of each iteration to calculatethe throughput. With the obtained data, the real migrationtime denoted by Tmig, is computed. If Tmig exceeds T ′

mig, thewhole migration might take a long time. In this condition, weappropriately increase the Tdown and turn the migration into thestop-and-copy phase during the early stages of migration inorder to ensure that Tmig is not too long. In this way, we adjust

TABLE 7. Results after bandwidth revise.

Bytes Iterate count T Downtime Migration time

581 035 2760 16 874 μs 32 ms 46 600 ms

FIGURE 30. New time calculation method.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 13: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 13

the Tdown dynamically according to the current circumstances,thus Tdown is limited to a minimum value lesser than Tmig.

(iii) Data generating and transferring

When the rate of data generation (denoted by data_rate) is morethan the speed_limit, the counter (denoted by over_count) isincreased by one count. The expected number of remainingiterations is denoted by remain_count. If

over_count

remain_count≥ percent_border(threshold),

the migration turns into the stop-and-copy phase.

(iv) Optimized calculation of expected downtime

During our experiments, we found that once the max downtimehas been assigned, KVM strictly adopts this value to determinewhether to enter the stop-and-copy phase or not, even thoughthe expected downtime is very close to the max downtime. Thisleads to a continuous iteration, thus extending the migrationtime. However, it is not essential to repeat the iterations, unlessto extend the migration time, shown in Fig. 1.

To solve this problem, we turn the migration into the stop-and-copy phase, once either the expected downtime gets closer tothe max downtime or the downtime converges to a certain value.These two scenarios imply the presence steady workloads thatdo not decrease with time.

(v) Optimization of bandwidth

During the migration, we calculate the throughput and compareit with the speed_limit. If the mean throughput during themigration is far less than the speed_limit, we then comparethe speed_limit with the tcp_buffer. If the restrictions onthe tcp_buffer reflect on poor throughput, the tcp_buffer isincreased in order to complete the migration process.

4.3. Experiment results

From Tables 8 and 9, it is evident that with minor modificationsof the downtime, the migration time of iMIG has reducedconsiderably, thus proving the effectiveness of the iMIG.

From Figs 31 and 32, both the expected downtime and theiterate count, are optimized with iMIG.

We know that the migration lasts for a very long time whenthe network environment is tough, which in turn causes excess

TABLE 8. Experiment conditions of iMIG.

VM Maxmemory Workload downtime Bandwidth Network

2 GB 20 MB/s 30 ms 32 MB/s GigE(default) (default)

energy consumption. After our optimizations, it is clear fromFigs 33 and 34 that iMIG reduces as much as half of the energyconsumed in similar network conditions. Our optimizationtechniques can be deployed in real cloud environments and weplan to extend our study by implementing our techniques inpractise with the cloud service providers, in the future.

TABLE 9. Experiment results of iMIG.

Original QEMU/KVM iMIG

Migration Downtime Migration Downtimetime (ms) (ms) time (ms) (ms)

51 960 48.79 23 620 117.83

0

200

400

600

800

1000

1200

1400

1600

150 200 250 300 350 400 450 500 550 600

Exp

ecte

d D

ownt

ime

(ms)

Elapsed Time (s)

iMIGOriginal

FIGURE 31. Expected downtime of iMIG.

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

0 1000 2000 3000 4000 5000 6000 7000 8000

Exp

ecte

d D

ownt

ime

(ms)

Iteratation count

Threshold=3000Threshold=5000Threshold=7000

FIGURE 32. Iterate count of iMIG.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 14: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

14 J. Li et al.

47

48

49

50

51

52

53

54

55

56

57

58

0 50 100 150 200 250 300 350 400

Power(Watt)

Elapsed time(s)

OriiMIG

FIGURE 33. Comparison between iMIG and original algorithm atdestination host.

90

95

100

105

110

115

120

125

0 50 100 150 200 250 300 350 400

Power(Watt)

Elapsed time(s)

OriiMIG

FIGURE 34. Comparison between iMIG and original algorithm atsource host.

5. RELATED WORKS

Live migration optimization. Clark et al. [6] proposed the pre-copy live migration on Xen platform. In this study, they firsttransfer the memory pages iteratively, and then pause the sourceVM in the final phase in order to send the remaining pages,then resuming the execution. However, in some scenarios witha low bandwidth or a heavy workload, the pre-copy phase mightfail, since more number of dirty pages requires transmission.Hines et al. [7] proposed the post-copy live migration. Theyfirst transfer the VM processor state and then resuming theVM in the target host, only then they begin to copy the VMmemory contents. This strategy of post-copy is advantageousas it ensures the completion of the migration process with afew number of page transfers, thus improving the performance.Liu et al. [8] introduced the ReVirt [19] migration, based on

a full system trace and replay. They transfer the executiontrace logs from the source host to the target host and thenreplaying them on the target host. This approach reducesboth the migration downtime and the network bandwidthconsumption. To enhance the usability of their HiTrust service,Li et al. [20] proposed an adaptive trust negotiation strategywhich gave us lots of inspirations. Lei et al. [21] presenteda one-to-many migration method called VMScatter. It cansimultaneously migrate VMs from one to many other hostswith a shorter migration time. Nguyen et al. [22] introduceda strategy to limit the number of slots occupied during themigration process in a data center. Though a lot of researchworks have been carried out on live migration, the pre-copyis one schema that is most widely used owing to its easefunctionalities.

Performance evaluation. Jin et al. [13] improved theperformance of the live VM migration by an adaptive memorycompression. This approach can decrease the amount of datatransferred and balance the performance and the cost of VMmigration. The experiments demonstrate that it can reducethe downtime, migration time and the total data transferred.Fereydoun et al. [23] proposed a method for memory transfer.They decreased the downtime by using the probability densityfunction selection of the memory pages. Jin et al. [24] presenteda method by using CPU scheduling; the application downtimecan be reduced up to 88%, with an acceptable overhead. Songet al. [25] introduced a parallelizing live migration in which,they evaluated their method on Xen and KVM and the resultsshow that their approach can accelerate the live VM migrationand decrease the downtime. Sallam et al. [26] proposed a multi-objective migration policy which can direct the progress. Andtheir experiments show that their policy can control the systemperformance.

A number of studies [23, 24, 27] have been proposed toimprove the performance of the live VM migration. But to thebest of our knowledge, they all focus on reducing the amount ofdata transferred in order to improve the migration metrics andnone of them has ever focused on the migration environments.We consider the parameters effecting the migration itself topredict the time, well before the migration actually starts. Wealso optimize the existing KVM live migration by dynamicallychanging these parameters to adapt the environments during themigration.

Energy-saving computing. Jin et al. [28] investigated howvirtualization influences the energy usage in servers underdifferent workload. Li et al. [2] presented an energy-awarealgorithm to enable application live placement dynamically. Liuet al. [29] proposed an energy model of the VM migration andthey found that the energy consumption is linearly related tothe network traffic. Li et al. [30] designed an architecture calledCyberGuarder and measured the energy-consumption overheadwith four benchmark applications. Yang et al. [31] analyzedthe negative impact of the performance interference on energyefficiency after the VM co-allocation.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 15: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

Toward an Adaptive Live Migration Method for KVM Virtual Machines 15

6. CONCLUSIONS

Live migration of VM is now widely used in the data centersof the cloud infrastructures. Still defects are witnessed in thelive migration such as the lack of adaptive ability to make wayfor the green computing environments. To solve this problem,we designed iMIG to predict the migration time and adapted thedynamic mechanism to improve the migration performance, andto reduce the energy consumption. To achieve this, we analyzedthe parameters that affect the performance of live migration andenergy consumption by conducting compressive experiments.We calculated the impacts of speed limit, TCP_Buffer size,max downtime, VM memory and workload. And, we alsodeveloped an analytical model to characterize the impacts ofall the parameters on the live migration performance. Duringour experiments, we found that the bandwidth calculation inQEMU/KVM live migration is not accurate and to overcomethis issue, we adopted a few modifications in the time recordmechanism. Finally, we implemented a self-adaptive approachto optimize the existing KVM live migration mechanism.We proved the efficiency of our algorithm in improving themigration performance, supported by our experiment results.In the future, we plan to investigate the measuring techniquesof the floating workload during the migration process to achieveaccurate predictions of the migration time and the downtime.

ACKNOWLEDGEMENTS

We thank the anonymous reviewers for their valuable commentsand helps in improving this paper.

FUNDING

This work is supported by the 973 Program (No.2011CB302602), NSFC Programs (Nos. 91118008, 61202424),China MOST grant (No. 2012BAH46B04), 863 project (No.2013AA01A213), New Century Excellent Talents in University2010 and SKLSDE-2012ZX-21.

REFERENCES

[1] UCB/EECS-2009-28 (2009) Above the Clouds: A BerkeleyVeiw of Cloud Computing. EECS Department, University ofCalifornia, Berkeley, USA.

[2] Li, B., Li, J., Huai, J., Wo, T., Li, Q. and Zhong, L.(2009) EnaCloud:An Energy-SavingApplication Live PlacementApproach for Cloud Computing Environments. Proc. IEEEInt. Conf. Cloud Computing, CLOUD 2009, Bangalore, India,September 21–25, pp. 17–24. IEEE Computer Society Press,USA.

[3] Grit, L., Irwin, D., Yumerefendi, A. and Chase, J. (2006)Virtual Machine Hosting for Networked Clusters: building theFoundations for ‘Autonomic’ Orchestration. 1st Int. Workshopon Virtualization Technology in Distributed Computing (VTDC

2006), Tampa, FL, November 17, p. 7. IEEE Computer SocietyPress, USA.

[4] Meng, X., Pappas, V. and Zhang, L. (2010) Improving theScalability of Data Center Networks with Traffic-Aware VirtualMachine Placement. Proc. INFOCOM, San Diego, CA, March14–19, pp. 1154–1162. IEEE Computer Society Press, USA.

[5] Wood, T., Ramakrishnan, K.K., Shenoy, P. and Van der Merwe, J.(2011) CloudNet: Dynamic Pooling of Cloud Resources by LiveWAN Migration of Virtual Machines. VEE’11 Proc. 7th ACMSIGPLAN/SIGOPS Int. Conf. Virtual Execution Environments,Newport Beach, CA, March 9–11, pp. 121–132. ACM Press,USA.

[6] Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach,C., Ian, P. and Warfield, A. (2005) Live Migration of VirtualMachines. NSDI’05 Proc. 2nd Conf. Sym. Networked SystemsDesign & Implementation, Boston, MA, May 2–4, pp. 273–286.USENIX Association, Berkeley, USA.

[7] Hines, M.R. and Gopalan, K. (2009) Post-Copy BasedLive Virtual Machine Migration Using Adaptive Pre-pagingand Dynamic Self-Ballooning. VEE’09 Proc. 2009 ACMSIGPLAN/SIGOPS Int. Conf. Virtual Execution Environments,Washington, DC, March 11–13, pp. 51–60. ACM Press, USA.

[8] Liu, H., Jin, H., Liao, X., Hu, L. and Yu, C. (2009) LiveMigration of Virtual Machine Based on Full System Traceand Replay. HPDC’09 Proc. 18th ACM Int. Sym. HighPerformance Distributed Computing, Munich, Germany, June11–13, pp. 101–110. ACM Press, USA.

[9] Deshpande, U., Wang, X. and Gopalan, K. (2011) Live GangMigration of Virtual Machines. Proc. 20th Int. Symp. on HighPerformance Distributed Computing, San Jose, CA, June 08–11,pp. 135–146. ACM Press, USA.

[10] Nelson, M., Lim, B.H. and Hutchins, G. (2005) Fast TransparentMigration for Virtual Machines. Proc. Annual Conf. USENIXAnnual Technical Conf., Anaheim, CA, April 10–15, pp. 25–25,USENIX Association, Berkeley, USA.

[11] Bradford, R., Kotsovinos, E., Feldmann, A. and Schiöberg,H. (2007) Live Wide-Area Migration of Virtual MachinesIncluding Local Persistent State. VEE’07 Proc. 3rd Int. Conf.Virtual Execution Environments, San Diego, CA, June 13–15,pp. 169–179. ACM Press, USA.

[12] Svärd, P., Hudzia, B., Tordsson, J. and Elmroth, E. (2011)Evaluation of Delta Compression Techniques for Efficient LiveMigration of Large Virtual Machines. VEE’11 Proc. 7th ACMSIGPLAN/SIGOPS Int. Conf. Virtual Execution Environments,Newport Beach, CA, March 9–11, pp. 111–120. ACM Press,USA.

[13] Jin, H., Deng, L., Wu, S., Shi, X. and Pan, X. (2009) Live VirtualMachine Migration with Adaptive Memory Compression. IEEEInt. Conf. Cluster Computing and Workshops, New Orleans, LA,August 31 to September 4, pp. 1–10. IEEE Computer SocietyPress, USA.

[14] Kivity, A., Kamay,Y., Laor, D., Lublin, U. and Liguori, A. (2007)KVM: The Linux Virtual Machine Monitor. Proc. Linux Sym.,Ottawa, Canada, June 27–30, pp. 225–230. Linux Symposium,Canada.

[15] Lublin, U. and Liguori, A. (2007) KVM Live Migration. KVMForum, Tucson, Arizona, August 29–31, KVM Forum.

[16] KVM: Main Page. http://www.linux-kvm.org/page/Main_Page.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from

Page 16: iMIG: Toward anAdaptive Live Migration Method for ......directory. We use a Voltech PM1000+ power analyzer unit to measure and record the transient power and the total energy consumption

16 J. Li et al.

[17] Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D. and Lyon,B. (1985) Design and Implementation of the Sun NetworkFilesystem. Proc. Summer 1985 USENIX Conf., Portland, OR,June 1985, pp. 119–130, USENIX, USA.

[18] Phoronix. http://www.phoronix.com/scan.php?page=home.[19] Dunlap, G.W., King, S.T., Cinar, S., Basrai, M.A. and Chen, P.M.

(2002) Revirt: Eenabling Intrusion Analysis Through Virtual-Machine Logging and Replay. ACM SIGOPS Operating SystemsReview—OSDI’02: Proc. 5th Sym. Operating Systems Designand Implementation, Boston, MA, December 9–11, pp. 211–224.ACM Press, USA.

[20] Li, J., Liu, X., Liu, L., Sun, D. and Li, B. (2013) HiTrust:building cross-organizational trust relationship based on a hybridnegotiation tree. Telecommun. Syst., 52, 1353–1365.

[21] Cui, L., Li, J., Li, B., Huai, J., Hu, C., Wo, T., Hussain,A. and Liu,L. (2013) VMScatter: Migrate Virtual Machines to Many Hosts.Proc. 9th ACM SIGPLAN/SIGOPS Int. Conf. Virtual ExecutionEnvironments, Houston, TX, USA, March 16–17, pp. 63–72.ACM Press, USA.

[22] Nguyen, T.D., Nguyen, A.T., Nguyen, M.D., Van Nguyen, M.and Huh, E.N. (2014) An improvement of resource allocation formigration process in cloud environment. Comput. J., 57, 308–318.

[23] Moghaddam, F.F. and Cheriet, M. (2010) Decreasing LiveVirtual Machine Migration Down-Time Using a Memory PageSelection Based on Memory Change PDF. Proc. 2010 Int. Conf.Networking, Sensing and Control (ICNSC), Chicago, IL, April11–13, pp. 355–359. IEEE Computer Society Press, USA.

[24] Jin, H., Gao, W., Wu, S., Shi, X., Wu, X. and Zhou, F. (2011)Optimizing the live migration of virtual machine by CPUscheduling. J. Network Comput. Appl., 34, 1088–1096.

[25] Song, X., Shi, J., Liu, R., Yang, J. and Chen, H. (2013)Parallelizing Live Migration of Virtual Machines. VEE’13Proc. 9th ACM SIGPLAN/SIGOPS Int. Conf. Virtual ExecutionEnvironments, Houston, TX, March 16–17, pp. 85–96. ACMPress, USA.

[26] Sallam, A. and Li, K. (2014) A multi-objective virtual machinemigration policy in cloud systems. Comput. J., 57, 195–204.

[27] Zhao, M. and Figueiredo, R.J. (2007) Experimental Study ofVirtual Machine Migration in Support of Reservation of ClusterResources. VTDC’07 Proc. 2nd Int. Workshop on VirtualizationTechnology in Distributed Computing, Reno, Nevada, November12, pp. 1–8. ACM Press, USA.

[28] Jin, Y., Wen, Y., Chen, Q. and Zhu, Z. (2013) An empiricalinvestigation of the impact of server virtualization on energyefficiency for green data center. Comput. J., 56, 977–990.

[29] Liu, H., Jin, H., Xu, C.Z. and Liao, X. (2011) Performanceand Energy Modeling for Live Migration of Virtual Machines.HPDC’11 Proc. 20th Int. Sym. High Performance DistributedComputing, San Jose, CA, June 08–11, pp. 171–182.ACM Press,USA.

[30] Li, J., Li, B., Wo, T., Hu, C., Huai, J., Liu, L. and Lam, K.P. (2012)CyberGuarder: a virtualization security assurance architecturefor green cloud computing. Fut. Gener. Comput. Syst., 28,379–390.

[31] Yang, R., Moreno, I.S., Xu, J. and Wo, T. (2013) An Analysisof Performance Interference Effects on Energy-Efficiency ofVirtualized Cloud Environments. Proc. 5th IEEE Int. Conf.Cloud Computing Technology and Science (CloudCom), Bristol,UK, December 2–5, pp. 113–119. IEEE Computer Society,USA.

Section B: Computer and Communications Networks and Systems

The Computer Journal, 2014

at Beihang U

niversity on September 4, 2014

http://comjnl.oxfordjournals.org/

Dow

nloaded from