energy-saving adaptive computing and traffic engineering for real-time-service data centers

7/18/2019 Energy-Saving Adaptive Computing and Traffic Engineering for Real-Time-Service Data Centers

http://slidepdf.com/reader/full/energy-saving-adaptive-computing-and-traffic-engineering-for-real-time-service 1/8

See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/273342425

Energy-Saving Adaptive Computing and TrafficEngineering for Real-Time-Service Data Centers

CONFERENCE PAPER · JUNE 2015

CITATIONS

2

DOWNLOADS

68

VIEWS

84

4 AUTHORS:

Mohammad Shojafar

Sapienza University of Rome

52 PUBLICATIONS 176 CITATIONS

SEE PROFILE

Nicola Cordeschi



SEE PROFILE

Danilo Amendola



SEE PROFILE

Enzo Baccarelli



SEE PROFILE

Available from: Mohammad Shojafar

Retrieved on: 05 July 2015

http://www.researchgate.net/profile/Danilo_Amendola?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_4

http://www.researchgate.net/profile/Mohammad_Shojafar?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_4


http://www.researchgate.net/profile/Nicola_Cordeschi?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_5

http://www.researchgate.net/?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_1

http://www.researchgate.net/profile/Enzo_Baccarelli?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_7

http://www.researchgate.net/institution/Sapienza_University_of_Rome?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_6















http://www.researchgate.net/?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_1

http://www.researchgate.net/publication/273342425_Energy-Saving_Adaptive_Computing_and_Traffic_Engineering_for_Real-Time-Service_Data_Centers?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_3

http://www.researchgate.net/publication/273342425_Energy-Saving_Adaptive_Computing_and_Traffic_Engineering_for_Real-Time-Service_Data_Centers?enrichId=rgreq-757f1100-aa92-4836-b03a-a5f948917a7f&enrichSource=Y292ZXJQYWdlOzI3MzM0MjQyNTtBUzoyMzgxNzE3MjgzNzk5MDRAMTQzMzc5NTk3MzQ1OQ%3D%3D&el=1_x_2



Energy-Saving Adaptive Computing and Traffic Engineering

for Real-Time-Service Data Centers

Mohammad Shojafar∗, Nicola Cordeschi∗, Danilo Amendola∗ and Enzo Baccarelli∗

∗Sapienza University of Rome, Rome, Italy

Email: {

Shojafar, Cordeschi, Amendola, Enzo.Baccarelli}

@diet.uniromal.it

Abstract—In this paper, we propose a traffic engineering-based adaptive approach to dynamically reconfigure thecomputing-plus-communication resources of networked data cen-ters which support in real-time the service requirements of mobileclients connected by TCP/IP energy-limited wireless backbones.The goal is to maximize the energy-efficiency, while meeting hardQoS requirements on the delivered transmission rate and pro-cessing delay. In order to cope with the (possibly, unpredictable)fluctuations of the offered workload, the proposed optimal cross-layer resource controller is adaptive. It jointly performs: i) thebalanced control and dispatching of the admitted workload; ii) thedynamic reconfiguration of the Virtual Machines (VMs) instan-tiated onto the parallel computing platform at the data center;and iii) the rate control of the traffic injected into the wireless

backbone for delivering the service to the requiring clients. Ourexperimental results show that the proposed technique improvesenergy consumption of servers by 25% compared to state of theart improvement on average in the entire data center.

Keywords—TCP/IP connections; networked data center; energy-efficiency; adaptive resource management.

I. INTRODUCTION AND BACKGROUND

The forecast development of adaptive ubiquitous applica-tions (such as, iCloud) for highly parallel wireless (possibly,mobile) processing platforms demands for a novel designapproach that integrates both computing and communicationaspects and it is capable to effectively cope with the inherently

stochastic and time-varying nature of the wireless domain.This novel approach should be characterized by a tight in-teraction between two still distinct engineering fields, e.g.,Parallel Computing [1] and Wireless Mobile Communication[2]. From a communication perspective, over 50% of currentwireless traffic leverages TCP/IP architectures [3]. One of themost challenging tasks in data center technology is resourceand energy management for the applications [4]. Therefore,from an application-centered perspective, it is necessary forthe computing platforms hosted on Network Data Centers(NetDCs) to exchange information with the underlying TCP/IPwireless communication infrastructures, in order to provideQoS guarantees to (possibly, real-time) computing-intensivemultimedia applications over energy-limited congestion-proneTCP/IP mobile connections. In a nutshell, the goal is to mini-mize the overall energy for the computing-plus-communicationresources in NetDCs.

In this paper, we propose a new approach to decrease theenergy consumptions induced by computing, communicationand reconfiguration costs of virtualized clouds. Our approachtakes into account dynamic load balancing, because we con-sider the state of the server for the next workload, which comesinto the admission control system. We resort to online jobdecomposition and scheduling (i.e., which jobs are scheduledin global cost function including energy saving and running

time in VMs, simultaneously) for resource management. Theenergy-saving management of the computing resources ingreen Clouds is the specific topic of some quite recent contri-butions [5], [6], [7], [8]. In particular, [5] proposes a computingarchitecture for green Clouds that exploits Dynamic Voltageand Frequency Scaling (DVFS) techniques for increasing thecomputing energy efficiency. In [6], the authors present a greenCloud architecture for reducing the computing-induced energyconsumption, while attempting to meet the performance limitsrequested by the clients. The numerical results reported in[6] show, indeed, that the proposed green Cloud architectureis capable to save up to 27% of the computing energy. Thecurrent state-of-the-art about the exploitation of DVFS-basedtechniques for attaining the green Cloud paradigm is wellsummarized by the (recent) contribution in [7]. It deals withthe energy-aware optimized scheduling of jobs in computingclusters equipped with DVFS-enabled processors. Finally, au-thors in [8] used Lyapunov optimization technique to design analgorithm for joint job admission control, routing, and resourceallocation in a virtualized data center.

Overall, although the target computing platforms consid-ered in aforementioned references are parallel and managedby Virtual Machine Monitors (VMMs), their frameworks differfrom that considered in our paper under three main aspects.First, the temporary input/output buffering of the arriving tasksfor efficiently coping with both workload peaks and network-

ing congestion is not considered. Second, no QoS guaranteesare provided by the computing architecture considered in[5], [6], [7], [8] in terms of minimum processing rate andmaximum allowed processing delay. Third, the presence of wireless backbones is not considered in [5], [6], [7], [8], andthe effects induced by the client mobility are not addressed.

Passing to consider the researching area of the WirelessMobile Communication, a first research line focuses on thecross-layer analysis and optimization of TCP/IP traffic con-trol mechanisms for single-antenna and multi-antenna mobileconnections [9], [10], [11]. These contributions support theconclusion that an optimal control of the energy employed bythe wireless transmission is an effective means to improve the

resulting TCP goodput. However, this conclusion is partiallyoffset by the fact that [9], [10], [11] neglect the computingaspects. Analogous conclusion holds for the works in [12], and[13], in which optimized schedulers are derived by exploitingnonlinear optimization and queuing theory. Specifically, thescheduler developed in [12] does not present adaptive capa-bilities. Finally, the scheduler in [13] does not account for thelimitation on the energy budget available for the transmissionover the wireless backbone.

The paper structure can be outlined as follows. The consid-ered model, the proposed method and the mathematical proves

IEEE ICC 2015 - Workshop on Cloud Computing Systems, Networks, and Applications (CCSNA)

978-1-4673-6305-1/15/$31.00 ©2015 IEEE 9866



are clearly detailed in Section II. Simulation results can befound in Section III. Finally, Section IV summarizes the mainresults and outlooks future research.

II. MODEL AND S OLVING A PPROACH

The goal of this section is twofold. First, we define someimportant elements engaged in data center problems. Second,we introduce an optimization mathematical problem that cap-tures the main issues of several energy minimization problemsand resolve the resulting nonconvex problem in closed-form.

A. Basic Definitions

In this subsection, we explore the employed parts that areexploited in our approach. We apply competitive analysis [14]to analyze the subproblem of energy and performance efficientdynamic load balancing/VMs consolidation and data centerenergy minimization.

Fig. 1 reports the proposed platform for parallel real-time processing of workloads composed by multiple phys-ical servers which host M virtual machines (VMs), which

are interconnected by a switched rate-adaptive Virtual LAN(VLAN) and are managed by a central controller. Formallyspeaking, physical servers are equipped with multi-frequencyCPUs. We model each multi-frequency CPU with frequencyranges between minimum frequency considered for each VM isdenoted by f min

i , and maximum available frequency is denotedby f max

i .

The targeted system is an Infrastructure-as-a-Service (IaaS)environment. Each computing node is comprised by M het-erogeneous VMs or CPUs which can work in the afore-mentioned frequency ranges, and each one has M indepen-dent congestion-free half-duplex channel powered by P neti

i ∈ {1, . . . , M }. We observe that, in order to limit theimplementation cost, current data centers utilize off-the-shelf

rackmount physical servers, which are interconnected by com-modity Fast/Giga Ethernet switches. For the sake of clearness,we consider a single physical server or host and M VMsallocates to the host. In this paper, we consider the discrete-time model where the slot (i.e., time slot) length matchesthe timescale at which the data center can adjust its capacity,the workload comes in each time-slot (i.e., time-slot durationis definite) and processing workload immediately (i.e., real-time); no queue is considered for incoming/outgoing workloadinto/from the system.

In the considered model, it is clear that the wireless channelshould be modeled at the level of the Transport layer. Inparticular, we must consider that all the preparations and

performances of the services takes place in adaptive loaddispatcher at the application level, then the wireless channel(which in general can be supposed multi-hop) is placed be-tween the interface of the transport layer of the dispatcherand the corresponding interface at the transport layer of theclient, generally mobile clients. Considering the structure of the platform described above, we should formulate the problemof optimizing the allocation of resources and the wirelessbackbone model. In particular, it will be necessary to ”model”below: i) Energy consumption and VM consolidation whichwe called computing and is denoted by E CPU (J ); ii) Energyconsumption for switching the VMs frequencies we called

Fig. 1: Model of the considered communication-plus-computing technological platform.

reconfiguration and is denoted by E Reconf (J ); iii) Energyconsumption of the virtual-link reconfigurable LAN whichwe called communication and is denoted by E LAN (J ); iv)Wireless channel: energy transmitted and traffic patterns calledend-to-end wireless backbone and is denoted by E W (J ). Tosum-up,

E TOT E CPU + E Reconf + E LAN + E W , (J ). (1)

We try to model these parameters and minimize the wholeenergy consumption of the system or E TOT .

According to the described model, at the beginning of eachtime-slot, a new job of size Ltot(bit) arrives at the inputof the scheduler (i.e., VMM) of Fig. 1. The input job ischaracterized by: i) processed workload size denoted by Ltot

; ii) the maximum tolerated processing delay T t; and, iii) the job granularity, that is, the (integer-valued) maximum numberM T ≥ 1 of independent parallel tasks embedded into thesubmitted job. In principle, each VM may be modeled as avirtual server, that is capable to process f i [15]. The VMM of

Fig. 1 must carry out two main operations at run-time, namely,virtual machine management and load balancing. Specifically,goal of the virtual machine management is to adaptivelycontrol the Virtualization Layer of Fig. 1. In particular, theset of the (aforementioned) VM’s attributes:

{∆, f maxi (i), Φi(ηi), E maxi (i), i = 1, . . . , M } , (2)

are dictated by the Virtualization Layer and, then, they arepassed to the VMM of Fig. 1. Furthermore, due to the real-time nature of the considered application scenario, the timeallowed the VM to fully process each submitted task is fixedin advance at ∆ (s), regardless of the actual size L of the task currently assigned to the VM. Also, E max

i (i)(J ) is the per-jobmaximum energy consumed by V M (i). Hence, by definition,

the utilization factor η of the VM equates η f i/f maxi ∈[0, 1]. Then, as in [7], let E i = E i(f i) (J ) be the overall energyconsumed by the VM to process a single task of duration ∆at the processing rate f i, and let E max

i = E i(f maxi ) (J ) be

the corresponding maximum energy when the VM operates atthe maximum processing rate f max

i . Hence, by definition, the(dimensionless) ratio

Φ(η) E i(f i)

E maxi

= Φ

f if maxi

, (3)

is the so-called Normalized Energy Consumption (NEC) of the considered VM [15]. From an analytical point of view,


9867



Φ(η) : [0, 1] → [0, 1] is a function of the actual value η of theutilization factor of the VM. Its analytical behavior dependson the specific features of the resource provisioning policyactually implemented by the VMM of Fig. 1. A quite commonexpression is the quadratic form on [7], [16] or Φ(η) = η2.So, the computing cost of the system can be summarizes as

E CPU M

i=1

Φ(η)E maxi ≡M

i=1

f i

f

max

i2

E maxi , (4)

It is in charge of the VMM to implement a suitable frequency-scaling policy, in order to allow the VMs to scale up/down inreal-time their processing rates f i’s at the minimum cost [17].At this regard, we note that switching from the processingfrequency f 1 to the processing frequency f 2 entails an energycost of ε(f 1; f 2) (J ). Although the actual behavior of the func-tion ε(f 1; f 2) may depend on the adopted DVFS technique,any practical ε(f 1; f 2) function typically retains the followinggeneral properties: i) it depends on the absolute frequency gap|f 1 − f 2|; ii) it vanishes at f 1 = f 2 and is not decreasingin |f 1 − f 2|; and, iii) it is jointly convex in f 1, f 2. A quitecommon practical model, which retains the aforementionedformal properties, is the following one:

E Reconf M i=1

ε(f 1; f 2) =M i=1

ke (f 1 − f 2)2 (J ), (5)

where ke (J/(Hz)2) dictates the per VM reconfiguration costinduced by an unit-size frequency switching. Typical values of ke for current reconfigurable virtualized computing platformsare limited up to few hundreds of µJ s per (MHz)2 [16]. Forsake of concreteness, we directly subsume the quadratic modelin (5). The generalization to the case of ε(.; .) functions thatmeet the aforementioned (more general) analytical propertiesis, indeed, direct. For communication cost, the Shannon-Hartley exponential formula as

P net

i (Ri) = ζ i

2Ri/W i

− 1

(J ), (6)

with ζ i N

(i)0 W igi

, i = 1, . . . , M —noise spectral power density

N (i)0 (W/Hz), transmission bandwidth W i (Hz) and (non-negative) gain gi of the i-th link [18] is instance of power-ratefunctions of practical interest that meet the above assumptions.Therefore, since the corresponding one-way transmission delayequates: Di = Li/Ri , the resulting one-way communicationenergy E LAN (i) which is needed for sustaining the i-th virtuallink of Fig. 1 is: E LAN (i) = P neti (Li/Ri) where Ri(bit/s)is communication rate of the i-th virtual link and Li(bit) isassigned workload (received job) for the V M (i). Commu-nication virtual channel duration for each physical node, onthe (one-way) delays

{Di, i = 1 . . . M

} introduced by the

Virtual LAN (VLAN) and the allowed per-task processingtime ∆. Specifically, since the M virtual connections of Fig.1 are typically activated in a parallel fashion, the overall two-way communication-plus-computing delay induced by the i-th connection of Fig. 1 equates 2Di + ∆, so that the hardconstraint on the overall per-job execution time reads as in:

max1≤i≤M

{2Di} + ∆ ≤ T t. (7)

The wireless connection of Fig. 1 is understood to model alllayers of the stack protocol up to the Transport layer. Withregard to the physical level, we go to consider a connection

(typically multi-hop) affected by interference due to multipleaccess, noise and fading (the latter considered constant overa time slot, a physical block-faded channel which operates inthe steady-state condition). The resulting state σ(t) ∈ R+

0 incorrespondence of the t-th slot for the overall end-to-endconnection is modeled as a non-negative random real variable.We assume that the state σ(t) is known to the controller at thebeginning of slot t (this is necessary because the controller

has to manage a per-slot resource allocation). Regarding thegoodput offered by the wireless connection of Fig. 1 andthe evaluation of the cost in terms of energy required forthe transmission, the parameter r.v. r(t) depends on bothE W (t) and σ(t) through the rate function RW (·; ·) whichis precisely the measures the instantaneous goodput that thewireless connection is able to offer. In particular, we can write:

r(t) RW

E W (t), σ(t)

, t ≥ 0 (byte/slot) (8)

where RW (·; ·) in (8) is a nonnegative time-invariant function,whose arguments (E W (t) and σ) are also non-negative. Itdepends on multiple factors, including: i) performance of the modulation and coding adopted at the physical layer; ii)statistical characteristics fading and interference that impact

on our wireless channel; iii) phenomena of loss on the MAClayer; iv) statistical characteristics of delays introduced by theNetwork layer; and, v) characteristics of the client, in terms of speed of movement. Having established the above, it remainsto perform a more detailed analysis about the characteristicsof the goodput that the TCP/IP end-to-end wireless connectionis able to offer. We will see the assumptions made about theanalysis of the steady-state goodput of a mobile connectionTCP/IP with Rayleigh fading. The Network layer of the con-nection is assumed as a connection type of IP best-effect whichis not reliable. This assumption implies that the Network layerintroduce packet loss that is uncertain time-varying delays in[9]. At the Transport layer, we adopt the protocol TCP-Renowith Congestion-Avoidance [9]. In this model, we considerthat in the Transport layer is implemented ”Triple DuplicateAcknowledgment” (TDACK) as a technique used to notify theloss of packets. For all subsequent considerations we considerthe following conditions: (1) the underlying physical channelis simultaneously subject to two types of fading, Rayleigh andLog-normal distributed fading. We can consider them flat inthe frequency domain and constant in the duration of a timeslot (at least); (2) we consider that the feedback channel usedto carry the ACK messages from the client is reliable anddelay-free. In addition, when the buffer of the MAC layer issaturated, the new incoming frames are directly discarded anddefinitively considered lost. Finally, if the MAC layer of theclient receives a frame incorrectly, the encapsulated segment isirreversibly declared lost and a TDACK message is sent back to the controller; (3) in order to limit the end-to-end transportdelay, we don’t implement any kind of fragmentation. TheIP-based link (generally considered multi-hop) arising at theNetwork layer is therefore characterized by the presence of thephenomenon of packet loss and random delay.

According to the analysis presented in [9], we can modelthe sequence {∆IP (t) ∈ R+

0 , t ≥ 1} of the packets-delays (inmultiple of slot period) as a i.i.d. random sequence. The pdf function of each of the random variables ∆IP is uniformlydistributed in the interval [0, ∆max

IP ], where ∆maxIP (measured

in multiples of the slot period) is the maximum packet delay


9868



introduced by the IP layer which is known. At this point, underthe usual assumption that the segment loss rate P L(t) presentat the input of the Transport layer is limited to the value 10−2

which is obtained by taking account for the phenomenon of Rayleigh-distributed fading as

P L(t) ≈

C + (A/(CB2))Γ(1; CB )

(z (t)/E W (t)), t ≥ 1, (9)

where Γ(·, ·) is the incomplete Gamma function, the pos-

itive constants A, B and C are completely described theperformance, in terms of error, of the FEC system in Fig.1 [10], E W (t) is the energy that we need for transmittingover the wireless channel of Fig. 1 at slot t and z(t) takesinto account in mobility, it is modeled as a time-correlated

log-distributed sequence

z(t) ∈ R+0 , t ≥ 1

, as [19]: z(t)

a0100.1x(t), ∀t ≥ 1 where a0 ≈ 0.9738 assures the E {z(t)} ≡1 (J )−1, and {x(t), t ≥ 1} is a time correlated, stationary,zero-mean and unit-variance Markov random sequence withprobability density function of uniformly distributed in theinterval [−√

3,√

3] [19]. As a result, the goodput valueRW (t)(byte/slot) is given by the following formula:

RW (t) = (3/2b)1/2 MSS/(RT T (t))( P L(t))1/2 , t ≥ 1, (10)

where b = 2, M SS (byte) (Maximum Segment Size) is themaximum permitted size of the segment, and RT T (t) is theaverage Round Trip Time (measured in multiples of the slotperiod). RT T (t) is calculated iteratively using the followingformula:

RT T (t) = 0.75RT T (t− 1) + 0.25∆IP (t), t ≥ 1, R T T (0) = 0.(11)

Finally, by making the appropriate substitutions (e.g., insertingeq. (9) in (8)) we can elicit the following analytical expressionthat is precisely determinate by the instantaneous goodput thatthe wireless TCP/IP can provide in the steady state:

r(t) = σ(t)(E W (t))1/2, (byte/slot), (12)

where the state of the connection at slot t is

σ(t) (K 0(z (t))1/2)/RTT (t), t ≥ 1, (13.1)

while the positive constant

K 0

(3/2b)1/2MSS

/(C + (A/CB2)Γ(1; CB ))1/2(byte)

(13.2)

describes the performance of the FEC-based error-recoverysystem implemented at the Physical layer of Fig. 1. For findingoptimum energy consumption for the end-to-end wirelessbackbone, it is enough to achieve r(·)−1 and minimize theE W (t) respect to the r(t). As we know, average incomingworkload to the system is strictly correlated to the average

goodput of the channel while the system is in the steady state.Therefore, the average goodput is known and it is calculatedaccording to the average incoming workload in each time-slot to the system. Therefore, E ∗W (t) which is the optimumenergy consumption in the end-to-end wireless backbone is inquadratic form and closed form in the feasibility region andeasily can be achieved. To do this, we assume a boundaryfor the r(t) ∈ [rmin, rmax] (byte/slot) and we should takeinto account the stability of the network and find propergoodput which the energy-consumption in the network for eachtime-slot t should be equal to the average energy availablefor the transmission over wireless/wired network denoted by

E ave(J ). therefore, it is important to reach to this energy forthe transmission beside finding r∗(t). To find E ∗W (t), we usethe stochastic gradient projection algorithm [20] to find theoptimum transmission energy.

B. Optimization Problem

The proposed scheduling algorithm, which allows to de-

termine 3M parameters {f ∗

i , L∗

i , R∗

i , i = 1, . . . , M }, is com-pletely independent of the size of the arrived workload Ltot

and the total capacity in terms of Rt rate of the local LAN.

min{Ri,f i,Li}

M i=1

f if maxi

2

E maxi + ke

f i − f 0i

2+

+2P neti (Ri)

Li

Ri

, (14.1)

s.t.: (Li) ≤ f i∆, i = 1, . . . , M , (14.2)

M i=1

Li = Ltot, (14.3)

0

≤f i

≤f maxi , i = 1, . . . , M , (14.4)

Li ≥ 0, i = 1, . . . , M , (14.5)

2Li

Ri

+ ∆ ≤ T t, i = 1, . . . , M , (14.6)

M i=1

Ri ≤ Rt, (14.7)

Ri ≥ 0, i = 1, . . . , M . (14.8)

About the stated problem, the first two terms in the summa-tion in (14.1) account for the computing-plus-reconfigurationenergy E c(i) consumed by the VM(i), while the third termin (14.1) is the communication energy E net(i) or E LAN (i)requested by the corresponding point-to-point virtual link for

conveying Li bits at the transmission rate of Ri (bit/s).Furthermore, f 0i and f i in (14.1) represent the current (i.e.,already computed and consolidated) computing rate and thetarget one, respectively. Formally speaking, f i is the variableto be optimized, while f 0i describes the current state of theV M (i), and, then, it plays the role of a known constant. Hence,

ke

f i − f 0i2

in (14.1) accounts for the resulting switchingcost. The constraint in (14.2) guarantees that V M (i) executesthe assigned task within ∆ secs, while the (global) constraintin (14.3) assures that the overall job is partitioned into M parallel tasks. According to (7), the set of constraints in (14.6)forces the considered problem of Fig. 1 to process the overall job within the assigned hard deadline T t. Finally, the globalconstraint in (14.7) limits up to Rt (bit/s) the aggregate

transmission rate sustainable by the underlying VLAN of Fig.1, so that Rt is directly dictated by the actually consideredVLAN standard [6]. The first and second terms of the objectivefunction in (14.1) are convex and non-decreasing, the thirdterm is nonconvex but by replacing (T t − ∆)/2 and 2Li

T t−∆

instead of Li

Ri, Ri from (14.6), respectively, we can make the

communication cost convex in f i and Li. As a result, it can

be simplified as: (T t − ∆)M

i=1 ζ i(2

2 Li

(T t − ∆) W i − 1)2.

We propose an iterative method in which we can calculatethe optimum Li or f i for each V M (i) for each incoming


9869



workload. Iterative method has flexibility and reliability ina multiple scenarios. After some iterations, we will reachthe optimal solution, in the sense of the energy-saving, of the considered problem which is consist of a set of optimalparameters {µ∗; L∗

i , ν ∗i , f ∗i , i = 1, . . . , M }. We iterate ourmethod n-times (i.e. n is the loop counter for searching properoptimum workload and frequency for each VM). There aresome initializations points that we should point out for re-production of the approach: i) the n index iteration to be

considered as n ≥ 1; ii) β and γ are positive constants; iii) thei index of the VM should be considered as i = 1, . . . , M ; and,

iv)µ(0) = 0, α(0) = β , ν (0)i = 0 for i = 1, . . . , M , L

(0)i = 0

for i = 1, . . . , M , V (0) = 0.

µ(n) =

µ(n−1) − α(n−1)

M i=1

L(n−1)i − Ltot

+

, (15.1)

y(n)i =

(T t − ∆)

2 W i log2

µ(n)

T H (i)

+

(15.2)

T H (i) = 2 W i ln(2) N (i)0

gi, (15.3)

ν (n)i = ν (n−1)

i + α(n−1)NEW y(n)i − ∆ f (n

−1)i + , (15.4)

f (n)i =

2 ke f

(0)i + ν

(n)i ∆

2 ke + 2

E maxi

(f maxi )2

f maxi

0

, (15.5)

L(n)i = min{∆ f (n)i y(n)i }, (15.6)

α(n) ≡

max

0;min

β ; α(n−1) − γV (n−1)

M i=1

L(n−1)i − Ltot

,

(15.7)

α(n)NEW ≡ max

0;min

β NEW ; α

(n−1)NEW −

γV (n−1) M

i=1L(n−1)

i − Ltot

, (15.8)

V (n) ≡

1 − α(n−1)

V (n−1) −

M i=1

L(n−1)i − Ltot

, (15.9)

V (n)NEW ≡

1 − α(n−1)NEW

V (n

−1)NEW −

M i=1

L(n−1)i − Ltot

.

(15.10)

In eq. (15), we propose two different formulas for calculatingthe variable α and V which represents the step of adaptation,and the intermediate parameter for updating the αs to be usedseparately in the calculation of the dual variables µ and ν ,respectively. This change justified the fact that, in the optimiza-

tion problems in general, each variable should always have itsown step-size. In practice, when ∆ approaches T t, the valuesβ gets far from β NEW and α within long distance of αNEW .Specifically, β and αNEW will be much larger,simultaneously.To conclude, the description of the optimization algorithmimplemented must necessarily specify that, in addition to thesequence of steps just described, our software is composed of three basic parts. The sum of the three costs along wirelessbackbone cost will provide us with the total cost . This valueis essential to evaluate the performance of our scheduler interms of energy savings and real utility in achieving the goalof projection towards the green paradigm. In fact, when we are

going to perform the simulations, it is through the evaluationof the total cost that we can be aware of the differences,in terms of energy saving, between a case study and theother. Note that, the following reports index i indicates thei-th VMs and n represents the iteration indexes. In detail, wehave proposed three convergence conditions after each iterativeloop. Formally speaking, at the end of each iterative cycle, weare going to checkout some conditions that will determine if

the optimization process should continue or if we can stopbecause the solution is to be considered ””excellent””. Eachcondition should be processed one after the other or step bystep, for each iteration must be performed to verify that thesolution found {µ∗; L∗

i , ν ∗i , f ∗i , i = 1, . . . , M } is optimal andpermissible. In following, the three aforementioned conditionsare listed in the same order in which they were placed

inside the iterative algorithm: i)

|(M

i=1 Li) − Ltot|Ltot

≤ a

is to ensure that the total load (Ltot is the total workloadsize) has been fully allocated with a sufficiently accuracy(according to what we choose small a); ii) [Li ≤ f i ∆]has the meaning of verifying that the working frequencyselected for the generic VM is sufficient to enable it and be

responded in the time limit ∆ with the Li load assigned; iii)the so-called complementary condition is enable to verify that[(|ν i| ≤ b) or (|Li − f i ∆| ≤ c)]. If all three conditionsare met then, our current workload scheduler converged tothe optimal solution in terms of resource allocation, i.e., wehave minimized the energy consumption for computing-plus-communication of proposed model.

III. PERFORMANCE E VALUATION AND N UMERICAL T ES T

This section presents the simulated performance of theproposed scheduler for a synthetic workload and comparesthe simulation results with the no-DVFS techniques in [7]and the well-known method (i.e., Lyapunov-based method)

which recovers CPU and reconfiguration approach in [8].The simulations were carried out with the numerical softwareMATLAB platform under Microsoft Windows 8 x64 on IntelCore i7. We want to evaluate the average energy per-job relatedto communication-plus-computing E tot that will be consumedby the system managed by our optimum scheduler. The gen-eral scenario considered in this paper is as follows: DVFSFrequency for each VM (i.e., f i = {0, 5, 50, 70, 90, f max

i })T t = 5 (s), Rt = 100 (Mb/s), M SS = 120 (byte),ke = {0.005, 0.05} (J/(MHz)2), f max

i = 105 (Mbit/s),E maxi = 60 (J ), ∆ = 0.1 (s), W i = 1 (MHz), E ave = 8 (J ),

rmax = 4rmin = 2000(byte/slot),#itrmax = 104, γ = 0.5,a = b = c = 0.01 and time − slot = 2000.

We test the performance of the scheduler paying particularattention to the cost of reconfiguration in case there are tempo-ral fluctuations of workload. Specifically, the sequence of theworkload {Ltot(mT t), m = 0, 1, . . .} will be characterized bya period of inter-arrival T t and a dimension represented by anuniformly distributed random variable in [Ltot − a , Ltot + a]with Ltot = 8 (Mbit) and a = 2 (Mbit) that is, a peak-mean-ratio (PMR) equal to 1.25.

f 0i

represents the initialstate of the VM in terms of their frequency (each VM hasa working frequency). In general, for every workload, afterthe service of the first workload

f 0i

matches the optimalfrequency {f ∗i } which is calculated from the previous task. To


9870



100

101

102

103

104

10−2

100

102

104 ∆ = {0.1,4.4} (s), γ = 0 .5, T t = 5 (s), β = 0 .15, #itrmax = 104

iterations

µ

M=2, N=50, ∆=0.1 (s)

M=2, N=1000,∆=0.1 (s)

M=100, N=50,∆=0.1 (s)

M=100, N=1000,∆=0.1

M=2, N=50, ∆=4.4 (s)

M=2, N=1000,∆=4.4 (s)

M=100, N=1000,∆=4.4 (s)

M=100, N=50,∆=4.4 (s)

(a) Dual variable µ

0 100 200 300 400 500 600−2

0

2

4

6 ν in ∆ = 0 .1 (s), γ = 0.5, T t = 5 (s), β = 0.15

#iterations

ν

M=2, N=50

M=2, N=1000

M=100, N=50

M=100, N=1000

(b) Dual variable ν

0 20 40 60 80 100 1200

2

4

6

8 E ave = 8(J ), γ = 0.5, β = 0.15, #itr

max = 10 4

#iterations

E W

( J )

(c) E W for each t

Fig. 2: Example of achieved convergence to the optimal value for µ∗ and ν ∗ in scenario: M = {2, 100}, ∆ = {0.1, 4.4}(sec)and ke = {0.005} (J/(M Hz)2)

confer to our simulations, a certain reliability from a statisticalpoint of view will be used 2000 workloads cycles. The firstsimulation demonstrates the convergence rates for the proposedscheduler facing with various parameters fluctuations. Wecan see these results in Figs. 2(a), 2(b) and 2(c). Figs. 2demonstrate the internal iterative loop convergence for thementioned parameters in (15). Specifically, Fig. 2(a) concludesthat: i) the proposed scheduler is able to converge even with

low VMs (i.e., for M = 2 which is hard to gain) andhigh computing time near to the maximum tolerated delay T t(i.e., ∆ = 4.4(s)), and ii) when the number of applicationincreases dramatically the convergence for the µ is reachedfaster. Fig. 2(b) indicates that while M increases not only ν decreases but also the convergence for finding suitable ν inthe iterative method reach faster, it means that, the scheduleris able to balance load easier with more handy resources. Fig.2(c) demonstrates that the iterative gradient method is ableto find proper E W for each incoming workload transmittedinto the end-t-end backbone channel in each time slot in someiterations. The second simulation presents the goodput andRTT of the the TCP connection in the case where we vary thevalue of the energy available for transmission E W (t). About

code modulation of the system, we used QPSK with codingparameters (A = 90.2514, B = 3.4998, C = 1.0942, rate=1.5). The available energy for wireless transmission E W (t) ismodeled as a random variable with a unit mean value andvariance. We performed simulations for three different valuesof E W in the case of unit variance (σ2

E W = 1). Here, for

average energy 8 (J ) for the wireless backbone channel ouraverage goodput is approximately 33.83 bit per-slot.

Furthermore, we evaluated the average (per-job) energyconsumed by the system to vary the number of available VMs,and the variation of {ζ i}. In particular, in addition to thetwo constant values, we have carried out the simulation inthe case of increasing

{ζ i

} with respect to the VM indexes.

The aforementioned system describes the following scenario:every-time we allocate a new VM, it has a higher channel costbecause maybe it is physically placed in a server far away.We can see what happen to the mean per-job communication-plus-computing energy E TOT at the variation of the parameterske and ζ that represent the re-configuration cost of VM andchannel variances, simultaneously. The parameters that arevariables in this simulation are M = 2, . . . , 15, ke = 0.5 and0.005 (J/(MHz)2) and, for the channel we have: i) [HMC]VMs HoMogeneous channels with ζ i 0.5 (mW ) and ii) [HTC]VMs HeTerogeneous channels with ζ i = [0.5 + 0.05(i − 1)]and ζ i = [0.5 + 0.1(i − 1)] (mW ), (i.e., the program execute

1 2 3 4 5 6 7 8 9 10 11 12 13 14 1510

20

40

60

80

100

120

140

VMs

E T O T

( J o u l e )

∆= 0 .1 (s), M = 15 (i.i.d runs), HMC=HeMogenous Channel, HTC=HeTerogeneous Channel

[HMC]k e=0.005(J/(MHz)

2), ζ

i=0.5 (mW) i=1,…,M

[HMC]k e=0.05(J/(MHz)

2), ζ

i=0.5 (mW) i=1,…,M

[HTC]k e=0.005(J/(MHz)

2), ζ

i=0.5+0.05(i−1)(mW) i=1,…,M

[HTC]k e=0.05(J/(MHz)

2), ζ

i=0.5+0.1(i−1) (mW) i=1,…,M

Fig. 3: E tot -vs.-M with different ke for homogeneous andheterogeneous channels.

15 times for group of VMs independently).

Based on the synthetic traces of workload in Fig. 3,comparisons of HTC plots with the corresponding ones of theHMC in the same ke confirms that by increasing the VMs theenergy reduces which ranges from 4% (case of ke = 0.005

with two lower plots) to 8.5% (case of ke = 0.05 with twoupper plots). These results proceed the expectations [5] thatnoticeable energy savings may be attained by jointly changingthe available computing-plus-communication resources. In twoupper most plots in Fig. 3, while VM is 2, the E tot increasessuddenly due to increasing the reconfiguration cost but whileM increases, the scheduler controls/manages the energy anddecrease the energy parts according to aforementioned formulain (14) and (15).

In the last simulations, in order to evaluate the energyreduction due to scaling up/down of the computing and re-configuration rates by increasing the VMs (i.e., we processthe results for the one time implementation over 100 VMs),

we have also implemented two well-known recent schedulers[7], [8] based on aforementioned general scenario which arepresented in Figs. 4 and 5, simultaneously. According toFig. 4, average energy saving for the proposed method (i.e.,green color continue plot;−∇−) compared with Lyapunov-based method (i.e., light-thick blue color continue plot) andno DVFS method (i.e., yellow blue color continue plot with−− points) are about 60% and 25%, simultaneously. Itconfirms that the proposed method is able to adapt itself withthe incoming workload whilst increasing the VM number fasterthan no-DVFS method which concentrates on the optimumfrequency in each time-slot. Also, Fig. 5 shows that, the


9871



0 10 20 30 40 50 60 70 80 90 10010

0

101

102∆ = 0 .1 (s), T t = 5 (s), itr max = 10 4, # WL = 2000, f max = 105 (Mbit/s), V = 100, E max = 60(J ), P min = 10(w)

V Ms

E C P U

( J )

E CP U , DVFS

E CPU per VM, DVFSE CP U [8]

E CPU per VM [8]NO-DVFS [7]

E CPU NO-DVFS [7]

Fig. 4: E CPU (J ) for the proposed method, no-DVFS methodin [7], and Lyapunov method in [8].

10 20 30 40 50 60 70 80 90 100

10−4

10−3

10−2

10−1

100

∆ = 0 .1 (s), T t = 5 (s), itr max = 10 4, W L = 2000, f max = 105 (Mbit/s), V = 100, E max = 60 (J), P min = 10 (w)

V Ms

E R e c o n f ( J )

E Reconf

E Reconf per VM

E Reconf per VM [7]E Reconf [7]

E Reconf per VM [8]E Reconf [8]

Fig. 5: E Reconf (J ) for the proposed method (i.e., usingDVFS)-vs.- no-DVFS method in [7]-vs.- Lyapunov method in

[8].

average reconfiguration cost differences between the proposedmethod and the no-DVFS method [7] is negligible but is higherthan Lyapunov-based method which is approximately 1000times lower than our method, but, with looking at these twofigures (Figs. 4 and 5), we can easily understand that thisdifference is unable to fill the gap of computing part, as aresult, [8] even with lower switching cost has much highercomputing cost compared to the proposed method.

IV. CONCLUSION

In this paper, we developed an iterative-based model for

the joint admitted workload, delivered throughput, and, re-source reconfiguration of computing-plus-communication plat-forms equipped with wireless Internet-based connections. Theoverall goal is the energy-saving support of QoS demand-ing computing-intensive delay-sensitive services that utilizeTCP/IP wireless connections for delivering remotely processedworkload to clients. Its implementation indicates that theresulting complexity fully scales with the number of theavailable VMs and takes at the minimum the energy consumedby the overall platform for computing, communication andtransmission over the wireless connection; and, it is capable toprovide hard QoS guarantees, in terms of minimum delivered

instantaneous goodput. The energy-efficient adaptive manage-ment of the delay-vs.-throughput trade off of the WAN TCP/IPmobile connections becomes an additional topic for furtherresearch.

REFERENCES

[1] Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in mo-bile cloud computing: taxonomy and open challenges,” CommunicationsSurveys & Tutorials, IEEE , vol. 16, no. 1, pp. 369–392.

[2] B. Hayes, “Cloud computing,” Commun. ACM , vol. 51, no. 7, pp. 9–11,Jul. 2008.

[3] S. Jin, L. Guo, I. Matta, and A. Bestavros, “A spectrum of tcp-friendlywindow-based congestion control algorithms,” IEEE/ACM Transactions

on Networking (TON), vol. 11, no. 3, pp. 341–355, 2003.

[4] G. Aceto, A. Botta, W. De Donato, and A. Pescape, “Cloud monitoring:A survey,” Computer Networks, vol. 57, no. 9, pp. 2093–2115, 2013.

[5] R. Buyya, A. Beloglazov, and J. Abawajy, “Energy-efficient manage-ment of data center resources for cloud computing: A vision, architec-tural elements, and open challenges,” arXiv preprint arXiv:1006.0308 ,2010.

[6] L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen,“Greencloud: a new architecture for green data center,” in Proceedings

of the 6th international conference industry session on Autonomic

computing and communications industry session. ACM, 2009, pp.29–38.

[7] N. Cordeschi, M. Shojafar, and E. Baccarelli, “Energy-saving self-configuring networked data centers,” Computer Networks, vol. 57,no. 17, pp. 3479–3491, 2013.

[8] R. Urgaonkar, U. C. Kozat, K. Igarashi, and M. J. Neely, “Dynamicresource allocation and power management in virtualized data centers,”in Network Operations and Management Symposium (NOMS), 2010

IEEE . IEEE, 2010, pp. 479–486.

[9] E. Baccarelli and M. Biagi, “Optimized power allocation and signalshaping for interference-limited multi-antenna ad hoc networks,” inPersonal Wireless Communications. Springer, 2003, pp. 138–152.

[10] Q. Liu, S. Zhou, and G. B. Giannakis, “Cross-layer combining of adaptive modulation and coding with truncated arq over wireless links,”Wireless Communications, IEEE Transactions on, vol. 3, no. 5, pp.1746–1755, 2004.

[11] E. Baccarelli and M. Biagi, “Error resistant space-time coding foremerging 4g-wlans,” in Wireless Communications and Networking,2003. WCNC 2003. 2003 IEEE , vol. 1. IEEE, 2003, pp. 72–77.

[12] D. Mitra and Q. Wang, “Stochastic traffic engineering for demanduncertainty and risk-aware network revenue management,” IEEE/ACM Transactions on Networking (TON), vol. 13, no. 2, pp. 221–233, 2005.

[13] S. Faruque, “Traffic engineering for multi rate wireless data,” in Electro/Information Technology, 2008. EIT 2008. IEEE InternationalConference on. IEEE, 2008, pp. 280–283.

[14] A. Borodin and R. El-Yaniv, Online computation and competitiveanalysis. Cambridge University Press, 1998.

[15] R. Nathuji and K. Schwan, “Virtualpower: coordinated power manage-ment in virtualized enterprise systems,” in ACM SIGOPS OperatingSystems Review, vol. 41, no. 6. ACM, 2007, pp. 265–278.

[16] D. Zhu, R. Melhem, and B. R. Childers, “Scheduling with dynamic

voltage/speed adjustment using slack reclamation in multiprocessorreal-time systems,” IEEE Trans. Parallel Distrib. Syst., vol. 14, no. 7,pp. 686–700, Jul. 2003.

[17] D. Warneke and O. Kao, “Exploiting dynamic resource allocation forefficient parallel data processing in the cloud,” Parallel and Distributed Systems, IEEE Transactions on, vol. 22, no. 6, pp. 985–997, 2011.

[18] N. Cordeschi, T. Patriarca, and E. Baccarelli, “Stochastic traffic engi-neering for real-time applications over wireless networks,” Journal of

Network and Computer Applications, vol. 35, no. 2, pp. 681–694, 2012.

[19] M. Gudmundson, “Correlation model for shadow fading in mobile radiosystems,” Electronics letters, vol. 27, no. 23, pp. 2145–2146, 1991.

[20] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and distributed computa-tion: numerical methods. Prentice-Hall, Inc., 1989.


energy-saving adaptive computing and traffic engineering for real-time-service data centers

Documents