fuzzy self-learning controllers for elasticity management in dynamic cloud architectures

50
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures Pooyan Jamshidi Imperial College London [email protected] Invited talk at Sharif University of Technology 12 April 2016

Upload: pooyan-jamshidi

Post on 13-Apr-2017

452 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

FuzzySelf-LearningControllersforElasticityManagementinDynamicCloudArchitectures

Pooyan JamshidiImperial College [email protected]

Invited talk at Sharif University of Technology12 April 2016

Page 2: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Motivation

~50%=wastedhardware

Actual traffic

Typical weekly traffic to Web-based applications (e.g., Amazon.com)

Page 3: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Motivation

Problem1:~75%wastedcapacityActual

demand

Problem2:customerlost

Traffic in an unexpected burst in requests (e.g. end of year traffic to Amazon.com)

Page 4: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Motivation

Really like this??

Auto-scaling enables you to realize this ideal on-demand provisioningTime

Demand

?Enacting change in the Cloud resources are notreal-time

Page 5: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Motivation

Capacity we can provision with Auto-Scaling

A realistic figure of dynamic provisioning

Page 6: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ResearchChallenges

• Challenge 1.Parameters’valuepredictionaheadoftime.

• Challenge2.Qualitativespecificationofthresholds.

• Challenge3.Robustcontrolofuncertaintyinmeasurementdata.

Page 7: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Predictablevs.UnpredictableDemand

0 50 1000

500

1000

1500

0 50 100100

200

300

400

500

0 50 1000

1000

2000

0 50 1000

200

400

600

Page 8: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ResearchChallenges

• Challenge 1.Parameters’valuepredictionaheadoftime.

• Challenge2.Qualitativespecificationofthresholds.

• Challenge3.Robustcontrolofuncertaintyinmeasurementdata.

Page 9: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

AnExampleofAuto-scalingRule These quantitative values are required to be determined by the userÞ requires deep

knowledge of application (CPU, memory, thresholds)

Þ requires performance modeling expertise (when and how to scale)

Þ A unified opinion of user(s) is required

Amazon auto scaling Microsoft Azure Watch

9

Microsoft Azure Auto-scaling Application Block

Page 10: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ResearchChallenges

• Challenge 1.Parameters’valuepredictionaheadoftime.

• Challenge2.Qualitativespecificationofthresholds.

• Challenge3.Robustcontrolofuncertaintyinmeasurementdata.

Page 11: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

SourcesofUncertaintyinElasticSoftwareP. Jamshidi, C. Pahl, N. Mendonca, “Managing Uncertainty in Autonomic Cloud Elasticity Controllers”, IEEE Cloud Computing, 2016.

P. Jamshidi, C. Pahl, “Software Architecture for the Cloud–a Roadmap towards Control-Theoretic, Model-Based Cloud Architecture”, LNCS, 2015.

Page 12: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Aconcreteexampleofuncertaintyinthecloud

Uncertainty related to enactment latency: The same scaling action (adding/removing a VM with precisely the same size) took different time to be enacted on the cloud platform (here is Microsoft Azure) at different points and this difference were significant (up to couple of minutes).The enactment latency would be also different on different cloud platforms.

Page 13: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Goal!•Taketheburdenawayfromtheuser

─ usersspecifiesthethresholdsthroughqualitativelinguistics─ theauto-scalingcontrollershouldbefullyresponsibleforscalingdecisions─ theauto-scalingshouldberobust againstuncertainty

Ø Offlinebenchmarking

Ø Trial-and-error

Ø Expertknowledge

Costly and not systematic

A. Gandhi, P. Dube, A. Karve, A. Kochut, L. Zhang, Adaptive, “Model-driven Autoscaling for Cloud Applications”, ICAC’14

arrivalrate(req/s)

95%Resp.time(m

s)

400ms

60req/s

Page 14: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

RobusT2Scale:ArchitecturalOverviewRobusT2Scale Initial setting +

elasticity rules + response-time SLA

environment monitoring

application monitoring

scaling actions

Fuzzy ReasoningUsers

Prediction/Smoothing

Page 15: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

InternalDetailsofRobusT2ScaleCode:https://github.com/pooyanjamshidi/RobusT2Scale

Page 16: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Whywedecidedtousetype-2fuzzylogic?

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Regionofdefinitesatisfaction

RegionofdefinitedissatisfactionRegionof

uncertainsatisfaction

Performance Index

Poss

ibili

ty

Performance IndexPo

ssib

ility

words can mean different things to different people

Different users often recommend different elasticity policies

0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Type-2 MF

Type-1 MF

Page 17: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Howwedesignedthefuzzycontroller?

- Thefuzzylogiccontrolleriscompletelydefinedbyits“membershipfunctions”and“fuzzyrules”.

- Knowledgeelicitationthroughasurvey of10 experts.

Page 18: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

SurveyprocessingandfuzzyMFconstruction

Workload

Response time

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x2

uM

embe

rshi

p gr

ade

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x2u

Mem

bers

hip

grad

e

=>

=>

UMF

LMF

Embedded

FOU

mean

sd

Page 19: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

FuzzyruleelicitationRule(𝒍)

Antecedents Consequent

𝒄𝒂𝒗𝒈𝒍 Workload Response-

timeNormal(-2)

Effort(-1)

MediumEffort(0)

HighEffort(+1)

MaximumEffort(+2)

1 Verylow Instantaneous 7 2 1 0 0 -1.62 Verylow Fast 5 4 1 0 0 -1.43 Verylow Medium 0 2 6 2 0 04 Verylow Slow 0 0 4 6 0 0.65 Verylow Veryslow 0 0 0 6 4 1.46 Low Instantaneous 5 3 2 0 0 -1.37 Low Fast 2 7 1 0 0 -1.18 Low Medium 0 1 5 3 1 0.49 Low Slow 0 0 1 8 1 110 Low Veryslow 0 0 0 4 6 1.611 Medium Instantaneous 6 4 0 0 0 -1.612 Medium Fast 2 5 3 0 0 -0.913 Medium Medium 0 0 5 4 1 0.614 Medium Slow 0 0 1 7 2 1.115 Medium Veryslow 0 0 1 3 6 1.516 High Instantaneous 8 2 0 0 0 -1.817 High Fast 4 6 0 0 0 -1.418 High Medium 0 1 5 3 1 0.419 High Slow 0 0 1 7 2 1.120 High Veryslow 0 0 0 6 4 1.421 Veryhigh Instantaneous 9 1 0 0 0 -1.922 Veryhigh Fast 3 6 1 0 0 -1.223 Veryhigh Medium 0 1 4 4 1 0.524 Veryhigh Slow 0 0 1 8 1 125 Veryhigh Veryslow 0 0 0 4 6 1.6

Rule()

Antecedents ConsequentWorkload

Response-time

-2 -1 0 +1 +2

12 Medium Fast 2 5 3 0 0 -0.9

10 experts’ responses

𝑅": IF (the workload (𝑥%) is 𝐹'(), AND the response-time (𝑥*) is 𝐺'(,), THEN (add/remove 𝑐./0" instances).

𝑐./0" =∑ 𝑤4"×𝐶7849%

∑ 𝑤4"7849%

Goal: pre-computations of costly calculations to make a runtime efficient elasticity reasoning based on fuzzy inference

Page 20: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ElasticityReasoning@Runtime

Liang, Q., Mendel, J. M. (2000). Interval type-2 fuzzy logic systems: theory and design. Fuzzy Systems, IEEE Transactions on, 8(5), 535-550.

Scaling Actions

Monitoring Data

Page 21: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Fuzzification

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.5954

0.3797

𝑀

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.2212

0.0000

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x2

u

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x2

uMonitoring data

Workload

Response time

Page 22: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

InferenceMechanism

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.5954

0.3797

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

10.95680.9377

Page 23: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

OutputProcessing

Page 24: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ControlSurface𝑦", 𝑦=

Page 25: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ToolChainArchitecture

Page 26: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ExperimentalSetting:ProcessView

0 50 1000

500

1000

1500

0 50 100100

200

300

400

500

0 50 1000

1000

2000

0 50 1000

200

400

600

0 50 1000

500

1000

0 50 1000

500

1000

Page 27: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

EstimationErrorsw.r.t.WorkloadPatterns

0 10 20 30 40 50 60 70 80 90 100-500

0

500

1000

1500

2000

Time (seconds)

Num

ber o

f hits

Original databetta=0.10, gamma=0.94, rmse=308.1565, rrse=0.79703betta=0.27, gamma=0.94, rmse=209.7852, rrse=0.54504betta=0.80, gamma=0.94, rmse=272.6285, rrse=0.70858

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Big spike Dual phase Large variations Quickly varying Slowly varying Steep tri phase

0 50 1000

500

1000

1500

0 50 100100

200

300

400

500

0 50 1000

1000

2000

0 50 1000

200

400

6000 50 1000

500

1000

0 50 1000

500

1000

Root

Relat

ive

Squa

red

Erro

r

Page 28: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

WorkloadPredictionandItsAccuracy

0 20 40 60 80 100 120Time (Seconds)

150

200

250

300

350

400

450

500

Num

ber o

f hits

Forecasting with double exponential smoothingObserved dataSmoothed dataForecast

0.120.9

0.14

0.2

0.16

0.92

0.18

0.4 0.94

0.2

alpha gamma

Root mean squared error versus alpha

RMSE

0.22

0.960.6

0.24

0.98

0.26

0.81

Page 29: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

EffectivenessofRobusT2ScaleSUT Criteria Bigspike Dualphase

Largevariations

Quicklyvarying

Slowlyvarying

Steeptriphase

RobusT2Scale973ms 537ms 509ms 451ms 423ms 498ms

3.2 3.8 5.1 5.3 3.7 3.9

Overprovisioning354ms 411ms 395ms 446ms 371ms 491ms

6 6 6 6 6 6

Underprovisioning

1465ms 1832ms 1789ms 1594ms 1898ms 2194ms

2 2 2 2 2 2

SLA:𝒓𝒕𝟗𝟓 ≤ 𝟔𝟎𝟎𝒎𝒔Forevery10scontrolinterval

•RobusT2Scale is superior to under-provisioning in terms of guaranteeing the SLA and does not require excessive resources•RobusT2Scale is superior to over-provisioning in terms of guaranteeing required resources while guaranteeing the SLA

Page 30: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

RobustnessofRobusT2Scale

0

0.02

0.04

0.06

0.08

0.1

alpha=0.1 alpha=0.5 alpha=0.9 alpha=1.0

Root

Mea

n Sq

uare

Err

or

Noiselevel:10%

Page 31: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Self-LearningController

Page 32: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Initial Work(SEMAS paper)

Current Work

Updating Kin MAPE-K@ Runtime(QoSA16, IEEE Cloud)

Design-timeAssistance

Multi-cloud

Page 33: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

AModel-FreeReinforcementLearningApproach

S1

S2

S3

S4

S5

a1

a2

a3

a4

a5

a6

Environment

RLAgent

!"0state

#"$%0reward/punishment&"$%0 PolicyPolicyPolicy

Calibrate

EstablishDe

term

ine

Environment

RLAgent

!"0state#"$%0

reward/punishment&"$%0 PolicyPolicyPolicy

Derive

Establish

Determ

ine

Value0Function

(!) ())

Page 34: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Fuzzifier

InferenceEngine

DefuzzifierRulebase

FuzzyQ-learning

CloudApplicationMonitoring Actuator

CloudPlatform

FuzzyLogicController

KnowledgeLearning

Autono

micCon

troller

𝑟𝑡𝑤

𝑤,𝑟𝑡,𝑡ℎ,𝑣𝑚

𝑠𝑎

systemstate systemgoal

FQL4KE:LogicalArchitecture

Page 35: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

FuzzyQ-LearningAlgorithm 1 : Fuzzy Q-LearningRequire: �, ⌘, ✏

1: Initialize q-values:q[i, j] = 0, 1 < i < N , 1 < j < J

2: Select an action for each fired rule:ai

= argmaxk

q[i, k] with probability 1� ✏ . Eq. 5ai

= random{ak

, k = 1, 2, · · · , J} with probability ✏3: Calculate the control action by the fuzzy controller:

a =

PN

i=1 µi

(x)⇥ ai

, . Eq. 1where ↵

i

(s) is the firing level of the rule i4: Approximate the Q function from the current

q-values and the firing level of the rules:Q(s(t), a) =

PN

i=1 ↵i

(s) ⇥ q[i, ai

],where Q(s(t), a) is the value of the Q function forthe state current state s(t) in iteration t and the action a

5: Take action a and let system goes to the next state s(t+1).6: Observe the reinforcement signal, r(t + 1)

and compute the value for the new state:V (s(t+ 1)) =

PN

i=1 ↵i

(s(t+ 1)).maxk

(q[i, qk

]).7: Calculate the error signal:

�Q = r(t+ 1) + � ⇥ Vt

(s(t+ 1))�Q(s(t), a), . Eq. 4where � is a discount factor

8: Update q-values:q[i, a

i

] = q[i, ai

] + ⌘ ·�Q · ↵i

(s(t)), . Eq. 4where ⌘ is a learning rate

9: Repeat the process for the new state until it converges

is formally defined as:

Q⇡

(s, a) = E⇡

{1X

k=0

�krt+k+1}, (3)

where E⇡

{.} is the expectation function under policy ⇡. Whenan appropriate policy is found, the RL problem at hand issolved. Q-learning is a technique that does not require anyspecific policy in order to evaluate Q(s, a), therefore:

Q(st

, at

) Q(st

, at

)+⌘[rt+1+�max

a

Q(st+1, a)�Q(s

t

, at

)],

(4)where � is the learning rate. In this case, the policy adaptationcan be achieved by selecting a random action with probability ✏and an action that maximizes the Q function in the current statewith probability 1� ✏, note that the value of ✏ is determinedby the exploitation/exploration strategy (cf. V-A):

a(s) = argmax

k

Q(s, k) (5)

The fuzzy Q-learning algorithm is summarized in Algorithm1. In the case of our running example, the state space is finite(i.e., 9 states as the full combination of 3 ⇥ 3 membershipfunctions for fuzzy variables w and rt) and our controllerhas to choose a scaling action among 5 possible actions{�2,�1, 0,+1,+2}. However, the design methodology thatwe describe is general. Note that the convergence is detectedwhen the change in the consequent functions is negligible ineach learning loop.

D. FQL4KE for Dynamic Resource Allocation

In this work, for simplicity, only one instance of the fuzzycontroller is integrated. Note that in the case of multiplecontrollers is also possible but due to the intricacies of updatinga central Q table, we consider this as a natural future extensionof this work for the problem areas that requires coordinationbetween several controllers, see [9].

Reward function. The controller receives the current valuesof w and rt that correspond to the state of the system, s(t) (cf.Step 4 in Algorithm 1). The control signal sa represents theaction a that the controller take at each loop. We define thereward signal r(t) based on three criteria: (i) numbers of thedesired response time violations, (ii) the amount of resourceacquired, and (iii) throughput, as follows:

r(t) = U(t)� U(t� 1), (6)

where U(t) is the utility value of the system at time t. Hence,if a controlling action leads to an increased utility, it meansthat the action is appropriate. Otherwise, if the reward is closeto zero, it implies that the action is not effective. A negativereward (punishment) warns that the situation becomes worseafter taking the action. The utility function is defined as:

U(t) = w1 ·th(t)

thmax

+w2 ·(1�vm(t)

vmmax

)+w3 ·(1�H(t)) (7)

H(t) =

8><

>:

(rt(t)�rtdes)rtdes

rtdes

rt(t) 2 · rtdes

1 rt(t) � 2 · rtdes

0 rt(t) rtdes

where th(t), vm(t) and rt(t) are throughput, number of workerroles and response time of the system, respectively. w1,w2 andw3 are their corresponding weights determining their relativeimportance in the utility function. In order to aggregate theindividual criteria, we normalized them depending on whetherthey should be maximized or minimized.

Knowledge base update. FQL4KE starts with controllingthe allocation of resources with no a priori knowledge. Afterenough explorations, the consequents of the rules can bedetermined by selecting actions that correspond to the highestq-value in each row of the Q-table. Although FQL4KE doesnot rely on design-time knowledge, if even partial knowledgeis available (i.e., operator of the system is confident withproviding some of the elasticity policies) or there exists dataregarding performance of the application, FQL4KE can exploitsuch knowledge by initializing q-values (cf. step 1 in Algorithm1) This implies a quicker learning convergence.

IV. IMPLEMENTATION

We implemented prototypes of FQL4KE on Microsoft Azureand OpenStack. As illustrated in Figure 4, the Azure prototypecomprises of 3 components integrated according to Figure 6:

i A learning component FQL implemented in Matlab. 1

1code is available at https://github.com/pooyanjamshidi/Fuzzy-Q-Learning

Low Medium High

Workload

1

0α β γ δ

Bad OK Good

Response Time

1

0λ μ ν

Algorithm 1 : Fuzzy Q-LearningRequire: �, ⌘, ✏

1: Initialize q-values:q[i, j] = 0, 1 < i < N , 1 < j < J

2: Select an action for each fired rule:ai

= argmaxk

q[i, k] with probability 1� ✏ . Eq. 5ai

= random{ak

, k = 1, 2, · · · , J} with probability ✏3: Calculate the control action by the fuzzy controller:

a =

PN

i=1 µi

(x)⇥ ai

, . Eq. 1where ↵

i

(s) is the firing level of the rule i4: Approximate the Q function from the current

q-values and the firing level of the rules:Q(s(t), a) =

PN

i=1 ↵i

(s) ⇥ q[i, ai

],where Q(s(t), a) is the value of the Q function forthe state current state s(t) in iteration t and the action a

5: Take action a and let system goes to the next state s(t+1).6: Observe the reinforcement signal, r(t + 1)

and compute the value for the new state:V (s(t+ 1)) =

PN

i=1 ↵i

(s(t+ 1)).maxk

(q[i, qk

]).7: Calculate the error signal:

�Q = r(t+ 1) + � ⇥ Vt

(s(t+ 1))�Q(s(t), a), . Eq. 4where � is a discount factor

8: Update q-values:q[i, a

i

] = q[i, ai

] + ⌘ ·�Q · ↵i

(s(t)), . Eq. 4where ⌘ is a learning rate

9: Repeat the process for the new state until it converges

is formally defined as:

Q⇡

(s, a) = E⇡

{1X

k=0

�krt+k+1}, (3)

where E⇡

{.} is the expectation function under policy ⇡. Whenan appropriate policy is found, the RL problem at hand issolved. Q-learning is a technique that does not require anyspecific policy in order to evaluate Q(s, a), therefore:

Q(st

, at

) Q(st

, at

)+⌘[rt+1+�max

a

Q(st+1, a)�Q(s

t

, at

)],

(4)where � is the learning rate. In this case, the policy adaptationcan be achieved by selecting a random action with probability ✏and an action that maximizes the Q function in the current statewith probability 1� ✏, note that the value of ✏ is determinedby the exploitation/exploration strategy (cf. V-A):

a(s) = argmax

k

Q(s, k) (5)

The fuzzy Q-learning algorithm is summarized in Algorithm1. In the case of our running example, the state space is finite(i.e., 9 states as the full combination of 3 ⇥ 3 membershipfunctions for fuzzy variables w and rt) and our controllerhas to choose a scaling action among 5 possible actions{�2,�1, 0,+1,+2}. However, the design methodology thatwe describe is general. Note that the convergence is detectedwhen the change in the consequent functions is negligible ineach learning loop.

D. FQL4KE for Dynamic Resource Allocation

In this work, for simplicity, only one instance of the fuzzycontroller is integrated. Note that in the case of multiplecontrollers is also possible but due to the intricacies of updatinga central Q table, we consider this as a natural future extensionof this work for the problem areas that requires coordinationbetween several controllers, see [9].

Reward function. The controller receives the current valuesof w and rt that correspond to the state of the system, s(t) (cf.Step 4 in Algorithm 1). The control signal sa represents theaction a that the controller take at each loop. We define thereward signal r(t) based on three criteria: (i) numbers of thedesired response time violations, (ii) the amount of resourceacquired, and (iii) throughput, as follows:

r(t) = U(t)� U(t� 1), (6)

where U(t) is the utility value of the system at time t. Hence,if a controlling action leads to an increased utility, it meansthat the action is appropriate. Otherwise, if the reward is closeto zero, it implies that the action is not effective. A negativereward (punishment) warns that the situation becomes worseafter taking the action. The utility function is defined as:

U(t) = w1 ·th(t)

thmax

+w2 ·(1�vm(t)

vmmax

)+w3 ·(1�H(t)) (7)

H(t) =

8><

>:

(rt(t)�rtdes)rtdes

rtdes

rt(t) 2 · rtdes

1 rt(t) � 2 · rtdes

0 rt(t) rtdes

where th(t), vm(t) and rt(t) are throughput, number of workerroles and response time of the system, respectively. w1,w2 andw3 are their corresponding weights determining their relativeimportance in the utility function. In order to aggregate theindividual criteria, we normalized them depending on whetherthey should be maximized or minimized.

Knowledge base update. FQL4KE starts with controllingthe allocation of resources with no a priori knowledge. Afterenough explorations, the consequents of the rules can bedetermined by selecting actions that correspond to the highestq-value in each row of the Q-table. Although FQL4KE doesnot rely on design-time knowledge, if even partial knowledgeis available (i.e., operator of the system is confident withproviding some of the elasticity policies) or there exists dataregarding performance of the application, FQL4KE can exploitsuch knowledge by initializing q-values (cf. step 1 in Algorithm1) This implies a quicker learning convergence.

IV. IMPLEMENTATION

We implemented prototypes of FQL4KE on Microsoft Azureand OpenStack. As illustrated in Figure 4, the Azure prototypecomprises of 3 components integrated according to Figure 6:

i A learning component FQL implemented in Matlab. 1

1code is available at https://github.com/pooyanjamshidi/Fuzzy-Q-Learning

Algorithm 1 : Fuzzy Q-LearningRequire: �, ⌘, ✏

1: Initialize q-values:q[i, j] = 0, 1 < i < N , 1 < j < J

2: Select an action for each fired rule:ai

= argmaxk

q[i, k] with probability 1� ✏ . Eq. 5ai

= random{ak

, k = 1, 2, · · · , J} with probability ✏3: Calculate the control action by the fuzzy controller:

a =

PN

i=1 µi

(x)⇥ ai

, . Eq. 1where ↵

i

(s) is the firing level of the rule i4: Approximate the Q function from the current

q-values and the firing level of the rules:Q(s(t), a) =

PN

i=1 ↵i

(s) ⇥ q[i, ai

],where Q(s(t), a) is the value of the Q function forthe state current state s(t) in iteration t and the action a

5: Take action a and let system goes to the next state s(t+1).6: Observe the reinforcement signal, r(t + 1)

and compute the value for the new state:V (s(t+ 1)) =

PN

i=1 ↵i

(s(t+ 1)).maxk

(q[i, qk

]).7: Calculate the error signal:

�Q = r(t+ 1) + � ⇥ Vt

(s(t+ 1))�Q(s(t), a), . Eq. 4where � is a discount factor

8: Update q-values:q[i, a

i

] = q[i, ai

] + ⌘ ·�Q · ↵i

(s(t)), . Eq. 4where ⌘ is a learning rate

9: Repeat the process for the new state until it converges

is formally defined as:

Q⇡

(s, a) = E⇡

{1X

k=0

�krt+k+1}, (3)

where E⇡

{.} is the expectation function under policy ⇡. Whenan appropriate policy is found, the RL problem at hand issolved. Q-learning is a technique that does not require anyspecific policy in order to evaluate Q(s, a), therefore:

Q(st

, at

) Q(st

, at

)+⌘[rt+1+�max

a

Q(st+1, a)�Q(s

t

, at

)],

(4)where � is the learning rate. In this case, the policy adaptationcan be achieved by selecting a random action with probability ✏and an action that maximizes the Q function in the current statewith probability 1� ✏, note that the value of ✏ is determinedby the exploitation/exploration strategy (cf. V-A):

a(s) = argmax

k

Q(s, k) (5)

The fuzzy Q-learning algorithm is summarized in Algorithm1. In the case of our running example, the state space is finite(i.e., 9 states as the full combination of 3 ⇥ 3 membershipfunctions for fuzzy variables w and rt) and our controllerhas to choose a scaling action among 5 possible actions{�2,�1, 0,+1,+2}. However, the design methodology thatwe describe is general. Note that the convergence is detectedwhen the change in the consequent functions is negligible ineach learning loop.

D. FQL4KE for Dynamic Resource Allocation

In this work, for simplicity, only one instance of the fuzzycontroller is integrated. Note that in the case of multiplecontrollers is also possible but due to the intricacies of updatinga central Q table, we consider this as a natural future extensionof this work for the problem areas that requires coordinationbetween several controllers, see [9].

Reward function. The controller receives the current valuesof w and rt that correspond to the state of the system, s(t) (cf.Step 4 in Algorithm 1). The control signal sa represents theaction a that the controller take at each loop. We define thereward signal r(t) based on three criteria: (i) numbers of thedesired response time violations, (ii) the amount of resourceacquired, and (iii) throughput, as follows:

r(t) = U(t)� U(t� 1), (6)

where U(t) is the utility value of the system at time t. Hence,if a controlling action leads to an increased utility, it meansthat the action is appropriate. Otherwise, if the reward is closeto zero, it implies that the action is not effective. A negativereward (punishment) warns that the situation becomes worseafter taking the action. The utility function is defined as:

U(t) = w1 ·th(t)

thmax

+w2 ·(1�vm(t)

vmmax

)+w3 ·(1�H(t)) (7)

H(t) =

8><

>:

(rt(t)�rtdes)rtdes

rtdes

rt(t) 2 · rtdes

1 rt(t) � 2 · rtdes

0 rt(t) rtdes

where th(t), vm(t) and rt(t) are throughput, number of workerroles and response time of the system, respectively. w1,w2 andw3 are their corresponding weights determining their relativeimportance in the utility function. In order to aggregate theindividual criteria, we normalized them depending on whetherthey should be maximized or minimized.

Knowledge base update. FQL4KE starts with controllingthe allocation of resources with no a priori knowledge. Afterenough explorations, the consequents of the rules can bedetermined by selecting actions that correspond to the highestq-value in each row of the Q-table. Although FQL4KE doesnot rely on design-time knowledge, if even partial knowledgeis available (i.e., operator of the system is confident withproviding some of the elasticity policies) or there exists dataregarding performance of the application, FQL4KE can exploitsuch knowledge by initializing q-values (cf. step 1 in Algorithm1) This implies a quicker learning convergence.

IV. IMPLEMENTATION

We implemented prototypes of FQL4KE on Microsoft Azureand OpenStack. As illustrated in Figure 4, the Azure prototypecomprises of 3 components integrated according to Figure 6:

i A learning component FQL implemented in Matlab. 1

1code is available at https://github.com/pooyanjamshidi/Fuzzy-Q-Learning

Code: https://github.com/pooyanjamshidi/Fuzzy-Q-Learning

Page 36: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

RobusT2Scale

Learnedrules

FQL

Monitoring Actuator

CloudPlatform

.fis

LW

W

ElasticBench

𝑤, 𝑟𝑡

𝑤, 𝑟𝑡,𝑡ℎ, 𝑣𝑚

𝑠𝑎

LoadGenerator

C

systemstate

WCF

REST

𝛾, 𝜂, 𝜀, 𝑟

FQL4KE:ImplementationArchitecture

Page 37: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

CloudPlatform(PaaS)On-Premise

P:WorkerRole

L:WebRole

P:WorkerRole

P:WorkerRole

Cache

M:WorkerRole

Results:Storage

Blackboard:Storage

LG:Console

Auto-scalingLogic(controller)

Policy Enforcer

1 2 3

78

1112

4

109

LB:LoadBalancer

6

5Queue

Actuator

Monitoring

Code: https://github.com/pooyanjamshidi/ElasticBench

ElasticBench:TheExperimentalPlatform

Page 38: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

LearningStrategies

0

0.2

0.4

0.6

0.8

1

1.2

0 8 15 23 33 42 53 61 75 87 95 105

118

127

135

147

159

169

179

190

199

210

217

223

236

245

255

265

271

279

289

298

305

317

S1 S2 S3 S4 S5learning epochs

prob

abilit

y

Page 39: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Q-valueEvolution

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 50 100 150 200 250 300 350

q(9,3)

punishment

rewardno change

Page 40: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

TemporalEvolutionofAcquiredNodes

0

1

2

3

4

5

6

7

8

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

high explorationless frequent exploration

mostly exploitation

experiment epochs

num

ber o

f VM

s

Page 41: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ControlSurfaceEvolution

1

2

3

4

Page 42: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

ExperimentalResults

Page 43: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

WorkloadInjectedtotheSystem

0 50 1000

500

1000

1500

0 50 100100

200

300

400

500

0 50 1000

1000

2000

0 50 1000

200

400

600

0 50 1000

500

1000

0 50 1000

500

1000

Big spike Dual phase Large variations

Quickly varying Slowly varying Steep tri phase

0

1000

2:00 6:47 11:34 16:20 21:07

user

requ

ests

time during 24 hours

0

1000

31-May-1998 14-Jun-1998 28-Jun-1998 12-Jul-1998

user

requ

ests

date during one and half months

user

requ

ests

Page 44: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Results

13

III. FQL4KE is capable of automatically constructing the control rules and keeping controlparametersupdatedthroughfastonline learning. Itexecutesresourceallocationand learnstoimproveitsperformancesimultaneously.

IV. Unlikesupervisedtechniquesthatlearnfromtrainingdata,FQL4KEdoesnotrequireoff-linetraining,whichsavessignificantamountsoftimeandeffort.

Table3.ComparisonoftheeffectivenessofFQL4KE,RobusT2ScaleandAzureauto-scalingunderdifferentworkloads.

Approach Criteria

WorkloadpatternsBig

spikeDualphase

LargeVariations

Quicklyvarying

Slowlyvarying

Steeptriphase

FQL4KE rt_95 1212ms 548ms 991ms 1319ms 512ms 561msvm 2.2 3.6 4.3 4.4 3.6 3.4

RobusT2Scale rt_95 1339ms 729ms 1233ms 1341ms 567ms 512ms vm 3.2 3.8 5.1 5.3 3.7 3.9Azureauto-scaling rt_95 1409ms 712ms 1341ms 1431ms 1101ms 1412ms

vm 3.3 4 5.5 5.4 3.7 4

5. Conclusions

We propose a new learning based self-adaptation framework, called MAPE-KE, which isparticularly suited for engineering elastic systems that need to cope with uncertainenvironments,suchascloudandbigdata,andneedtoberobustenoughbytakingthehumanoutoftheloop.WeimplementedtheFQL4KEcloudcontrolleraccordingtothisframeworktoadjusttherequiredresourceswhentheapplicationisrunningthataddressesthekeyproblemsofuncertainty,latencyandnoise.

Fuzzy rules are at the core of the proposed solutions. Our approach enables qualitativespecification of elasticity rules that allows the elasticity controller to handle even conflictingrules.Theproposedcontrollerisalsorobustagainstnoisymeasurementsthatmightarisefrompredictionaddedtothecontrollerloopormonitoringuncertainties.Theonlinelearningofrulesis a complementary solution that provides a completely user-independent cloud controllerdesign.

Whatemergesfromthediscussion istounderstandcloudcontrollersascontroltheory-basedtools. Our solutions use fuzzy control theory combinedwithmachine learning to specificallydealwithuncertaintyinitsvariousforms.

- FQL4KE performs better than Azure’s native auto-scaling service and better or similarly to RobusT2Scale - FQL4KE can learn to acquire resources for dynamic cloud systems. - FQL4KE is flexible enough to allow the operator to set different elasticity strategies.

Page 45: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

RuntimeOverhead

Monitoring Learning Actuation

�104

0

2

4

6

8

10

12

Page 46: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Insight:MAPE-K->MAPE-KE

Monitoring

Analysis Planning

Execution

OfflineTraining

OnlineLearning

KnowledgeUpdate

Base-Level: Elastic SystemEnvironment

(Cloud, Sensors, Actuators)

Knowledge

Knowledge

Users

S A

Met

a-Le

vel:

MAP

E-K

Met

a-M

eta-

Leve

l: KE

Page 47: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Currentwork

- ImplementationonOpenStack withIntel,Ireland- PolicylearningthroughotherMLtechniques(e.g.,GP,BO)

Page 48: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

Challenge 1: ~75% wasted capacityActual

demand

Challenge 2: customer lost

Fuzzifier

InferenceEng ine

DefuzzifierRulebase

FuzzyQ-learning

CloudApplicationMonitoring Actuator

CloudPlatform

FuzzyLog icController

KnowledgeLearning

Auton

omicCon

troller

𝑟𝑡𝑤

𝑤,𝑟𝑡,𝑡ℎ,𝑣𝑚

𝑠𝑎

systemstate systemgoal

RobusT2Scale

Learnedrules

FQL

Monitoring Actuator

CloudPlatform

.fis

LW

W

ElasticBench

𝑤, 𝑟𝑡

𝑤, 𝑟𝑡,𝑡ℎ, 𝑣𝑚

𝑠𝑎

LoadGenerator

C

systemstate

WCF

REST

𝛾, 𝜂, 𝜀, 𝑟

Page 49: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

http://www.doc.ic.ac.uk/~pjamshid/PDF/qosa16.pdf

More Details?=> http://www.slideshare.net/pooyanjamshidi/

Slides?

=>

Thank you!

https://github.com/pooyanjamshidi

Code?=>

Page 50: Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Architectures

SubmittoCloudWays 2016!

Paper submission deadline July 1st, 2016Decision notification August 1st, 2016Final version due August 8th, 2016Workshop date September 5th, 2016

Collocated with ESOCC, ViennaTopics: Cloud Architecture, Big Data, DevOps

- Details: https://sites.google.com/site/cloudways2016/call-for-papers