a game-theoretic resource manager for rt applications
DESCRIPTION
Presentation given at ECRTS 2013. Code: https://github.com/martinamaggio/gtrm Paper abstract: The management of resources among competing QoS-aware applications is often solved by a resource manager (RM) that assigns both the resources and the application service levels. However, this approach requires all applications to inform the RM of the available service levels. Then, the RM has to maximize the "overall quality" by comparing service levels of different applications which are not necessarily comparable. In this paper we describe a Linux implementation of a game-theoretic framework that decouples the two distinct problems of resource assignment and quality setting, solving them in the domain where they naturally belong to. By this approach the RM has linear time complexity in the number of the applications. Our RM is built over the SCHED_DEADLINE Linux scheduling class.TRANSCRIPT
A Game-Theoretic Resource Manager for RT Applications
Martina Maggio, Enrico Bini, Georgios Chasparis, Karl-Erik Årzén
Lund University, Department of Automatic Control
Motivation
• Multiple applications sharing the same computing platform, especially in embedded systems
• Applications support multiple service levels with different execution requirements and quality of service
Motivation
• Multiple applications sharing the same computing platform, especially in embedded systems
• Applications support multiple service levels with different execution requirements and quality of service
SL1: 640x480 SL2: 800x600 SL3: 1024x768CPU: 30% CPU: 60% CPU: 90%
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Problem formulation
• Select:
the resource allocation
the service levels of the applications
to maximize the quality of the overall computation
Problem formulation
• Select:
the resource allocation
the service levels of the applications
to maximize the quality of the overall computation
The resource allocation naturally belongs to the resource management domain
Problem formulation
• Select:
the resource allocation
the service levels of the applications
to maximize the quality of the overall computation
The service level naturally belongs to the application domain
Existing approaches
• Centralized solution: the resource manager chooses both the resource allocation and the application service levels
Existing approaches
• Centralized solution: the resource manager chooses both the resource allocation and the application service levels
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatformOS
PSfrag
s1 s2
app1 app2
sn
appn
RM
v1 v2 vn
f1 f2 fn
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatformOS
s1 s2
app1 app2
sn
appn
RM
v1 v2 vn
f1 f2 fn
Existing approachesD
raw
back
s
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatformOS
s1 s2
app1 app2
sn
appn
RM
v1 v2 vn
f1 f2 fn
Existing approaches
The resource manager solves an ILP problem, high time complexityDra
wba
cks
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatformOS
s1 s2
app1 app2
sn
appn
RM
v1 v2 vn
f1 f2 fn
Existing approaches
The resource manager solves an ILP problem, high time complexityDra
wba
cks
The resource manager compares service levels from different applications
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatformOS
s1 s2
app1 app2
sn
appn
RM
v1 v2 vn
f1 f2 fn
Existing approaches
The resource manager solves an ILP problem, high time complexityDra
wba
cks
The resource manager compares service levels from different applications
The resource manager requires a lot of informations from the applications
Our contribution
• Decoupling service level assignment and resource allocation
• Formal guarantees that the operating point is a good match between the resource assigned and the service level
• Implemented in Linux with SCHED_DEADLINE
Our contribution
• Decoupling service level assignment and resource allocation
• Formal guarantees that the operating point is a good match between the resource assigned and the service level
• Implemented in Linux with SCHED_DEADLINEThe application selects the service level
The resource manager assignes resources
Our contribution
• Decoupling service level assignment and resource allocation
• Formal guarantees that the operating point is a good match between the resource assigned and the service level
• Implemented in Linux with SCHED_DEADLINE
Low information exchange between the two entities and no unwanted comparisons
The application selects the service levelThe resource manager assignes resources
Our contribution
• Decoupling service level assignment and resource allocation
• Formal guarantees that the operating point is a good match between the resource assigned and the service level
• Implemented in Linux with SCHED_DEADLINE
Assumptions
• Every application is made of jobs
every job has a deadline - expected execution time - and a real execution time
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fn
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fn
The application has a service level si - known only by the application itself
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fn
The resource manager selects the virtual platform vi assigned within a CBS scheduler
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fnA weight λi determines how the adaptation
should be done and who should adjust more
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fnA weight λi determines how the adaptation
should be done and who should adjust more
λi = 0: the applications does all the adaptation by changing its service level
λi = 1: the resource manager adjusts by setting the virtual platform
The architecture
settingservice
awareservice
settingservice
awareservice
01
sensorvirtualplatform
sensorvirtualplatform
sensorvirtualplatform
01
01
unawareservice
OS
s1
app1 app2
sn
appn
RM!1
!2
!nv1 v2 vn
f1 f2 fn
Both the application and the resource manager sense a matching function fi
The matching function
-5-3.75-2.5
-1.250
1.252.5
3.75
time
-5-3.75-2.5
-1.250
1.252.5
3.75
The matching function
The matching is abundant:increase si or decrease vi
The matching function
-5-3.75-2.5
-1.250
1.252.5
3.75
The matching is scarce:decrease si or increase vi
The matching function
-5-3.75-2.5
-1.250
1.252.5
3.75
The matching is perfect:do nothing
The matching function
-5-3.75-2.5
-1.250
1.252.5
3.75
• The purpose of the resource manager is find the allocation where the matching functions of all the applications are as close as possible to zero
The matching function
-5-3.75-2.5
-1.250
1.252.5
3.75
Both the application and the resource manager should be able to measure the
matching function independently
The matching function
• Defines how good is the match between the resource assigned to an application and its current service level:
increases with vi
decreases with si
The matching function
• Defines how good is the match between the resource assigned to an application and its current service level:
increases with vi
decreases with sifi = �ivisi
� 1
The matching function
• Defines how good is the match between the resource assigned to an application and its current service level:
increases with vi
decreases with sifi = �ivisi
� 1
application dependent constant
The matching function
• Defines how good is the match between the resource assigned to an application and its current service level:
increases with vi
decreases with sifi = �ivisi
� 1
Not meas
urable!
The matching function
• In order to get a function that both the resource manager and the application can measure, we chose to use:
fi =deadline
response time
� 1
= �lateness
response time
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Virtual platform
• The resource manager is designed with a game-theoretic approach and changes the virtual platform assignments according to
vi = vi � "[�ifi � viX
i
(�ifi)]
Formal guarantees
• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore
Formal guarantees
• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore
If a stationary point where all the matching functions are zero exists, it is reached
Formal guarantees
• The allocation converge to a stationary point where either the applications are performing reasonably good or they cannot reduce their service level anymore
If a stationary point where all the matching functions are zero exists, it is reached
If multiple of these points exist, the assignment will depend on the weights λi
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Implementation
• In TrueTime (RT kernel simulator)
• In Linux with SCHED_DEADLINE*Linux Scheduling Class implementing global EDF and CBS
* Juri Lelli, Giuseppe Lipari, Dario Faggioli, Tommaso Cucinotta An efficient and scalable implementation of global EDF in Linux International Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT), Porto (Portugal), July 2011.
https://github.com/martinamaggio/gtrm
Communication
• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function
Communication
• The application communicates via shared memory the start and stop time of each job: the shared memory contains the necessary values to compute the matching function
Shared memory communication is provided transparently with a library
https://github.com/martinamaggio/jobsignal
one job consumes acpu si + bcpu cpu and amem si + bmem memory
while (!finished) { /* jobs */
signal_job_start();
if (needed)
change_service_level();
do_work();
signal_job_end();
}
write on shared memory
si: based on performance function
wi = [acpu si + bcpu, amem si + bmem]
write on shared memory
e.g.: every 10 jobs
Application
for i in applications {
read_matching_functions();
compute_virtual_platforms();
set_virtual_platforms();
send_app_indications();
}
read from shared memory
vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]
set SCHED_DEADLINE budget
write on shared memory
Resource manager
for i in applications {
read_matching_functions();
compute_virtual_platforms();
set_virtual_platforms();
send_app_indications();
}
read from shared memory
vi = vi - Ɛ [λi fi - Ʃi(λi fi) * vi]
set SCHED_DEADLINE budget
write on shared memory
Resource manager
O(n)
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Convergence test
• The first experiment is a single core experiment to test the convergence of virtual platforms to their predicted values
Convergence test
λ1 = 0.1, λ2 = 0.3, λ3 = 0.2, λ4 = 0.5
0 1 2 3 4 5
0.80.60.40.20
1 app1
app2
app3
app4
time (sec)
VPs(v
i)
Multicore
• The second one is a multicore experiment, where applications are demanding CPU and memory and have different weights
Multicore
0 2 4 6 8 10
−1.0
0.5
−0.50.0
0.40.30.20.10
0.50.6
1.0
0.0
0.80.60.40.2
time (sec)
VPs(v
i)
fi
SLs(s
i)
app1...8
app9
app10
app11
app12
Overhead
• The third experiment is still multicore and tests the overhead of the resource manager and its linear time complexity
Overhead
200
400
600
800
3.5
2.5
1.5
0.5
3
2
1
0 5 10 15 20 25
0
number of applications (n)
run-tim
e(µsec)
noshared
mem
Outline
• Problem formulation
• The architecture
• Formal guarantees
• Implementation
• Experimental evaluation
• Conclusion and future work
Conclusion
• Decoupling the responsibility of choosing the applications service levels and the CPU allocation
• Resource allocation has linear time complexity in the number of applications
• Guarantees in terms of zero-matching functions whenever a feasible allocation exists
Future work
• Managing asynchronous application updates [✔]
• Apply to web server jobs for the cloud [✔]
• Address multithreaded applicationsSCHED_DEADLINE → cgroups
• Deal with cheating applications
• Apply to different resources, like memory and network bandwidth
Thanks for the attentionQuestions?
email: [email protected]
code: https://github.com/martinamaggio/jobsignal https://github.com/martinamaggio/gtrm