lookup based partitioned scheduling
DESCRIPTION
Partitioned scheduling using Lookup-table.TRANSCRIPT
-
A Lookup-table Driven Approach to Partitioned
Scheduling
Bipasa ChattopadhyaySanjoy Baruah
Department of Computer ScienceThe University of North Carolina at Chapel Hill
-
Partitioned Scheduling
Given - a task system and m processors
Objective - partition the tasks onto the processors
Advantages - Good cache affinity, no migration delay
-
Implicit Deadline Sporadic Task System
Each task has a - Worst Case Execution Time, Ci Time Period, Ti Deadline, Di = Ti Utilization, ui =
CiTi
-
Partitioned Scheduling
Implicit Deadline Sporadic Task System Utilization, ui =
CiTi
Each processor is identical and has computing capacity = 1
Scheduling algorithm on each processor is pre-emptive EDF optimal uniprocessor scheduling algorithm if sum of utilizations
of tasks 1
-
Partitioning tasks is equivalent to bin-packing Bin-packing is NP-hard in the strong sense
Hochbaum and Shmoys designed a polynomial timeapproximation scheme (PTAS)
-
Partitioning tasks is equivalent to bin-packing Bin-packing is NP-hard in the strong sense
Hochbaum and Shmoys designed a polynomial timeapproximation scheme (PTAS)
If an optimal algorithm can partition a task system ontom processors each of computing capacity 1 then a
PTAS algorithm can partition the task system onto mprocessors each of computing capacity 1 + (0 < < 1)
in polynomial time.
-
Proposed PTAS partitioning algorithm has large run time
We suggest a lookup-table driven approach
Pre-compute the lookup-table The results of the expensive computation are stored in a
lookup-table The table is consulted for task partitioning
-
Computing the Lookup-table
The lookup-table is computed only once for a given value ofm and
-
Computing the Lookup-table
Choosing
Determining Utilization Values
Single Processor Configurations
Multiprocessor Configurations
-
Computing the Lookup-table
Choosing
Determining Utilization Values
Single Processor Configurations
Multiprocessor Configurations
-
Choosing
If an optimal algorithm can partition a task system ontom processors each of computing capacity 1
then a PTAS algorithm can partition the task systemonto m processors each of computing capacity 1 +
in polynomial time (0 < < 1).
-
Choosing
If an optimal algorithm can partition a task system ontom processors each of computing capacity 1 ( 11+)
then a PTAS algorithm can partition the task systemonto m processors each of computing capacity 1 + (1)
in polynomial time (0 < < 1).
-
For example -system utilization loss = 10%= 11+ 0.9
= 19 0.1
-
For example -system utilization loss = 10%= 11+ 0.9
= 19 0.1
= 0.1
-
For example -system utilization loss = 10%= 11+ 0.9
= 19 0.1
= 0.1
The size of the lookup-table is a function on m and If decreases, the size of the lookup-table and the time needed
to compute it increases
-
Computing the Lookup-table
Choosing
Determining Utilization Values
Single Processor Configurations
Multiprocessor Configurations
-
Determining Utilization Values
Let tasks have fixed utilization values
- Fixed utilization values
-
Determining Utilization Values
Let tasks have fixed utilization values
- Fixed utilization values
V () - set of fixed utilization values
-
Determining Utilization Values
Let tasks have fixed utilization values
- Fixed utilization values
V () - set of fixed utilization values
V () = {, (1 + ), (1 + )2, ... (1 + )j} j is a non-negative integer (1 + )j 1
-
Let = 0.3 (running example)
j util. value (0.3) (1.3)j
0 0.30001 0.39002 0.50703 0.65914 0.8568
5 1.114
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
-
Let = 0.3 (running example)
j util. value (0.3) (1.3)j
0 0.30001 0.39002 0.50703 0.65914 0.8568
5 1.114
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
As decreases the number of utilization values in V ()increases
If = 0.1, size of V () = 25
-
As decreases - The solution is closer to optimal - Less inflation, less utilization
loss
-
As decreases - The solution is closer to optimal - Less inflation, less utilization
loss
The size of the lookup-table and the time needed to computeit increases
-
Computing the Lookup-table
Choosing
Determining Utilization Values
Single Processor Configurations
Multiprocessor Configurations
-
Single Processor Configurations
How many different tasks with utilization in V () can beassigned to a single processor?
-
Single Processor Configurations
How many different tasks with utilization in V () can beassigned to a single processor?
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
Configurations determine the number of tasks of eachutilization
3, 0, 0, 0, 0 - 3 tasks of ui = 0.3 are assigned to a processor 1, 1, 0, 0, 0 - 1 task of ui = 0.3 and 1 task of uj = 0.39 are
assigned to a processor
-
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
Valid configuration - sum of utilizations of tasks 1 3, 0, 0, 0, 0 (0.3 + 0.3 + 0.3 1) 1, 1, 0, 0, 0 (0.3 + 0.39 1)
-
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
Valid configuration - sum of utilizations of tasks 1 3, 0, 0, 0, 0 (0.3 + 0.3 + 0.3 1) 1, 1, 0, 0, 0 (0.3 + 0.39 1)
Maximal Configuration - no other configuration can fit moretasks
3, 0, 0, 0, 0 1, 1, 0, 0, 0 is not a maximal configuration
(0.3 + 0.39 + 0.3 = 0.99)
2, 1, 0, 0, 0 is a maximal configuration. We say that2, 1, 0, 0, 0 dominates 1, 1, 0, 0, 0
-
The lookup-table for = 0.3 Maximal single processor configurations
Config ID 0.3000 0.3900 0.5070 0.6591 0.8568
A 3 0 0 0 0B 2 1 0 0 0C 1 0 1 0 0D 1 0 0 1 0E 0 2 0 0 0F 0 1 1 0 0G 0 0 0 0 1
-
The lookup-table for = 0.3 Maximal single processor configurations
Config ID 0.3000 0.3900 0.5070 0.6591 0.8568
A 3 0 0 0 0B 2 1 0 0 0C 1 0 1 0 0D 1 0 0 1 0E 0 2 0 0 0F 0 1 1 0 0G 0 0 0 0 1
The lookup-table for = 0.1 has 9604 maximal singleprocessor configurations
-
Computing the Lookup-table
Choosing
Determining Utilization Values
Single Processor Configurations
Multiprocessor Configurations
-
Multiprocessor Configurations
Maximal single processor configurations are used to determinemaximal m processor configurations
-
Multiprocessor Configurations
Maximal single processor configurations are used to determinemaximal m processor configurations
If m = 2, combine any pair of maximal single processorconfigurations
Single-proc. Config ID< A > 3 0 0 0 0< B > 2 1 0 0 0< C > 1 0 1 0 0< D > 1 0 0 1 0< E > 0 2 0 0 0< F > 0 1 1 0 0< G > 0 0 0 0 1
-
Multiprocessor Configurations
Maximal single processor configurations are used to determinemaximal m processor configurations
If m = 2, combine any pair of maximal single processorconfigurations
Single-proc. Config ID< A > 3 0 0 0 0< B > 2 1 0 0 0< C > 1 0 1 0 0< D > 1 0 0 1 0< E > 0 2 0 0 0< F > 0 1 1 0 0< G > 0 0 0 0 1
Example - 1, 2, 1, 0, 0 < C , E >
-
Multiprocessor Configurations
Maximal single processor configurations are used to determinemaximal m processor configurations
If m = 2, combine any pair of maximal single processorconfigurations
Single-proc. Config ID< A > 3 0 0 0 0< B > 2 1 0 0 0< C > 1 0 1 0 0< D > 1 0 0 1 0< E > 0 2 0 0 0< F > 0 1 1 0 0< G > 0 0 0 0 1
Example - 1, 2, 1, 0, 0 < C , E > 2, 2, 1, 0, 0 < B, F >
-
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
Valid Configuration - sum of utilization of tasks on eachprocessor 1
1, 2, 1, 0, 0 < C , E > 2, 2, 1, 0, 0 < B, F >
Combining 2 maximal single processor configurations results ina valid m = 2 processor configuration.
-
V (0.3) = {0.3000, 0.3900, 0.5070, 0.6591, 0.8568}
Valid Configuration - sum of utilization of tasks on eachprocessor 1
1, 2, 1, 0, 0 < C , E > 2, 2, 1, 0, 0 < B, F >
Combining 2 maximal single processor configurations results ina valid m = 2 processor configuration.
Maximal Configuration - A configuration is maximal if noother configuration dominates it
Configuration < B, F > dominates < C , E >
Remove configurations that are not maximal
-
The lookup-table consists of only maximal m processorconfigurations
-
The lookup-table consists of only maximal m processorconfigurations
We compute maximal m processor configurations frommaximal k processor configurations, k < m
If m = 4, combine any pair of maximal m = 2 processorconfigurations
0.3000 0.3900 0.5070 0.6591 0.8568 Single-proc. Config ID
3 2 1 2 0 < D, D, E , C >3 4 2 0 0 < F , F , E , A >0 3 3 0 1 < F , F , F , G >4 1 1 1 1 < G , D, C , B >4 0 1 3 0 < D, D, D, C >
-
Partitioning Tasks in Polynomial Time
-
Partitioning Tasks in Polynomial Time
-
Utilization Inflation
Inflate the utilization values of the task to equal the nearestutilization value V ()
Tasks with ui
1+ (large tasks) are inflated by at most
(1 + )
Inflation factor = (1+)j+1
ui4 1 1 1 1 < G , D, C , B >4 0 1 3 0 < D, D, D, C >
Output = < F , F , F , G >
-
Given task system - 15 ,15 ,
13 ,
720 ,
925 ,
25 ,
12 ,
12 ,
34
Inflate large tasks-15 ,
15 , 0.3900, 0.3900, 0.3900, 0.5070, 0.5070, 0.5070, 0.8568
0, 3, 3, 0, 1(Input) = Lookup-table = < F , F , F , G >(Output)
-
Given task system - 15 ,15 ,
13 ,
720 ,
925 ,
25 ,
12 ,
12 ,
34
Inflate large tasks-15 ,
15 , 0.3900, 0.3900, 0.3900, 0.5070, 0.5070, 0.5070, 0.8568
0, 3, 3, 0, 1(Input) = Lookup-table = < F , F , F , G >(Output)
Phase 1 - Large tasks are assigned as specified byconfiguration < F , F , F , G >
0, 1, 1, 0, 0 < F > ; 0, 0, 0, 0, 1< G >
Single-proc. Config ID Task Assignment Remaining Capacity
F 1/3, 2/5 0.2667F 7/20, 1/2 0.15F 9/25, 1/2 0.14G 3/4 0.25
-
Given task system - 15 ,15 ,
13 ,
720 ,
925 ,
25 ,
12 ,
12 ,
34
Inflate large tasks-15 ,
15 , 0.3900, 0.3900, 0.3900, 0.5070, 0.5070, 0.5070, 0.8568
0, 3, 3, 0, 1(Input) = Lookup-table = < F , F , F , G >(Output)
Phase 1 - Large tasks are assigned as specified byconfiguration < F , F , F , G >
0, 1, 1, 0, 0 < F > ; 0, 0, 0, 0, 1< G >
Single-proc. Config ID Task Assignment Remaining Capacity
F 1/3, 2/5 0.2667F 7/20, 1/2 0.15F 9/25, 1/2 0.14G 3/4 0.25
Phase 2 - Small tasks are accommodated in the remainingcapacity
-
Given task system - 15 ,15 ,
13 ,
720 ,
925 ,
25 ,
12 ,
12 ,
34
Inflate large tasks-15 ,
15 , 0.3900, 0.3900, 0.3900, 0.5070, 0.5070, 0.5070, 0.8568
0, 3, 3, 0, 1(Input) = Lookup-table = < F , F , F , G >(Output)
Phase 1 - Large tasks are assigned as specified byconfiguration < F , F , F , G >
0, 1, 1, 0, 0 < F > ; 0, 0, 0, 0, 1< G >
Single-proc. Config ID Task Assignment Remaining Capacity
F 1/3, 2/5, 1/5 0.0667F 7/20, 1/2 0.15F 9/25, 1/2 0.14G 3/4, 1/5 0.05
Phase 2 - Small tasks are accommodated in the remainingcapacity
-
Performance Guarantee
If the partitioning algorithm fails to partition the tasks
in , then no algorithm can partition on a platform of
m processors each of computing capacity 11+
-
Performance Guarantee
If the partitioning algorithm fails to partition the tasks
in , then no algorithm can partition on a platform of
m processors each of computing capacity 11+
Phase 1 fails - tasks with utilization 1+ are inflated byatmost (1 + )
-
Performance Guarantee
If the partitioning algorithm fails to partition the tasks
in , then no algorithm can partition on a platform of
m processors each of computing capacity 11+
Phase 1 fails - tasks with utilization 1+ are inflated byatmost (1 + )
if inflated tasks can not be partitioned onto m processorseach of utilization 1 then unmodified tasks can not bepartitioned onto m processors each of utilization 11+
-
Performance Guarantee
If the partitioning algorithm fails to partition the tasks
in , then no algorithm can partition on a platform of
m processors each of computing capacity 11+
Phase 1 fails - tasks with utilization 1+ are inflated byatmost (1 + )
if inflated tasks can not be partitioned onto m processorseach of utilization 1 then unmodified tasks can not bepartitioned onto m processors each of utilization 11+
Phase 2 fails - tasks with utilization < 1+ can not beaccommodated
-
Performance Guarantee
If the partitioning algorithm fails to partition the tasks
in , then no algorithm can partition on a platform of
m processors each of computing capacity 11+
Phase 1 fails - tasks with utilization 1+ are inflated byatmost (1 + )
if inflated tasks can not be partitioned onto m processorseach of utilization 1 then unmodified tasks can not bepartitioned onto m processors each of utilization 11+
Phase 2 fails - tasks with utilization < 1+ can not beaccommodated
sum of utilizations of tasks assigned to each processormust be > (1 1+)= total utilization of > m (1 1+)
> m( 11+)
-
Context and Summary
Build the lookup-table during platform synthesis
-
Context and Summary
Build the lookup-table during platform synthesis
Ship the platform with the lookup-table
-
Context and Summary
Store the lookup-table on a central server
-
Future Work
What happens if the exact configuration does not exist in thetable?
Configuration 0, 3, 3, 0, 1 was easy to find, what ifconfiguration was 0, 2, 3, 0, 0
Need good heuristic for lookup function.
-
Future Work
What happens if the exact configuration does not exist in thetable?
Configuration 0, 3, 3, 0, 1 was easy to find, what ifconfiguration was 0, 2, 3, 0, 0
Need good heuristic for lookup function.
Partition the tasks onto processors and memory.
-
Thank You
-
Does better than First Fit - 0.25, 0.25 + , 0.50, 0.50 + (utilization 1.5 + 2 )
For m = 2 processors first fit can not partition this tasksystem. Our partitioning algorithm can.
= 0.1, = 0.01, 0.25, 0.26, 0.50, 0.51 0.259, 0.285, 0.505, 0.556
-
Size of lookup-table for = 0.15 is - m = 1, 14KB ( 252) m = 2, 240KB ( 8523) m = 3, 3MB (122599)
Size of lookup-table for = 0.1 is - m = 1, 250KB (9604) m = 2, tens of MB() m = 3, hundreds of MB()