dynamic matching models - university of...
TRANSCRIPT
Dynamic Matching Models
Ana Busic
Inria Paris - RocquencourtCS Department of Ecole normale superieure
joint work with Varun Gupta, Jean Mairesse and Sean Meyn
3rd Workshop on Cognition and ControlJanuary 16 & 17, 2015, University of Florida
1 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Dynamic Bipartite Matching Model
Static model – long history in economics Finding Stable Matches 2012 NobelPrize awarded to L. S. Shapley.
Dynamic model introduced by Caldentey, Kaplan, and Weiss (2009)Multiclass queueing model – Supply/Demand play symmetric roles
Discrete time queueing model with twotypes of arrival: “supply” and “demand”.
Arrival of Supply/Demand is i.i.d., with
|AD(t)| = |AS(t)| for all t
Instantaneous matchings according to abipartite matching graph.
Supply/Demand that cannot be matched arestored in a buffer.
2 / 29
Bipartite matching model Model description
Matching in Health-care
Matching Kidneys and Donors
Who can join this program?For recipients: If you are eligible
for a kidney transplant and are
receiving care at a transplant
center in the United States, you
can join ... You must have a living
donor who is willing and medically
able to donate his or her kidney ...
For donors: You must also be
willing to take part ...
U N I T E D N E T W O R K F O R O R G A N S H A R I N G
TA L K I N G A B O U T T R A N S P L A N TAT I O N
3 / 29
Bipartite matching model Model description
Matching Policies
Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivalsof Supply/Demand, and
3) A matching policy.
Admissible policies
State feedback: Decision U(t) depends only on buffers Q(t) and immediatearrivals A(t),
Q(t + 1) = Q(t)− U(t) + A(t)
Match according to graph, maintaining
|UD(t)| = |US(t)||QD(t)| = |QS(t)| for all t
Q is a discrete time Markov chainStability = positive recurrence of Q.
4 / 29
Bipartite matching model Model description
Matching Policies
Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivalsof Supply/Demand, and 3) A matching policy.
Admissible policies
State feedback: Decision U(t) depends only on buffers Q(t) and immediatearrivals A(t),
Q(t + 1) = Q(t)− U(t) + A(t)
Match according to graph, maintaining
|UD(t)| = |US(t)||QD(t)| = |QS(t)| for all t
Q is a discrete time Markov chainStability = positive recurrence of Q.
4 / 29
Bipartite matching model Model description
Matching Policies
Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivalsof Supply/Demand, and 3) A matching policy.
Admissible policies
State feedback: Decision U(t) depends only on buffers Q(t) and immediatearrivals A(t),
Q(t + 1) = Q(t)− U(t) + A(t)
Match according to graph, maintaining
|UD(t)| = |US(t)||QD(t)| = |QS(t)| for all t
Q is a discrete time Markov chainStability = positive recurrence of Q.
4 / 29
Bipartite matching model Model description
Matching Policies
Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivalsof Supply/Demand, and 3) A matching policy.
Admissible policies
State feedback: Decision U(t) depends only on buffers Q(t) and immediatearrivals A(t),
Q(t + 1) = Q(t)− U(t) + A(t)
Match according to graph, maintaining
|UD(t)| = |US(t)||QD(t)| = |QS(t)| for all t
Q is a discrete time Markov chainStability = positive recurrence of Q.
4 / 29
Bipartite matching model Model description
Matching Policies
Model specified by 1) Matching graph, 2) Joint probability measure µ for arrivalsof Supply/Demand, and 3) A matching policy.
Admissible policies
State feedback: Decision U(t) depends only on buffers Q(t) and immediatearrivals A(t),
Q(t + 1) = Q(t)− U(t) + A(t)
Match according to graph, maintaining
|UD(t)| = |US(t)||QD(t)| = |QS(t)| for all t
Q is a discrete time Markov chainStability = positive recurrence of Q.
4 / 29
Bipartite matching model Necessary conditions
Necessary stability conditions
For a matching graph (D,S,E ) we denote:
D(s) = {d ∈ D : (d , s) ∈ E}, S(d) = {s ∈ S : (d , s) ∈ E} .
Necessary conditions: If the model is stable then the marginals of µ satisfy
NCond :
{µD(U) < µS(S(U)), ∀U ( DµS(V ) < µD(D(V )), ∀V ( S
Prop. Given [(D,S,E ), µ], there exists an algorithm of time complexityO((|D|+ |S|)3) to decide if NCond is satisfied.
5 / 29
Bipartite matching model Necessary conditions
Necessary stability conditions
For a matching graph (D,S,E ) we denote:
D(s) = {d ∈ D : (d , s) ∈ E}, S(d) = {s ∈ S : (d , s) ∈ E} .
Necessary conditions: If the model is stable then the marginals of µ satisfy
NCond :
{µD(U) < µS(S(U)), ∀U ( DµS(V ) < µD(D(V )), ∀V ( S
Prop. Given [(D,S,E ), µ], there exists an algorithm of time complexityO((|D|+ |S|)3) to decide if NCond is satisfied.
5 / 29
Bipartite matching model Necessary conditions
Proof
Proof using network flow arguments:
N =(D ∪ S ∪ {i , f },E ∪ {(i , d), d ∈ D} ∪ {(s, f ), s ∈ S}
).
Capacities: µD(d) for (i , d), µS(s) for (s, f ), ∞ for (d , s).
Lemma.
1 There exists a flow of value 1 in N iff µ satisfiesNCond≤ (< replaced by ≤ in NCond).
2 There exists a flow T of value 1 such thatT (d , s) > 0 for all (d , s) ∈ E iff µ satisfiesNCond.
i
S
D
f
6 / 29
Bipartite matching model Necessary conditions
Proof of Lemma 2
⇒ Follows easily from connectivity of the matching graph.
⇐ Fix η such that 0 < η < 1/|E |. A strictly positive flow of value |E |η:
Tη(x , y) =
η for (x , y) = (d , s) ∈ E
|S(d)| η for (x , y) = (i , d)
|D(s)| η for (x , y) = (s, f ) .
Define: µD(d) = µD(d)−|S(d)|η1−|E |η , µS(s) = µS(s)−|D(s)|η
1−|E |η .
For η small enough, µD, µS are probability measures satisfying NCond.
For µD, µS there exists a flow T of value 1.
A strictly positive flow of value 1: T = Tη + (1− |E |η)T .
7 / 29
Bipartite matching model Necessary conditions
Verification algorithm
The pair (µD, µS) satisfies NCond iff the pair (µD, µS) satisfies NCond for ηstrictly positive and small enough.
Run MaxFlow on the input (N , µD, µS) by considering η as a formal parameter“as small as needed”.
Quantities of type: x + yη for x , y ∈ R.Addition: (x1 + y1η) + (x2 + y2η) = (x1 + x2) + (y1 + y2)η.Comparisons:[
x1 + y1η = x2 + y2η]⇐⇒
[x1 = x2, y1 = y2
][x1 + y1η < x2 + y2η
]⇐⇒
[(x1 < x2) or (x1 = x2, y1 < y2)
]On any given input, MaxFlow stops in finite time.A posteriori, assign to η a value which is small enough not to reverse any strictinequality.
8 / 29
Bipartite matching model Necessary conditions
Verification algorithm
The pair (µD, µS) satisfies NCond iff the pair (µD, µS) satisfies NCond for ηstrictly positive and small enough.
Run MaxFlow on the input (N , µD, µS) by considering η as a formal parameter“as small as needed”.
Quantities of type: x + yη for x , y ∈ R.Addition: (x1 + y1η) + (x2 + y2η) = (x1 + x2) + (y1 + y2)η.Comparisons:[
x1 + y1η = x2 + y2η]⇐⇒
[x1 = x2, y1 = y2
][x1 + y1η < x2 + y2η
]⇐⇒
[(x1 < x2) or (x1 = x2, y1 < y2)
]
On any given input, MaxFlow stops in finite time.A posteriori, assign to η a value which is small enough not to reverse any strictinequality.
8 / 29
Bipartite matching model Necessary conditions
Verification algorithm
The pair (µD, µS) satisfies NCond iff the pair (µD, µS) satisfies NCond for ηstrictly positive and small enough.
Run MaxFlow on the input (N , µD, µS) by considering η as a formal parameter“as small as needed”.
Quantities of type: x + yη for x , y ∈ R.Addition: (x1 + y1η) + (x2 + y2η) = (x1 + x2) + (y1 + y2)η.Comparisons:[
x1 + y1η = x2 + y2η]⇐⇒
[x1 = x2, y1 = y2
][x1 + y1η < x2 + y2η
]⇐⇒
[(x1 < x2) or (x1 = x2, y1 < y2)
]On any given input, MaxFlow stops in finite time.A posteriori, assign to η a value which is small enough not to reverse any strictinequality.
8 / 29
Bipartite matching model Necessary conditions
Example
1 2 3 4
1’2’3’4’
1 2 3 4
1’2’3’4’
D
S
matching graph arrival graph
Consider any µ with supp(µ) = F . We have
µS({1′, 2′}) = µ(3, 1′) + µ(4, 2′) ≤ µD({3, 4}) ,
which contradicts NCond for U = {3, 4}.
9 / 29
Bipartite matching model Connectivity properties
Connectivity properties
Consider a bipartite matching structure (D,S,E ,F ). Associated directed graph:the nodes are D ∪ S and the arcs are
d −→ s, if (d , s) ∈ E , s −→ d , if (d , s) ∈ F .
1 2 3 4
1’2’3’4’
1 2 3 4
1’2’3’4’
arrival graph (D,S, F )matching graph (D,S,E )
S
C
associated directed graph
10 / 29
Bipartite matching model Connectivity properties
Connectivity properties
Thm. For a bipartite matching structure (D,S,E ,F ) the following properties areequivalent:
1 There exists µ such that supp(µ) = F , supp(µD) = D, supp(µS) = S and µsatisfies NCond.
2 The associated directed graph is strongly connected.
Thm. If the associated directed graph of (D,S,E ,F ) is strongly connected, thenany bipartite matching model [(D,S,E ,F ), µ,Pol] has a unique strictlyconnected component with all states leading to it.
11 / 29
Bipartite matching model Connectivity properties
Connectivity properties
Thm. For a bipartite matching structure (D,S,E ,F ) the following properties areequivalent:
1 There exists µ such that supp(µ) = F , supp(µD) = D, supp(µS) = S and µsatisfies NCond.
2 The associated directed graph is strongly connected.
Thm. If the associated directed graph of (D,S,E ,F ) is strongly connected, thenany bipartite matching model [(D,S,E ,F ), µ,Pol] has a unique strictlyconnected component with all states leading to it.
11 / 29
Bipartite matching model Sufficient conditions
State space decomposition
The state space can be decomposed into facets, defined only by the non-emptyclasses.
Def. A facet is an ordered pair (U,V ) such that: U ⊂ D,V ⊂ S andU × V ⊂ (D × S − E ).
1 32
2’ 1’3’
1 32
2’ 1’3’
facet ({3}, {3′}) facet ({2, 3}, {3′})
For a facet F = (U,V ), define:
D•(F) = U, D}(F) = D(V ), D◦(F) = D − (D•(F) ∪ D}(F))
S•(F) = V , S}(F) = S(U), S◦(F) = S − (S•(F) ∪ S}(F)).
12 / 29
Bipartite matching model Sufficient conditions
Sufficient conditions
Conditions SCond:
µD(D}(F)) + µS(S}(F)) > 1− µ(E ∩ D◦(F)× S◦(F)), ∀F 6= (∅, ∅)
Prop. (Sufficient conditions) A bipartite model with probability µ satisfyingSCond is stable under any admissible matching policy.
Proof. Variation of the linear Lyapunov function(number of unmatched customers):
D} D◦ D•S} −1 0 0S◦ 0 0 or 1 1S• 0 1 1
13 / 29
Bipartite matching model Sufficient conditions
Sufficient conditions
Def. A facet F is called saturated if D◦(F) = ∅ or S◦(F) = ∅.
SCond =⇒ NCond (considering only the saturated facets).
1 32
2’ 1’3’
1 32
2’ 1’3’
non-saturated saturated
For the NN graph:SCond = {NCond ∩ (µD(1) + µS(1′) > 1 −µ(2, 2′))} .
For µ = µD × µS andµD = µS = (x , y , 1− x − y):
NCond :
{x < 0.52x + y > 1
SCond :
{NCond2x + y2 > 1
14 / 29
Policy-specific results Match the Longest has maximal stability region
Match the Longest has maximal stability region
Match the Longest (ML) policy: a newly arriving customer of class c is matchedto a server in S(c) with the largest buffer (similarly for newly arriving server).
Thm. For any bipartite graph, ML has a maximal stability region.
Proof:
Quadratic Lyapunov function: L(x , y) =∑
d∈D x2d +
∑s∈S y
2s .
ML minimizes the value of this Lyapunov function at each step.
Facet-dependent randomized policy. In a non-zero facet F : the servers ∈ S}(F) is matched to d ∈ D•(F) ∩ D(s) with probability PFsd . Theseprobabilities can be chosen such that:
∀d ∈ D•,∑
s∈S(d)
µS(s)PFsd > µD(d).
(symmetrically for customers)
For this randomized policy stability can be shown using Foster-Lyapunovcriterion.
15 / 29
Policy-specific results Match the Longest has maximal stability region
Match the Longest has maximal stability region
Match the Longest (ML) policy: a newly arriving customer of class c is matchedto a server in S(c) with the largest buffer (similarly for newly arriving server).
Thm. For any bipartite graph, ML has a maximal stability region.
Proof:
Quadratic Lyapunov function: L(x , y) =∑
d∈D x2d +
∑s∈S y
2s .
ML minimizes the value of this Lyapunov function at each step.
Facet-dependent randomized policy. In a non-zero facet F : the servers ∈ S}(F) is matched to d ∈ D•(F) ∩ D(s) with probability PFsd . Theseprobabilities can be chosen such that:
∀d ∈ D•,∑
s∈S(d)
µS(s)PFsd > µD(d).
(symmetrically for customers)
For this randomized policy stability can be shown using Foster-Lyapunovcriterion.
15 / 29
Policy-specific results Match the Longest has maximal stability region
Match the Longest has maximal stability region
Match the Longest (ML) policy: a newly arriving customer of class c is matchedto a server in S(c) with the largest buffer (similarly for newly arriving server).
Thm. For any bipartite graph, ML has a maximal stability region.
Proof:
Quadratic Lyapunov function: L(x , y) =∑
d∈D x2d +
∑s∈S y
2s .
ML minimizes the value of this Lyapunov function at each step.
Facet-dependent randomized policy. In a non-zero facet F : the servers ∈ S}(F) is matched to d ∈ D•(F) ∩ D(s) with probability PFsd . Theseprobabilities can be chosen such that:
∀d ∈ D•,∑
s∈S(d)
µS(s)PFsd > µD(d).
(symmetrically for customers)
For this randomized policy stability can be shown using Foster-Lyapunovcriterion.
15 / 29
Policy-specific results Priorities are not always stable
Priorities and Match the Shortest are not always stable
Prop. NN model with either the MS policy or the PR (priority) policy such thatcustomers of class 1 (resp. servers of class 1′) give priority to servers of class 2′
(resp. to customers of class 2):
1 32
2’ 1’3’
C
S
For both policies, the stability region is not maximal.
Consider µD = (1/3, 2/5, 4/15),µS = µD, and µ = µD × µS . NCondare satisfied, but the Markov chain istransient (for MS or PR as above).
16 / 29
Policy-specific results Priorities are not always stable
Stability region for Match the shortest
17 / 29
Workload Stabilizability
Optimization
Cost function c on buffer levels.
Average-cost:
η = lim supN→∞
1
N
N−1∑t=0
E[c(Q(t))
]
Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:
U� ={∑e∈E
neue : ne ∈ Z+
}with ue = 1i + 1j for e = (i , j) ∈ E .X (t) = Q(t) + A(t) the state process of the MDP model.
X (t + 1) = X (t)− U(t) + A(t + 1)
The state space X� = {x ∈ Z`+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1,−1, . . . ,−1).
18 / 29
Workload Stabilizability
Optimization
Cost function c on buffer levels.
Average-cost:
η = lim supN→∞
1
N
N−1∑t=0
E[c(Q(t))
]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0
Input process U represents the sequence of matching activities. Input space:
U� ={∑e∈E
neue : ne ∈ Z+
}with ue = 1i + 1j for e = (i , j) ∈ E .X (t) = Q(t) + A(t) the state process of the MDP model.
X (t + 1) = X (t)− U(t) + A(t + 1)
The state space X� = {x ∈ Z`+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1,−1, . . . ,−1).
18 / 29
Workload Stabilizability
Optimization
Cost function c on buffer levels.
Average-cost:
η = lim supN→∞
1
N
N−1∑t=0
E[c(Q(t))
]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:
U� ={∑e∈E
neue : ne ∈ Z+
}with ue = 1i + 1j for e = (i , j) ∈ E .
X (t) = Q(t) + A(t) the state process of the MDP model.
X (t + 1) = X (t)− U(t) + A(t + 1)
The state space X� = {x ∈ Z`+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1,−1, . . . ,−1).
18 / 29
Workload Stabilizability
Optimization
Cost function c on buffer levels.
Average-cost:
η = lim supN→∞
1
N
N−1∑t=0
E[c(Q(t))
]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:
U� ={∑e∈E
neue : ne ∈ Z+
}with ue = 1i + 1j for e = (i , j) ∈ E .X (t) = Q(t) + A(t) the state process of the MDP model.
X (t + 1) = X (t)− U(t) + A(t + 1)
The state space X� = {x ∈ Z`+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1,−1, . . . ,−1).
18 / 29
Workload Stabilizability
Workload
For any D ⊂ D, corresponding workload vector ξD defined so that
ξD · x =∑i∈D
xDi −
∑j∈S(D)
xSj
Necessary and sufficient condition for a stabilizing policy:
NCond: δD :=−ξD · α > 0 for each Dα = E[A(t)] arrival rate vector.
Why is this workload? Consistent with routing/scheduling models:
Fluid model,d
dtx(t) = −u(t) + α
The minimal time to reach the origin from x(0) = x : T ∗(x) = maxDξD ·xδD
Heavy-traffic: δD ∼ 0 for one or more D
19 / 29
Workload Stabilizability
Workload
For any D ⊂ D, corresponding workload vector ξD defined so that
ξD · x =∑i∈D
xDi −
∑j∈S(D)
xSj
Necessary and sufficient condition for a stabilizing policy:
NCond: δD :=−ξD · α > 0 for each Dα = E[A(t)] arrival rate vector.
Why is this workload?
Consistent with routing/scheduling models:
Fluid model,d
dtx(t) = −u(t) + α
The minimal time to reach the origin from x(0) = x : T ∗(x) = maxDξD ·xδD
Heavy-traffic: δD ∼ 0 for one or more D
19 / 29
Workload Stabilizability
Workload
For any D ⊂ D, corresponding workload vector ξD defined so that
ξD · x =∑i∈D
xDi −
∑j∈S(D)
xSj
Necessary and sufficient condition for a stabilizing policy:
NCond: δD :=−ξD · α > 0 for each Dα = E[A(t)] arrival rate vector.
Why is this workload? Consistent with routing/scheduling models:
Fluid model,d
dtx(t) = −u(t) + α
The minimal time to reach the origin from x(0) = x : T ∗(x) = maxDξD ·xδD
Heavy-traffic: δD ∼ 0 for one or more D
19 / 29
Workload Stabilizability
Workload
For any D ⊂ D, corresponding workload vector ξD defined so that
ξD · x =∑i∈D
xDi −
∑j∈S(D)
xSj
Necessary and sufficient condition for a stabilizing policy:
NCond: δD :=−ξD · α > 0 for each Dα = E[A(t)] arrival rate vector.
Why is this workload? Consistent with routing/scheduling models:
Fluid model,d
dtx(t) = −u(t) + α
The minimal time to reach the origin from x(0) = x : T ∗(x) = maxDξD ·xδD
Heavy-traffic: δD ∼ 0 for one or more D
19 / 29
Workload Workload Relaxation
Workload Dynamics
Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).
Workload W (t) = ξ · X (t)
can be positive or negative.Dynamics as in other queueing models,
E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ
Achieved ⇐⇒ S(D) matches with D only.
Workload relaxation: take this as the model for control.
20 / 29
Workload Workload Relaxation
Workload Dynamics
Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).
Workload W (t) = ξ · X (t) can be positive or negative.
Dynamics as in other queueing models,
E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ
Achieved ⇐⇒ S(D) matches with D only.
Workload relaxation: take this as the model for control.
20 / 29
Workload Workload Relaxation
Workload Dynamics
Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).
Workload W (t) = ξ · X (t) can be positive or negative.Dynamics as in other queueing models,
E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ
Achieved ⇐⇒ S(D) matches with D only.
Workload relaxation: take this as the model for control.
20 / 29
Workload Workload Relaxation
Workload Dynamics
Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).
Workload W (t) = ξ · X (t) can be positive or negative.Dynamics as in other queueing models,
E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ
Achieved ⇐⇒ S(D) matches with D only.
Workload relaxation: take this as the model for control.
20 / 29
Workload Workload Relaxation
Workload Dynamics
Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).
Workload W (t) = ξ · X (t) can be positive or negative.Dynamics as in other queueing models,
E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ
Achieved ⇐⇒ S(D) matches with D only.
Workload relaxation: take this as the model for control.
20 / 29
Workload Workload Relaxation
Relaxations
A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,
W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0
+ N(t + 1)︸ ︷︷ ︸Zero mean
Effective cost c : < → <+: Given a cost function c for Q,
c(w) = min{c(x) : ξ · x = w}
piecewise linear if c is linear
Conclusions
Control of the relaxation = inventory model of Clark & Scarf
Hedging policy, with threshold r : Idling is not permitted unless W (t) < −r
Heavy-traffic: For average-cost optimal control, r ∼ 12
σ2N
δlog(1 + c+/c−)
21 / 29
Workload Workload Relaxation
Relaxations
A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,
W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0
+ N(t + 1)︸ ︷︷ ︸Zero mean
Effective cost c : < → <+: Given a cost function c for Q,
c(w) = min{c(x) : ξ · x = w}
piecewise linear if c is linear
Conclusions
Control of the relaxation = inventory model of Clark & Scarf
Hedging policy, with threshold r : Idling is not permitted unless W (t) < −r
Heavy-traffic: For average-cost optimal control, r ∼ 12
σ2N
δlog(1 + c+/c−)
21 / 29
Workload Workload Relaxation
Relaxations
A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,
W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0
+ N(t + 1)︸ ︷︷ ︸Zero mean
Effective cost c : < → <+: Given a cost function c for Q,
c(w) = min{c(x) : ξ · x = w}
piecewise linear if c is linear
Conclusions
Control of the relaxation = inventory model of Clark & Scarf
Hedging policy, with threshold r : Idling is not permitted unless W (t) < −r
Heavy-traffic: For average-cost optimal control, r ∼ 12
σ2N
δlog(1 + c+/c−)
21 / 29
Workload Workload Relaxation
Relaxations
A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,
W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0
+ N(t + 1)︸ ︷︷ ︸Zero mean
Effective cost c : < → <+: Given a cost function c for Q,
c(w) = min{c(x) : ξ · x = w}
piecewise linear if c is linear
Conclusions
Control of the relaxation = inventory model of Clark & Scarf
Hedging policy, with threshold r : Idling is not permitted unless W (t) < −r
Heavy-traffic: For average-cost optimal control, r ∼ 12
σ2N
δlog(1 + c+/c−)
21 / 29
Workload Workload Relaxation
Relaxations
A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,
W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0
+ N(t + 1)︸ ︷︷ ︸Zero mean
Effective cost c : < → <+: Given a cost function c for Q,
c(w) = min{c(x) : ξ · x = w}
piecewise linear if c is linear
Conclusions
Control of the relaxation = inventory model of Clark & Scarf
Hedging policy, with threshold r : Idling is not permitted unless W (t) < −r
Heavy-traffic: For average-cost optimal control, r ∼ 12
σ2N
δlog(1 + c+/c−)
21 / 29
Workload Examples
Tracking the Relaxation
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
ξ1 = (0, 0, 1,−1, 0, 0)T
W (t) = QD3 (t)− QS
1 (t)
Relaxation:Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
22 / 29
Workload Examples
Tracking the Relaxation
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
ξ1 = (0, 0, 1,−1, 0, 0)T
W (t) = QD3 (t)− QS
1 (t)
Relaxation:Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
22 / 29
Workload Examples
Tracking the Relaxation
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
ξ1 = (0, 0, 1,−1, 0, 0)T
W (t) = QD3 (t)− QS
1 (t)
Relaxation:Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
22 / 29
Workload Examples
Tracking the Relaxation
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
ξ1 = (0, 0, 1,−1, 0, 0)T
W (t) = QD3 (t)− QS
1 (t)
Relaxation:Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
22 / 29
Workload Examples
Tracking the Relaxation
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
ξ1 = (0, 0, 1,−1, 0, 0)T
W (t) = QD3 (t)− QS
1 (t)
Relaxation:Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
22 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
30 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
30 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
30 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
30 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
0 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
0 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:
Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 1
Cost: c(x) = xD1 + 2xD
2 + 3xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = 4|w |
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
30 5 10 15 20 25 30 r
r∗ = 14.9
62
64
66
68
70
72
74
76
Average Cost Estimated in Simulation:
Average Cost Comparisons:
50
100
150
Priority
MaxWeight
Threshold (15)
1 2 3 4 5x 106
T
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD2 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 14.9
23 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Average Cost Estimated in Simulation:
0 2 4 6 8 10 12 14 r
r∗ = 7.2
39
40
41
42
43
44
45
Average Cost Comparisons:
20
30
40
50
60
70
Priority
MaxWeight
Threshold (7)
0 1 2 3 4 5 6
T5 6x 10
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD1 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 7.2
24 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Average Cost Estimated in Simulation:
0 2 4 6 8 10 12 14 r
r∗ = 7.2
39
40
41
42
43
44
45
Average Cost Comparisons:
20
30
40
50
60
70
Priority
MaxWeight
Threshold (7)
0 1 2 3 4 5 6
T5 6x 10
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD1 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 7.2
24 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Average Cost Estimated in Simulation:
0 2 4 6 8 10 12 14 r
r∗ = 7.2
39
40
41
42
43
44
45
Average Cost Comparisons:
20
30
40
50
60
70
Priority
MaxWeight
Threshold (7)
0 1 2 3 4 5 6
T5 6x 10
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD1 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 7.2
24 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Average Cost Estimated in Simulation:
0 2 4 6 8 10 12 14 r
r∗ = 7.2
39
40
41
42
43
44
45
Average Cost Comparisons:
20
30
40
50
60
70
Priority
MaxWeight
Threshold (7)
0 1 2 3 4 5 6
T5 6x 10
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD1 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 7.2
24 / 29
Workload Examples
Tracking the Relaxation W (t) = QD3 (t)− QS
1 (t)
Example 2
Cost: c(x) = 3xD1 + 2xD
2 + xD3 + 3xS
1 + 2xS2 + xS
3
=⇒ Effective Cost: c(w) = max(2w ,−5w)
e1 e2 e3 e4 e5
xD1 xD
2 xD3
xS1xS
2xS3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Dem
and
Supp
ly
Q QD1
D2 QD
3
Q QS1
S2 QS
3
Average Cost Estimated in Simulation:
0 2 4 6 8 10 12 14 r
r∗ = 7.2
39
40
41
42
43
44
45
Average Cost Comparisons:
20
30
40
50
60
70
Priority
MaxWeight
Threshold (7)
0 1 2 3 4 5 6
T5 6x 10
Matching of Supply 1 and Demand 2allowed only if W (t) < −r
Workload Relaxation:
QS1 (t) = QS
2 (t) = 0 if W (t) > 0
QD1 (t) = QD
3 (t) = 0 if W (t) < 0
Simulation with r∗ = 7.2
24 / 29
h-MaxWeight and Approximate Optimality
h-MaxWeight
h-MaxWeight Policy: U(t) = φMW(Q(t))
φMW(x) = argminu
E[∇h (x) ·∆(t + 1) | X (t) = x ,U(t) = u]
where ∆(t + 1) = X (t + 1)− X (t) = −U(t) + A(t + 1)
Average drift: −φMW(x) + α =
E[∆(t + 1) | X (t) = x ] = E[−U(t) + A(t + 1) | X (t) = x ]
For average cost optimality, this means,
E[h(Q(t + 1))− h(Q(t)) | Q(t) = x ] = ∇h (x) · [−φMW(x) + α]︸ ︷︷ ︸∼−c(x)
+ 12 E[∆(t + 1)T∇2h (X )∆(t + 1)
]︸ ︷︷ ︸
bounded
Average drift: −φMW(x) + α =
E[∆(t + 1) | X (t) = x ] = E[−U(t) + A(t + 1) | X (t) = x ]
25 / 29
h-MaxWeight and Approximate Optimality
h-MaxWeight
h-MaxWeight Policy: U(t) = φMW(Q(t))
φMW(x) = argminu
E[∇h (x) ·∆(t + 1) | Q(t) = x ,U(t) = u]
where ∆(t + 1) = X (t + 1)− X (t) = −U(t) + A(t + 1)
Hope that h approximates solution to a dynamic programming equation
For average cost optimality, this means,
E[h(Q(t + 1))− h(Q(t)) | Q(t) = x ] = ∇h (x) · [−φMW(x) + α]︸ ︷︷ ︸∼−c(x)
+ 12 E[∆(t + 1)T∇2h (X )∆(t + 1)
]︸ ︷︷ ︸
bounded
Average drift: −φMW(x) + α =
E[∆(t + 1) | X (t) = x ] = E[−U(t) + A(t + 1) | X (t) = x ]
25 / 29
h-MaxWeight and Approximate Optimality
h-MaxWeight
h-MaxWeight Policy: U(t) = φMW(Q(t))
φMW(x) = argminu
E[∇h (x) ·∆(t + 1) | Q(t) = x ,U(t) = u]
where ∆(t + 1) = X (t + 1)− X (t) = −U(t) + A(t + 1)
Hope that h approximates solution to a dynamic programming equation
For average cost optimality, this means,
E[h(Q(t + 1))− h(Q(t)) | Q(t) = x ] = ∇h (x) · [−φMW(x) + α]︸ ︷︷ ︸∼−c(x)
+ 12 E[∆(t + 1)T∇2h (X )∆(t + 1)
]︸ ︷︷ ︸
bounded
Average drift: −φMW(x) + α =
E[∆(t + 1) | X (t) = x ] = E[−U(t) + A(t + 1) | X (t) = x ]
25 / 29
h-MaxWeight and Approximate Optimality
Asymptotic optimality
Family of arrival processes {Aδ(t)} parameterized by Additional assumptions:
(A1) For one set D ( D we have ξD · αδ = −δ, where αδ denotes the mean ofAδ(t).Moreover, there is a fixed constant δ > 0 such that ξD
′ · αδ ≤ −δ for anyD ′ ( D, D ′ 6= D, and δ ∈ [0, δ•].
(A2) The distributions are continuous at δ = 0, with linear rate: For someconstant b,
E[‖Aδ(t)− A0(t)‖] ≤ bδ. (1)
(A3) The sets E and F do not depend upon δ, and the graph associated with E isconnected. Moreover, there exists i0 ∈ S(D), j0 ∈ Dc , and εI > 0 such that
P{Aδi0 (t) ≥ 1 and Aδj0 (t) ≥ 1} ≥ εI , 0 ≤ δ ≤ δ•. (2)
26 / 29
h-MaxWeight and Approximate Optimality
Asymptotic optimality
There is a function h such that, under Assumptions (A1)–(A3), for sufficientlylarge κ > 0, β > 0, and sufficiently small δ+ > 0 (each independent of δ), theaverage cost η under the h-MaxWeight policy satisfies,
η∗ ≤ η∗ ≤ η ≤ η∗ + O(1)
where η∗ is the optimal average cost for the MDP model, η∗ is the optimalaverage cost for the workload relaxation, and the constant O(1) does not dependupon δ.The average cost for the relaxation satisfies the uniform bound,
η∗ = η∗∗ + O(1)
where η∗∗ is the optimal cost for the diffusion approx. for the relaxation:
η∗∗ =1
Θc− log
(1 +
c+
c−
), where
1
Θ= 1
2
σ2∆
δ.
27 / 29
Final remarks
Final remarks/related work
Performance bounds?
Approximate optimal control for relaxations in higher dimensions?
More general arrival assumptions. Admission control?
Non-bipartite matching? Networks?
Applications in energy systems and/or healthcare?
28 / 29
Final remarks
Final remarks/related work
Performance bounds?
Approximate optimal control for relaxations in higher dimensions?
More general arrival assumptions. Admission control?
Non-bipartite matching? Networks?
Applications in energy systems and/or healthcare?
28 / 29
Final remarks
Final remarks/related work
Performance bounds?
Approximate optimal control for relaxations in higher dimensions?
More general arrival assumptions. Admission control?
Non-bipartite matching? Networks?
Applications in energy systems and/or healthcare?
28 / 29
References
References
Related models
Tassiulas, Ephremides, IEEE TAC 1992.McKeown, Mekkitikul, Anantharam, Walrand, IEEE Trans. Comm. 1999.
Bipartite matching model
Caldentey, Kaplan, Weiss, Adv. Appl. Probab. 2009.Adan & Weiss, Operations Research, 2012.Busic, Gupta, Mairesse, Stability of the bipartite matching model. Adv.Appl. Probab. 2013.Busic, Meyn, Optimization of Dynamic Matching Models. ArXiv:1411.1044.2014.Adan, Busic, Mairesse, Weiss, Reversibility of the FCFS bipartite matchingmodel. In preparation.
Workload relaxations
Meyn, Stability and asymptotic optimality of generalized MaxWeight policies.SIAM J. Control Optim., 2009.Meyn, Control Techniques for Complex Networks. Cambridge UniversityPress, 2007.
29 / 29