data-based science for service engineering and...
TRANSCRIPT
Data-Based Science forService Engineering and Management
or: Empirical Adventuresin Call-Centers and Hospitals
Avi Mandelbaum
Technion, Haifa, Israel
http://ie.technion.ac.il/serveng
IELM, Hong Kong UST, September 2011
1
Research PartnersI Students:
Aldor∗, Baron∗, Carmeli, Feldman∗, Garnett∗, Gurvich∗, Huang,Khudiakov∗, Maman∗, Marmor∗, Reich, Rosenshmidt∗, Shaikhet∗,Senderovic, Tseytlin∗, Yom-Tov∗, Zaied, Zeltyn∗, Zychlinski, Zohar∗,Zviran, . . .
I Theory:Armony, Atar, Gurvich, Jelenkovic, Kaspi, Massey, Momcilovic,Reiman, Shimkin, Stolyar, Wasserkrug, Whitt, Zeltyn, . . .
I Industry:IBM Research (OCR: Carmeli, Vortman, Wasserkrug, Zeltyn),Rambam Hospital, Hapoalim Bank, Mizrahi Bank, PelephoneCellular, . . .
I Technion SEE Center / Labaratory:Feigin; Trofimov, Nadjharov, Gavako, Kutsyy; Liberman, Koren,Rom, Plonsky; Research Assistants, . . .
I Empirical/Statistical Analysis:Brown, Gans, Zhao; Shen; Ritov, Goldberg; Allon, Ata, Bassamboo;Gurvich, Huang, Liberman; Armony, Marmor, Tseytlin, Yom-Tov;Zeltyn, Nardi, . . .
2
History, Resources (Downloadable)
I Math. + C.S. + Stat. + O.R. + Mgt. ⇒ IE (≥ 1990)
I Teaching: “Service-Engineering" Course (≥ 1995):http://ie.technion.ac.il/serveng - websitehttp://ie.technion.ac.il/serveng/References/teaching_paper.pdf
I Call-Centers Research (≥ 2000)e.g. <Call Centers> in Google-Scholar
I Healthcare Research (≥ 2005)e.g. OCR Project: IBM + Rambam Hospital + Technion
I The Technion SEE Center (≥ 2007)
3
The Case for Service Science / Engineering
I Service Science / Engineering (vs. Management) are emergingAcademic Disciplines. For example, universities (world-wide),IBM (SSME, a là Computer-Science), USA NSF (SEE), GermanyIAO (ServEng), ...
I Models that explain fundamental phenomena , which arecommon across applications:
- Call Centers- Hospitals- Transportation- Justice, Fast Food, Police, Internet, . . .
I Simple models at the Service of Complex Realities (Human)Note: Simple yet rooted in deep analysis.
I Mostly What Can Be Done vs. How To
4
Title: Expands the Scientific Paradigm
Physics, Biology, . . . : Measure, Model, Experiment, Validate, Refine.Human-complexity triggered above in Transportation, Economics.Starting with Data, expand to:Service Science/Engineering/Management
7. Feedback 1. Measurements / Data
6. Improvement 5. Implementation2. Modeling,
Analysis3. Validation
8. Novel needs, necessitating Science
4. Maturity enables Deployment
e.g. Validate, refute or discover congestion laws (Little, PASTA,SSC, ?, ?,...), in call centers and hospitals
5
Little’s Law: Call Center & Emergency Department
Time-Gap: # in System lags behind Piecewise-Little (L = λ × W )
Little’s Law and the Offered-Load: Empirical Adventures in Call Centers and Hospitals
The Black-Box View of a Call Center: Number in Q vs. Little shows a time-gap.
USBank Customers in queue(average), Telesales10.10.2001
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
180.00
200.00
07:00 08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
Time (Resolution 30 min.)
Num
ber
of c
ases
Customers in queue(average) Little's law
HomeHospital Average patients in EDFebruary 2004, Wednesdays
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 30 min.)
Ave
rage
num
ber
of c
ases
Average patients in ED Lambda*E(S) smoothed
בהסתכלות ממוצעת על יום באמצע השבוע לאורך מספר חודשים
HomeHospital Average patients in EDTotal for February2004 March2004 April2004 May2004 June2004 July2004 August2004 September2004 October2004,Week days
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 60 min.)
Ave
rage
num
ber
of c
ases
Average patients in ED Lambda*E(S)
⇒ Time-Varying Little’s LawI Berstemas & Mourtzinou;I Fralix, Riano, Serfozo; . . .
6
QED Call Center: Staffing (N) vs. Offered-Load (R)IL Telecom; June-September, 2004; w/ Nardi, Plonski, Zeltyn
Empirical Analysis of a QED Call Center
• 2205 half-hour intervals of an Israeli call center • Almost all intervals are within R R− and 2R R+ (i.e. 1 2β− ≤ ≤ ),
implying: o Very decent forecasts made by the call center o A very reasonable level of service (or does it?)
• QED? o Recall: ( )GMR x is the asymptotic probability of P(Wait>0) as a
function of β , where x θµ=
o In this data-set, 0.35x θµ= =
o Theoretical analysis suggests that under this condition, for 1 2β− ≤ ≤ , we get 0 ( 0) 1P Wait≤ > ≤ …
2205 half-hour intervals in an Israeli Call Center7
QED Call Center: Performance
Large Israeli Bank
P{Wq > 0} vs. (R, N) R-Slice: P{Wq > 0} vs. N
• Intercepting plane of the previous plot for a "constant" offered load (actually
18 18.5R≤ ≤ ) • Smoother line was created using smoothing splines. • General shape of the smoothing line is as the Garnett delay functions! • In 2-d view, we get the following:
3 Operational Regimes:I QD: ≤ 25%
I QED: 25%− 75%
I ED: ≥ 75%
8
Empirical data: 2204 intervals of Technical category from Telecom call center
(13 summer weeks; week-days only; 30 min. resolution; excluding 6 outliers). Theoretical plot: based on approximations of Erlang-A for the proper rate.
Operational Regimes: Scaling, Performance,w/ I. Gurvich & J. HuangTable 1: Operational Regimes: Asymptotics, Scaling, Performance
Erlang-A Conventional scaling MS scaling NDS scaling
µ fixed Sub Critical Super QD QED ED ED+QED Sub Critical Super
Offered load per server 11+δ
< 1 1− β√n≈ 1 1
1−γ > 1 11+δ
1− β√n
11−γ
11−γ − β
√1
n(1−γ)31
1+δ1− β
n1
1−γ
Arrival rate λ µ1+δ
µ− β√nµ µ
1−γnµ1+δ
nµ− βµ√n nµ1−γ
nµ1−γ − βµ
√n
(1−γ)3nµ1+δ
nµ− βµ nµ1−γ
Number of servers 1 n n
Time-scale n 1 n
Abandonment rate θ/n θ θ/n
Staffing level λµ(1 + δ) λ
µ(1 + β√
n) λ
µ(1− γ) λ
µ(1 + δ) λ
µ+ β
√λµ
λµ(1− γ) λ
µ(1− γ) + β
√λµ
λµ(1 + δ) λ
µ+ β λ
µ(1− γ)
Utilization 11+δ
1−√
θµ
h(β)√n
1 11+δ
1−√
θµ
(1−α2)β+α2h(β)√n
1 1 11+δ
1−√
θµ
h(β)
n1
E(Q) α1
δ
√n
√µθ[h(β)− β] nµγ
θ(1−γ)1√2π
1+δδ2%n 1√
n
√n
√µθ[h(β)− β]α2
nµγθ(1−γ)
nµθ(1−γ)
(γ − β√n(1−γ)
) o(1) n√
µθ[h(β)− β] n2µγ
θ(1−γ)
P(Ab) 1n
1+δδ
θµα1
1√n
√θµ[h(β)− β] γ 1√
2π
θµ
(1+δ)2
δ2%n 1
n3/21√n
√θµ[h(β)− β]α2 γ γ − β
√1−γ√n
o( 1n2 ) 1
n
√θµ[h(β)− β] γ
P(Wq > 0) α1 ∈ (0,1) ≈ 1 1√2π
1+δδ%n 1√
n≈ 0 α2 ∈ (0,1) ≈ 1 ≈ 1 ≈ 0 ≈ 1
P(Wq > T ) α1e− δ
1+δµt 1 +O( 1√
n) 1 +O( 1
n) ≈ 0 G(T )1{G(T )<γ} α3, if G(T ) = γ ≈ 0 Φ(β+
√θµT )
Φ(β)1 +O( 1
n)
Congestion EWq
ES α11+δδ
√n
√µθ[h(β)− β] nµγ/θ 1√
2π
(1+δ)2
δ2%n 1
n3/21√n
√µθ[h(β)− β]α2 µ
∫∫∫ x∗
0G(s)ds µ
∫∫∫ x∗
0G(s)ds− µβ
√1−γ
hG(x∗)√n
o( 1n)
√µθ[h(β)− β] nµγ/θ
• δ > 0, γ ∈ (0, 1) and β ∈ (−∞,∞);
• QD:% = 11+δe
δ1+δ < 1;
• ED (ED+QED): G(x∗) = γ;
• QED:α2 = [1 +√
θµh(β)h(−β) ]−1, here β = β
√µθ and h(x) = φ(x)
Φ(x);
• ED+QED:α3 = G(T )Φ(β√
µg(T ));
• Conventional: critical: P(W > T ) = P( W√n> T√
n), super: P(W > T ) = P(Wn > T
n ); NDS: Super: P(W > T ) = P(Wn > Tn ).
1
9
Prerequisite I: Data
Averages Prevalent (and could be useful / interesting).But I need data at the level of the Individual Transaction:For each service transaction (during a phone-service in a call center,or a patient’s visit in a hospital, or browsing in a website, or . . .), itsoperational history = time-stamps of events .
Sources: “Service-floor" (vs. Industry-level, Surveys, . . .)
I Administrative (Court, via “paper analysis")I Face-to-Face (Bank, via bar-code readers)I Telephone (Call Centers, via ACD / CTI, IVR/VRU)I Hospitals (Emergency Departments, . . .)
I Expanding:I Hospitals, via RFIDI Operational + Financial + Contents (Marketing, Clinical)I Internet, Chat (multi-media)
10
Pause for a Commercial: The Technion SEE Center
11
Technion SEE = Service Enterprise Engineering
SEELab: Data-repositories for research and teaching
I For example:I Bank Anonymous: 1 years, 350K calls by 15 agents - in 2000.
Brown, Gans, Sakov, Shen, Zeltyn, Zhao (JASA), paved the wayfor:
I U.S. Bank: 2.5 years, 220M calls, 40M by 1000 agents.I Israeli Cellular: 2.5 years, 110M calls, 25M calls by 750 agents.I Israeli Bank: from January 2010, daily-deposit at a SEESafe.I Israeli Hospital: 4 years, 1000 beds; 8 ED’s- Sinreich’s data.
SEEStat: Environment for graphical EDA in real-time
I Universal Design, Internet Access, Real-Time Response.
SEEServer: Free for academic useRegister, then access (presently) U.S. Bank and Bank Anonymous.
Visitor: run mstsc, seeserver.iem.technion.ac.il ; Self-Tutorial
12
Technion - Israel Institute of Technology
The William Davidson Faculty of Industrial Engineering and Management
Center for Service Enterprise Engineering (SEE) http://ie.technion.ac.il/Labs/Serveng/
SEEStat 3.0 Tutorial
HKUST, Hong Kong, September 2011
Note: This tutorial is customized to HKUST. To become a regular user of SEEStat, please go to http://seeserver.iem.technion.ac.il/see-terminal/ click on “Register” (left menu), and follow the registration procedure.
2
, you are able to course-seminar/mini HKUSTAs a participant in the at the SEELab Serverlaptop) to the PC or (from your connect
Technion.
SEE taught -selfthe Once connected, you will be able to go through.that follows Tutorial
To start, you need a “User Name” and “Password”. Use the Number allocated to you in the seminar/mini-course (from 1-20):
Log In
On Page 3, you are asked to provide the following LogIn information:
Your User Name: visitor_1_20 Your Password: visitor_1_20
Log On to Windows
You will be needing the above also for the following step (Page 4):
Introduction: SEEStat is a environment for Exploratory Data Analysis (EDA) in real-time. It enables users to easily conduct statistical and performance analyses of massive datasets; in particular, datasets representing operational histories of large service operations (e.g. call centers, hospitals, internet sites), as available through the SEELab server. SEEStat can also automatically create sophisticated reports in Microsoft Excel, which support research and teaching. Both SEEStat and the SEELab Server were developed at the Faculty of Industrial Engineering and Management, Technion, Israel Institute of Technology. More information on the SEELab can be found at its homepage http://ie.technion.ac.il/Labs/Serveng/
1
SEEStat 3.0 Tutorial
Introduction ............................................................................................................................. 2
Connecting to SEEStat on the Technion SEELab Server ................................................... 3
SEEStat Tutorial ..................................................................................................................... 7
Part 1 ................................................................................................................................................. 7
Example 1.1: Distributions ............................................................................................................ 8
Example 1.2: Intraday time series ............................................................................................... 15
Example 1.3: Time series (Daily totals) ...................................................................................... 20
Part 2 ............................................................................................................................................... 24
Example 2.1: Distribution fitting ................................................................................................. 24
Example 2.2: Distribution mixture fitting .................................................................................... 26
Example 2.3: Survival analysis with smoothing of hazard rates ................................................. 28
Example 2.4: Smoothing of intraday time series ......................................................................... 30
Part 3 ............................................................................................................................................... 32
Example 3.1: Queue regulated by a protocol & annoucements .................................................. 32
Example 3.2: Queue length & state-space collapse .................................................................... 34
Example 3.3: Change-of_Shifts phenomena ................................................................................ 36
Example 3.4: Daily flow of calls .................................................................................................. 40
35
USBank Customers in queue(average)Total for May2002 June2002 July2002 August2002 September2002 October2002
November2002 December2002,Week days
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 1 min.)
Ave
rage
num
ber
of c
ases
Business Platinum
Platinum is a small-scale service. You will now normalize the chart in order to identify patters. Click "Output" on the main menu and then "Modify Tables and Charts". Open the "Options" tab and select Percent to mean. Click "OK".
USBank Customers in queue(average)Total for May2002 June2002 July2002 August2002 September2002 October2002
November2002 December2002,Week days
0
100
200
300
400
500
600
700
800
900
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 1 min.)
Per
cent
to
mea
n
Business Platinum
Note the essentially overlapping patterns of the queue lengths of the two customer types. (This phenomenon is predicted by asymptotic analysis of queues in heavy traffic, where it is referred to as State-Space-Collapse.)
eg. RFID-Based Data: Mass Casualty Event (MCE)
Drill: Chemical MCE, Rambam Hospital, May 2010
מאייר -קלים ודחק'מרתף פנימית ו-משפחותקרדיולוגיהעורנוירולוגיהבינוניים
נספח לנוהלקרדיולוגיה,עור,נוירולוגיה-בינונייםחדר אוכל-קשים
ד"מלר-משולבים
Focus on severely wounded casualties (≈ 40 in drill)Note: 20 observers support real-time control (helps validation)
14
Data Cleaning: MCE with RFID Support
Data-base Company report comment
Asset id order Entry date Exit date Entry date Exit date 4 1 1:14:07 PM 1:14:00 PM 6 1 12:02:02 PM 12:33:10 PM 12:02:00 PM 12:33:00 PM 8 1 11:37:15 AM 12:40:17 PM 11:37:00 AM exit is missing
10 1 12:23:32 PM 12:38:23 PM 12:23:00 PM 12 1 12:12:47 PM 12:35:33 PM 12:35:00 PM entry is missing 15 1 1:07:15 PM 1:07:00 PM 16 1 11:18:19 AM 11:31:04 AM 11:18:00 AM 11:31:00 AM 17 1 1:03:31 PM 1:03:00 PM 18 1 1:07:54 PM 1:07:00 PM 19 1 12:01:58 PM 12:01:00 PM 20 1 11:37:21 AM 12:57:02 PM 11:37:00 AM 12:57:00 PM 21 1 12:01:16 PM 12:37:16 PM 12:01:00 PM
22 1 12:04:31 PM 12:20:40 PM first customer is missing
22 2 12:27:37 PM 12:27:00 PM 25 1 12:27:35 PM 1:07:28 PM 12:27:00 PM 1:07:00 PM 27 1 12:06:53 PM 12:06:00 PM
28 1 11:21:34 AM 11:41:06 AM 11:41:00 AM 11:53:00 AMexit time instead of entry time
29 1 12:21:06 PM 12:54:29 PM 12:21:00 PM 12:54:00 PM 31 1 11:40:54 AM 12:30:16 PM 11:40:00 AM 12:30:00 PM 31 2 12:37:57 PM 12:54:51 PM 12:37:00 PM 12:54:00 PM 32 1 11:27:11 AM 12:15:17 PM 11:27:00 AM 12:15:00 PM 33 1 12:05:50 PM 12:13:12 PM 12:05:00 PM 12:15:00 PM wrong exit time 35 1 11:31:48 AM 11:40:50 AM 11:31:00 AM 11:40:00 AM 36 1 12:06:23 PM 12:29:30 PM 12:06:00 PM 12:29:00 PM 37 1 11:31:50 AM 11:48:18 AM 11:31:00 AM 11:48:00 AM 37 2 12:59:21 PM 12:59:00 PM 40 1 12:09:33 PM 12:35:23 PM 12:09:00 PM 12:35:00 PM 43 1 12:58:21 PM 12:58:00 PM 44 1 11:21:25 AM 11:52:30 AM 11:52:00 AM entry is missing 46 1 12:03:56 PM 12:03:00 PM 48 1 11:19:47 AM 11:19:00 AM 49 1 12:20:36 PM 12:20:00 PM 52 1 11:21:29 AM 11:50:49 AM 11:21:00 AM 11:50:00 AM 52 2 12:10:07 PM 1:07:28 PM 12:10:00 PM 1:07:00 PM recorded as exit 53 1 12:24:26 PM 12:24:00 PM 57 1 11:32:02 AM 11:58:31 AM 11:58:00 AM entry is missing 57 2 12:59:41 PM 1:14:00 PM 12:59:00 PM 1:14:00 PM 60 1 12:27:12 PM 12:48:41 PM 12:27:00 PM 12:48:00 PM 63 1 12:10:04 PM 12:10:00 PM 64 1 11:30:29 AM 12:43:38 PM 11:30:00 AM 12:43:00 PM
Imagine “Cleaning" 60,000+ customers per day (call centers) !
15
Beyond Averages: The Human Factor
Histogram of Service-Time in a (Small Israeli) Bank, 1999
January-OctoberJanuary-October
0
2
4
6
8
0 100 200 300 400 500 600 700 800 900
AVG: 185STD: 238
November-December
0
2
4
6
8
0 100 200 300 400 500 600 700 800 900
AVG: 201STD: 263
5.59%
?6.83%
Log-Normal
November-DecemberJanuary-October
0
2
4
6
8
0 100 200 300 400 500 600 700 800 900
AVG: 185STD: 238
November-December
0
2
4
6
8
0 100 200 300 400 500 600 700 800 900
AVG: 201STD: 263
5.59%
?6.83%
Log-Normal
I 6.8% Short-Services: Agents’ “Abandon" (improve bonus, rest),(mis)lead by incentives
I Distributions must be measured (in seconds = natural scale)I LogNormal service times common in call centers
16
Validating LogNormality of Service-Duration
Israeli Call Center, Nov-Dec, 1999
Log(Service Times)
0 2 4 6 8
0.0
0.1
0.2
0.3
0.4
Log(Service Time)
Pro
port
ion
LogNormal QQPlot
Log-normalS
ervi
ce ti
me
0 1000 2000 3000
010
0020
0030
00
I Practically Important: (mean, std)(log) characterizationI Theoretically Intriguing: Why LogNormal ? Naturally multiplicative
but, in fact, also Infinitely-Divisible (Generalized Gamma-Convolutions)
I Simple-model of a complex-reality? The Service Process:17
(Telephone) Service-Process = “Phase-Type" Model
RetailService(IsraeliBank)
StatisticsORIE
START
Hello14 / 20
I.D.24 / 23Request
15 / 8
Stock17 / 21
Question14 / 20
Password creation62 / 42
Processing49 / 24
Confirmation29 / 9
Answer32 / 19
END202/190
Dead time18 / 17
Paperwork22 / 12
End of call5 / 3
Credit34 / 32
Others21 / 21
Checking21 / 9
1
0.65
0.050.27
0.950.03
0.62
0.29
0.17
0.28
0.17
0.11
0.05
0.05 0.17
0.23 0.08
0.38
0.15
0.15
0.93
0.14
0.03
0.5
0.03 0.26
0. 4
0.23
0.590.09
0.05
0.67
0.11
0.17
0.67
0.330.7
0.25
0.66
0.43
0.57
0.130.2
0.93
0.04
0. 6
Password creation62 / 42
0.67
0.33
STDAVG
18
Individual Agents: Service-Duration, Variabilityw/ Gans, Liu, Shen & Ye
Agent 14115
Service-Time Evolution: 6 month Log(Service-Time)
I Learning: Noticeable decreasing-trend in service-durationI LogNormal Service-Duration, individually and collectively
19
Individual Agents: Learning, Forgetting, Switching
Daily-Average Log(Service-Time), over 6 monthsAgents 14115, 14128, 14136
Day Index
Mea
n Lo
g(S
ervi
ce T
ime)
0 20 40 60 80 100 120 1404.2
4.4
4.6
4.8
5.0
5.2
5.4
Day IndexM
ean
Log(
Ser
vice
Tim
e)
a 102−day break
0 20 40 60 80 100
4.4
4.6
4.8
5.0
5.2
5.4
5.6
Day Index
Mea
n Lo
g(S
ervi
ce T
ime)
switch to Online Banking after 18−day break
0 50 100 150
4.5
5.0
5.5
6.0
Weakly Learning-Curves for 12 Homogeneous(?) Agents
5 10 15 20
34
56
Tenure (in 5−day week)
Ser
vice
rat
e pe
r ho
ur
20
Why Bother?
In large call centers:+One Second to Service-Time implies +Millions in costs, annually
⇒ Time and "Motion" Studies (Classical IE with New-age IT)
I Service-Process Model: Customer-Agent InteractionI Work Design (w/ Khudiakov)
eg. Cross-Selling: higher profit vs. longer (costlier) services;Analysis yields (congestion-dependent) cross-selling protocols
I “Worker" Design (w/ Gans, Liu, Shen & Ye)eg. Learning, Forgetting, . . . : Staffing & individual-performanceprediction, in a heterogenous environment
I IVR-Process Model: Customer-Machine Interaction75% bank-services, poor design, yet scarce research;Same approach, automatic (easier) data (w/ Yuviler)
21
IVR-Time: HistogramsIsraeli Bank: IVR/VRU Only, May 2008
IVR_onlyMay 2008, Week days
0.000.100.200.300.400.500.600.700.800.901.001.101.201.301.40
00:00 00:30 01:00 01:30 02:00 02:30 03:00 03:30 04:00 04:30
Time(mm:ss) (Resolution 1 sec.)
Rel
ativ
e fr
eque
ncie
s %
mean=99st.dev.=101
Mixture: 7 LogNormals
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1.10
1.20
1.30
00:00 00:30 01:00 01:30 02:00 02:30 03:00 03:30 04:00 04:30
Rel
ativ
e fr
eque
ncie
s %
Time(mm:ss) (Resolution 1 sec.)
Fitting Mixtures of Distributions for VRU only timeMay 2008, Week days
Empirical Total Lognormal Lognormal Lognormal
22
IVR-Process: “Phase-Type" Model
Start
3I.D.
7Query
5Agent
End
4828
49369
39634
43464
Change
127410
Bursa
11Poly
319
1110
30
15Error
15
157
4
41
2
2
2
68
16Kranot
4
1 2
1Pery
2Check
8Trans
9Nia
12Credit
6Fax
24
23
3
51
10
74
3
1
5
121
140
1
75
12
1157
308
140
2
99 549
1
5
10
219
14
2
1
10
37
8408
1233
655
29780
24
2
1
298
418
3,7,10,11P<<0.01
7P<<0.01
17380
23
Started with Call Centers, Expanded to Hospitals
Call Centers - U.S. (Netherlands) Stat.
I $200 – $300 billion annual expenditures (0.5)I 100,000 – 200,000 call centers (1500-2000)I “Window" into the company, for better or worseI Over 3 million agents = 2% – 4% workforce (100K)
Healthcare - similar and unique challenges:
I Cost-figures far more staggeringI Risks much higherI ED (initial focus) = hospital-windowI Over 3 million nurses
24
Call-Center Environment: Service Network
25
Call-Center Network: Gallery of Models
Agents(CSRs)
Back-Office
Experts)(Consultants
VIP)Training (
Arrivals(Business Frontier
of the21th Century)
Redial(Retrial)
Busy)Rare(
Goodor
Bad
Positive: Repeat BusinessNegative: New Complaint
Lost Calls
Abandonment
Agents
ServiceCompletion
Service Engineering: Multi-Disciplinary Process View
ForecastingStatistics
New Services Design (R&D)Operations,Marketing
Organization Design:Parallel (Flat)Sequential (Hierarchical)Sociology/Psychology,Operations Research
Human Resource Management
Service Process Design
To Avoid Delay
To Avoid Starvation Skill Based Routing
(SBR) DesignMarketing,Human Resources,Operations Research,MIS
Customers Interface Design
Computer-Telephony Integration - CTIMIS/CS
Marketing
Operations/BusinessProcessArchiveDatabaseDesignData Mining:MIS, Statistics, Operations Research, Marketing
InternetChatEmailFax
Lost Calls
Service Completion)75% in Banks (
( Waiting TimeReturn Time)
Logistics
Customers Segmentation -CRM
Psychology, Operations Research,Marketing
Expect 3 minWilling 8 minPerceive 15 min
PsychologicalProcessArchive
Psychology,Statistics
Training, IncentivesJob Enrichment
Marketing,Operations Research
Human Factors Engineering
VRU/IVR
Queue)Invisible (
VIP Queue
(If Required 15 min,then Waited 8 min)(If Required 6 min, then Waited 8 min)
Information Design
Function
Scientific Discipline
Multi-Disciplinary
IndexCall Center Design
(Turnover up to 200% per Year)(Sweat Shops
of the21th Century)
Tele-StressPsychology
27
Call-Center Network: Gallery of Models
Agents(CSRs)
Back-Office
Experts)(Consultants
VIP)Training (
Arrivals(Business Frontier
of the21th Century)
Redial(Retrial)
Busy)Rare(
Goodor
Bad
Positive: Repeat BusinessNegative: New Complaint
Lost Calls
Abandonment
Agents
ServiceCompletion
Service Engineering: Multi-Disciplinary Process View
ForecastingStatistics
New Services Design (R&D)Operations,Marketing
Organization Design:Parallel (Flat)Sequential (Hierarchical)Sociology/Psychology,Operations Research
Human Resource Management
Service Process Design
To Avoid Delay
To Avoid Starvation Skill Based Routing
(SBR) DesignMarketing,Human Resources,Operations Research,MIS
Customers Interface Design
Computer-Telephony Integration - CTIMIS/CS
Marketing
Operations/BusinessProcessArchiveDatabaseDesignData Mining:MIS, Statistics, Operations Research, Marketing
InternetChatEmailFax
Lost Calls
Service Completion)75% in Banks (
( Waiting TimeReturn Time)
Logistics
Customers Segmentation -CRM
Psychology, Operations Research,Marketing
Expect 3 minWilling 8 minPerceive 15 min
PsychologicalProcessArchive
Psychology,Statistics
Training, IncentivesJob Enrichment
Marketing,Operations Research
Human Factors Engineering
VRU/IVR
Queue)Invisible (
VIP Queue
(If Required 15 min,then Waited 8 min)(If Required 6 min, then Waited 8 min)
Information Design
Function
Scientific Discipline
Multi-Disciplinary
IndexCall Center Design
(Turnover up to 200% per Year)(Sweat Shops
of the21th Century)
Tele-StressPsychology
16
Skills-Based Routing in Call CentersEDA and OR, with I. Gurvich and P. Liberman
Mktg. ⇒
OR ⇒
HRM ⇒
MIS ⇒
29
SBR Topologies: I; V, Reversed-V; N, X; W, M
Israeli Cellular, March 2008Implication
Itai Gurvich (Kellogg/MEDS) Call Centers as Queueing Systems May, 2010 41 / 52
30
SBR: Class-Dependent Services
“Reduction" to V-Topology (Equivalent Brownian Control)“Many-server approximations” simplification
The class-dependent case
In what sense does the reduction work? Same Brownian controlproblem.Itai Gurvich (Kellogg/MEDS) Call Centers as Queueing Systems May, 2010 43 / 52
PhD’s: Tezcan, Dai; Shaikhet, w/ Atar; Gurvich, Whitt31
SBR: Pool-Dependent Services
“Reduction" to Reversed-V and I (Equivalent Brownian Control)Many-server approximations simplification
The pool-dependent case
e.g. Tezcan and Dai(09’), G. & Whitt (08’), Atar et. al. (10’)Itai Gurvich (Kellogg/MEDS) Call Centers as Queueing Systems May, 2010 42 / 52
PhD’s: Tezcan, Dai; Shaikhet, w/ Atar; Gurvich, Whitt32
Waiting Times in a Call Center (Theory?)
Exponential in Heavy-Traffic (min.)Small Israeli Bank
quantiles of waiting times to those of the exponential (the straight line at the right plot). The �t is reasonableup to about 700 seconds. (The p-value for the Kolmogorov-Smirnov test for Exponentiality is however 0 {not that surprising in view of the sample size of 263,007).
Figure 9: Distribution of waiting time (1999)
Time
0 30 60 90 120 150 180 210 240 270 300
29.1 %
20 %
13.4 %
8.8 %
6.9 %5.4 %
3.9 %3.1 %
2.3 % 1.7 %
Mean = 98SD = 105
Waiting time given agent
Exp
qua
ntile
s
0 200 400 600
020
040
060
0
Remark on mixtures of independent exponentials: Interestingly, the means and standard deviations in Table19 are rather close, both annually and across all months. This suggests also an exponential distributionfor each month separately, as was indeed veri�ed, and which is apparently inconsistent with the observerdannual exponentiality. The phenomenon recurs later as well, hence an explanation is in order. We shall besatis�ed with demonstrating that a true mixture W of independent random varibles Wi, all of which havecoeÆcients of variation C(Wi) = 1, can also have C(W ) � 1. To this end, let Wi denote the waiting time inmonth i, and suppose it is exponentially distributed with meanmi. Assume that the months are independentand let pi be the fraction of calls performed in month i (out of the yearly total). If W denotes the mixtureof these exponentials (W =Wi with probability pi, that is W has a hyper-exponential distribution), then
C2(W ) = 1 + 2C2(M);
where M stands for a �ctitious random variable, de�ned to be equal mi with probability pi. One concludesthat if themi's do not vary much relative to their mean (C(M) << 1), which is the case here, then C(W ) � 1,allowing for approximate exponentiality of both the mixture and its constituents.
6.2.1 The various waiting times, and their rami�cations
We �rst distinguished between queueing time and waiting time. The latter does not account for zero-waits,and it is more relevant for managers, especially when considered jointly with the fraction of customers thatdid wait. A more fundamental distinction is between the waiting times of customer that got served and thosethat abandoned. Here is it important to recognize that the latter does not describe customers' patience,which we now explain.
A third distinction is between the time that a customer needs to wait before reaching an agent vs. the timethat a customer is willing to wait before abandoning the system. The former is referred to as virtual waitingtime, since it amounts to the time that a (virtual) customer, equipped with an in�nite patience, would havewaited till being served; the latter will serve as our operational measure of customers' patience. While bothmeasures are obviously of great importance, note however that neither is directly observable, and hence mustbe estimated.
25
Routing via Thresholds (sec.)Large U.S. Bank
Chart1
Page 1
0
2
4
6
8
10
12
14
16
18
20
2 5 8 11 14 17 20 23 26 29 32 35
Time
Relative frequencies, %
Scheduling Priorities (sec) (later: Hospital LOS, hr.)Medium Israeli Bankwaitwait
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
33
ER / ED Environment: Service Network
Acute (Internal, Trauma) Walking
Multi-Trauma
34
Emergency-Department Network: Gallery of Models
ImagingLaboratory
Experts
Interns
Returns (Old or New Problem)
“Lost” Patients
LWBS
Nurses
Statistics,HumanResourceManagement(HRM)
New Services Design (R&D)
Operations,Marketing,MIS
Organization Design:Parallel (Flat) = ERvs. a true ED Sociology, Psychology,Operations Research
Service Process Design
Quality
Efficiency
CustomersInterface Design
Medicine
(High turnoversMedical-Staff shortage)
Operations/BusinessProcessArchiveDatabaseDesignData Mining:MIS, Statistics, Operations Research, Marketing
StretcherWalking
Service Completion(sent to other department)
( Waiting TimeActive Dashboard )
PatientsSegmentation
Medicine,Psychology, Marketing
PsychologicalProcessArchive
Human FactorsEngineering(HFE)
InternalQueue
OrthopedicQueue
Arrivals
Function
Scientific Discipline
Multi-Disciplinary
Index
ED-StressPsychology
OperationsResearch, Medicine
Emergency-Department Network: Gallery of Models
Returns
TriageReception
Skill Based Routing (SBR) DesignOperations Research,HRM, MIS, Medicine
IncentivesGame Theory,Economics
Job EnrichmentTrainingHRM
HospitalPhysiciansSurgical
Queue
Acute,Walking
Blocked(Ambulance Diversion)
Forecasting
Information Design
MIS, HFE,Operations Research
Psychology,Statistics
Home
I Forecasting, Abandonment = LWBS, SBR ≈ Flow Control36
Emergency-Department Network: Gallery of Models
ImagingLaboratory
Experts
Interns
Returns (Old or New Problem)
“Lost” Patients
LWBS
Nurses
Statistics,HumanResourceManagement(HRM)
New Services Design (R&D)
Operations,Marketing,MIS
Organization Design:Parallel (Flat) = ERvs. a true ED Sociology, Psychology,Operations Research
Service Process Design
Quality
Efficiency
CustomersInterface Design
Medicine
(High turnoversMedical-Staff shortage)
Operations/BusinessProcessArchiveDatabaseDesignData Mining:MIS, Statistics, Operations Research, Marketing
StretcherWalking
Service Completion(sent to other department)
( Waiting TimeActive Dashboard )
PatientsSegmentation
Medicine,Psychology, Marketing
PsychologicalProcessArchive
Human FactorsEngineering(HFE)
InternalQueue
OrthopedicQueue
Arrivals
Function
Scientific Discipline
Multi-Disciplinary
Index
ED-StressPsychology
OperationsResearch, Medicine
Emergency-Department Network: Gallery of Models
Returns
TriageReception
Skill Based Routing (SBR) DesignOperations Research,HRM, MIS, Medicine
IncentivesGame Theory,Economics
Job EnrichmentTrainingHRM
HospitalPhysiciansSurgical
Queue
Acute,Walking
Blocked(Ambulance Diversion)
Forecasting
Information Design
MIS, HFE,Operations Research
Psychology,Statistics
Home
I Fork-Join Q’s, eg. After Physician: Nurse, Lab-Tests, X-RayI Synchronization Control, with R. Atar and A. ZviranI ED-to-IW Routing
25
ED Design, with B. Golany, Y. Marmor, S. Israelit
Routing: Triage (Clinical), Fast-Track (Operational), . . . (via DEA)eg. Fast Track most suitable when elderly dominate
Triage
ED Area 1
“Hospital”
ED Area 2 ED Area 3
Patient Arrival
Patient Departure
(a) Triage Model
Triage
Fast TackLane*
“Hospital”
ED Area 1 ED Area 2
Patient Arrival
Patient Departure* operational criteria (short treatments time) –acute or walking patient
(b) Fast-Track Model
Admission
ED Area 1
“Hospital”
ED Area 2 ED Area 3
Wrong ED placement
Wrong ward placementPatient Arrival
Patient Departure
(c) Illness-based Model (d) Walking-Acute Model
ED Area 1 ED Area 2Room1 Room2
Walking Area Acute Area
“Hospital”Wrong ED placement
Wrong ward placement
Patient Arrival
Patient Departure
Admission
38
Emergency-Department Network: Flow Control
ImagingLaboratory
Experts
Interns
Returns (Old or New Problem)
“Lost” Patients
LWBS
Nurses
Statistics,HumanResourceManagement(HRM)
New Services Design (R&D)
Operations,Marketing,MIS
Organization Design:Parallel (Flat) = ERvs. a true ED Sociology, Psychology,Operations Research
Service Process Design
Quality
Efficiency
CustomersInterface Design
Medicine
(High turnoversMedical-Staff shortage)
Operations/BusinessProcessArchiveDatabaseDesignData Mining:MIS, Statistics, Operations Research, Marketing
StretcherWalking
Service Completion(sent to other department)
( Waiting TimeActive Dashboard )
PatientsSegmentation
Medicine,Psychology, Marketing
PsychologicalProcessArchive
Human FactorsEngineering(HFE)
InternalQueue
OrthopedicQueue
Arrivals
Function
Scientific Discipline
Multi-Disciplinary
Index
ED-StressPsychology
OperationsResearch, Medicine
Emergency-Department Network: Gallery of Models
Returns
TriageReception
Skill Based Routing (SBR) DesignOperations Research,HRM, MIS, Medicine
IncentivesGame Theory,Economics
Job EnrichmentTrainingHRM
HospitalPhysiciansSurgical
Queue
Acute,Walking
Blocked(Ambulance Diversion)
Forecasting
Information Design
MIS, HFE,Operations Research
Psychology,Statistics
Home
I Queueing-Science, w/ Armony, Marmor, Tseytlin, Yom-TovI Fair ED-to-IW Routing (Patients vs. Staff), w/ Momcilovic, TseytlinI Triage vs. In-Process / Release in EDs, w/ Carmeli, Huang, ShimkinI Workload and Offered-Load in Fork-Join Networks, w/ Kaspi, ZaeidI Synchronization Control of Fork-Join Networks, w/ Atar, ZviranI Staffing Time-Varying Q’s with Re-Entrant Customers, w/ Yom-Tov
39
ED Patient Flow: The Physicians View
&%'$
? ? ?
@@@@@@R
CCCCCCW
���
���
�������
�����
AAAAK
HHHH
HHHH
HHY�
?
66 66
-
λ01 λ0
2 λ0J
· · ·
P 0(j,k)
P (k, l)
d1 d2 dJ
m01 m0
2 m0J
Triage-Patients
IP-Patients
ExitsS
Arrivals
C1(·) C2(·) C3(·) CK(·)
m1 m2 m3 mK
· · ·
1
I Goal: Adhere to Triage-Constraints, then process/release In-Process PatientsI Model = Multi-class Q with Feedback: Min. convex congestion costs of
IP-Patients, s.t. deadline constraints on Triage-Patients.I Solution: In conventional heavy-traffic, asymptotic least-cost s.t. asymptotic
compliance, via threshold (w/ B. Carmeli, J. Huang, S. Israelit, N. Shimkin; asin Plambeck, Harrison, Kumar, who applied admission control).
40
Operational Fairness
1. “Punishing" fast wards in ED-to-IW Routing:
I Parallel IWs: similar clinically , differ operationallyI Problem: Short Length-of-Stay goes hand in hand with high
bed-occupancy, bed-turnover, yet clinically apt: unfair!I Solution: Both nurses and managers content, w/ P. Momcilovic
and Y. Tseytlin (3 time-scales: hour, day, week; “compare" withcall-centers SBR)
2. Balancing Load across Maternity Wards:
I 2 Maternity Wards: 1 = pre-birth, 2 = post-birth complications
I Problem: Nurses think the “others-work-less": unfair!I Goal: Balance workload, mostly via normal birthsI Challenge: Workload is Operational, Cognitive, Emotional
I Operational: Work content of a task, in time-unitsI Emotional: e.g. Mother and fetus-in-stress, suddenly fetus dies
⇒ Need help: A. Rafaeli & students (Psychology) - Ongoing41
LogNormal & Beyond: Length-of-Stay in a Hospital
Israeli Hospital, in Days: LN
0 2 4 6 8 10 13 16 19 22 25 28 31 34 37 40 43 46 49
Israeli Hospital, in Hours: Mixture
0 .2.4 .6 .8 11.21.51.82.12.42.7 33.23.53.84.14.44.7 55.25.55.86.16.46.7 7 7.37.67.98.28.58.89.19.49.7 10
Explanation: Patients releasedaround 3pm (1pm in Singapore)
Why Bother ?I Hourly Scale: Staffing,. . .I Daily: Flow / Bed Control,. . .
Workload at the Internal Ward (In Progress): Arrivals, Departures, # Patients in Ward A, by Hour
42
Prerequisite II: Models (Fluid Q’s)“Laws of Large Numbers" capture Predictable Variability
Deterministic Models: Scale Averages-out Stochastic Individualism
# Severely-Wounded Patients, 11:00-13:00 (Censored LOS)
Cleaning Data – An Example: RFID data in an MCE Drill
0
10
20
30
40
50
60
11:09 11:16 11:24 11:31 11:38 11:45 11:52 12:00 12:07 12:14 12:21 12:28 12:36 12:43 12:50 12:57 13:04 13:12 13:19 13:26
entries
exits
entries (original)
exits (original)
0
5
10
15
20
25
30
11:09 11:16 11:24 11:31 11:38 11:45 11:52 12:00 12:07 12:14 12:21 12:28 12:36 12:43 12:50 12:57 13:04 13:12 13:19 13:26
number of patients
number of patients (original)
I Paths of doctors, nurses, patients (100+, 1 sec. resolution)eg. (could) Help predict “What if 150+ casualties severely wounded ?"
I Transient Q’s:I Control of Mass Casualty Events (w/ I. Cohen, N. Zychlinski)I Chemical MCE = Needy-Content Cycles (w/ G. Yom-Tov)
43
The Basic Service-Network Model: Erlang-R
Erlang-R:
Needy (s-servers)
Content (Delay)
1-p
p
1
2
Arrivals Patient discharge
Needy (st-servers)
rate- μ
Content (Delay) rate - δ
1-p
p
1
2
Arrivals Poiss(λt) Patient discharge
Erlang-R (IE: Repairman Problem 50’s; CS: Central-Server 60’s) =2-station “Jackson" Network = (M/M/S, M/M/∞) :
I λ(t) – Time-Varying Arrival rateI S(·) – Number of Servers (Nurses / Physicians).I µ – Service rate (E [Service] = 1
µ)
I p – ReEntrant (Feedback) fractionI δ – Content-to-Needy rate (E [Content] = 1
δ)
44
Erlang-R: Fitting a Simple Model to a Complex Reality
Chemical MCE Drill (Israel, May 2010)
Arrivals & Departures (RFID) Erlang-R (Fluid, Diffusion)
!
"!
#!
$!
%!
&!
'!
""(!# ""("' ""($" ""(%& "#(!! "#("% "#(#) "#(%$ "#(&* "$("# "$(#'
!"#$%&'
()*+
,&"-&.
$#/+0#1
!/)+
+,-,./0123456612/.7
+,-,./01234839/60,637
!
"
#!
#"
$!
$"
%!
##&!$ ##&#' ##&%# ##&(" #$&!! #$&#( #$&$) #$&(% #$&"* #%&#$ #%&$'
!"#
$%&'(
)'*+,
'-./0%1/2'01',3
40#%
+,-./0123-4
50.67123-4
89:;<1=>?;09@;123-413AB;9<;-6,/04
C@@;<1=>?;09@;123-413AB;9<;-6,/04
50.6712#
I Recurrent/Repeated services in MCE Events: eg. Injection every 15 minutesI Fluid (Sample-path) Modeling, via Functional Strong Laws of Large Numbers
I Stochastic Modeling, via Functional Central Limit Theorems
I ED in MCE: Confidence-interval, usefully narrow for ControlI ED in normal (time-varying) conditions: Personnel Staffing
45
Prerequisite II: Models (Diffusion/QED’s Q’s)
Traditional Queueing Theory predicts that Service-Quality andServers’ Efficiency must be traded off against each other.
For example, M/M/1 (single-server queue): 91% server’s utilizationgoes with
Congestion Index =E [Wait ]
E [Service]= 10,
and only 9% of the customers are served immediately upon arrival.
Yet, heavily-loaded queueing systems with Congestion Index = 0.1(Waiting one order of magnitude less than Service) are prevalent:
I Call Centers: Wait “seconds" for minutes service;I Transportation: Search “minutes" for hours parking;I Hospitals: Wait “hours" in ED for days hospitalization in IW’s;
and, moreover, a significant fraction are not delayed in queue. (Forexample, in well-run call-centers, 50% served “immediately", alongwith over 90% agents’ utilization, is not uncommon ) ? QED
44
The Basic Staffing Model: Erlang-A (M/M/N + M)
agents
arrivals
abandonment
λ
µ
1
2
n
…
queue
θ
Erlang-A (Palm 1940’s) = Birth & Death Q, with parameters:
I λ – Arrival rate (Poisson)I µ – Service rate (Exponential; E [S] = 1
µ )
I θ – Patience rate (Exponential, E [Patience] = 1θ )
I n – Number of Servers (Agents).45
Testing the Erlang-A Primitives
I Arrivals: Poisson?I Service-durations: Exponential?I (Im)Patience: Exponential?
I Primitives independent (eg. Impatience and Service-Durations)?I Customers / Servers Homogeneous?I Service discipline FCFS?I . . . ?
Validation: Support? Refute?
46
Arrivals to ServiceArrival-Rates to Three Call Centers
Dec. 1995 (U.S. 700 Helpdesks) May 1959 (England)
Q-Science
May 1959!
Dec 1995!
(Help Desk Institute)
Arrival Rate
Time 24 hrs
Time 24 hrs
% Arrivals
(Lee A.M., Applied Q-Th)
28
Q-Science
May 1959!
Dec 1995!
(Help Desk Institute)
Arrival Rate
Time 24 hrs
Time 24 hrs
% Arrivals
(Lee A.M., Applied Q-Th)
28
November 1999 (Israel)
Arrival Process, in 1999
Yearly Monthly
Daily Hourly
Random Arrivals “must be"(Axiomatically)Time-Inhomogeneous Poisson
47
Arrivals to Service: only Poisson-Relatives
Arrival-Counts: Coefficient-of-Variation (CV), per 30 min.Israeli-Bank Call-Center, 263 regular days (4/2007 - 3/2008)Coefficient of Variation Per 30 Minutes, seperated weekdays
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Time
Coe
ffici
ent o
f Var
iatio
n
Sundays Mondays Tuesdays Wednesdays Thursdays
I Poisson CV (Dashed Line) = 1/√
mean arrival-rateI Poisson CV’s� Sampled CV’s (Solid) ⇒ Over-Dispersion
⇒ Modeling (Poisson-Mixture) of and Staffing ( >√· ) against
Time-Varying Over-Dispersed Arrivals (w/ S. Maman & S. Zeltyn)48
Service Durations: LogNormal Prevalent
Israeli Bank Service-ClassesLog-Histogram Survival-Functions
0
100
200
300
400
500
600
700
800
900
0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8
Log(service time)
Freq
uenc
y
frequency normal curve
Average = 2.24St.dev. = 0.42
31
Service TimeSurvival curve, by Types
Time
Surv
ival
Means (In Seconds)
NW (New) = 111
PS (Regular) = 181
NE (Stocks) = 269
IN (Internet) = 381
34- New Customers: 2 min (NW);
- Regulars: 3 min (PS);
- Stock: 4.5 min (NE);
- Tech-Support: 6.5 min (IN).
I Service Durations are LogNormal (LN) and Heterogeneous
49
(Im)Patience while Waiting (Palm 1943-53)Hazard Rate of (Im)Patience Distribution ∝ Irritation
Regular over VIP Customers – Israeli Bank 14
16
I VIP Customers are more Patient (Needy)I Peaks of abandonment at times of AnnouncementsI Challenges: Un-Censoring, Dependence (vs. KM), Smoothing
- requires Call-by-Call Data50
Dependent Primitives: Service- vs. Waiting-Time
Average Service-Time as a function of Waiting-TimeU.S. Bank, Retail, Weedays, January-June, 2006
Introduction Relationship Between Service Time and Patience Workload and Offered-Load Empirical Results Future Research
Motivation
Mean Service-Time as a Function of Waiting-TimeU.S. Bank - Retail Banking Service - Weekdays - January-June, 2006
190
210
230
250
270
290
150
170
190
210
230
250
270
290
0 50 100 150 200 250 300
Waiting Time
Fitted Spline Curve E(S|τ>W=w)
⇒ Focus on ( Patience, Service-Time ) jointly , w/ Reich and Ritov.E [S |Patience = w ], w ≥ 0: Service-Time of the Unserved.
51
Erlang-A: Practical Relevance?
Experience:I Arrival process not pure Poisson (time-varying, σ2 too large)I Service times not Exponential (typically close to LogNormal)I Patience times not Exponential (various patterns observed).
I Building Blocks need not be independent (eg. long waitassociated with long service; with w/ M. Reich and Y. Ritov)
I Customers and Servers not homogeneous (classes, skills)I Customers return for service (after busy, abandonment;
dependently; P. Khudiakov, M. Gorfine, P. Feigin)I · · · , and more.
Question: Is Erlang-A Relevant?
YES ! Fitting a Simple Model to a Complex Reality, bothTheoretically and Practically
52
Estimating (Im)Patience: via P{Ab} ∝ E[Wq]
“Assume" Exp(θ) (im)patience. Then, P{Ab} = θ · E[Wq] .
% Abandonment vs. Average Waiting-TimeBank Anonymous (JASA): Yearly Data
Hourly Data Aggregated
0 50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
0 50 100 150 200 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
Graphs based on 4158 hour intervals.
Estimate of mean (im)patience: 250/0.55 sec. ≈ 7.5 minutes.
53
Erlang-A: Fitting a Simple Model to a Complex Reality
I Bank Anonymous Small Israeli Call-Center
I (Im)Patience (θ) estimated via P{Ab} / E[Wq]
I Graphs: Hourly Performance vs. Erlang-A Predictions,during 1 year (aggregating groups with 40 similar hours).
P{Ab} E[Wq] P{Wq > 0}
0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
Probability to abandon (Erlang−A)
Pro
babi
lity
to a
band
on (
data
)
0 50 100 150 200 2500
50
100
150
200
250
Waiting time (Erlang−A), sec
Wai
ting
time
(dat
a), s
ec
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of wait (Erlang−A)
Pro
babi
lity
of w
ait (
data
)
54
Erlang-A: Fitting a Simple Model to a Complex Reality
Large U.S. Bank
Retail. P{Wq > 0} Telesales. E[Wq]
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of wait (QED: aggregated)
Pro
babi
lity
of w
ait (
data
: agg
rega
ted)
0 10 20 30 40 50 60 70 80 900
10
20
30
40
50
60
70
80
90
Average wait (QED: aggregated), sec
Ave
rage
wai
t (da
ta: a
ggre
gate
d), s
ec
Partial success – in some cases Erlang-A does not work well(Networking, SBR).
Ongoing Validation Project, w/ Y. Nardi, O. Plonsky, S. Zeltyn
55
Erlang-A: Simple, but Not Too Simple
Practical (Data-Based) questions, started in Brown et al. (JASA):1. Fitting Erlang-A (Validation, w/ Nardi, Plonsky, Zeltyn).2. Why does it practically work? justify robustness.3. When does it fail? chart boundaries.4. Generate needs for new theory.
Theoretical Framework: Asymptotic Analysis, as load- andstaffing-levels increase, which reveals model-essentials:
I Efficiency-Driven (ED) regime: Fluid models (deterministic)I Quality- and Efficiency-Driven (QED): Diffusion refinements.
Motivation: Moderate-to-large service systems (100’s - 1000’sservers), notably Call-Centers.
Results turn out accurate enough to also cover <10 servers:I Practically Important: Relevant to Healthcare
(First: F. de Véricourt and O. Jennings; w/ G. Yom-Tov; Y. Marmor, S.Zeltyn; H. Kaspi, I. Zaeid)
I Theoretically Justifiable: Gap-Analysis by A. Janssen, J. vanLeeuwaarden, B. Zhang, B. Zwart.
56
Operational Regimes: Conceptual Framework
R: Offered LoadDef. R = Arrival-rate × Average-Service-Time = λ
µ
eg. R = 25 calls/min. × 4 min./call = 100
N = #Agents ? Intuition, as R or N increase unilaterally.
QD Regime: N ≈ R + δR , 0.1 < δ < 0.25 (eg. N = 115)I Framework developed in O. Garnett’s MSc thesisI Rigorously: (N − R)/R → δ, as N, λ ↑ ∞, with µ fixed.I Performance: Delays are rare events
ED Regime: N ≈ R − γR , 0.1 < γ < 0.25 (eg. N = 90)I Essentially all customers are delayedI Wait same order as service-time; γ% Abandon (10-25%).
QED Regime: N ≈ R + β√
R , −1 < β < +1 (eg. N = 100)I Erlang 1913-24, Halfin & Whitt 1981 (for Erlang-C)I %Delayed between 25% and 75%I E[Wait] ∝ 1√
N× E[Service] (sec vs. min); 1-5% Abandon.
57
Operational Regimes: Rules-of-Thumb, w/ S. ZeltynOperational Regimes in Practice
Constraint P{Ab} E[W ] P{W > T}Tight Loose Tight Loose Tight Loose
1-10% ≥ 10% ≤ 10%E[τ ] ≥ 10%E[τ ] 0 ≤ T ≤ 10%E[τ ] T ≥ 10%E[τ ]
Offered Load 5% ≤ α ≤ 50% 5% ≤ α ≤ 50%
Small (10’s) QED QED QED QED QED QED
Moderate-to-Large QED ED, QED ED, QED ED+QED
(100’s-1000’s) QED QED if τ d= exp
ED: n ≈ R − γR.
QD: n ≈ R + δR.
QED: n ≈ R + β√
R.
ED+QED: n ≈ (1 − γ)R + β√
R.
1
ED: N ≈ R − γR (0.1 ≤ γ ≤ 0.25 ).
QD: N ≈ R + δR (0.1 ≤ δ ≤ 0.25 ).
QED: N ≈ R + β√
R (−1 ≤ β ≤ 1 ).
ED+QED: N ≈ (1− γ)R + β√
R (γ, β as above).
WFM: How to determine specific staffing level N ? e.g. β.
58
Operational Regimes: Scaling, Performance,w/ I. Gurvich & J. HuangTable 1: Operational Regimes: Asymptotics, Scaling, Performance
Erlang-A Conventional scaling MS scaling NDS scaling
µ fixed Sub Critical Super QD QED ED ED+QED Sub Critical Super
Offered load per server 11+δ
< 1 1− β√n≈ 1 1
1−γ > 1 11+δ
1− β√n
11−γ
11−γ − β
√1
n(1−γ)31
1+δ1− β
n1
1−γ
Arrival rate λ µ1+δ
µ− β√nµ µ
1−γnµ1+δ
nµ− βµ√n nµ1−γ
nµ1−γ − βµ
√n
(1−γ)3nµ1+δ
nµ− βµ nµ1−γ
Number of servers 1 n n
Time-scale n 1 n
Abandonment rate θ/n θ θ/n
Staffing level λµ(1 + δ) λ
µ(1 + β√
n) λ
µ(1− γ) λ
µ(1 + δ) λ
µ+ β
√λµ
λµ(1− γ) λ
µ(1− γ) + β
√λµ
λµ(1 + δ) λ
µ+ β λ
µ(1− γ)
Utilization 11+δ
1−√
θµ
h(β)√n
1 11+δ
1−√
θµ
(1−α2)β+α2h(β)√n
1 1 11+δ
1−√
θµ
h(β)
n1
E(Q) α1
δ
√n
√µθ[h(β)− β] nµγ
θ(1−γ)1√2π
1+δδ2%n 1√
n
√n
√µθ[h(β)− β]α2
nµγθ(1−γ)
nµθ(1−γ)
(γ − β√n(1−γ)
) o(1) n√
µθ[h(β)− β] n2µγ
θ(1−γ)
P(Ab) 1n
1+δδ
θµα1
1√n
√θµ[h(β)− β] γ 1√
2π
θµ
(1+δ)2
δ2%n 1
n3/21√n
√θµ[h(β)− β]α2 γ γ − β
√1−γ√n
o( 1n2 ) 1
n
√θµ[h(β)− β] γ
P(Wq > 0) α1 ∈ (0,1) ≈ 1 1√2π
1+δδ%n 1√
n≈ 0 α2 ∈ (0,1) ≈ 1 ≈ 1 ≈ 0 ≈ 1
P(Wq > T ) α1e− δ
1+δµt 1 +O( 1√
n) 1 +O( 1
n) ≈ 0 G(T )1{G(T )<γ} α3, if G(T ) = γ ≈ 0 Φ(β+
√θµT )
Φ(β)1 +O( 1
n)
Congestion EWq
ES α11+δδ
√n
√µθ[h(β)− β] nµγ/θ 1√
2π
(1+δ)2
δ2%n 1
n3/21√n
√µθ[h(β)− β]α2 µ
∫∫∫ x∗
0G(s)ds µ
∫∫∫ x∗
0G(s)ds− µβ
√1−γ
hG(x∗)√n
o( 1n)
√µθ[h(β)− β] nµγ/θ
1 59
QED Call Center: Staffing (N) vs. Offered-Load (R)IL Telecom; June-September, 2004; w/ Nardi, Plonski, Zeltyn
Empirical Analysis of a QED Call Center
• 2205 half-hour intervals of an Israeli call center • Almost all intervals are within R R− and 2R R+ (i.e. 1 2β− ≤ ≤ ),
implying: o Very decent forecasts made by the call center o A very reasonable level of service (or does it?)
• QED? o Recall: ( )GMR x is the asymptotic probability of P(Wait>0) as a
function of β , where x θµ=
o In this data-set, 0.35x θµ= =
o Theoretical analysis suggests that under this condition, for 1 2β− ≤ ≤ , we get 0 ( 0) 1P Wait≤ > ≤ …
2205 half-hour intervals in an Israeli Call Center60
QED Call Center: Performance
Large Israeli Bank
P{Wq > 0} vs. (R, N) R-Slice: P{Wq > 0} vs. N
• Intercepting plane of the previous plot for a "constant" offered load (actually
18 18.5R≤ ≤ ) • Smoother line was created using smoothing splines. • General shape of the smoothing line is as the Garnett delay functions! • In 2-d view, we get the following:
3 Operational Regimes:I QD: ≤ 25%
I QED: 25%− 75%
I ED: ≥ 75%
61
QED Theory (Erlang ’13; Halfin-Whitt ’81; Garnett MSc; Zeltyn PhD)
Consider a sequence of steady-state M/M/N + G queues, N = 1, 2, 3, . . .Then the following points of view are equivalent, as N ↑ ∞:
Consider a sequence of M/M/N+G models, N=1,2,3,…
Then the following points of view are equivalent:
� QED %{Wait > 0} � � , 0 < � < 1 ;
� Customers %{Abandon} �N� , 0 < � ;
� Agents OCCN�� �
�� 1 � < � < ;
� Managers RRN ��� , � �R E(S) not small;
QED performance (ASA, ...) is easily computable, all in terms
of � (the square-root safety staffing level) – see later.
I QED performance: Laplace Method (asymptotics of integrals).I Parameters: Arrivals and Staffing - β, Services - µ,
(Im)Patience - g(0) = patience density at the origin.
62
Erlang-A: QED Approximations (Examples)
Assume Offered Load R not small (λ→∞).
Let β = β
õ
θ, h(·) =
φ(·)1− Φ(·) = hazard rate of N (0,1).
I Delay Probability:
P{Wq > 0} ≈[
1 +
√θ
µ· h(β)
h(−β)
]−1
.
I Probability to Abandon:
P{Ab|Wq > 0} ≈ 1√n·√θ
µ·[h(β)− β
].
I P{Ab} ∝ E[Wq] , both order 1√n :
P{Ab}E[Wq]
= θ.
63
Garnett / Halfin-Whitt Functions: P{Wq > 0}HW/GMR Delay Functions
α vs. β
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
Beta
Del
ay P
roba
bilit
y
Halfin-Whitt Garnett(0.1) Garnett(0.5) Garnett(1)
Garnett(2) Garnett(5) Garnett(10) Garnett(20)
Garnett(50) Garnett(100)
θ/µ
Halfin-Whitt
QED Erlang-A
64
Empirical data: 2204 intervals of Technical category from Telecom call center
(13 summer weeks; week-days only; 30 min. resolution; excluding 6 outliers). Theoretical plot: based on approximations of Erlang-A for the proper rate.
QED Intuition: Why P{Wq > 0} ∈ (0, 1) ?
1. Why subtle: Consider a large service system (e.g. call center).I Fix λ and let n ↑ ∞: P{Wq > 0} ↓ 0.I Fix n and let λ ↑ ∞: P{Wq > 0} ↑ 1.I ⇒ Must have both λ and n increase simultaneously:I ⇒ (CLT) Square-root staffing: n ≈ R + β
√R.
2. Erlang-A (M/M/n+M), with parameters λ, µ, θ; n, in which µ = θ:(Im)Patience and Service-times are equally distributed.
I Steady-state: L(M/M/n + M)d= L(M/M/∞)
d= Poisson(R), with
R = λ/µ (Offered-Load)I Poisson(R)
d≈ R + Z
√R, with Z d
= N(0, 1).
I P{Wq(M/M/n + M) > 0} PASTA= P{L(M/M/n + M) ≥ n} µ=θ
=
P{L(M/M/∞) ≥ n} ≈ P{R + Z√
R ≥ n} =
P{Z ≥ (n − R)/√
R}√· staffing≈ P{Z ≥ β} = 1− Φ(β).
3. QED Excursions65
QED Intuition via Excursions: Busy-Idle CyclesM/M/N+M (Erlang-A) with Many Servers: N ↑ ∞M/M/N+M (Erlang-A) with Many Servers: N ↑ ∞
0 1 N-1 N N+1
Busy Period
µ 2µNµ(N-1)µ Nµ +
Q(0) = N : all servers busy, no queue.
Let TN,N−1 = Busy Period (down-crossing N ↓ N − 1 )
TN−1,N = Idle Period (up-crossing N − 1 ↑ N )
Then P (Wait > 0) =TN,N−1
TN,N−1 + TN−1,N=
[1 +
TN−1,N
TN,N−1
]−1
.
Calculate TN−1,N =1
λNE1,N−1∼ 1
Nµ× h(−β)/√
N∼ 1√
N· 1/µ
h(−β)
TN,N−1 =1
Nµπ+(0)∼ 1√
N· β/µ
h(δ) /δ, δ = β
√µ/θ
Both apply as√
N (1− ρN) → β, −∞ < β < ∞.
Hence, P (Wait > 0) ∼[1 +
h(δ)/δ
h(−β)/β
]−1
.
1
Q(0) = N : all servers busy, no queue.
Let TN,N−1 = E[Busy Period] down-crossing N ↓ N − 1
TN−1,N = E[Idle Period] up-crossing N − 1 ↑ N )
Then P (Wait > 0) =TN,N−1
TN,N−1+TN−1,N=[1 +
TN−1,NTN,N−1
]−1.
66
QED Intuition via Excursions: Asymptotics
M/M/N+M (Erlang-A) with Many Servers: N ↑ ∞
0 1 N-1 N N+1
Busy Period
µ 2µNµ(N-1)µ Nµ +
Q(0) = N : all servers busy, no queue.
Let TN,N−1 = Busy Period (down-crossing N ↓ N − 1 )
TN−1,N = Idle Period (up-crossing N − 1 ↑ N )
Then P (Wait > 0) =TN,N−1
TN,N−1 + TN−1,N=
[1 +
TN−1,N
TN,N−1
]−1
.
Calculate TN−1,N =1
λNE1,N−1∼ 1
Nµ× h(−β)/√
N∼ 1√
N· 1/µ
h(−β)
TN,N−1 =1
Nµπ+(0)∼ 1√
N· β/µ
h(δ) /δ, δ = β
√µ/θ
Both apply as√
N (1− ρN) → β, −∞ < β < ∞.
Hence, P (Wait > 0) ∼[1 +
h(δ)/δ
h(−β)/β
]−1
.
1
67
Process Limits (Queueing, Waiting)• QN = {QN(t), t ≥ 0} : stochastic process obtained by
centering and rescaling:
QN =QN −N√
N
• QN(∞) : stationary distribution of QN
• Q = {Q(t), t ≥ 0} : process defined by: QN(t)d→ Q(t).
��
�
�
� �
QN(t) QN(∞)
Q(t) Q(∞)
t →∞
t →∞
N →∞ N →∞
Approximating (Virtual) Waiting Time
VN =√
N VN ⇒ V =
[1
μQ
]+
(Puhalskii, 1994)
stochastic process
Waiting Time
68
QED Erlang-X (Markovian Q’s: Performance Analysis)I Pre-History, 1914: Erlang (Erlang-B = M/M/n/n, Erlang-C = M/M/n)I Pre-History, 1974: Jagerman (Erlang-B)I History Milestone, 1981: Halfin-Whitt (Erlang-C, GI/M/n)I Erlang-A (M/M/N+M), 2002: w/ Garnett & ReimanI Erlang-A with General (Im)Patience (M/M/N+G), 2005: w/ ZeltynI Erlang-C (ED+QED), 2009: w/ ZeltynI Erlang-B with Retrial, 2010: Avram, Janssen, van LeeuwaardenI Refined Asymptotics (Erlang A/B/C), 2008-2011: Janssen, van Leeuwaarden,
Zhang, ZwartI NDS Erlang-C/A, 2009: AtarI Production Q’s, 2011: Reed & ZhangI Universal Erlang-R, ongoing: w/ Gurvich & HuangI Queueing Networks:
I (Semi-)Closed: Nurse Staffing (Jennings & de Vericourt), CCs with IVR (w/Khudiakov), Erlang-R (w/ Yom-Tov)
I CCs with Abandonment and Retrials: w. Massey, Reiman, Rider, StolyarI Markovian Service Networks: w/ Massey & Reiman
I Leaving out:I Non-Exponential Service Times: M/D/n (Erlang-D), G/Ph/n, · · · , G/GI/n+GI,
Measure-Valued DiffusionsI Dimensioning (Staffing): M/M/n, · · · , time-varying Q’s, V- and Reversed-V, · · ·I Control: V-network, Reversed-V, · · · , SBRNets
69
Back to “Why does Erlang-A Work?"
Theoretical (Partial) Answer:
M?,Jt /G∗/Nt + G
d≈ (M/M/N + M)t , t ≥ 0.
I Over-Dispersed Arrivals: R + βRc , c-Staffing (c ≥ 1/2).
I General Patience: Behavior at the origin matters most (only).
I General Services: Empirical insensitivity beyond the mean.
I Heterogeneous Customers / Servers: State-Collapse.
I Time-Varying Arrivals: Modified Offered-Load approximations.
I Dependent Building-Blocks: eg. When (Im)Patience andService-Times correlated (positively):
I Predict performance with E [S |Served].I Calculate offered-load with E [S] = E [S |Wait = 0].I Note: staffing← service-times← waiting / abandonment← staffing
70
“Why does Erlang-A Work?" General PatienceIsraeli Bank: Yearly Data
Hourly Data Aggregated
0 50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
0 50 100 150 200 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
Theory:Erlang-A: P{Ab} = θ · E[Wq]; M/M/N+G: P{Ab} ≈ g(0) · E[Wq].
g(0) = Patience-density at originRecipe:In both cases, use Erlang-A, with θ = P{Ab}/E[Wq] (slope above).References on g(0):
- Stationary M/M/N+GI, w/ Zeltyn- Process G/GI/N+GI: w/ Momcilovic; Dai & He;
71
“Why does Erlang-A Work?" Over-Dispersion
ln(STD) vs. ln(AVG) (Israeli Bank, 4/2007-3/2008)
Tue-Wed, 30 min resolutionln(sd) vs ln(average) per 30 minutes. Sundays
y = 0.8027x - 0.1235R2 = 0.9899
y = 0.8752x - 0.8589R2 = 0.9882
0
1
2
3
4
5
6
0 1 2 3 4 5 6 7 8
ln(Average Arrival)
ln(S
tand
ard
Dev
iatio
n)
00:00-10:30 10:30-00:00
Tue-Wed, 5 min resolutionln(Standard Deviation) vs ln(Average) per 5 minutes, thu-wed
y = 0.7228x - 0.0025R2 = 0.9937
y = 0.7933x - 0.5727R2 = 0.9783
0
1
2
3
4
5
0 1 2 3 4 5 6
ln(Average)
ln(S
tand
ard
Dev
iatio
n)
00:00-10:30 10:30-00:00
Significant linear relations (w/ Aldor & Feigin; then w/Maman & Zeltyn ):
ln(STD) = c · ln(AVG) + a
(Poisson: STD = AVG1/2, hence c = 1/2, a = 0.)
72
Over-Dispersion: Random Arrival-Rates
Linear relation between ln(STD) and ln(AVG) gives rise to:
Poisson-Mixture (Doubly-Poisson, Cox) model for Arrivals:Poisson(Λ) with Random-Rate of the form
Λ = λ + λc · X , c ≤ 1 ;
I c determines magnitude of over-dispersion (λc)c = 1, proportional to λ; c ≤ 1/2, Poisson-level;
- In Call Centers: c ≈ 0.75− 0.85 (significant over-dispersion).- In Emergency Departments, c ≈ 0.5 (Poisson).
I X random-variable with E [X ] = 0 (E [Λ] = λ), capturing themagnitude of stochastic deviation from mean arrival-rate:under conventional Gamma prior (λ large), X can be takenNormal with std. derived from the intercept.
QED-c Regime: Erlang-A, with Poisson(Λ) arrivals, amenable toasymptotic analysis (with S. Maman & S. Zeltyn)
73
Over-Dispersion: The QED-c Regime
QED-c Staffing: Under offered-load R = λ · E[S],
N = R + β · Rc , 0.5 < c < 1
Performance measures (M/M/N + G):
- Delay probability: P{Wq > 0} ∼ 1−G(β)
- Abandonment probability: P{Ab} ∼ E [X − β]+
n1−c
- Average offered wait: E [V ] ∼ E [X − β]+
n1−c · g0
- Average actual wait: EΛ,N [W ] ∼ EΛ,N [V ]
74
Why Does Erlang-A Work? Time-Varying Arrival Rates
Square-Root Staffing: Nt = Rt + β√
Rt , −∞ < β <∞What is Rt , the Offered-Load at time t ? ( Rt 6= λt × E[S] )
Arrivals, Offered-Load and Staffing
0
50
100
150
200
250
0 1 2 4 5 6 7 8
10
11
12
13
14
16
17
18
19
20
22
23
0
500
1000
1500
2000
Arr
iva
ls p
er
ho
ur
beta 1.2 beta 0 beta -1.2 Offered Load Arrivals
QDQED
ED
75
Time-Stable Performance of Time-Varying Systems
Delay Probability = As in the Stationary Erlang-A (Garnett)Delay Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
10 1 2 4 5 6 7 8
10
11
12
13
14
16
17
18
19
20
22
23
beta 2 beta 1.6 beta 1.2 beta 0.8 beta 0.4 beta 0
beta -0.4 beta -0.8 beta -1.2 beta -1.6 beta -2
76
Time-Stable Performance of Time-Varying Systems
Waiting Time, Given Waiting:Empirical vs. Theoretical Distribution
Waiting Time given Wait > 0:
beta = 1.2 QD ( 0.1)
0
0.05
0.1
0.15
0.2
0.25
0.0
00
0.0
02
0.0
04
0.0
06
0.0
08
0.0
10
0.0
12
0.0
14
0.0
16
0.0
18
0.0
20
ho
urs
Simulated Theoretical (N=191)
Waiting Time given Wait > 0:
beta = 0 QED ( 0.5)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.0
00
0.0
02
0.0
04
0.0
06
0.0
08
0.0
10
0.0
12
0.0
14
0.0
16
0.0
18
0.0
20
0.0
22
0.0
24
0.0
26
0.0
28
ho
urs
Simulated Theoretical (N=175)
Waiting Time given Wait > 0:
beta = -1.2 ED ( 0.9)
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.0
00
0.0
04
0.0
08
0.0
12
0.0
16
0.0
20
0.0
24
0.0
28
0.0
32
0.0
36
0.0
40
0.0
44
ho
urs
Simulated Theoretical (N=160)
- Empirical: Simulate time-varying Mt/M/Nt + M (λt ,Nt = Rt + β√
Rt )
- Theoretical: Naturally-corresponding stationary Erlang-A, with QEDβ-staffing (some Averaging Principle?)
- Generalizes up to a single-station within a complex network (eg.Doctors in an Emergency Department).
77
What is the Offered-Load R(t)?
I Offered-Load Process: L(·) = Least number of servers thatguarantees no delay.
I Offered-Load Function R(t) = E [L(t)], t ≥ 0.Think Mt/G/N?
t + G vs. Mt/G/∞: Ample-Servers.
Four (all useful) representations, capturing “workload before t":
R(t) = E [L(t)] =
∫ t
−∞λ(u) · P(S > t − u)du = E
[A(t)− A(t − S)
]=
= E[∫ t
t−Sλ(u)du
]= E [λ(t − Se)] · E [S] ≈ ... .
I {A(t), t ≥ 0} Arrival-Process, rate λ(·);I S (Se) generic Service-Time (Residual Service-Time).I Relating L, λ,S (“W ”): Time-Varying Little’s Formula.
Stationary models: λ(t) ≡ λ then R(t) ≡ λ× E[S].
QED-c: Nt = Rt + βRct , 1/2 ≤ c < 1; (c = 1 separate analysis).
78
The Offered-Load R(t), t ≥ 0I Backbone of time-varying staffing:
I Practically robust: up to a station within a complex network (ED).I Theoretically challenging: only Erlang-A with µ = θ tractable.
I Process: L(·) = Least number of servers that guarantees no delay.I Offered-Load Function R(·) = E [L(·)] (Mt/G/N?
t + G ↔ Mt/G/∞).
ILTelecom , Private12.05.2004
0.005.00
10.0015.0020.0025.0030.0035.0040.0045.0050.0055.0060.00
07:00 08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
Time (Resolution 30 min.)
Num
ber
of c
ases
Offered load Abandons Av_agents_in_system
79
Estimating / Predicting the Offered-Load
Must account for “service times of abandoning customers".
I Prevalent Assumption: Services and (Im)Patience independent.I But recall Patient VIPs: Willing to wait more for longer services.
Survival Functions by Type (Small Israeli Bank)
31
Service Time (cont’)Survival curve, by types
Time
Surv
ival
Service times stochastic order: SNew
st< SReg
st< SVIP
Patience times stochastic order: τNew
st< τReg
st< τVIP
80
Dependent Primitives: Service- vs. Waiting-Time
Average Service-Time as a function of Waiting-TimeU.S. Bank, Retail, Weedays, January-June, 2006
Introduction Relationship Between Service Time and Patience Workload and Offered-Load Empirical Results Future Research
Motivation
Mean Service-Time as a Function of Waiting-TimeU.S. Bank - Retail Banking Service - Weekdays - January-June, 2006
190
210
230
250
270
290
150
170
190
210
230
250
270
290
0 50 100 150 200 250 300
Waiting Time
Fitted Spline Curve E(S|τ>W=w)
⇒ Focus on ( Patience, Service-Time ) jointly , w/ Reich and Ritov.E [S |Patience = w ], w ≥ 0: Service-Time of the Unserved.
81
(Imputing) Service-Times of Abandoning Customers
w/ M. Reich, Y. Ritov:
1. Estimate g(w) = E [S |Patience > Wait = w ], w ≥ 0:Mean service time of those served after waiting exactly w unitsof time (via non-linear regression: Si = g(Wi ) + εi );
2. Calculate
E [S |Patience = w ] = g(w)− g′(w)
hτ (w);
hτ (w) = hazard-rate of (im)patience (via un-censoring);
3. Offered-load calculations: Impute E [S |Patience = w ](or the conditional distribution).
Challenges: Stable and accurate inference of g,g′,hτ .
82
Extending the Notion of the “Offered-Load"
I Business (Banking Call-Center): Offered Revenues
I Healthcare (Maternity Wards): Fetus in stressI 2 patients (Mother + Child) = high operational and cognitive loadI Fetus dies⇒ emotional load dominates
I ⇒I Offered Operational Load
I Offered Cognitive Load
I Offered Emotional Load
I ⇒ Fair Division of Load (Routing) between 2 Maternity Wards:One attending to complications before birth, the other tocomplications after birth, and both share normal birth
83
The Technion SEE Center / LaboratoryData-Based Service Science / Engineering
2
84