integrating machine learning with optimization: resource ... · developers web sql finance...
TRANSCRIPT
Integrating Machine Learning with Optimization: Resource MatchingMichael Jaczynski (1), Olivier Noiret (1), Fernando Orozco (1), Juan Orozco (1), Tere Gonzalez (2), Pano Santos (1). August, 2019
(1) Gurobi Optimization(2) Hitachi Labs
Speaker IntroductionDr. Cipriano (Pano) Santos
• Sr. Technical Content Manager at Gurobi• Worked at Hewlett Packard Laboratories in Palo Alto, CA, for 23
years• Retired as Distinguished Technologist• Developed and implemented several decision support tools for
Product Life-cycle Management, Customer Relationship Management, Large Data Centers Computing Resources Allocation, Professional Services Workforce Planning, Airline Dispatcher Workload Distribution Optimization, and Operating Room Planning & Scheduling
• Eighteen patents granted• Seventeen refereed research publications• Holds a Bachelor’s Degree in Applied Mathematics from the
University of Mexico (UNAM), and a Master’s and PhD degrees in Operations Research from the University of Waterloo, Canada
© 2019, Gurobi Optimization, LLC2
Speaker IntroductionMsC. Teresa (Tere) González
• Research engineer in the Industrial AI lab at the Center of Social innovation - Hitachi Labs Americas
• Projects that combine images, voice, and text to implement AI solutions in transportation and mobility domains.
• Prior, she worked in Hewlett Packard Laboratories in Palo Alto, CA. • Holds three granted patents and more than 10 in process.• MSc Degree in Information Technologies from Monterrey Institute of
Technology. • Experience and interests are in deep learning, machine learning, big
data, and Augmented Reality (AR) that can help develop the next generation of services to improve people’s lives.
·
© 2019, Gurobi Optimization, LLC3
Integrating Machine Learning with Optimization: Resource MatchingMichael Jaczynski (1), Olivier Noiret (1), Fernando Orozco (1), Juan Orozco (1), Tere Gonzalez (2), Pano Santos (1). August, 2019
(1) Gurobi Optimization(2) Hitachi Labs
Outline· Resource Management at Professional Services Industry
· Resource Matching Optimization (RMO) Problem Statement 5 min
· RMO Demo 15 min
· Machine Learning engine 15 min
· MIP engine 15 min
· RMO Demo Architecture 5 min
· Q&A 5 min
© 2019, Gurobi Optimization, LLC5
© 2019, Gurobi Optimization, LLC6
Resource ManagementProfessional Services Industry
Resource Management ProblemMotivation
• Professional service firms employs thousands of professionals making labor the industry’s highest expense
• Examples of industries where professional service firms are commonly used include :
• Information technology services
• Business process outsourcing services
• Offering customized knowledge-based services to clients• Accounting • Legal • Advertising• Engineering• … Other specialized services
© 2019, Gurobi Optimization, LLC7
Resource Management Problem
Limitations of current processes & tools
• Current labor resources assignment processes and tools are manual and present many limitations leading to
• Poor demand fulfillment and low labor resource utilization,
• High project delivery costs,
• Poor customer satisfaction
© 2019, Gurobi Optimization, LLC8
Resource Management Problem
The integration of ML and MIP is essential:
• Machine Learning addresses the scoring of resources and jobs, but alone cannot address the complex combinatorial nature of the assignment problem
• MIP addresses the complex combinatorial nature of the assignment problem, but alone would require a human to provide a large number of matching scores, making the approach impractical
© 2019, Gurobi Optimization, LLC9
© 2019, Gurobi Optimization, LLC10
Resource Matching OptimizationProblem Statement
Resource Matching OptimizationProblem Statement
· Find an assignment of resources to jobs that maximizes the total matching score of resources
and jobs, while satisfying the requirements of jobs and the availability of resources.
© 2019, Gurobi Optimization, LLC11
Resources Jobs
.
.
....
𝑅"
𝑅#
𝑅$
𝑅%
𝐽"
𝐽#
𝐽'
ms (r, j)
Extra constraints:
+ A resource must satisfy a minimum matching score to qualify for a job
+ There is a priority for each job
+ If job demand cannot be satisfied withavailable resources, then a new resourcecan be hired. The number of new resourcesthat can be hired is limited.
Job-Resource Matching Score
© 2019, Gurobi Optimization, LLC12
Problem Statement
· Compute a matching score to indicate how well a resource is qualified for the job.
Job1, resource1, 0Job1, resource2, 1Jobi, resourcej, msi,j…Jobn, resourcem, msn,m
Qualification matrix
Job1, resource1, 0.5Job1, resource2, 0.7Jobi, resourcej, ms’i,j…Jobn, resourcem, ms’n,m
Resource-job score
Limited score granularity Rich score granularity
Scores
Learn resource–job scores and then optimize allocations subject to problem constraints.
Job1, resource1Job2, resource2Jobi, resourcej…Jobn, resourcem
Optimal Allocation
Best resources to jobs
Discrete to continuous score
ML/AI and Optimization Integration
© 2019, Gurobi Optimization, LLC13
Problem Understanding
Identify objectives,constrains and key data parameters
DataNeeds
Data gathering and transformation
Optimization
- Forecasting- Classification,- Etc.
Machine Learning/AI
Data engineering
Actionabledecisions
Learned parameters
Collected parameters
Objective functionConstraints
Dispatching,Scheduling,Routing,Etc.
© 2019, Gurobi Optimization, LLC14
Resource Matching OptimizationLive demo
RMO Demo
· https://demos.gurobi.com/rmo
© 2019, Gurobi Optimization, LLC15
This demo is the result of a collaboration with the Recruitology Data Science team (resumes).
© 2019, Gurobi Optimization, LLC16
Resource Matching OptimizationMachine Learning Engine
Machine learning componentLearning job and resource profiles from documents
© 2019, Gurobi Optimization, LLC17
Document Classification
Technology
Role
Domain
Resumejob description
Resource-job similarity
ü Role: Developer or Data Scientist
1. Build resource – job profiles from documents.
• Classify resources and jobs into relevant classes.2. Compute matching score using weighting similarity
text
classscore
ü Technology: Web or SQL/C++
ü Domain: Marketing, Finance, or ITOther
© 2019, Gurobi Optimization, LLC18
64
103
1531
2212
37
34
2
160 31
3811
0
20
40
60
80
100
120
Datascience
Developers Web SQL Finance Marketing ITOther
role technology domain
Document distribution used for resource profile classification
no augmentation augmentationTraining set1. - 167 resumes2. - 132 technical documents
(augmentation)
Job attributes (classes) of interest:ü - Roleü - Technologyü - Domain
Corpus Description
resumes
Model building - Learning job-resource profile from documents
© 2019, Gurobi Optimization, LLC19
Text Processing
FeatureSelection ClassificationData
Augmentation
• Tokenization• Stemming• Lemmatization • Stop words
Adding relevant documents
• Distributed space vectors with TF/IDF
• Dimensionality reduction
documents corpus
Vectorspace models
• Random forest• SVM• Naïve Bayes
Resumes/Job description
Classesof Interest
Public sources
Document classification pipeline with small datasets
Model Evaluation and Results
© 2019, Gurobi Optimization, LLC20
Experiments:1. Only resumes2. Resumes+ augmentation
Datasets:1. Testing dataset2. Demo dataset (64 resumes)
Models:1. Naïve Bayes2. Random Forest3. Support Vector Machine
Accuracy Results of Best Models on Testing Dataset
Class Model Only resumes Resumes + augmentation
Roles SVM 90.0% 94.2%
Technology RF 50.0% 95.0%
Domain SVM/RF 37.0% 84.0%
Accuracy Results of Best Models on Demo Dataset
Class Model Only resumes Resumes + augmentation
Roles SVM 84.0% 96.3%
Technology RF 64.0% 94.3%
Domain RF 38.8% 83.1%
© 2019, Gurobi Optimization, LLC21
91.7 91.7 94.285
9586
8084 84
0
20
40
60
80
100
Naïve RF SVM Naïve RF SVM Naïve RF SVM
Role Technology Domain
Accuracy of job attribute classification on testing dataset
No-augmentation With-augmentation
Detailed Results – Testing dataset
Best. Models: Roles: SVM Technology : RF Domain : RF/SVM
%
92 92 9481
9488
7984 84
0
20
40
60
80
100
Naïve RF SVM Naïve RF SVM Naïve RF SVM
Role Technology Domain
Recall of Job attribute classification on demo set
No-augmentation With-augmentation
%
Only resumes Only resumes
© 2019, Gurobi Optimization, LLC22
Detail Results – Demo dataset
96 96 9793 94
88
61
88
71
0
20
40
60
80
100
Naïve RF SVM Naïve RF SVM Naïve RF SVMRole Technology Domain
Recall of Job attribute classification on demo dataset
No-augmentation With-augmentation
94.1 94.1 96.2789.5
94.387.3
57.1
83.1
56.7
0
20
40
60
80
100
Naïve RF SVM Naïve RF SVM Naïve RF SVMRole Technology Domain
Accuracy of Job attribute classification on demo dataset
No-augmentation With-augmentation
% %
Only resumes Only resumes
Best. Models: Roles: SVM Technology : RF Domain : RF
Resource profile example
© 2019, Gurobi Optimization, LLC23
--------------Results-------------->> Classification Results <<------- Document profile ---- ID = 0- filename = resourceA.txt- Class = {'roles': 'developer', 'domain': 'other', 'technology': 'web'}- MatchingScore = 1.0- Scores = {'roles': 1, 'domain': 1, 'technology': 1}- Relevant terms{‘Developer’,’tool’,’javascript', 'asp net', 'project', 'optimization’, 'asp', 'software engineer’,’optimization’, ‘angularjs’, ‘Development’,’engineering’,’prototype’}
Input resume
Resume classification
Machine Learning Component· Extensions
• Document similarity between job description and resumes
• Add temporal models to consider job history
• Other classifier approaches
• Gradient Boosting• Other Ensemble models• Deep learning with transfer learning
© 2019, Gurobi Optimization, LLC24
© 2019, Gurobi Optimization, LLC25
Resource Matching OptimizationMIP Engine
RMO Demo MIP formulation .. 1
© 2019, Gurobi Optimization, LLC26
Input data
Decision variables
Budget constraint
𝐹𝑇𝐸+ = 1𝐶𝐴𝑃1 = 1 Resources Jobs
.
.
....
𝑅"
𝑅#
𝑅$
𝑅%
𝐽"
𝐽#
𝐽'
ms (r, j)
RMO Demo MIP formulation .. 2
© 2019, Gurobi Optimization, LLC27
Constraints
𝐹𝑇𝐸+ = 1
𝐶𝐴𝑃1 = 1
RMO Demo MIP formulation .. 3
© 2019, Gurobi Optimization, LLC28
Objective Function
RMO Extension .. 1
Consider job FTE requirements (e.g. 2.4 FTE data scientists) and resource fractional capacity (e.g. data scientist is only available 60% of its time).
There is a fixed cost of recruiting one resource of each type of job.
There is a hiring budget that limits the number of new resources that can be hired.
© 2019, Gurobi Optimization, LLC29
RMO Extension .. 2Challenge
The new resources to be hired need to be an integer number (headcount).
The amount of effort allocated to fill job FTE requirements from new resources hired (headcount) can be fractional numbers.
We hire headcount but allocate capacity available (effort).
© 2019, Gurobi Optimization, LLC30
RMO Extension .. 3Addressing challenge:
Let’s define two types of decision variables for hiring.
· ℎ+: (effort) amount of effort allocated from new resources hired (headcount) to fill FTE requirements of job j.
· 𝑛ℎ+: (headcount) number of new resources hired to fill demand of job j.
The condition we will like to satisfy is: effort allocated is positive if and only if there is headcount available. That is: (ℎ+ ≥ 𝜖 > 0) if and only if (𝑛ℎ+ ≥ 1).
Where 𝜖 is the minimum meaningful effort that can be allocated to perform a job, we assume 𝜖 = 0.01 (about 5 min of work).
© 2019, Gurobi Optimization, LLC31
RMO Extension .. 4Addressing challenge:ℎ+: amount of effort allocated from new resources hired to fill FTE requirements of job j.𝑛ℎ+: number of new resources hired to fill demand of job j.
The condition we will like to satisfy is: effort allocated is positive if and only if there is headcount available
(ℎ+ ≥ 𝜖 > 0 if and only if 𝑛ℎ+ ≥ 1)
To force this logical condition, we need an auxiliary binary variable 𝑢ℎ+ that:· 𝑢ℎ+ = 1 whenever ℎ+ ≥ 𝜖 > 0, then 𝑛ℎ+ ≥ 1 (Positive effort implies headcount hired).And· 𝑢ℎ+ = 1 whenever 𝑛ℎ+ ≥ 1, then ℎ+ ≥ 𝜖 > 0 (Headcount hired implies positive effort).
© 2019, Gurobi Optimization, LLC32
RMO Extension .. 5Addressing challenge:
The condition we will like to satisfy is: ℎ+ ≥ 𝜖 > 0 if and only if 𝑛ℎ+ ≥ 1.· (L1) 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+· (L2) 𝑢ℎ+ ≤ 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+
Let’s check that constraints L1 and L2 force the if-and-only-if logical condition.· If ℎ+ > 0 then by (L1: ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+ ), 𝑢ℎ+ = 1. By (L2: 𝑢ℎ+ ≤ 𝑛ℎ+ ), we have 𝑛ℎ+ ≥ 1· If ℎ+ = 0 then by (L1: 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+), 𝑢ℎ+ = 0. By (L2: 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+), we have 𝑛ℎ+ = 0· If 𝑛ℎ+ ≥ 1 then (L2: 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+ ), 𝑢ℎ+ = 1. By (L1: 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+), we have ℎ+ ≥ 𝜖 > 0· If 𝑛ℎ+ = 0 then (L2: 𝑢ℎ+ ≤ 𝑛ℎ+), 𝑢ℎ+ = 0. By (L1:ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+), we have ℎ+ = 0
Q.E.D.
© 2019, Gurobi Optimization, LLC33
RMO Extension MIP Formulation .. 1
© 2019, Gurobi Optimization, LLC34
Decision variables
Budget constraint
𝐹𝑇𝐸+ ≥ 00 ≤ 𝐶𝐴𝑃1 ≤ 1 Resources Jobs
.
.
....
𝑅"
𝑅#
𝑅$
𝑅%
𝐽"
𝐽#
𝐽'
ms (r, j)
RMO Extension MIP Formulation .. 2
© 2019, Gurobi Optimization, LLC35
Constraints
𝐹𝑇𝐸+ ≥ 0
0 ≤ 𝐶𝐴𝑃1 ≤ 1
RMO Extension MIP Formulation .. 3
© 2019, Gurobi Optimization, LLC36
Constraints
RMO Extension MIP Formulation .. 4
© 2019, Gurobi Optimization, LLC37
Objective Function
Conclusions: Integrating Machine Learning with Optimization
Core Message:
• Predicting is not enough.
• You need to react to your predictions and decide the best course of action
An upper bound on the number of possible assignment to explore is the combinations of assigning 62 resources among the 2,620 positive matching scores.
2,62062 = 1.3208313298e+126
This is a daunting number, a MIP approach is required to efficiently navigate this solution space and identify the Global optimal solution, or declare the problem infeasible.Typically, people use a ranking approach based on the matching scores to assign resources to jobs. This heuristic may lead to very poor suboptimal assignments.
© 2019, Gurobi Optimization, LLC38
© 2019, Gurobi Optimization, LLC39
Resource Matching OptimizationArchitecture
RMO Architecture
© 2018 Gurobi Optimization
RMO Server(Nodejs)
Message Queue(Redis)
RMO Worker(Python)Scenario
Database(MongoDB)
RMO Web UI(React)
Build optimization model using the Gurobi Python API. Solve using Gurobi Instant Cloud. Compute KPIs.
HTTPWebSockets
Optimization(Gurobi)
Machine Learning (Document
Classification)
Compute job/resource matching scores
Store jobs, resources, matching scores and optimized assignment plan
Step 1
Step 2
Machine Learning (Training)
Build document classification model based on training corpus
Deploy classification model in Docker container
Thank You– Questions?
Next Steps
• If you haven’t already done so, please register for an account at www.gurobi.com.
• Try the demo yourself! https://demos.gurobi.com/rmo
• For questions about Gurobi pricing contact [email protected].
• Additional resources:
• HP Enterprise Services Uses Optimization for Resource Planning
• https://doi.org/10.1287/inte.1110.0621
• Adaptive Employee Profile Classification for Resource Planning Tool
• https://www.researchgate.net/publication/261308627_Adaptive_Employee_Profile_Classification_for_Resource_Planning_Tool
© 2019, Gurobi Optimization, LLC42