integrating machine learning with optimization: resource ... · developers web sql finance...

42
Integrating Machine Learning with Optimization: Resource Matching Michael Jaczynski (1), Olivier Noiret (1), Fernando Orozco (1), Juan Orozco (1), Tere Gonzalez (2), Pano Santos (1). August, 2019 (1) Gurobi Optimization (2) Hitachi Labs

Upload: others

Post on 12-Jul-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Integrating Machine Learning with Optimization: Resource MatchingMichael Jaczynski (1), Olivier Noiret (1), Fernando Orozco (1), Juan Orozco (1), Tere Gonzalez (2), Pano Santos (1). August, 2019

(1) Gurobi Optimization(2) Hitachi Labs

Page 2: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Speaker IntroductionDr. Cipriano (Pano) Santos

• Sr. Technical Content Manager at Gurobi• Worked at Hewlett Packard Laboratories in Palo Alto, CA, for 23

years• Retired as Distinguished Technologist• Developed and implemented several decision support tools for

Product Life-cycle Management, Customer Relationship Management, Large Data Centers Computing Resources Allocation, Professional Services Workforce Planning, Airline Dispatcher Workload Distribution Optimization, and Operating Room Planning & Scheduling

• Eighteen patents granted• Seventeen refereed research publications• Holds a Bachelor’s Degree in Applied Mathematics from the

University of Mexico (UNAM), and a Master’s and PhD degrees in Operations Research from the University of Waterloo, Canada

© 2019, Gurobi Optimization, LLC2

Page 3: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Speaker IntroductionMsC. Teresa (Tere) González

• Research engineer in the Industrial AI lab at the Center of Social innovation - Hitachi Labs Americas

• Projects that combine images, voice, and text to implement AI solutions in transportation and mobility domains.

• Prior, she worked in Hewlett Packard Laboratories in Palo Alto, CA. • Holds three granted patents and more than 10 in process.• MSc Degree in Information Technologies from Monterrey Institute of

Technology. • Experience and interests are in deep learning, machine learning, big

data, and Augmented Reality (AR) that can help develop the next generation of services to improve people’s lives.

·

© 2019, Gurobi Optimization, LLC3

Page 4: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Integrating Machine Learning with Optimization: Resource MatchingMichael Jaczynski (1), Olivier Noiret (1), Fernando Orozco (1), Juan Orozco (1), Tere Gonzalez (2), Pano Santos (1). August, 2019

(1) Gurobi Optimization(2) Hitachi Labs

Page 5: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Outline· Resource Management at Professional Services Industry

· Resource Matching Optimization (RMO) Problem Statement 5 min

· RMO Demo 15 min

· Machine Learning engine 15 min

· MIP engine 15 min

· RMO Demo Architecture 5 min

· Q&A 5 min

© 2019, Gurobi Optimization, LLC5

Page 6: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC6

Resource ManagementProfessional Services Industry

Page 7: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Resource Management ProblemMotivation

• Professional service firms employs thousands of professionals making labor the industry’s highest expense

• Examples of industries where professional service firms are commonly used include :

• Information technology services

• Business process outsourcing services

• Offering customized knowledge-based services to clients• Accounting • Legal • Advertising• Engineering• … Other specialized services

© 2019, Gurobi Optimization, LLC7

Page 8: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Resource Management Problem

Limitations of current processes & tools

• Current labor resources assignment processes and tools are manual and present many limitations leading to

• Poor demand fulfillment and low labor resource utilization,

• High project delivery costs,

• Poor customer satisfaction

© 2019, Gurobi Optimization, LLC8

Page 9: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Resource Management Problem

The integration of ML and MIP is essential:

• Machine Learning addresses the scoring of resources and jobs, but alone cannot address the complex combinatorial nature of the assignment problem

• MIP addresses the complex combinatorial nature of the assignment problem, but alone would require a human to provide a large number of matching scores, making the approach impractical

© 2019, Gurobi Optimization, LLC9

Page 10: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC10

Resource Matching OptimizationProblem Statement

Page 11: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Resource Matching OptimizationProblem Statement

· Find an assignment of resources to jobs that maximizes the total matching score of resources

and jobs, while satisfying the requirements of jobs and the availability of resources.

© 2019, Gurobi Optimization, LLC11

Resources Jobs

.

.

....

𝑅"

𝑅#

𝑅$

𝑅%

𝐽"

𝐽#

𝐽'

ms (r, j)

Extra constraints:

+ A resource must satisfy a minimum matching score to qualify for a job

+ There is a priority for each job

+ If job demand cannot be satisfied withavailable resources, then a new resourcecan be hired. The number of new resourcesthat can be hired is limited.

Page 12: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Job-Resource Matching Score

© 2019, Gurobi Optimization, LLC12

Problem Statement

· Compute a matching score to indicate how well a resource is qualified for the job.

Job1, resource1, 0Job1, resource2, 1Jobi, resourcej, msi,j…Jobn, resourcem, msn,m

Qualification matrix

Job1, resource1, 0.5Job1, resource2, 0.7Jobi, resourcej, ms’i,j…Jobn, resourcem, ms’n,m

Resource-job score

Limited score granularity Rich score granularity

Scores

Learn resource–job scores and then optimize allocations subject to problem constraints.

Job1, resource1Job2, resource2Jobi, resourcej…Jobn, resourcem

Optimal Allocation

Best resources to jobs

Discrete to continuous score

Page 13: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

ML/AI and Optimization Integration

© 2019, Gurobi Optimization, LLC13

Problem Understanding

Identify objectives,constrains and key data parameters

DataNeeds

Data gathering and transformation

Optimization

- Forecasting- Classification,- Etc.

Machine Learning/AI

Data engineering

Actionabledecisions

Learned parameters

Collected parameters

Objective functionConstraints

Dispatching,Scheduling,Routing,Etc.

Page 14: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC14

Resource Matching OptimizationLive demo

Page 15: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Demo

· https://demos.gurobi.com/rmo

© 2019, Gurobi Optimization, LLC15

This demo is the result of a collaboration with the Recruitology Data Science team (resumes).

Page 16: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC16

Resource Matching OptimizationMachine Learning Engine

Page 17: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Machine learning componentLearning job and resource profiles from documents

© 2019, Gurobi Optimization, LLC17

Document Classification

Technology

Role

Domain

Resumejob description

Resource-job similarity

ü Role: Developer or Data Scientist

1. Build resource – job profiles from documents.

• Classify resources and jobs into relevant classes.2. Compute matching score using weighting similarity

text

classscore

ü Technology: Web or SQL/C++

ü Domain: Marketing, Finance, or ITOther

Page 18: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC18

64

103

1531

2212

37

34

2

160 31

3811

0

20

40

60

80

100

120

Datascience

Developers Web SQL Finance Marketing ITOther

role technology domain

Document distribution used for resource profile classification

no augmentation augmentationTraining set1. - 167 resumes2. - 132 technical documents

(augmentation)

Job attributes (classes) of interest:ü - Roleü - Technologyü - Domain

Corpus Description

resumes

Page 19: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Model building - Learning job-resource profile from documents

© 2019, Gurobi Optimization, LLC19

Text Processing

FeatureSelection ClassificationData

Augmentation

• Tokenization• Stemming• Lemmatization • Stop words

Adding relevant documents

• Distributed space vectors with TF/IDF

• Dimensionality reduction

documents corpus

Vectorspace models

• Random forest• SVM• Naïve Bayes

Resumes/Job description

Classesof Interest

Public sources

Document classification pipeline with small datasets

Page 20: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Model Evaluation and Results

© 2019, Gurobi Optimization, LLC20

Experiments:1. Only resumes2. Resumes+ augmentation

Datasets:1. Testing dataset2. Demo dataset (64 resumes)

Models:1. Naïve Bayes2. Random Forest3. Support Vector Machine

Accuracy Results of Best Models on Testing Dataset

Class Model Only resumes Resumes + augmentation

Roles SVM 90.0% 94.2%

Technology RF 50.0% 95.0%

Domain SVM/RF 37.0% 84.0%

Accuracy Results of Best Models on Demo Dataset

Class Model Only resumes Resumes + augmentation

Roles SVM 84.0% 96.3%

Technology RF 64.0% 94.3%

Domain RF 38.8% 83.1%

Page 21: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC21

91.7 91.7 94.285

9586

8084 84

0

20

40

60

80

100

Naïve RF SVM Naïve RF SVM Naïve RF SVM

Role Technology Domain

Accuracy of job attribute classification on testing dataset

No-augmentation With-augmentation

Detailed Results – Testing dataset

Best. Models: Roles: SVM Technology : RF Domain : RF/SVM

%

92 92 9481

9488

7984 84

0

20

40

60

80

100

Naïve RF SVM Naïve RF SVM Naïve RF SVM

Role Technology Domain

Recall of Job attribute classification on demo set

No-augmentation With-augmentation

%

Only resumes Only resumes

Page 22: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC22

Detail Results – Demo dataset

96 96 9793 94

88

61

88

71

0

20

40

60

80

100

Naïve RF SVM Naïve RF SVM Naïve RF SVMRole Technology Domain

Recall of Job attribute classification on demo dataset

No-augmentation With-augmentation

94.1 94.1 96.2789.5

94.387.3

57.1

83.1

56.7

0

20

40

60

80

100

Naïve RF SVM Naïve RF SVM Naïve RF SVMRole Technology Domain

Accuracy of Job attribute classification on demo dataset

No-augmentation With-augmentation

% %

Only resumes Only resumes

Best. Models: Roles: SVM Technology : RF Domain : RF

Page 23: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Resource profile example

© 2019, Gurobi Optimization, LLC23

--------------Results-------------->> Classification Results <<------- Document profile ---- ID = 0- filename = resourceA.txt- Class = {'roles': 'developer', 'domain': 'other', 'technology': 'web'}- MatchingScore = 1.0- Scores = {'roles': 1, 'domain': 1, 'technology': 1}- Relevant terms{‘Developer’,’tool’,’javascript', 'asp net', 'project', 'optimization’, 'asp', 'software engineer’,’optimization’, ‘angularjs’, ‘Development’,’engineering’,’prototype’}

Input resume

Resume classification

Page 24: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Machine Learning Component· Extensions

• Document similarity between job description and resumes

• Add temporal models to consider job history

• Other classifier approaches

• Gradient Boosting• Other Ensemble models• Deep learning with transfer learning

© 2019, Gurobi Optimization, LLC24

Page 25: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC25

Resource Matching OptimizationMIP Engine

Page 26: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Demo MIP formulation .. 1

© 2019, Gurobi Optimization, LLC26

Input data

Decision variables

Budget constraint

𝐹𝑇𝐸+ = 1𝐶𝐴𝑃1 = 1 Resources Jobs

.

.

....

𝑅"

𝑅#

𝑅$

𝑅%

𝐽"

𝐽#

𝐽'

ms (r, j)

Page 27: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Demo MIP formulation .. 2

© 2019, Gurobi Optimization, LLC27

Constraints

𝐹𝑇𝐸+ = 1

𝐶𝐴𝑃1 = 1

Page 28: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Demo MIP formulation .. 3

© 2019, Gurobi Optimization, LLC28

Objective Function

Page 29: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension .. 1

Consider job FTE requirements (e.g. 2.4 FTE data scientists) and resource fractional capacity (e.g. data scientist is only available 60% of its time).

There is a fixed cost of recruiting one resource of each type of job.

There is a hiring budget that limits the number of new resources that can be hired.

© 2019, Gurobi Optimization, LLC29

Page 30: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension .. 2Challenge

The new resources to be hired need to be an integer number (headcount).

The amount of effort allocated to fill job FTE requirements from new resources hired (headcount) can be fractional numbers.

We hire headcount but allocate capacity available (effort).

© 2019, Gurobi Optimization, LLC30

Page 31: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension .. 3Addressing challenge:

Let’s define two types of decision variables for hiring.

· ℎ+: (effort) amount of effort allocated from new resources hired (headcount) to fill FTE requirements of job j.

· 𝑛ℎ+: (headcount) number of new resources hired to fill demand of job j.

The condition we will like to satisfy is: effort allocated is positive if and only if there is headcount available. That is: (ℎ+ ≥ 𝜖 > 0) if and only if (𝑛ℎ+ ≥ 1).

Where 𝜖 is the minimum meaningful effort that can be allocated to perform a job, we assume 𝜖 = 0.01 (about 5 min of work).

© 2019, Gurobi Optimization, LLC31

Page 32: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension .. 4Addressing challenge:ℎ+: amount of effort allocated from new resources hired to fill FTE requirements of job j.𝑛ℎ+: number of new resources hired to fill demand of job j.

The condition we will like to satisfy is: effort allocated is positive if and only if there is headcount available

(ℎ+ ≥ 𝜖 > 0 if and only if 𝑛ℎ+ ≥ 1)

To force this logical condition, we need an auxiliary binary variable 𝑢ℎ+ that:· 𝑢ℎ+ = 1 whenever ℎ+ ≥ 𝜖 > 0, then 𝑛ℎ+ ≥ 1 (Positive effort implies headcount hired).And· 𝑢ℎ+ = 1 whenever 𝑛ℎ+ ≥ 1, then ℎ+ ≥ 𝜖 > 0 (Headcount hired implies positive effort).

© 2019, Gurobi Optimization, LLC32

Page 33: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension .. 5Addressing challenge:

The condition we will like to satisfy is: ℎ+ ≥ 𝜖 > 0 if and only if 𝑛ℎ+ ≥ 1.· (L1) 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+· (L2) 𝑢ℎ+ ≤ 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+

Let’s check that constraints L1 and L2 force the if-and-only-if logical condition.· If ℎ+ > 0 then by (L1: ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+ ), 𝑢ℎ+ = 1. By (L2: 𝑢ℎ+ ≤ 𝑛ℎ+ ), we have 𝑛ℎ+ ≥ 1· If ℎ+ = 0 then by (L1: 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+), 𝑢ℎ+ = 0. By (L2: 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+), we have 𝑛ℎ+ = 0· If 𝑛ℎ+ ≥ 1 then (L2: 𝑛ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+ ), 𝑢ℎ+ = 1. By (L1: 𝜖 ∗ 𝑢ℎ+ ≤ ℎ+), we have ℎ+ ≥ 𝜖 > 0· If 𝑛ℎ+ = 0 then (L2: 𝑢ℎ+ ≤ 𝑛ℎ+), 𝑢ℎ+ = 0. By (L1:ℎ+ ≤ 𝐹𝑇𝐸+ ∗ 𝑢ℎ+), we have ℎ+ = 0

Q.E.D.

© 2019, Gurobi Optimization, LLC33

Page 34: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension MIP Formulation .. 1

© 2019, Gurobi Optimization, LLC34

Decision variables

Budget constraint

𝐹𝑇𝐸+ ≥ 00 ≤ 𝐶𝐴𝑃1 ≤ 1 Resources Jobs

.

.

....

𝑅"

𝑅#

𝑅$

𝑅%

𝐽"

𝐽#

𝐽'

ms (r, j)

Page 35: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension MIP Formulation .. 2

© 2019, Gurobi Optimization, LLC35

Constraints

𝐹𝑇𝐸+ ≥ 0

0 ≤ 𝐶𝐴𝑃1 ≤ 1

Page 36: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension MIP Formulation .. 3

© 2019, Gurobi Optimization, LLC36

Constraints

Page 37: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Extension MIP Formulation .. 4

© 2019, Gurobi Optimization, LLC37

Objective Function

Page 38: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Conclusions: Integrating Machine Learning with Optimization

Core Message:

• Predicting is not enough.

• You need to react to your predictions and decide the best course of action

An upper bound on the number of possible assignment to explore is the combinations of assigning 62 resources among the 2,620 positive matching scores.

2,62062 = 1.3208313298e+126

This is a daunting number, a MIP approach is required to efficiently navigate this solution space and identify the Global optimal solution, or declare the problem infeasible.Typically, people use a ranking approach based on the matching scores to assign resources to jobs. This heuristic may lead to very poor suboptimal assignments.

© 2019, Gurobi Optimization, LLC38

Page 39: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

© 2019, Gurobi Optimization, LLC39

Resource Matching OptimizationArchitecture

Page 40: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

RMO Architecture

© 2018 Gurobi Optimization

RMO Server(Nodejs)

Message Queue(Redis)

RMO Worker(Python)Scenario

Database(MongoDB)

RMO Web UI(React)

Build optimization model using the Gurobi Python API. Solve using Gurobi Instant Cloud. Compute KPIs.

HTTPWebSockets

Optimization(Gurobi)

Machine Learning (Document

Classification)

Compute job/resource matching scores

Store jobs, resources, matching scores and optimized assignment plan

Step 1

Step 2

Machine Learning (Training)

Build document classification model based on training corpus

Deploy classification model in Docker container

Page 41: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Thank You– Questions?

Page 42: Integrating Machine Learning with Optimization: Resource ... · Developers Web SQL Finance Marketing ITOther role technology domain Document distribution used for resource profile

Next Steps

• If you haven’t already done so, please register for an account at www.gurobi.com.

• Try the demo yourself! https://demos.gurobi.com/rmo

• For questions about Gurobi pricing contact [email protected].

• Additional resources:

• HP Enterprise Services Uses Optimization for Resource Planning

• https://doi.org/10.1287/inte.1110.0621

• Adaptive Employee Profile Classification for Resource Planning Tool

• https://www.researchgate.net/publication/261308627_Adaptive_Employee_Profile_Classification_for_Resource_Planning_Tool

© 2019, Gurobi Optimization, LLC42