journey to auto model training at scale nlp text

NLP text recommender system: Journey to auto model training at scale

Aditya Sakhuja

Engineering Lead, Salesforce Einstein@sakhuja

Agenda● Goal● Scenario● Approach & Metrics● ML System Architecture

○ Recs Serving○ Feature Engineering○ Model Training

● ML System Evolution● Training CI, Deployments & Rollbacks● Cloud Native● Challenges & Takeaways

Customer Service Agent Assist

Goal

Agents rely on traditional search results for finding relevant answers to often long and time sensitive customer questions.

Scenario

Approach

Business Metrics● Agent Time to Resolution● Agent Time spent per case● Case-Article Attach Rate

● # of recommendations served● MAO, MAU● Serving Latency

ML System Architecture

Recommendations ServingLayer 1 : Candidate Generation● NLP : Extract POS, NER, Noun and key terms from user query● IR specific Query Formulation ● Candidates Generated

Layer 2 : Ranking Model● <question, article> pairwise feature generation ● Candidates evaluated by model● Candidates above the threshold are recommended

Recommendations Serving

Data Prep & Feature Engineering

● Multi tenant data ingestion pipeline● Data Cleansing and Sanity checks● Precompute TDF, Corpus Statistics● Feature Vectors computation● 100+ of NLP features across different statistical feature categories● Serving Training Drift

Model Training● Ranking Model● Auto tuned hyperparams● Auto Model comparison● Metrics

○ AUC○ F-Measure○ Precision, Recall○ Hit Rate @K

Model Auto Training Pipeline

ML System Evolutionversion 0● Heuristic based answer recommendations POC. First pilot sign up.● Communities use case: community selected bestAnswer, as positive label.● Generic model trained on open source dataset Stanford SQuAD

version 1● Ranking model : <question, answer> pairwise probability● Notebooks based on-demand training● Static configured data filtering

https://rajpurkar.github.io/SQuAD-explorer/

ML System Evolutionversion 2● Dynamically configured training dataset attributes● Model retraining ● Multilingual Support● Multitenant Auto-trained models● Observability● Trained Model Deployments & Rollbacks

Model Deployment, CI & Rollbacks

Cloud Native Training

Challenges● Data

○ Privacy and sharing compliances – GDPR, HIPAA, Accessibility○ Freshness / Hydration○ Handling encrypted data at rest and in motion○ Too sparse, not meeting thresholds ○ Too dense, training performance SLA not met

● Custom, non standard fields and datatypes● Building ML Infrastructure along the way● Training Serving Skew● Cold start problem

Takeaways● Start small, Ship and Iterate● Prioritize ML infrastructure● Start with simple interpretable models● Scale model learning to the size of your data● Prioritize Observability● Prioritize Data privacy over model quality

Thank you!

Feedback

Your feedback is important to us.

Don’t forget to rateand review the sessions.

journey to auto model training at scale nlp text

Documents