Amazon: a Playgroundfor Machine Learning
Cedric ArchambeauPrincipal Applied Scientist, Amazon, Berlin
Data Science Summer School, École Polytechnique, 2017
LeNet 5
Today, machine learning is creating a paradigm shift.
Jeff Bezos in Geekwire (May 6, 2017): “It is a golden age. Machine learning and AI is a horizontal enabling layer. It will empower and improve every business, every government organization, every philanthropy.”
More than 2 million active seller accounts
(>40% from 3P)
Authors andcontent creators
Over a millionactive AWS accounts
Over 244 million active consumers
Our Customers
The Maturity of Deep Learning
Data
GPUs &Acceleration
Software
Algorithms
Machine learning has a long history at Amazon.
Recommendations& Search
UnderstandingFashion & Style
Amazon Art
(Archambeau and Bach, NIPS 2008)
Shallow LatentVariable Models
Artificial Intelligence at Amazon
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Thousands of Employees across the Company Focused on Machnine Learning & AI
Artificial Intelligence at AmazonThousands of Employees across the Company Focused on Machnine Learning & AI
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Machine Translated Detail Pages
Neural Machine Translation (NMT) with Sockeye
• Open-sourced toolkit for sequence-to-sequence modeling in MXNet
• Implements encoder-decoder models with attention (Bahdanau, et al., 2014)
• Supports different attention models(Luong, et al., 2015)
• Applicable to Named Entity Recognition, Semantic Parsing, …
(Image credit: Washington Department of Fish & Wildlife.)
github.com/awslabs/sockeye
Language model without Markov assumption:
Embedding layer:𝒚" = 𝑾𝐸𝑣"
Recurrent hidden layer (e.g., RNN, LSTM, GRU):𝒔" = tanh 𝑼𝒔"-. +𝑾𝒚"
Output layer:P house <BOS>, the, white = softmax 𝑾1𝑠3 + 𝒃𝑶
Recurrent Neural Network Language Model
𝒚.
<BOS>
𝒚6
the
𝒚3
white
𝒚7
house
𝒔. 𝒔6 𝒔3 𝒔7𝒔8
the white house <EOS>
𝑃 𝒗 =;𝑃(𝑣"|𝒗.:"-.)@
"A.
Language model conditionedon the source sentence:
𝑃 𝒗|𝒙 =;𝑃(𝑣"|𝒗.:"-., 𝒙)@
"A.
Encoded source sentenceinitializes decoder RNN:
𝒔8 = tanh 𝑾𝐼𝒉𝑚 + 𝒃𝐼
Sequence-to-Sequence Model (Sutskever, et al., 2014)
𝒚.
<BOS>
𝒚6
the
𝒚3
white
𝒚7
house
𝒔. 𝒔6 𝒔3 𝒔7𝒔8
the white house <EOS>
𝒙.
la
𝒙6
casa
𝒙3
blanca
𝒉𝟎 𝒉𝟏 𝒉𝟐 𝒉𝟑
encoder RNN 𝑓𝑒𝑛𝑐
decoder RNN 𝑓𝑑𝑒𝑐
Decoder consumes an attention vector:𝒔" = tanh 𝑼𝒔"-. +𝑾[𝒚", 𝒔Q"-. ]
The attention vector consumes a context vector:𝒔Q" = tanh(𝑾S[𝒔", 𝒄"])
The context vector is a linear combination source states:
𝒄" = ∑ 𝛼"WXWA. 𝒉W
where 𝛼"W = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑠𝑐𝑜𝑟𝑒(𝒔", 𝒉W)).
Sequence Decoding with Attention (Bahdanau et al., 2014)
𝒚3
white
house
𝒔3𝒔6
𝒔Q6 𝒔Q3𝒄3 𝜶3
𝒉𝟏 𝒉𝟐 𝒉𝟑 …
Attention Models in Sockeye
Name Available in Sockeyemlp (Bahdanau, et al., 2014) ✓
concat (Luong, et al., 2015)
dot (Luong, et al., 2015) ✓
location (Luong, et al., 2015) ✓
bilinear (Luong, et al., 2015) ✓
coverage (Tu, et al., 2015) ✓v>a tanh(Wu s+Wv h+Wc C)
Artificial Intelligence at Amazon
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Thousands of Employees across the Company Focused on Machnine Learning & AI
Amazon Fresh
Same Day and Early MorningHome Delivery of Grocery.
Strawberry Inspection by a Produce Specialist
Computer Vision-based Grocery Inspection
Illumination Clutter/occlusions Viewpoint Size Variability
Predicting Longevity
Age
Stra
wbe
rryID
Demand Forecasting
Scale 20M+ products fulfilled by Amazon alone!Sparsity Many product sell very infrequentlyRegionalised 100+ FCs worldwideNew products No past demand
Seasonality and External Events
Training RangeNon-fashion items have long(er) training ranges.
SeasonalityThis item has Christmas seasonality
with higher growth over time.
Missing Features/InputsUnexplained spikes
in the demand.
Effect of Out-of-stock
Without accounting for out-of-stock. When accounting for out-of-stock.
Seeger, et al.: Bayesian Intermittent Demand Forecasting for Large Inventories. NIPS 2016.
Deep Autoregressive Recurrent Networks
Salinas, et al. (2017): DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks.arXiv:1704.04110.
Deep Autoregressive Recurrent Networks
Salinas, et al. (2017). DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks.arXiv:1704.04110.
Artificial Intelligence at Amazon
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Thousands of Employees across the Company Focused on Machnine Learning & AI
High atop the steps of the Pyramid of Giza a young woman laughed and called down to him. "Robert, hurry up! I knew I should have married a younger man!" Her smile was magic. ….
Named Entity Extraction
High atop the steps of the Pyramid of Giza a young woman laughed and called down to him. "Robert, hurry up! I knew I should have married a younger man!" Hersmile was magic. ….
if (word is capitalized) and(word before is ‘in’) then
PLACEelse if (word = ‘her’) or (word = ‘his’)
or (word = ‘he’) or (word = ‘she’) thenPERSON
...
Data (input) Annotation (output)
Program
X-Ray : Enrich Every Piece of Digital Content
X-Ray for Videos
Artificial Intelligence at Amazon
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Thousands of Employees across the Company Focused on Machnine Learning & AI
Use your voice to• listen to music,• control smart home devices,• set timers when busy in the kitchen,• ask for news, weather report, …
Amazon Polly
Converts textto life-like speech
47 voices 24 languages Low latency,real time
PowersAlexa
Let’s take a listen…
“Today in Seattle, WA, it’s 11°F”
“We live for the music live from the Madison Square Garden.”
1. Automatic, Accurate Text Processing
A Focus On Voice Quality & Pronunciation
A Focus On Voice Quality & Pronunciation
2. Intelligible and Easy to Understand
1. Automatic, Accurate Text Processing
2. Intelligible and Easy to Understand
3. Add Semantic Meaning to Text
“Richard’s number is 2122341237“
“Richard’s number is 2122341237“Telephone Number
A Focus On Voice Quality & Pronunciation
1. Automatic, Accurate Text Processing
2. Intelligible and Easy to Understand
3. Add Semantic Meaning to Text
4. Customized Pronunciation
“My daughter’s name is Kaja.”
“My daughter’s name is Kaja.”
A Focus On Voice Quality & Pronunciation
1. Automatic, Accurate Text Processing
Artificial Intelligence at Amazon
Discovery &Search
Fulfilment &Logistics
EnhanceExisting Products
Define NewCategories of
Products
Bring MachineLearning to All
Thousands of Employees across the Company Focused on Machnine Learning & AI
Introducing Amazon AI
PollyText-to-Speech
Apache MXNetDeep learning engine
RekognitionImage Analysis
LexASR & NLU
Amazon MLML Applications
Introducing Amazon AI
PollyText-to-Speech
Apache MXNetDeep learning engine
RekognitionImage Analysis
LexASR & NLU
Amazon MLML Applications
Apache MXNet is the deep learning frameworkof choice for Amazon
Why MXNet?
Flexible programing model:• Symbolic API (computation graphs)• Imperative API (NumPy on GPUs)
Bindings for Python, C++, Scala, R, Julia, Perl.Fast & scalable:• Almost linear speed-up with
multiple GPUs• High efficiency on single machine too
(C++ backend) Google Inception v3 (image recognition)
Introducing Amazon AI
PollyText-to-Speech
Apache MXNetDeep learning engine
RekognitionImage Analysis
LexASR & NLU
Amazon MLML Applications
Amazon Rekognition
Real-time &batch image
analysis
Object & SceneDetection
Face Detection Face SearchFace Analysis
Object & Scene Detection
BayBeachCoastOutdoorsSeaWaterPalm_treePlantTreeSummerLandscapeNatureHotel
99.18%
99.18%
99.18%
99.18%
99.18%
99.18%
99.21%
99.21%
99.21%
58.3%
51.84%
51.84%
51.24%
Category Confidence
Face Detection
Face Analysis
Emotion: calm: 73%Sunglasses: false (value: 0)Mouth open wide: 0% (value: 0)Eye closed: open (value: 0)Glasses: no glass (value: 0)Mustache: false (value: 0)Beard: no (value: 0)
• Focus on the problem at hand• Abstract away learning algorithms• Abstract away feature engineering
Requires to automate hyperparameter tuning!
Democratising Machine Learning
Handwritten Digit Recognition
http://yann.lecun.com/exdb/mnist
Given an image of a digit, canwe predict which digit it is?
Simplified Handwritten Digit Recognition
Applying Logistic Regression in Practice
The performance of machine learning models depends on meta-parameters that need to be tuned with care
• Regularisation• (Hyper)priors • Model complexity• Optimisation• Markov Chain Monte Carlo • Feature extraction• Model validation• Decision rule
Can we automate this process?
github.com/awslabs/sockeyeaws.amazon.com/amazon-‐aiaws.amazon.com/blogs/ai