ben medlock, swiftkey // building a better keyboard
TRANSCRIPT
BUILDING THE WORLD’S SMARTEST KEYBOARD
Ben Medlock
Co-founder, CTO
Data Driven NYC 2015
19 August, 2015
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.2
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.3
We believe the next generation of
technology won’t just be smart, it will provide
a more human experience – one that adapts
to you, not the other way around.
HOW DO WE THINK?
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.5
HOW CAN WE MODEL HOW WE THINK
WHAT IS ARTIFICIAL INTELLIGENCE?
19 August, 20156 TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
Copyright: Warner Bros. Pictures
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.7
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.8
PROBABILITY
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.9
P(A |B) =P(B | A)P(A)
P(B)
MACHINE LEARNING
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.10
INFERENCE
MODEL
INPUT
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.12
Narrow AI
General AIWeak AI
Strong AI
AI
Web search
Collaborative filtering
Voice recognition
Machine translation
Driverless cars
Image processing
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.14
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.15
TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.16
RETHINKING TYPING
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
P(s|e,M)
context input prior
language
detection
next word
prediction
error
correction
tap /
continuous
unseen
sequences
DIFFERENT INTERPRETATIONS
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
• Independent probability distributions with smoothing parameters to
govern level of belief
• Ranking signals as inputs to a rank preference learner
• Single distribution estimates, e.g. maximum entropy, where all
evidence types can be expressed as features
LANGUAGE MODELING
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
• Use language models to capture domain usage:
• Background
• Conversational
• Personal
• Context-specific
• Combine multiple models using most confident, interpolation,
etc.
BUILDING LANGUAGE MODELS
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
• Smoothed n-gram models are fast and efficient
• Work well with optimized trie search
• Smoothing: interpolative, backoff, “stupid”…
• Morphemes
• Neural nets / representation learning
DATA COLLECTION
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
INPUT MODELING
A
• Use Gaussian distributions to model interaction with the keyboard
surface
• Linear Gaussians for e.g. spacebar
• Other distributions?
INPUT MODELING
Q W
will we quit
• Train using re-parameterized online MAP
• Track keystroke-character correlations and train models on a per-
session basis
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.26
HYPERPARAMETER LEARNING
• Lots of hyperparameters!
Input confidence, prune threshold, dynamic LM, etc…
• Some can be learned automatically, e.g. prefix probability
• What kind of typist?
• Accurate and visual-led
• Fast and furious
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.27
LOTS OF OTHER LANGUAGE PROBLEMS!
• Profanity filtering
• Stochastic tokenisation (Chinese, Vietnamese etc.)
• Language detection
• Vocabulary evolution
• Clustering
TIME SAVED SO FAR
19 August, 2015TouchType Ltd, 2014. CONFIDENTIAL - do not copy/distribute. All content for illustrative purposes only.28
50TRILLIONCHARACTERS
WRITTEN
15TRILLION
KEYSTROKES
SAVED
WATCH
FRIENDS
19 MILLION
TIMES