adventures on the road to enterprise virtual assistants

Download Adventures on the Road to Enterprise Virtual Assistants

Post on 14-Feb-2017

42 views

Category:

Documents

3 download

Embed Size (px)

TRANSCRIPT

  • Copyright 2017. All rights reserved.

    Pardon My French

    And Other Adventures on the Road to Enterprise Virtual Assistants

    Editt Gonen-FriedmanOracle Voice & Emerging TechnologiesEditt.gonen-friedman@oracle.com

  • Copyright 2017. All rights reserved.

    Voice Interaction

    Voice-based technologies are the most important area of growth for mobile user interfaces hands-free use and always-on interfaces will drive increased use of speech recognition enterprise application developers will need to accommodate new ways of accepting input. - Intelligence report, May 2015, Tractica

    Enterprises are going to be affected by a workers need to do more than type, click and swipe - ITWC

    2016 will be the year of Conversational Commerce Chris Messina on Medium

    2

    https://www.tractica.com/research/emerging-interface-technologies-for-mobile-devices/http://www.itworldcanada.com/article/voice-input-poised-to-take-a-swipe-at-enterprise-applications/374824https://medium.com/chris-messina/2016-will-be-the-year-of-conversational-commerce-1586e85e3991

  • Copyright 2017. All rights reserved.

    What Does it Take to Build an Enterprise Virtual Assistant?

    (ASR) Automatic

    Speech Recognition

    Voice UI

    Dialog Management

    (NLU) Natural

    Language Understanding

    3

    Multiple technologies must come together to build it.

  • Copyright 2017. All rights reserved. 4

    Needs SR

    Build your own- advantages:

    Build a massive language corpus (Google)

    Handle surround sound, priority by proximity (Amazons Alexa)

    Use voice biometrics to identify speaker (Alexa, Nuance)

    Or use 3rd party services

    A Mobile Enterprise VA

    Image source: itpro.co.uk

  • Copyright 2017. All rights reserved. 5

    Speech service considerations:

    Footprint: local install (Sensory) or cloud service

    Security: enterprise data is sensitive

    WER (word error rate)

    Device support

    Languages (global enterprise)

    Vocab customization: ability to add recurring entity names and industry jargon

    Automatic Speech

    Recognition

    Mobile Enterprise VA

  • Copyright 2017. All rights reserved. 6

    Compared to a general purpose VA

    Supported actions are limited

    Context is limited

    Is it easier?

    As a rule theres less ambiguity, but sometimes need to resolve to less popular meaning

    Example: The user says: Leads or Go to leads

    Intent: navigate to the leads page in my speech-enabled mobile app for sales

    Speech Considerations: Vocabulary

  • Copyright 2017. All rights reserved. 7

    Result: Go to Leeds. A general purpose VA might understand this to mean bring up the map for Leeds, England

    Speech Considerations: Vocabulary

    Leeds, Northern England

    In this case we had to add Leads to the ASR custom vocabulary with an increased weight of 50% instead of 10% Could also be solved at the NLP step, with full NLP that resolves ambiguity

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    8

    Automatic Speech

    Recognition

    Voice UI

    Needs voice interaction design

    How to make it look like a speech app?

    How to deal with command discoverability?

    Can you wake it with a key word?

    In what case do you allow touch and voice combo?

    How to indicate listening?

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    9

    Here are some attempts to answer those questions in a dedicated speech app, Oracle Voice

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    10

    And this is a UI change to a regular app, Oracle Sales Cloud Mobile, where speech capabilities have been added

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    11

    Needs dialog management

    One-step response: gives you a simple answer or link, or navigates you to another page

    Multi-step dialog: manages a back-and-forth dialog in which context is retained

    Perhaps add more useful interactions Such as business content reading

    (news, emails, app listings)

    Dialog Management

    Automatic Speech

    Recognition

    VUI

    Dialog Management

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    12

    Needs NLU

    Many 3rd party solutions available

    Its possible to start with a basic solution thatunderstands a number of meanings and intents and can follow up with specific actions and taskflows

    Soon youll run into the need for full language and context intelligence

    Dialog Management

    Automatic Speech

    Recognition

    VUI

    Dialog Management

    Natural Language

    Understanding

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    13

    Robust NLU needs to resolve ambiguity in context

    Leads vs. Leeds is a simple example

    Diversification means investment variety in Finance, but getting rid of assets in Marketing

    Image source: Oracle Intelligent UX

  • Copyright 2017. All rights reserved.

    A Mobile Enterprise VA

    14

    Adding an NLU solution to the mobile app is no simple task

    Test performance

    Word error rate

    Intent error rate

    Image source: Right Now Intent Guide

  • Copyright 2017. All rights reserved.

    Are We Done Yet?

    15

    Users want language support

    Dialog Management

    Automatic Speech

    Recognition

    VUI

    Dialog Management

    Natural Language

    Understanding

    Languages

  • Copyright 2017. All rights reserved.

    Languages

    16

    Source: Technology Review

    Your speech service recognizes in 40 languages why doesnt your app?

    The user asked how Im doing

    Respond that Im doing well

    How are you doing? Speech

    Engine

    How are you doing?

  • Copyright 2017. All rights reserved.

    Languages

    17

    Source: Technology Review

    I have no idea what that means

    Error handling

    Comment allez-vous? Speech

    Engine

    Comment allez-vous?

    A user speaks French. SR output is French text.

  • Copyright 2017. All rights reserved.

    Languages

    18

    A middle step is missing, a translation, or a mapping

    You could translate the text to English before further processing, or-

    You could add NLP in other languages

    When adding NLP in other languages you also essentially add a mapping between key words in English that associate intent with actions, and the corresponding words in the other supported languages.

    Source: Technology Review

  • Copyright 2017. All rights reserved.

    Languages

    19

    Source: Technology Review

    Translation services work differently, using statistics on many translated examples

    In a late 2016 blog post Googlers implied that Googles AI translation tool seems to have invented its own secret internal language, an internal representation, a machine initiated mapping

    The tool was trained to translate between English and Korean, and between English and Japanese

    The team found that the tool has spontaneously acquired the ability to translate between Korean and Japanese

    Science fiction? Read here:https://techcrunch.com/2016/11/22/googles-ai-translation-tool-seems-to-have-invented-its-own-secret-internal-language/

  • Copyright 2017. All rights reserved.

    Languages

    20

    Source: Technology Review

    A visualization of the translation systems memory when translating a single sentence in multiple directions.

  • Copyright 2017. All rights reserved.

    Are We Done Yet?

    21

    Users want AI

    Source: Technology Review

  • Copyright 2017. All rights reserved.

    Analytics=AI

    Automatic Speech

    Recognition

    VUI Design

    Dialog Management

    Natural Language

    Understanding

    22

    Languages Analytics

    Users want AI

    What they are really asking for is analytics

    Simple analytics gives you hindsight about what happened

  • Copyright 2017. All rights reserved.

    What Does It Take to Build an Enterprise Virtual Assistant?

    Automatic Speech

    Recognition

    VUI

    Dialog Management

    Natural Language

    Understanding

    Languages

    Descriptive Analytics

    Predictive Analytics

    Internet of Things

    Prescriptive Analytics

    Machine Learning

    23

    Descriptive analytics allows answering more complex questions and gives you insight about whats happening

    Predictive analytics gives you foresight about what will happen. It should also pull data from the real world

    Prescriptive analytics tells you what to do to get specific outcomes

    Machine learning makes sure the system gets better and smarter with every interaction

  • Copyright 2017. All rights reserved.

    Thats What It Takes to Build an Enterprise Virtual Assistant

    Automatic Speech

    Recognition

    VUI

    Dialog Management

    Natural Language

    Understanding

    Languages

    Descriptive Analytics

    Predictive Analytics

    Internet of Things

    Prescriptive Analytics

    Machine Learning

    24

    When will you be done?

    Source: http://theegeek.com/artificial-intelligence/

  • Copyright 2017. All rights reserved.

    Editt Gonen-Friedman

    Editt.gonenfr@gmail.com Editt.gonen-friedman@oracle.com

    https://www.linkedin.com/in/editt

    2