talking to machines, listening to people - gordon plant

Click here to load reader

Post on 12-Apr-2017




0 download

Embed Size (px)


PowerPoint Presentation

Talking to machines, listening to humansGordon Plant, WIAD 2017@gordonplant


Why talk to machines?

If I could talk to the animals machinesif I could talk to the animals, just imagine itChatting to a chimp in chimpanzeeImagine talking to a tiger, chatting to a cheetahWhat a neat achievement that would beDr Doolittle (1967)

Humans anthropomorphise everything54 per cent of people have verbally assaulted their computers, while 40 per cent have resorted to physical violence


Why talk to machines?Talking doesnt interrupt other tasksAttention can remain focussed elsewhereMachines are effort multipliersEffort / Reward ratio is improved as effort is reduced Some intentions are hard to express via a GUI

Whos talking already?Survey of 1250 people by Creative Strategies, October 201622% use a voice assistant four to six times a week33% think it is more convenient to talk than type27% would prefer to act with bots in the car26% would prefer to act with bots in the home

How does a LUI work?

Utterance, intent, invocationUtteranceThe spoken wordsIntentA recognisable intent extracted from parsing the utteranceInvocation phraseThe phrase that launches the relevant skill

A skill is a bit like an App on your phone

Click to reveal skill bubble8

Utterance strcutureAlexa, tell HAL to open the pod bay doors




Connecting wordsInvocationname

Lets turn on the heating

commandboilerVoice inputLets turn on the heating


Touch inputAppDirect input

Lets turn on the heatingAlexa, tell Hive to turn the heating on


Connecting wordsWakewordInvocationnameRequest

Alternative request wordsTalk toOpenLaunchStart


Lets turn on the heatingAlexa, tell Hive to turn the heating on Alexa, tell Hive to turn the heating on to 20 degrees Alexa, tell Hive to put the heating on to 20 degrees Alexa, tell Hive to boost my heating Tell

Lets turn on the heatingAlexa, tell Hive to turn the heating onAlexa, turn the heating on Alexa, tell Hive to turn the heating on to 20 degreesAlexa, Hive to 20 degrees Alexa, tell Hive to put the heating on to 20 degrees Alexa, put the heating on for 20 degrees Alexa, tell Hive to boost my heatingAlexa, tell Hive to put the heating on for 1 hour Tell

For every phrase that works, there are many similar ones that dont

1 Build !15


WakeInvoke AppNavigate appTap buttonConfirmationAlexatell Hiveto turn the heating onok

5 part build16

Model / Modes / Actionsare discoverable at launchModel / Modes / Actionsonly discoverable by trial and errorUser only needs to remember the name or location of the appUser needs to remember complete, structured sentencesGUILUI

Heating on

Alexa, tell Hive to turn the heating onAlexa, ask Hive heating on

Alexa, tell Hive to turn the heating offAlexa, tell Hive heating off

Alexa, turn the heating on

We dont have to think about resolving the click it just happensGUILUIMany inputs may resolve to the same click,and others may not resolve at all

Heating offTell me a joke

After Matthew Honnibal

Heating on

After Matthew Honnibal0Heating offTell me a joke

Whats the weather?

Play some jazz

Whats the time in Seattle?

Play some rock

Hows my diary looking?

Add beer to my shopping list

Did Arsenal win last night?

Get the Batmobile ready

Buy more dishwasher tabs

Lock the back door

Wheres that Beer I ordered?

The Invisible Canvas[With a LUI] you have a vastly bigger canvas on which the user can clickBut you still have to paint buttons, forms, navigation menus etc. onto this canvas. Youre still wiring a UI to some fixed underlying set of capabilities.Matthew Honnibal


Listening to people

All conversation has a shared contextWhen two people talk, their context will modify tone and contentWe have social rules around phone calls, texts, IM etc. These are modifications to the rules of face to face conversationWe use different language for work / home / social contextsbut machines have no context to shareWe have to explicitly model the context for the machineAlexa does not have common sense

AuthenticationIn human-to-human conversation we authenticate on sound of voice almost instantlyTo Alexa, all voices are equalWithout authentication, many potential use for LUIs are unsecure

You talkin to me?We rely on tone of voice to provide meta-data about the message contentIts not what you said, its the way you said it.Without the meta data, the communications capacity (bandwidth) is greatly reduced

GSOH essentialAmazon, Google, Apple and Facebook have been recruiting a diverse cast ofscript writers, audio specialists andcomedians. It is part of a much wider drive in digital industry to hire those with an understanding of howetiquette, creativity, dramatic timing and humour can elevate a digital experience. Google, for instance, is reportedly working with joke writers from Pixar and The Onion to imbue its newAssistant with some personality.

Tone of voice

When will it be like the movies?


Consider the contextSpatial contextIs the user in a space where speech can be recognised?Social contextIs it socially acceptable to talk to a machine?Task contextIs the user engaged in some other task ?

The Hype Cycle & Amaras Law

We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long runRoy AmaraSource are here


View more