designing systems for next-generation i/o devices mitchell tsai, peter reiher, jerry popek ucla may...

36
Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Upload: virginia-cleveland

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Designing Systems forNext-Generation I/O Devices

Mitchell Tsai, Peter Reiher, Jerry Popek

UCLA

May 20, 1999

Page 2: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Problem• Next-Generation I/O performs poorly with

existing applications and operating systems.

– Examples of next-generation sensors/actuators• Speech, vision, handwriting, physical location…

– AI meets real General-Purpose Systems.• Not in the sandbox anymore!

– What should OSs provide for these technologies?

Page 3: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Current Systems

Keyboard& Mouse

GUIInterface

OS &Applications

80-99% accuracy

Requires 100% accuracy in critical situationsOne input at a time, from one source

TextRange.Font.Color = ppAccent1

SoundsSpeech Enabler

SpeechRecognition

Engine

OS &Applications

Grammar

Best Phrase Command

“Make the text blue”

Page 4: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Noise & Errors• Existing Metrics (Accuracy & Speed) are not good enough.• Dictation: 99% accuracy at 150 wpm

10 sec/error = 20% time correcting errors!

Type Time (sec) Speed (wpm) % Total TimeTspeech 38 160 16%Tdelay 33 85 14%Tcorrections 131 30 57%Tproof-reading 29 26 13%Ttotal 230 26 100%

Ttotal = Tspeech + Tdelay + Tcorrections + Tproof-reading

488 2

9

40 X

Page 5: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Command & Control Errors1) Most programs have No Undo capability

2) One Keystroke Loss– Cancel in MS Money

– Paste instead of Copy on PalmPilot

3) Undo requires advanced knowledge– MS Word accidental shift to outline mode

4) Undo is inconsistent between programs– One text selection (Outlook Mail) or two (Netscape Mail)

Page 6: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

From Dictation to Commands

• Commands are worse than dictation– Con: Errors can be irreversible and/or dangerous

– Con: Dictation delays processing to increase accuracy

– Pro: Smaller grammars produce higher accuracy

• Error handling “ad hoc” & insufficient– Handled twice by sensor processor & application

– Programmers design custom interfaces (or programs!)

– Users confused by inconsistencies

• How to leverage new inputs?– Context-sensitive and ambiguous commands

Page 7: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Outline• Problems of Next-Generation Sensors• BabySteps: Some Dialogue Management Services• Related Work• Design Issues for Post-GUI Environments

Page 8: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Next-Generation Sensors• Direct

– speech, handwriting, vision (eye gaze, pointing, gesture)

• Indirect– vision (head and eye focus), geographic location,

identification badges, emotions (affective computing).

• Traditional– network connectivity, computer resources.

Page 9: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

4 Main Problems of Next-Generation Sensors

1) Noise– “Make this b… red”, Sporadic incorrect GPS readings

2) Errors– Accidental user errors, Sensor processor mistakes

3) Ambiguity– “Make this box red”: Which box?

4) Fragmentation– Simultaneous inputs from speech, pointing, & vision

Page 10: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Sequences of Errors• Series of commands

– “cd thisdir; mv foo ..; rm *”

• Linear Undo Stack problems– Accidentally undo a few operations (X, Y, Z)

– Type “A”

– Lose all operations on the stack (X, Y, Z)

• Quit without Save, Accidental Command Mode– Oops!, Confirmed a “Yes/No/Cancel” box.

Page 11: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

BabySteps: Some Dialogue Management Services

• Command Manager– Command Services

– Command Properties

• Context Manager– Analyze Behavior Patterns

– Explicit Contexts (Internal, Dialogue, and External)

• Communicating Ambiguous Information– Probabilistic

– Richer, Task-based, Annotated

PowerPoint:Context-SensitiveSpeech & Mouse

Page 12: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

BabySteps

Context Management

CommandProcessingModules

SoundsCommandProcessing

SpeechInterpreter

OS &Applications

Dangerouscommands

Safecommands

“We are in context 7 now.”

Grammar forcontext 7

Command Propertiesfor context 7.

Page 13: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Command Management

1) Command Services must be provided by OS– Recording, editing, filtering,...

2) Command Properties must be communicated to OS– Ambiguous, context-sensitive events (from sensors)

– Safety, reversibility, usage patterns, cost (from applications)

3) Command Processing Modules– Safety Filter, Usage Tracker, Cost Evaluator

Page 14: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

How Speech Recognition Works

Aisle

I’ll

I

loathe

of

love

you

view

Hugh

I’ll of view

I loathe you

I love you

I love Hugh

Acoustic Model Best Match

Language Model Best Match

Two Model Best Match

Best in different context

4 Models in Current Systems: Acoustic, Language, Vocabulary, Topic

Page 15: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Methods for Better Accuracy• Speech Engines can produce scored output

Score (Phrase | Sound) = –100 to 100

• Combine sensor information with application or OS information using likelihoods(L).L(Command | Sound, Context) = L(Command | Context)

* L(Command | Phrase, Context)* L(Phrase | Sound)

where L(A) = F(A) / (AF(A) – F(A))and F(A) can be P(A) or some other scoring function

Page 16: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Explicit Contexts FromUser Behavior Analysis

• Example:– Context A = a priori probabilities for “editing” commands

– Context B = a priori probabilities for “viewing” commands

• Other Types of Explicit Contexts– Variations on Least Recently Used (LRU)

– Simple Markov Models

– Hidden Markov Models (HMMs)

– Bayesian Networks

Page 17: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Probabilistic Context-Sensitive Events

90% Region X, 10% Region Y

Select “box 3”, “line 4”, and “box 10”

Fuzzy Mouse MovementLow-levelEvents

Mid-levelEvents

High-levelEvents

Page 18: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Probabilistic Objects in Events

Type = SpeechPClarification = 0.6NCommands = 3Command[1] = “Thicken line 11”, L[1] = 0.61Command[2] = “Thicken line 13”, L[2] = 0.24Command[3] = “Quit”, L[3] = 0.15

“Thicken”

Page 19: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

User Clarification• Consider PClarification, the probability that we should clarify the

command with the user:PClarification = [1-L(CommandML, Context)]

* LReversible(CommandML, Context)* LCost(CommandML, Context)

CommandML is the Most Likely command.

LReversible = 0 to 1 (1 means fully reversible)

LCost = 0 to 1 (a normalized version of cost)

• Reversibility and cost can reduce seriousness of errors, but they may increase the total time required to finish a task!

• What is the relative utility of different types of clarification?

Page 20: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

BabySteps: Additional Factors

• Performance Evaluation– Error Hierarchy

– New Commands

– “Ambiguity is a Strength, not a Problem”

• “Transparency is not the best policy.”– How to get Feedback from the user?

• Passive/Active

– Different Types of “Cancel”• “Oops”, “Wrong”, “Backtrack”

Page 21: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Application Performance: Error Types• Desired Effect 2%• Inaction 13%

• Confirmation 0%• Minor 0%

– Undoable

• Medium– Fixable (1 command)– Fixable (Few commands)

– Unrecoverable (Many commands) 8%• Major 2%

– Exit without Save, Application Crash/Freeze

1%

8%

1%

9%

5%

8%

Page 22: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Extended Benefits for Applications

• Combining speech & mouse commands– Speech: “Make these arrows red.”

– Mouse: Move around arrows and other objects.

CommandProcessingModules

SoundCommandProcessing

SpeechInterpreters

OS &Apps

• Mouse: Fuzzy Pointing

Ambiguity & Context = Convenience

Page 23: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Ambiguity can be a Strength• Ambiguity is usually considered a problem.

– If the user makes a precise command, and sensors provide perfect interpretation, then the application should know exactly what to do.

• Exact precision by the user may be impossible or extremely time-consuming. Consider PowerPoint:– Moving the cursor to change modes

• Select Object Move Object => Resize Object Copy Object

– Selecting objects (and groups of objects)• Very close and/or overlapping (esp. with invisible boundaries)

• From layers of different groups

– Making object A identical with object B in size, shape, color, etc...

Page 24: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

BabySteps Summary• New sensors & user inputs present a family of problems

– Noise, Errors, Ambiguity, Fragmentation

• BabySteps: Some Dialogue Management Services1) Command Management - Command Services & Command Properties

2) Context Management - Analyze Behavior Patterns, Explicit Contexts

3) Communicate Ambiguous Information - Probabilistic, Richer

• Performance Evaluation– New Metrics: Total Task Time, Error Hierarchy

– New Commands: Will they pass usability threshold?

– Transparency vs. Communication (User Feedback & Control)

– Ambiguity is a Strength

Page 25: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

BabySteps approach to 4 Main Problems

1) Noise– Facilitate closer interaction between sensor processors & applications

– Reduce impact of errors through command & context management

2) Errors– Use user behavior analysis to detect, fix, and/or override errors.

– Ask user for help based on context and command properties

3) Ambiguity– Limited context-sensitive speech and mouse

4) Fragmentation– Probabilistic, temporal multimodal grammars not handled yet

Page 26: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Related Work• Context-Handling Infrastructures

– Context Toolkit: Georgia Tech• Provides context widgets for reusable solutions to context

handling [Salber, Dey, Abowd 1998, 1999].

• Multimodal Architectures (Human-Computer Interfaces)– QuickSet: Oregon Graduate Institute

• First robust approach to reusable scalable architecture which integrates gesture and voice. [Cohen, Oviatt, et al. 1992, 1997, 1999].

• Context Advantages for Operating Systems– File System Actions: UC Santa Cruz

• Uses Prediction by Partial Match (PPM) to track sequences of File System Events for a predictive cache [Kroeger 1996, 1999].

Page 27: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Related Work• CHI-99

– “Nomadic Radio: Scaleable and Contextual Notification for Wearable Audio Messaging”: MIT

• Priority, Usage Level, & Conversations [Sawney, Schamandt 1999].

– LookOut, “Principles of Mixed-Initiative User Interfaces”: MSFT• Utility of Action vs. Non-action vs. Dialog. [Horvitz 1999].

– “Patterns of Entry and Correction in Large Vocabulary Continuous Speech Recognition Systems”: IBM/Univ. of Michigan

• Compares Dragon, IBM, & L&H. Speech 14 cwpm (vs. keyboard 32 cwpm). [Karat, Halverson, Horn, Karat 1999].

– “Model-based and Empirical Evaluation of Multimodal Interactive Error Correction”: CMU/ Universität Karlsruhe

• Models multimodal error correction attempts using TAttempt = TOverhead + R*TInput [Suhm, Myers, Waibel 1999].

Page 28: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Related Work• Multimodal Grammars

– Oregon Graduate Institute [Cohen, Oviatt, et al. 1992, 1997].

– CMU [Vo & Waibel 1995, 1997].

• Command Management– Universal Undo [Microsoft]

– Task-Based Windows UI [Microsoft]

• Context Management (CONTEXT-97, CONTEXT-99)

– AAAI-99 Context Workshop• “Operating Systems Services for Managing Context” [Tsai 1999]

– AAAI-99 Mixed-Initiative Intelligence• “Baby Steps Towards Dialogue Management” [Tsai 1999]

• Probabilistic & Labeled Information in OS– Eve [Microsoft]

Page 29: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Post-GUI Systems

Operating Systems

Artificial Intelligence User Interfaces

Real People

ComputerPeople

SpecialPeople

GeneralPublic

Next-Generation Sensors/Actuators

Page 30: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Design Issues forPost-GUI Environments

• Performance may be driven by mobility & ubiquity.– Hard to beat desktop performance, except for specialized tasks

– But why not design good macros? Or use 2+ pointers/mice?

– Even with no video screen or keyboard, use buttons (e.g. PalmPilot)

– Speech and video good for rapid acquisition of data

• What are new tasks for smart mobile environments?– Summarize ongoing tasks (e.g. “Car, what was I doing?)

– Real dialogue is mixed-initiative (All commands are backgrounded!)

– Control of multiple applications (Consider JAWS. Is this needed?)

– Context-sensitive communication (Where’s the nearest pizza?)

Page 31: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Possible Changes• Explicit Contexts for Communication

– For users, or for system services– What format for communicating events & contexts?– What command properties should applications support?

• Database-like Rollback/Transactions for Application Commands– In addition to Elephant File System (HotOS 1999)– Making the entire computer more bulletproof, temporal history– Support dialogue management rather than linear commands

• Command and Task History– How to handle? Databases? Trees? Human conversation?– Real Dialogue Management

Page 32: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Possible Changes II• “Faster is not better.”

– “Courteous Computing” (Horvitz, Microsoft)– Pre-executing tasks works best in MS Outlook with 1 sec delay– Alternative to “Yes/No” dialog = Announce action & wait 1 sec

• User I/O must be buffered, filtered, & managed– Normal dialog is a series of background commands– Speech-only output may be a queue of application output requests– Variable environment conditions

• low/high bandwidth connections & Video/PalmPilot

– What if user must switch modalities midstream?

• Separate SAPI, GUI may not work - Need Multimodal API

Page 33: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Possible Changes III• Applications not designed for multiple commands.

– Currently submenus & dialog box sequences help narrow context.– Procedures GUI event loops Post-GUI dialogue

• Windows event systems aren’t either.

• I/O not designed for rapid interactive haptic/visual systems.– 1/3 sec (300 ms) responses good for conscious responses– But not for unconscious actions

• 1 ms visual tracking, 70 ms haptic responses, 150 ms visual responses

• Cost/Delay of sensor processors extremely high– How to give e-mail system priority responsiveness?

• Unified resource management, Soft Real-Time Systems– Governed by new Command Properties and Context Knowledge

Page 34: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Possible Changes IV

• Use Probabilistic & Multi-faceted Info throughout OS– Task-based file identification

– Multiple configuration setups (NT dialup)

• Applications could be designed for ambiguous and context-sensitive commands

• Context-based Adaptive Computing, Active Networks

• Will a more context-aware system provide resiliency?– Rather than super-slow AI learning?

Page 35: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Possible Changes V

• How do we support transition to real English dialogue?

• “Computerese” may co-exist with– natural human spoken & gestural languages

– command-line & GUI computer interfaces

• Can other protocol learn from human languages?– Use ambiguity, synonyms.

– Different Types of ACKs, NACKs

Page 36: Designing Systems for Next-Generation I/O Devices Mitchell Tsai, Peter Reiher, Jerry Popek UCLA May 20, 1999

Future Directions

• If the System & Algorithm people can provide X, can the UI people design good ways use this information?

• If the UI or Device has characteristic Y, what must the system and algorithm people provide?

• New sensors & user inputs present a family of problems– Noise, Errors, Ambiguity, Fragmentation

• User I/O may need a whole family of User Dialogue services,similar to networking, file management, or process control.