realtime voicewriting education 2005 intersteno congress vienna, austria presented by phillip a....

29
Realtime Realtime Voicewriting Voicewriting Education Education 2005 INTERSTENO CONGRESS 2005 INTERSTENO CONGRESS VIENNA, AUSTRIA VIENNA, AUSTRIA Presented by Presented by Phillip A. Kaufman, Phillip A. Kaufman, CRI CRI

Upload: kenneth-shelton

Post on 26-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Realtime VoicewritingRealtime VoicewritingEducationEducation

2005 INTERSTENO CONGRESS2005 INTERSTENO CONGRESSVIENNA, AUSTRIAVIENNA, AUSTRIA

Presented byPresented byPhillip A. Kaufman, CRIPhillip A. Kaufman, CRI

Just about anyone who fluently speaks English, German, Italian, French, Spanish, Dutch or Japanese can use IBM ViaVoice or Dragon NaturallySpeaking to produce short documents.

But there are several significant differences between the common use of these applications and realtime voicewriting. Let’s examine those differences.

The average user dictates fairly slowly at The average user dictates fairly slowly at his/her own pace, pausing to compose a his/her own pace, pausing to compose a thought, then dictating, then pausing to thought, then dictating, then pausing to compose another thought, then dictating, compose another thought, then dictating, et cetera.et cetera.

The realtime voicewriter repeats The realtime voicewriter repeats everything that is said by everything that is said by individuals in a colloquy or an individuals in a colloquy or an individual giving a soliloquy, and individual giving a soliloquy, and has no control over what is said has no control over what is said or the speed at which it is said. or the speed at which it is said.

The average user is likely to use ViaVoice The average user is likely to use ViaVoice or NatSpeak simultaneously for computer or NatSpeak simultaneously for computer command and control, websurfing, e-mail command and control, websurfing, e-mail and dictating documents.and dictating documents.

The realtime voicewriter’s sole The realtime voicewriter’s sole use of the speech recognition use of the speech recognition engine when voicewriting is the engine when voicewriting is the instantaneous transcription of instantaneous transcription of the dialogue or monologue of the dialogue or monologue of others. others.

Most average users dictate short Most average users dictate short documents.documents.

Realtime voicewriters typically Realtime voicewriters typically dictate nonstop for at least 30 dictate nonstop for at least 30 minutes, and often for several minutes, and often for several hours. hours.

The average user is usually the only person The average user is usually the only person that sees the rough draft of what he or she that sees the rough draft of what he or she has dictated. Some errors are okay has dictated. Some errors are okay because the document will be corrected because the document will be corrected before others read it.before others read it.

The realtime voicewriter’s text is The realtime voicewriter’s text is often viewed by other people as often viewed by other people as it is produced; therefore, it is produced; therefore, accuracy must be near perfect.accuracy must be near perfect.

The average user may take his time The average user may take his time dictating long strings of commands and dictating long strings of commands and punctuation, or go back and format later.punctuation, or go back and format later.

The realtime voicewriter cannot The realtime voicewriter cannot pause to think about formatting. pause to think about formatting. He applies specialized dictation He applies specialized dictation and software techniques to and software techniques to compress what must be dictated compress what must be dictated to produce formatting and inserts to produce formatting and inserts those utterances appropriately to those utterances appropriately to produce all appropriate produce all appropriate formatting on-the-fly. formatting on-the-fly.

The average user can dictate sloppily and The average user can dictate sloppily and obtain acceptable results.obtain acceptable results.

The realtime voicewriter must The realtime voicewriter must learn and practice diligently to learn and practice diligently to develop extremely good develop extremely good enunciation and diction along enunciation and diction along with control of his voice and with control of his voice and breathing.breathing.

The average user can use an average The average user can use an average computer and is often ignorant of computer and is often ignorant of performance and maintenance concerns.performance and maintenance concerns.

The realtime voicewriter must The realtime voicewriter must have a high-end computer, and have a high-end computer, and know how to perform know how to perform adjustments and maintenance to adjustments and maintenance to obtain the best, consistent obtain the best, consistent performance from that performance from that computer. computer.

The average user typically uses speech The average user typically uses speech recognition for the same type of subject recognition for the same type of subject matter.matter.

Most realtime voicewriters Most realtime voicewriters encounter a very wide variety of encounter a very wide variety of subject matters and must have subject matters and must have a very broad knowledge base. a very broad knowledge base.

And the list could go on and on…And the list could go on and on…

There is a lot to learn.There is a lot to learn.

It is not magic.It is not magic.

It is through mastery ofIt is through mastery of► Rapid, clear dictation and specialized dictation Rapid, clear dictation and specialized dictation

techniques techniques

► Computer operation, performance and Computer operation, performance and maintenance as it relates to speech recognition maintenance as it relates to speech recognition

► The application of specialized, structured principles The application of specialized, structured principles of speech recognition software manipulationof speech recognition software manipulation

► CAT software tools CAT software tools

that an individual becomes that an individual becomes capable of realtime voicewritingcapable of realtime voicewriting

EDUCATION IS KEYEDUCATION IS KEY

• WELL-STRUCTURED AND WELL-STRUCTURED AND COMPREHENSIVECOMPREHENSIVE

• BEST PROVIDED THROUGH SCHOOLSBEST PROVIDED THROUGH SCHOOLSMost students need to be lead to the water and Most students need to be lead to the water and

coaxed to drinkcoaxed to drink

• THE TEACHER’S INFLUENCE IS AS VITAL THE TEACHER’S INFLUENCE IS AS VITAL AS THE TEXTBOOKS, CURRICULUM AND AS THE TEXTBOOKS, CURRICULUM AND OTHER RESOURCESOTHER RESOURCES

• THERE IS A LOT TO LEARN IN ORDER TO THERE IS A LOT TO LEARN IN ORDER TO MASTER REALTIME VOICEWRITINGMASTER REALTIME VOICEWRITING

Typical, uneducated Typical, uneducated preparation and/or preparation and/or

underdeveloped training underdeveloped training techniques result in less techniques result in less

than optimal results.than optimal results.

MOST TRAINING PROVIDED TO DATE MOST TRAINING PROVIDED TO DATE HAS BEEN FROM NOTHING MORE HAS BEEN FROM NOTHING MORE

THAN SOFTWARE MANUALSTHAN SOFTWARE MANUALS

What would a well-developed What would a well-developed realtime voicewriting realtime voicewriting

curriculum cover curriculum cover and how would and how would it be structured?it be structured?

Introduction to realtime voicewriting equipmentIntroduction to realtime voicewriting equipment Dictation practice equipment and audio sourcesDictation practice equipment and audio sources Introduction to developing dictation skills (Dictation skill development will go on throughout the course Introduction to developing dictation skills (Dictation skill development will go on throughout the course

and includes over 40 specially developed exercises and a structured set of speed of drills.) and includes over 40 specially developed exercises and a structured set of speed of drills.) The History of Realtime Voicewriting The History of Realtime Voicewriting Computer operating system basic knowledge for realtime voicewriting Computer operating system basic knowledge for realtime voicewriting Basic computer performance considerations for realtime voicewriting Basic computer performance considerations for realtime voicewriting How speech recognition worksHow speech recognition works An overview of realtime voicewriting career optionsAn overview of realtime voicewriting career options Intermediate dictation skills (speakers, punctuation and macros)Intermediate dictation skills (speakers, punctuation and macros) The average users speech recognition experience (creating a set of user files like everyone else The average users speech recognition experience (creating a set of user files like everyone else

does)does) Word processor basicsWord processor basics Three preparation steps before creating realtime voicewriting user filesThree preparation steps before creating realtime voicewriting user files The vocabulary tools in NaturallySpeakingThe vocabulary tools in NaturallySpeaking The Voice-Ed Foundation/Specialization Vocabulary theory in NaturallySpeaking The Voice-Ed Foundation/Specialization Vocabulary theory in NaturallySpeaking Creating a structured set of realtime voicewriting files in Naturally SpeakingCreating a structured set of realtime voicewriting files in Naturally Speaking Voice-Ed golden rules of user file care and re-creationVoice-Ed golden rules of user file care and re-creation Improving recognition with the correction tool in NaturallySpeakingImproving recognition with the correction tool in NaturallySpeaking Improving recognition with the Vocabulary Editor in NaturallySpeakingImproving recognition with the Vocabulary Editor in NaturallySpeaking Other accuracy improving techniques in DNSOther accuracy improving techniques in DNS Advanced computer maintenance for realtime voicewritingAdvanced computer maintenance for realtime voicewriting Understanding and troubleshooting computer audio for speech recognitionUnderstanding and troubleshooting computer audio for speech recognition Creation of a better set of DNS user files Creation of a better set of DNS user files The vocabulary tools in IBM ViaVoiceThe vocabulary tools in IBM ViaVoice Creating a set of user files in ViaVoiceCreating a set of user files in ViaVoice Improving recognition with the correction tool in ViaVoiceImproving recognition with the correction tool in ViaVoice Improving recognition with the Macro Editor in ViaVoiceImproving recognition with the Macro Editor in ViaVoice Realtime Voicewriting with CAT softwareRealtime Voicewriting with CAT software Correcting and editing in the CAT softwareCorrecting and editing in the CAT software CAT system globals for realtime voicewritingCAT system globals for realtime voicewriting Vocal health and overcoming colds and other problems with your voiceVocal health and overcoming colds and other problems with your voice Specialized vocabulariesSpecialized vocabularies Dictation skills for obtaining accuracy with fast talkersDictation skills for obtaining accuracy with fast talkers Digital room audio recording for court reportingDigital room audio recording for court reporting Dictation and software techniques for court reportingDictation and software techniques for court reporting Dictation and software techniques for captioningDictation and software techniques for captioning Captioning academics for realtime voicewritersCaptioning academics for realtime voicewriters Dictation and software techniques for CARTDictation and software techniques for CART CART academics for realtime voicewritersCART academics for realtime voicewriters

Classes for academics applicable to Classes for academics applicable to all career specialties should be taught all career specialties should be taught simultaneously with the basic realtime simultaneously with the basic realtime voicewriting classes, i.e. grammar, voicewriting classes, i.e. grammar, punctuation, computer basics, punctuation, computer basics, terminology, et cetera. terminology, et cetera.

A MODERATE LEVEL OF COMPETENCY IN BASIC REALTIME VOICEWRITING SHOULD BE ACHIEVED BEFORE FOCUSING ON A SPECIALIZATION

SOME CAREERS DO NOT REQUIRE SPECIAL ACADEMICS

Telephone and Internet Transcription ServicesOther corporate realtime transcription venues

CART Captioning Court Reporting Medical Transcription

Teachers would need to have: a laptop or desktop computer equivalent to that on

which the students will learn. one or both speech recognition engines a speech silencer mask and, optionally, an open-

mic headset a USB speech processor Instructors’ copies of all course books and

materials speed building dictation tapes course outlines, syllabi and lesson plans a set of tests and quizzes the CAT software chosen by the school Preparatory training

Students would need to have: a speech silencer mask and, optionally, an

pen-mic, sound-isolating headset one or two cassette recorder/players a high-end laptop computer ViaVoice Pro USB or ViaVoice and NatSpeak

Pro a USB speech processor textbooks, audio exercises and other

materials speed building dictation tapes, CDs and/or

online practice dictation recordings CAT software student or full version (not

necessary to begin studying, can be added and learned late in program)

WILL UNMANNED, AUTOMATED SPEECH WILL UNMANNED, AUTOMATED SPEECH RECOGNITION EVER REPLACE STENOTYPISTS AND RECOGNITION EVER REPLACE STENOTYPISTS AND

REALTIME VOICEWRITERS?REALTIME VOICEWRITERS?

It’s not likely, at least not within the next few decades. Why? Because there are a few things that it takes a human’s input to accomplish, things that are almost certain to be required for some time into the future in order to obtain the most accurate realtime transcription. These include:

• Instantaneous insertion of punctuation at the appropriate places.• Speaker identification with automated formatting thereof.• Homonym and other forms of conflict resolution. The speech

recognition engines are very good at handling a great deal of these because of grammatical modeling, but a speech recognition engine can only function on probabilities, while the human mind can discern countless variables.

• There will always be someone participating in a discourse that mumbles or does not speak loudly or clearly enough for the microphones and computer to pick up his/her speech and convert it to text accurately.

• People often talk over each other. It would be almost impossible for a speech recognition system to separate out the words of each speaker.

Certainly, speech recognition will improve year after Certainly, speech recognition will improve year after year. But the original pioneers and experts in the year. But the original pioneers and experts in the field of speech recognition engine development say field of speech recognition engine development say that they do not predict any major breakthroughs in that they do not predict any major breakthroughs in the next couple of decades, and that the the next couple of decades, and that the improvements will be incrementally smaller and improvements will be incrementally smaller and smaller. The uses of speech recognition are certain smaller. The uses of speech recognition are certain to grow rapidly. It will probably be everywhere. But to grow rapidly. It will probably be everywhere. But accurate realtime transcription of multiple speakers accurate realtime transcription of multiple speakers is a special use of speech recognition. is a special use of speech recognition.

HAL won’t be taking over our jobs in our lifetimes.HAL won’t be taking over our jobs in our lifetimes.

REALTIME VOICEWRITNG

REAL OPPORTUNITIESFOR STUDENTS, TEACHERS, SCHOOLS, THE INDUSTRY…

AND SERVICES FOR THE WORLD