mobile dictation with automatic speech recognition for healthcare purposes tuuli keskinen, aleksi...
TRANSCRIPT
Mobile DictationWith Automatic Speech Recognition for
Healthcare Purposes
Tuuli Keskinen, Aleksi Melto, Jaakko Hakulinen, Markku Turunen, Santeri Saarinen, Tamás Pallos
TAUCHI research center, School of Information Sciences, University of Tampere, Finland
Riitta Danielsson-Ojala, Sanna SalanteräDepartment of Nursing Sciences, Faculty of Medicine, University of Turku, Finland
Kites symposium 2013
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
Background
• First speech recognition systems for medical reporting were developed over 20 years ago [1]
• Doctors’ dictations are still commonly typed manually, but utilization of speech recognition is increasing especially in radiology and pathology
• Nurses’ use of speech recognition is rare and often limited to filling the templates
[X] Numbers refer to the actual references in the paper.
Background
• Utilizing speech recognition in Finnish healthcare studied, e.g., in [2] where radiologists were followed changing from cassette-based recording to speech recognition based dictating
• Several studies in the area of speech recognition in healthcare done, e.g., [1, 3, 4, 5]
• Previous studies focus mainly on objective qualities, such as dictation durations and recognition error rates
Motivation
”Voi kun meilläolisi
mahdollisuus saneluun!”
- Anonyymi YTHS:n sairaanhoitaja
Motivation for our study
• Paucity of utilizing speech recognition in Finnish healhcare, especially in nursing
• Obvious and unnecessary delays in getting patient information to the next treatment steps
• Lack of research focusing on the user expectations and experiences of dictation applications utilizing speech recognition in healthcare
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
Dictation application
• Based on ”MobiDic” by Turunen et al. [6]• The mobile client (Android application on a
tablet) includes functionality for recording and editing dictations, and modifying the dictation texts
• The server side manages the dictations (audio and text) and communicates with speech recognition engines and M-Files document management system
Dictation application• Not only speech recognition is utilized, but a variety of other
tools is included to improve results:– State of the art natural processing tools (e.g., spelling and grammar
checking)– Statistics based on user actions– Optimized multimodal touch-screen U
• Distributed application model makes a variety of use cases possible:– Real-time distributed assisted dictation– Workflow management– Plug-and-play component management (e.g., speech recognizer, NLP
tools, document management)– UI can be adapted for different usage cases and devices
Dictation application
Dictation application – v2.0
Dictation application – v2.0
Dictation application – v2.0
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
User evaluation
• Real-world context, real users and real dictations• Two wound care nurses in one of the University
Hospitals in Finland• Lasted three months in total, covering 30 and 67
dictations for the participants• Wizard-of-Oz approach– The medical language model available was based on
medical and nursing documentation, and thus, it was not sufficient to recognize the language used by the wound care nurses
Dictation application
Methodology
• Background interview– Main focus on participants’ normal practices on
making and/or dictating patient entries• Subjective data gathered with questionnaires– User expectations and experiences (SUXES [8])– Usability-related experiences (SUS [9])– Open questions
• Log data– All application and server events logged
SUXES method • Enables comparison between user expectations before the usage
and user experiences after the usage on a set of statements• Expectations reported by giving two values
– acceptable level: the lowest acceptable quality level for even using the system (or property)
– desired level: the uppermost level that can even be expected of the system (or property)
• Experiences reported by giving a single value on the same statements
• Expectations form a gap where the experienced level is usually expected to be– If below something is wrong; If above success
SUXES method
• Expectations
• Experiences
• Comparison
Using the phone is fast.
Low High
x x
Using the phone is fast. x
Using the phone is fast.
SUXES method
• Expectations
• Experiences
• Comparison
Using the phone is fast.
Low High
x x
Using the phone is fast. x
Using the phone is fast.
SUXES method
• Expectations
• Experiences
• Comparison
Using the phone is fast.
Low High
x x
Using the phone is fast. x
Using the phone is fast.
Expectations and experiences
• We used the nine original statements of SUXES– speed, pleasantness, clearness, error free use,
error free function, learning curve, naturalness, usefulness, and future use
• …and five additional statements comparing the dictation application to the normally used entry practice– faster, more pleasant, more clear, easier, and
prefer in the future
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
User expectations on the application
Median responses of acceptable – desired levels (grey areas), n=2.
User experiences on the application
Median responses of acceptable – desired levels (grey areas) and experiences (black circles), n=2. P1 and P2 refer to participant 1 and 2.
User expectations compared to normal entry practice
Median responses of acceptable – desired levels (grey areas), n=2.
User experiences compared to normal entry practice
Median responses of acceptable – desired levels (grey areas) and experiences (black circles), n=2.
Content
• Background & Motivation• Dictation application• User evaluation• Results• Discussion and conclusion• Ending words
Discussion
• The desired level was 6 or 7 on all statements• The experienced level was at least 6 on all but
one statements• The usefulness of the dictation application can
clearly be seen in the results• More importantly, the participants would
prefer using the application in the future, i.e., they would be ready to drop their familiar and safe routines
Conclusion
• Due to not having an accurate enough language model for nurses’ purposes, we used a Wizard-of-Oz scenario to finalize the speech recognition results
• The user experience results show a true potential for our dictation application – not only to smoothen dictation process, but as a relevant option for writing the nursing entries
Future work
• Finalizing a language model for nurses and utilizing it in Finnish healthcare to enable totally automatic dictation-to-text process is crucial
• We are not developing the language models by ourselves, but will be in close collaboration with our partners in the development and evaluation
• We are also developing our application further to provide even more pleasurable user experience and seamless process
Future Work
• In order to make this reality, we need a proper process for iterative deployment: not a stand-alone product which can be sold to hospitals, for example
• We have developed all necessary components: client and backend software, connections to 3rd party components, tools to support deployment, and a complete deployment process
• Ready for commercialization – looking for partners!
Global market
Global market
Acknowledgements
• Project ”Mobile and Ubiquitous Dictation and Communication Application for Medical Purposes” (”MOBSTER”)
• Funded by the Finnish Agency for Technology and Innovation (TEKES)
• Lingsoft and M-Files, and other project partners