behrooz chitsaz lorrie apple johnson microsoft research u.s. department of energy

14
Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Upload: dolan-lowery

Post on 31-Dec-2015

36 views

Category:

Documents


2 download

DESCRIPTION

Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy. Multimedia Research. Speech Search. Face identification. Object recognition. Video browsing. Semantic extraction. (3D) Segmentation. (3D) Image search. Speech Applications. Speech as interface - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Behrooz Chitsaz Lorrie Apple JohnsonMicrosoft Research U.S. Department of Energy Behrooz Chitsaz Lorrie Apple JohnsonMicrosoft Research U.S. Department of Energy

Page 2: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Multimedia ResearchSpeech Search

Face identification

Object

recognition

Video browsing

Semantic

extraction

(3D) Segmentation

(3D) Image search

Page 3: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Speech as interface

Speech as 1st class content

Speech Applications

Page 4: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Speech recognition

Spectral Analysis

Matching (Decoding)time alignment most likely hypothesis

W’=argmax(w1..wN)p(ot..o|w1..wN) P(w1..wN)

Acoustic Modelsp(ot..o|phoneme)

DictionaryP(phonemes|w)

Grammar (Language Model)

P(w1..wN)

“Hello World”

o1..oT

(w1..wN)^

Page 5: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

MAVIS technology

• Indexing automatic transcripts as text– Automatic transcription accuracy is only 50-80%

• MAVIS techniques– Word-level lattice indexing

• index word alternatives – robust to recognizer errors• 50-140% accuracy improvement • index timing – navigate to exact point in video

– Vocabulary Adaptation• Use NLP and Bing Search to expand word dictionary

– Automatic keywords to expose to search engines• Enables discovery of speech content through search engines• Bi-product of vocabulary adaptation

– See http://research.microsoft.com/mavis

Page 6: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

MAVIS Architecture

SQL Server(s)

1. S

ubm

it au

dio/

vid

eo R

SS

2. R

etrie

ve

AIB

3. Import AIB in SQL

Web server(s)

4. S

earc

h/R

etr

ieve

re

sults

• Store content to be processed in temporary Azure storage

• Do vocabulary adaptation using Bing• Run recognition engine on content• Store results or recognition process (AIB)

Page 7: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

U.S. Department of Energy Office of Scientific and Technical

Information (OSTI) Mission

• DOE invests > $10 billion/year in basic sciences, clean energy technology, and nuclear research.

• The immediate output from this investment is Information…Knowledge… R&D results

• OSTI’s mission is to accelerate scientific progress by accelerating access to this information.

Page 8: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

OSTI’s Core Products

• Information Bridge

• Science Accelerator

• Science.gov

Page 9: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

WorldWideScience.org

Page 10: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Emerging Forms of Scientific Information Require New Tools

• Numeric data, multimedia, and social media are emerging forms of scientific information

• Each form presents special opportunitiesand challenges

Page 11: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Search and Retrieval Challenges with Multimedia Science Information

• Lack of written transcripts, i.e. no “full text” to search

• Metadata, if available, is often minimal

• Scientific, technical, and medical terminology/vocabulary

• Videos can be long, often up to an hour or more

Page 12: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

• Video files collected from DOE’s National Laboratories

• RSS feeds with metadata and URLs sent to Microsoft Research

• Audio indexing performed via MAVIS• Audio index blob (AIB) returned to OSTI and

integrated with SQL servers• Users can search for a precise term within the video,

and be directed to the exact point in the video where the term was spoken

OSTI and Microsoft Research Partnership

Page 13: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Demonstration of ScienceCinema

Page 14: Behrooz Chitsaz Lorrie Apple Johnson Microsoft Research U.S. Department of Energy

Looking to the Future

• Additional content from DOE researchers• Integration of multimedia searches into

WorldWideScience.org by June• High quality automatic closed captions• Multilingual translation capabilities