multimedia information retrieval: what is it, and why isn't

74
Video Search: Opportunities and Challenges Keynote Speech at ACM MIR Workshop ACM Multimedia Conference 2005 Singapore Dr. Ramesh R. Sarukkai Yahoo! Search { [email protected] [email protected] }

Upload: webhostingguy

Post on 01-Nov-2014

1.800 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Opportunities and Challenges

Keynote Speech at ACM MIR WorkshopACM Multimedia Conference 2005

Singapore

Dr. Ramesh R. Sarukkai

Yahoo! Search{ [email protected] [email protected] }

Page 2: Multimedia Information Retrieval: What is it, and why isn't

About the Presenter

Dr. Ramesh Rangarajan Sarukkai currently heads up Video Search engineering at Yahoo where he manages video search/core media search infrastructure. Over his 6+ years tenure at Yahoo, Dr. Sarukkai has grown and managed different teams including the Small Business (e-commerce stores, hosting, domains) billing platform, and architected wireless applications & voice portals including 1-800-my-yahoo. Prior to Yahoo, he has worked in Kurzweil A.I., IBM Watson Lab, and has collaborated with HP Labs.

Dr. Sarukkai is the author of the book entitled “Foundations of Web Technology” (Kluwer/Springer 2002). In addition, he holds the first patent on large vocabulary voice web browsing, and has a number of patents awarded/pending in the areas of speech, natural language modeling, personalized search, information retrieval, optimal sponsored search, media and wireless technologies. Dr. Sarukkai also served on the World Wide Web Consortium (W3C) Voice Browser working group, recently chaired a panel in World Wide Web 2005 conference (Japan) on networking effects of the Web, and has contributed as a reviewer/Program Committee in a number of leading conferences/journals. His current interests include next-generation media search, social annotation, networking effects on the Web, emergent web technologies and new business models.

Page 3: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 4: Multimedia Information Retrieval: What is it, and why isn't

Introduction

Key questions discussed in this talk:• Why is video search important?• What broadly constitutes video search?• How is Web Video Search different?• What are some opportunities and challenges?

Page 5: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 6: Multimedia Information Retrieval: What is it, and why isn't

Market Trends

Whoever controls the media - the images - controls the culture.

- Allen Ginsberg

American Poet

Page 7: Multimedia Information Retrieval: What is it, and why isn't

Market Trends• Broadband doubling over next 3-5 years• Video enabled devices are emerging rapidly• Emergence of mass internet audience• Mainstream media moving to the Web• International trends are similar• Money Follows…

Page 8: Multimedia Information Retrieval: What is it, and why isn't

Market Trends• How many of you are aware of video on the Web?• How many have viewed a video on the Web in

• The last 3 months?• The last 6 months?• Ever?

• Would you watch video on your devices (ipod/wireless)?

• How many of you have produced video (personal or otherwise) recently?

• How many of you have shared that with your friends/community? Would you have liked to?

Page 9: Multimedia Information Retrieval: What is it, and why isn't

Market Trends• How many of you are aware of video on the Web?

• Large portion of online users

• How many have viewed a video on the Web in • The last 3 months?

• 50%• The last 6 months?• Ever?

• Would you watch video on your devices (ipod/wireless)?• 1M downloads in 20 days (iPod)

• How many of you have produced video (personal or otherwise) recently?

• Continuing to skyrocket with digital camera phones/devices

• How many of you have shared that with your friends/community? Would you have liked to?

• Huge interest & adoption in viral communities

* Source: Forrester Report

Page 10: Multimedia Information Retrieval: What is it, and why isn't

Market Trends

• Technology more media friendly• Storage costs plummeting (GB TB)• CPU speed continuing to double (Moore’s law)• Increased bandwidth• Device support for media• Adding media to sites drives traffic• Web continues to propel scalable infrastructure

for media products/communities

Page 11: Multimedia Information Retrieval: What is it, and why isn't

Market Trends

Democratization of Mass Media in the 21st Century (aka the long tail)

• Of the People (media is ultimately defined by users)

• By the people (production,tagging)

• For the people (consumption, devices, personalization)

- Abraham Lincoln as a Media Mogul in the 21st Century!

“The Long Tail” was coined by Chris Andersen’s article in the Wired magazine.

Page 12: Multimedia Information Retrieval: What is it, and why isn't

Market Trends

Mainstream media migrating online & interactive

• Media content online is growing• Video Streaming continuing to grow aggressively online.• More interactive in nature (user voting):

• MTV Interactive• American Idol• Reality shows

Page 13: Multimedia Information Retrieval: What is it, and why isn't

Market Trends

Video search is a key part of the future of digital media

Page 14: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 15: Multimedia Information Retrieval: What is it, and why isn't

Video Search

What is Video Search?

Multimedia? As far as I'm concerned, it's reading with the radio on!

- Rory BremnerActor/Writer

Page 16: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Meta-Content

Video Production

UnstructuredData

Video Consumption/Community

Content Features +Meta-Content

Structured +Unstructured Data

Page 17: Multimedia Information Retrieval: What is it, and why isn't

Video Search

• Media Information Retrieval• Been around since the late 1970s

• Text Based/DB• Issues: Manual Annotation, Subjectivity of Human Perception

• Content Based• Color, texture, shape, face detection/recognition, speech

transcriptions, motion, segmentation boundaries/shots• High Dimensionality• Limited success to date.

Citations:

“Image Retrieval: Current Techniques, Promising Directions, and Open Issues” [Rui et al 99]

Page 18: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Active Research Area

Graph from “A new perspective on Visual Information Retrieval”, Horst Eidenberger, 2004Black: “Image Retrieval”; Grey:”Video Retrieval”; IEEE Digital Library

Page 19: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Traditional 3-Step Overview:

• Meta-Data/Visual Feature Extraction• Multi-dimensional indexing• Retrieval System design

Page 20: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Popular features/techniques:• Color, Shape, Texture, Shape descriptors• OCR, ASR• A number of prototype or research products with

small data sets• More researched for visual queries

Page 21: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Features

1970 200019901980 2010

Texture: Autocorrelation; Wavelet transforms; Gabor Filters

Shape: Edge Detectors; Moment invariants; Animate Vision Marr; Finite Element Methods; Shape from Motion;

Color: Color Moments Color Histograms Color Autocorrelograms

Segmentation: Scene segmentation; Scene Segmentation; Shot detection;

OCR: Modeling; Successful OCR deployments;

Face: Face Detection algorithms; Neural Networks; EigenFaces

ASR: Acoustic analysis; HMMS; N-grams; CSR; LVCSR; Domain Specific;

NIST Video TREC StartsMedia IR systems

Web Media Search

Page 22: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Features

Color• Robust to background • Independent of size, orientation• Color Histogram [Swain &

Ballard]• “Sensitive to noise and

sparse”- Cumulative Histograms [Stricker & Orgengo]

• Color Moments• Color Sets: Map RGB Color

space to Hue Saturation Value, & quantize [Smith, Chang]

• Color layout- local color features by dividing image into regions

• Color Autocorrelograms

Texture• One of the earliest Image

features [Harlick et al 70s]• Co-occurrence matrix• Orientation and distance on gray-

scale pixels• Contrast, inverse deference

moment, and entropy [Gotlieb & Kreyszig]

• Human visual texture properties: coarseness, contrast, directionality, likeliness, regularity and roughness [Tamura et al]

• Wavelet Transforms [90s]• [Smith & Chang] extracted mean

and variance from wavelet subbands

• Gabor Filters• And so on

Region Segmentation• Partition image into regions• Strong Segmentation: Object

segmentation is difficult.• Weak segmentation: Region

segmentation based on some homegenity criteria

Scene Segmentation• Shot detection, scene detection• Look for changes in color,

texture, brightness• Context based scene

segmentation applied to certain categories such as broadcast news

Page 23: Multimedia Information Retrieval: What is it, and why isn't

Video Search: FeaturesFace

• Face detection is highly reliable- Neural Networks [Rwoley]- Wavelet based histograms of facial features [Schneiderman]

• Face recognition for video is still a challenging problem.- EigenFaces: Extract eigenvectors and use as feature space

OCR• OCR is fairly successful

technology.• Accurate, especially with good

matching vocabularies.• Script recognition still an open

problem.

ASR• Automatic speech recognition

fairly accurate for medium to large vocabulary broadcast type data

• Large number of available speech vendors.

• Still open for free conversational speech in noisy conditions.

Shape

• Outer Boundary based vs. region based

• Fourier descriptors• Moment invariants• Finite Element Method

(Stiffness matrix- how each point is connected to others; Eigen vectors of matrix)

• Turing function based (similar to Fourier descriptor) convex/concave polygons[Arkin et al]

• Wavelet transforms leverages multiresolution [Chuang & Kao]

• Chamfer matching for comparing 2 shapes (linear dimension rather than area)

• 3-D object representations using similar invariant features

• Well-known edge detection algorithms.

Page 24: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Video TREC• Overview:

• Shot detection, story segmentation, semantic feature extraction, information retrieval

• Corpora of documentaries, advertising films• Broadcast news added in 2003• Interactive and non-interactive tests• CBIR features• Speech transcribed (LIMSI)• OCR

Page 25: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Video TREC• 1st TREC (2001)

• Mean Average Precision @100 items (MAP) 0.033 (in general category)• Transcript only was better than transcript+image aspects+ASR• Also there were different types of test: known items versus general. Known items

had at least one result.• 2nd TREC (2002)

• Interactive runs to rephrase queries• Again text based on only ASR was the best performing system• Mean Average precision of 0.137• Leading system employed multiple systems: TF-IDF variants (Mean Average

Precision MAP 0.093). • Other was boolean with query expansion (0.101 MAP)• OCR was not particularly applicable; Phonetic ASR did not help.

• 3rd TREC (2003)• Data changes radically (Broadcast news added – CNN, ABC, CSPAN)• Baseline ASR + CC MAP 0.155• ASR + CC + VOCR + Image Similarity + Person X retrieval MAP 0.218

* “Successful Approaches in the TREC Video Retrieval Evaluations”, Alexander Hauptmann, Michael Christel, ACM Multimedia 2004.

Page 26: Multimedia Information Retrieval: What is it, and why isn't

Video Search: Browsing• Search by text• Navigation with customized categories • Random Browsing • Search by example• Search by Sketch

Page 27: Multimedia Information Retrieval: What is it, and why isn't

Video Search:• Pre-2000 or Research

• QBIC (IBM Almaden)• Photobook (MIT Media Lab)• FourEyes (MIT Media Lab)• Netra (UCSB Digital Library)• MARS (UCI)• PicToSeek (UVA.nl)• VisualSEEK (Columbia)• PicHunter (NEC)• ImageRover (BU)• WebSEEK (Columbia)• Virage (now Autonomy)• Visual RetrievalWare (Convera)• AMORE (NEC)• BlobWorld (UC Berkeley)

Page 28: Multimedia Information Retrieval: What is it, and why isn't

Video Search

We have discussed the background of video search, techniques used, and applications. It should be clear that “video search” is a fairly broad term.

Page 29: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Note the differences in these broader definitions of video search:• No mention of content based versus meta-data based.• No mention of the actual browsing paradigm.• Production and consumption are a key aspect of video

search, and should be taken into core consideration in the models.

• Device, personalization and on-demand needs addressed.

Page 30: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Video Search has evolved a lot since its inception and opportunities much broader.

Multimedia technologists (We) are defining and influencing the emergent media culture!

Page 31: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 32: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Engineering is the art of compromise, and there is always room for improvement in the real world; but

engineering is also the art of the practical. Engineers realize that they must, at some point curtail design,

and begin to manufacture or build

- H. Petrovski

Invention by design (Harvard Univ. Press)

Page 33: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

The Internet Revolution• Exposed the internet backbone to the masses• Standardized publication of web media

(HTML,CSS,XML)• Scaled to millions of users• Pre-dot-com bust- slow emergence of media

search (notably AltaVista Audio-Visual search)• Post-dot-com bust

• Recent trends of community tagging especially for Images: Flickr (Y!), DeliCious, and so on.

Page 34: Multimedia Information Retrieval: What is it, and why isn't

Opportunities• Media Search Now

• Yahoo! Image, Video, Audio, Podcast searches• Flickr(Y!)• AOL/SingingFish• BlinxTV• Google• Many smaller companies

Page 35: Multimedia Information Retrieval: What is it, and why isn't

OpportunitiesWeb Based Video Search

• Web Search meets Media Search• Leverage Meta-content from:

• Web (inferred meta-data)• Community (Tagging and user submission/mRSS)• Structured Meta-Data from different sources

• Augment with limited media analysis• OCR, ASR, Classification

Page 36: Multimedia Information Retrieval: What is it, and why isn't

Video Search

Meta-Content

Video Production

UnstructuredData

Video Consumption/Community

Content Features +Meta-Content

Structured +Unstructured Data

Web Video Search

Traditional M

edia Search

Page 37: Multimedia Information Retrieval: What is it, and why isn't

Opportunities Yahoo! Media Search

Billions of Images

Millions of Videos

Hundreds of Millions of users

Reasonable Precision rates

Page 38: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Example 1:• User query “zorro”

• User 1 wants to see Zorro videos• User 2 wants to see Legend of Zorro movie clips• User 3 just wants to see home videos about Zorro

• Can content based analysis help over structured meta-data query inference?

Page 39: Multimedia Information Retrieval: What is it, and why isn't
Page 40: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Example 2:• For main-stream head content such as news videos.• Meta-data are fairly descriptive • Usually queried based on non-visual attributes.

• Task: “Pull up recent Hurricane Katrina videos”

Page 41: Multimedia Information Retrieval: What is it, and why isn't
Page 42: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Example 3:• Creative Home Video• Community video rendering!

• The now “famous” Star Wars Kid• Example of “social buzz” combined with innovative tail

content video production.

Page 44: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

A hard example:• Lets take an example:

“Supposing you want to find videos that depict a monkey/chimp doing karate”!

• CBIR Approach:

Train models for Chimps/Monkeys

Motion Analysis for Karate movement models

Many open issues/problems!

Page 46: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

What are the key opportunities for Video Search?

Automatic Speech Recognition (ASR) as a comparable case study

Page 47: Multimedia Information Retrieval: What is it, and why isn't

OpportunitiesSpeech Recognition: Case study

• From small experimental models to usable large scale products in restricted domains.

• 60’s – Very small experimental tests; Digit recognition; • 70’s – Core Acoustic modeling work; Vocal Tract Modeling;

Formant analysis; Signal Processing (FFT, MFCC)• 80’s – HMMs; Statistical Language Models; Large collections of

acoustic data. Very Large collection of non-acoustic Text Data; Speaker ID;

• 90’s – Large vocabulary speech recognition deployments in many domains (IVR, LVCSR)

• Natural Language Understanding, Core Background Noise/Multiple speakers, Spontaneous Dialog still open research problems

Page 48: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

1970 200019901980 2010

Acoustic Modeling Vocal Tract Modeling Formant Analysis DSP: FFT, MFCC

Digit Recognizers Viterbi search; HMMs Sub-phonetic Models

FSMs, Grammars, Statistical N-grams

Digit Recognition; Small Vocabulary Discrete Medium Vocabulary Continuous Large Vocabulary Continuous LVCSR Telephony/Broadcast

Page 49: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

1970 200019901980 2010

Acoustic Modeling Vocal Tract Modeling Formant Analysis DSP: FFT, MFCC

Digit Recognizers Viterbi search; HMMs Sub-phonetic Models

FSMs, Grammars, Statistical N-grams

Digit Recognition; Small Vocabulary Discrete Medium Vocabulary Continuous Large Vocabulary Continuous LVCSR Telephony/Broadcast

Millions of non-audio Text Data

Hundreds of Hours of AudioData

Modeling FrameworkMilestones

Page 50: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Speech Recognition Case Study Key Milestones:• Foundations: HMMs, Viterbi search, Statistical Language

Models• Creative use of Data: Millions of non-audio text data from

Wall Street Journal Corpus applied for Language model training

• Data Crunching at the lowest levels: Hundreds of hours of data applied to model thousands of sub-phonetic models

Page 51: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Speech Recognition Case Study: • Exploiting non-media sources of meta/text data are

beneficial• Applying contextual restraints enables improved

applicability• Large scale data crunching does indeed help solve

hard problems

Page 52: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Web Video Search:• Mass Adoption: Large Community of

users/taggers/web developers• Different sources of Media and non-Media data• Highly Scalable systems

Page 53: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

The ASR takeaways can be applied to video search: • Exploiting non-media sources of meta/text data are

beneficial (Web Meta-data, Community Tagging)• Applying contextual restraints enables improved

applicability (User models, structured data,Tailored)• Large scale data crunching does indeed help solve

hard problems (Throw Horsepower)

Page 54: Multimedia Information Retrieval: What is it, and why isn't

OpportunitiesWeb Browsing Paradigms

• Search by text (Very popular)• Navigation with customized categories (Very

Popular)• Random Browsing (Popular for certain

categories: Discovery)• Search by example (Not popular currently)• Search by Sketch (Not popular)• Combined usage models - users skipping to

scan and then pick selective videos to play.

Page 55: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

• Leverage Community• Exploit Structured Meta-Data• Constrain with Context

Page 56: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Tying into social network modeling for media search

• How do you define social networks?• How do you integrate into video search

indexing/ranking?• How do you filter out the noise from the

authoritative ones?

Page 57: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Exploit Structured Meta-Data• Video has a number of different editorial

sources; Exploit such structured data• Tailor results to users’ perceived needs

Music Videos, Movies – aka traditional video markets

Page 58: Multimedia Information Retrieval: What is it, and why isn't

Opportunities

Constrain with context• Occam’s razor: Simpler the model, lesser the number of

parameters to train, the more general the model.• Rather than throw in all meta + media feature vectors,

tailor the model;• A decision tree based approach to picking the right

algorithms.• E.g. News broadcast will require certain type of content

features and meta-data.

• Context can also be constrained by user (preferences), device parameters (media type, geo-location)

Page 59: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 60: Multimedia Information Retrieval: What is it, and why isn't

Challenges A man will occasionally stumble over truth, but

usually manages to pick himself up, walk over or around it and carry on.

- Winston Churchill

Page 61: Multimedia Information Retrieval: What is it, and why isn't

Challenges

• Theoretical Frameworks• Leverage CBIR solutions• Learn User Needs/Behavior• Inherit Web Search Problems

Page 62: Multimedia Information Retrieval: What is it, and why isn't

ChallengesTheoretical Frameworks

• Establishing the right theoretical frameworks for video search is still a problem for a variety of reasons:

• Lack of proper classified data• Large dimensionality; Different sources of meta-data.• Difficult to define the correct “perceived relevance” metric

(“Semantic Gap*”)• Cost functions such as “average precision” are not

differentiable & need to be approximated with heuristics• Integrate with Application specific cost functions: E.g.

search: Click through rates, clicks, video views, customer time spent

• Sometimes ranking needs to be based on user buzz, quality and other non-IR metrics

* “Content-Based Image Retrieval at the End of the Early Years”, Arnold Smeulders et al, IEEE PAMI, Dec. 2000

Page 63: Multimedia Information Retrieval: What is it, and why isn't

ChallengesLeverage CBIR

• Video TREC Summary:• Text based retrieval still is much better than non-text based

searches.• Visual queries have poor MAP, but amenable for

interactive search.• Commercial detectors work okay with broadcast news, but

not okay with documentaries with no commercials. (also face detection)

• Visual features are seldom useful: Not without exceptions, edge detectors usually helped only in aircraft and animal categories when color features were not as discriminative.

• ASR was not as useful in tasks such as shot detection (as opposed to indexing whole clip). OCR is more closer to the actual occurrence of the word.

•“Successful Approaches in the TREC Video Retrieval Evaluations”, Alexander Hauptmann, Michael Christel, ACM Multimedia 2004. “Video Retrieval using Speech and Image Information”, Alex Hauptmann et al.

Page 64: Multimedia Information Retrieval: What is it, and why isn't

Challenges

Leverage CBIR• Apply what works in a tailored manner:

• Color, Texture, OCR, Speech Recognition• Classification (in restricted domains)• Copyright infringement prevention• Story-boarding/Shots• Music/Speech Detection

• Apply other techniques in selective domains• Broadcast news: ASR, Segmentation, Speaker Detection• Motion for event detection restricted to time regions

isolated by ASR

Page 65: Multimedia Information Retrieval: What is it, and why isn't

Challenges Overcome the temptation to “solve the

computer vision” problem*.• Yet leverage computer vision techniques• Today, CBIR is mostly image based• Leverage motion flow information more• Apply techniques in compressed domain (or

leverage compressed data- e.g. MPEG-4 already computes local/global motion)

• Strong Image Segmentation is hard- Consistent Segmentation across image sequences in video can be exploited.

* “Guest Introduction: The changing Shape of Computer Vision in the Twenty-First Century”, M. Shah, IJCV, 2002

Page 66: Multimedia Information Retrieval: What is it, and why isn't

ChallengesUser Needs/Behavior

• Understand user needs• Better modeling of query and user intent for video search• Not much work has been done on user intent analysis for

video search

• Leverage User Behavior• Implicit versus explicit relevance feedback• Challenges in tracking user feedback• Blind/Implicit feedback is useful for limited domains

• How do you assess user relevance?• More challenging for video since there is a temporal

duration associated with the actions.• Explicit ratings are an option, but not incentivizable

Page 67: Multimedia Information Retrieval: What is it, and why isn't

Challenges

Inherit Web Search Problems• Meta-data Spam

• Users spam through Web or tagging• Leverage Web Search solutions of spam detection

based on text, linkage analysis, or user disrepute.

• Potential dilution due to “long tail”• How do you balance classic web search

ranking with media based ranking?

Page 68: Multimedia Information Retrieval: What is it, and why isn't

Outline• Introduction

• Market Trends

• Video Search

• Opportunities

• Challenges

• Conclusion

Page 69: Multimedia Information Retrieval: What is it, and why isn't

Conclusion If you look back too much, you will soon be

headed that way.

- Unknown

Page 70: Multimedia Information Retrieval: What is it, and why isn't

ConclusionIn this presentation, we covered:• Market trends that are fuelling large scale

demand globally for video search.• The changing nature of video search

largely driven by the Web.• Discussed different systems and the

techniques that have been applied to media search.

• Identified key areas of opportunities and challenges.

Page 71: Multimedia Information Retrieval: What is it, and why isn't

Conclusion

Video Search is here to stay. We (the attendees) have an opportunity to

define and shape next generation media production, consumption, usage and

ultimately the culture.

Page 72: Multimedia Information Retrieval: What is it, and why isn't

ConclusionMany thanks to:

• MIR Workshop organizers• Qi Tian• Alex Jaimes• ACM Multimedia 2005 Conference Organizers

• Y! Search• John Thrall• Qi Lu• Bradley Horowitz• Video Search team, esp. Ruofei (Bruce) Zhang for help with

references• Y! Blore R&D: Sesha Shah, Srinivasan, YBL (Marc Davis et al) & YRL

(Malcolm Slaney)

- Prof. S.K. Rangarajan for help with quotes

Page 73: Multimedia Information Retrieval: What is it, and why isn't

Conclusion

Thanks to the audience

Questions?

Page 74: Multimedia Information Retrieval: What is it, and why isn't

Appendix: Leveraging Research & Industrial Communities

• Scalable systems deployed and processing millions of media files

• End-user testing with huge volume of traffic• General problem areas covered in R&D (e.g. VideoTREC)

are very relevant• Selective application and proper modelling of information

fusion is key.• How do we share or leverage that data to foster research?• How can industries help?• How can universities help?