hyp progress update by zhao jin. outline background progress update

19
HYP Progress Update By Zhao Jin

Upload: agatha-owen

Post on 01-Jan-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

HYP Progress Update

By Zhao Jin

Outline

• Background

• Progress Update

Background

• Query (Text-based)– The set of keywords to be entered into the

system to retrieve the desired information or resources

– Main category• Traditional IR • Web (ex. Google)• OPAC (ex. LINC)• Video (ex. TRECVID)

Background

• Query Analysis– To analyze the pattern and hidden information

in the queries

– To efficiently classify and support such queries.

Progress update

• Mid-May to Early June– Background reading– Around 30 to 40 papers on various topic– Summarizing of key points in the paper

Progress update

• Mid-June to late-June– Log analysis

• BBC Video Query• NUS OPAC Query

– Background reading on OPAC and TRECVID

Progress update

• July to now– Follow up on two main topics

• Query classification and division on content-based and feature-based keywords (OPAC)

• Identifying ASR-oriented keywords in a video query (TRECVID)

– Background reading on MARC, wordnet and LOC subject heading

Progress update

• Plan for the near future– Refine and experiment with the current ideas

– Log analysis

– Background reading (Textbook & Related paper)

– Preparation for implementation

Q&A?

End of progress update

• Thank you for your attention!

Two types of keywords

• Content-Based Keyword (CBK)– The keywords that concern what the item is

about– Ex. title, subject heading, etc

• Feature-Based Keyword (FBK)– The keywords that concern the features of the

item.– Ex. author, publisher, genre, medium

Benefits

• Benefits:– Faster retrieval – More precise retrieval– Help in relevance ranking

Possible implementation

• Possible implementation: – term co-occurrence for concept division

– list of special words and machine learning for FBK and CBK division

– wordnet for classification among CBKs

Possible implementation

• Possible implementation: – CL and IL search algorithms for actual

searching with CBKs.

– list of special words and machine learning for classification among FBKs.

– Marc record search algorithms for actual searching with FBKs.

Back

Means to retrieve shots

• Example:– To find shots of “Bill Clinton”

• Face recognition

• Closed-caption

• Automatic Speech Recognition (ASR)

Metrics

• Common VS Special (In reality) – How common in reality is the concept

represented by the keyword.

• Generic VS Specific – How generic is the concept represented by the

keyword.

Metrics

• Concrete VS Abstract – Whether the keyword represented is concrete

or abstract

• Topic frequency (Low VS High) – How often the keyword becomes (closely

related to) a topic.

Metrics

• Formal VS Informal – Whether the keyword is in formal or informal

language

• Written VS spoken – Whether the keyword is in spoken or written

language

Metrics

• Feature-level VS Content-level – Whether the keyword is about the feature of

the video (ex. camera motion) or the content of the video

Back