text categorization and images. text categorization text categorization (tc) refers to the automatic...
Post on 13-Jan-2016
242 Views
Preview:
TRANSCRIPT
Text Categorization and Images
Text Categorization
• Text categorization (TC) refers to the automatic labeling of documents, using natural language text contained in or associated with each document, into one or more pre-defined categories.
• TC techniques can be applied to image captions to label the corresponding images.
Outdoor Indoor
Clues for Indoor/Outdoor:Text (as opposed to Vision)
Denver Summit of Eight leaders begin their first official meeting in the Denver Public Library, June 21.
The two engines of an Amtrak passenger train lie in the mud at the edge a marsh after the train, bound for Boston from Washington, derailed on the bank of the Hackensack River, just after crossing a bridge.
NewsBlaster Categories
Entertainment Science/Technology Sports
U.S. News World News Finance
Events Categories
Politics Struggle
Disaster Crime Other
Subcategories for Disaster Images
Politics Struggle
Disaster Crime Other
Category F1
Politics 89%
Struggle 88%
Disaster 97%
Crime 90%
Other 59%
Affected People OtherWreckageWorkers Responding
Disaster Image Categories
Affected People
OtherWreckage
Workers Responding
Words are Ambiguous:Workers Responding vs. Affected People
Philippine rescuers carry a fire victim March 19 who perished in a blaze at a Manila disco.
Hypothetical alternative caption: A fire victim who perished in a blaze at a Manila disco is carried by Philippine rescuers March 19.
Workers Responding Affected People
Collect Labels to Train Systems
Contributions of My Research
• Applied text categorization (TC) techniques to images using associated text.
• Created a corpus, hoping to make it public.
• Introduced two novel TC approaches.
• Integrated NLP with traditional approaches.
• Explored combination of approaches.
• Combined text and image features.
top related