dl:lesson 11 multimedia search luca dini [email protected]
Post on 21-Dec-2015
240 views
TRANSCRIPT
![Page 2: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/2.jpg)
Definition
A multimedia search tool allows the user to search the Web for non-textual materials such as audio, image, radio, television and video files
The search tool can be a directory, meta search engine or a single search engine
![Page 3: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/3.jpg)
Types of multimedia
5 Types of Multimedia Search Engines:– Video– Audio– Radio– TV– Image
![Page 4: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/4.jpg)
Evaluation CriteriaScope and Navigation
How big and comprehensive is the index? Is this clearly stated? How often is the index updated? Is this clearly stated? Is the site easy to navigate? Is the screen cluttered? Is the user inundated with advertising and flashy marketing?
Search Tool Does the search tool use spiders to compile their own searchable databases on the Web? Or, does the search tool search the databases of multiple sets of individual search engines?
Search Options Are simple and advanced search options available? Is it possible to conduct Boolean, phrase and wildcard or truncation searches? Are field searches available? Can searches be filtered?
Performance Are the results quickly returned? How well do the results match the query? Are there dead links? Is there any obvious duplication of results?
Presentation How are the search results grouped? Can the search results be customized? How is relevancy determined?
Support Is there a link to help? Is the help information suitable for beginning and advanced searchers?
![Page 5: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/5.jpg)
Examples
Multimedia search tools which allow the user to search for video files
– AllTheWeb: www.alltheweb.com– AltaVista: www.altavista.com/video– Lycos: http://multimedia.lycos.com– Singingfish: www.singingfish.com– Dogpile: www.dogpile.com– Fazzle: www.fazzle.com
![Page 6: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/6.jpg)
Overview
“Conventional” methods: catalogs, databases and analog previewing
Why digitize? Discovering video structure Automatic and manual indexing Data models & user interfaces Prospects for the future: mobile and web
services
![Page 7: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/7.jpg)
Convetional Methods
Structured databases:– AV cataloging (AACR2, MARC 21)– Shot lists– Asset management systems
“Pathfinders” (librarians, archivists) Embedded markers: hints, chapters, scenes (DVD) Video logging systems Hardware browse/skim: FF, slow-mo, etc.
![Page 8: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/8.jpg)
Video Search in libaries
Mainly MARC– 245: Title (usually main entry)– 300: Description (physical piece)– 505: Contents– 508: Credits– 511: Performer note– 520: Summary
![Page 9: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/9.jpg)
520: Summary
505: Contents
650: Subject headings
![Page 10: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/10.jpg)
![Page 11: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/11.jpg)
Footage - Opening credits Chocolate factory workers. Alan Coxon and Kathy Sykes preparing food. Man biting into chocolate bar (0'00-0'50")Alan opening fridge and walking over to Kathy at table.Kathy grating orange. Alan showing ingredients for cheesecake. Cooking chocolate. Alan and Kathy breaking chocolate and smelling it.Breaking chocolate.Kathy tasting chocolate (0'51"-"2'00“)
![Page 12: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/12.jpg)
Browse and Skim: Media PlayersDVD player clones; can be enhanced with SDKs
Media Players are DECODERS Pause, FF, rewind Variable speed Navigate menus, chapters, tracks Insert markers Change audio subtitles Show closed captioning Shuttle/scrub
![Page 13: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/13.jpg)
Media Player Example
Start, stop, pause, rewind to beginning, FF to end, advance by frame
File markers; added by end-user
Play speed settings: 0.5 >> 3X
![Page 14: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/14.jpg)
What is video
Authored video has: Series of still images @25-30 fps Structure: frames >> shots >> scenes MODALITIES
– (Audio tracks)– (Text: captioning, subtitles, etc.)– (Graphics: logos, running tickers etc.)
Production metadata: timestamp, datestamp, flash on/off
![Page 15: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/15.jpg)
Advantages of Digital Video
Store and deliver over networksAllow analysis by computers
Allow auto & manual indexingUSING:
– Image processing – Signal processing– Information visualization
![Page 16: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/16.jpg)
Why Compress Video?
– 1 frame (@TV brightness) = 0.9 megabytes (MB) of storage
– At 29 fps, each second = 26.1 MB of storage– 30 minute film = 53 gigabytes (GB) of storage
OBJECT: Make file smaller; retain as much information as possible
![Page 17: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/17.jpg)
Encoding Formats
These formats use some kind of compression; similar encoding methods—many CODECS—some “lossy,” others “lossless”
AVI: audio-video interleave or interactive QuickTime MPEG family: MPEG-1, 2, 4 H261: for video conferencing New: H264; JPEG 2000
![Page 18: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/18.jpg)
CODECS
Compressor/Decompressor, or Coder/Decoder Produce and work with encoding formats. Central to compression and encoding; perform signal
and image processing tasks Examples: Cinepak, Indeo, Windows Media Video. MPEG-4: DivX, Xvid, 3ivX implementations of certain
compression recommendations of MPEG-4.
![Page 19: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/19.jpg)
How Do CODECS Work?
Movement creates “temporal aliasing”: human eye/brain fills in the gaps
Blurring produced by camera shutter softens edges
Modeled by CODECS and algorithms Goal: acceptable facsimile of moving scene
![Page 20: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/20.jpg)
Example
Jermyn, I. Psychovisual Evaluation of Image Database Retrieval and Image Segmentation
![Page 21: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/21.jpg)
Encoding Methods: predictive
Sampling: value of function @ regular intervals (example: brightness of pixels)
Quantization: frequency of sampling (1 in 10 vs. 1 in 100 frames)
Discrete cosine transforms (DCT) an array of data (not just one pixel) is transformed into another set of values.
Inter-frame vs. Intra-frame encoding
![Page 22: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/22.jpg)
Video Compression
Repetition and Patterns– Say we have a video of a horse running. The movement of the
horse’s legs is the same step after step.
– The codec recognizes this repetition of a string of numbers over several frames that form a repeated pattern (the horse’s moving legs).
– The codec saves this data using a “ditto, 319 more times”, or using a single number token to represent the repeated pattern.
– Relatively lossless
![Page 23: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/23.jpg)
Video Compression
Averaging– Same method a JPEG uses
Looks at a block of pixels and averages their color and brightness, saving one number rather than 4, 9, 16, etc.
![Page 24: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/24.jpg)
Video Compression
Range Reduction– The range in brightness of an original video might be on a scale of 1 to 500,
meaning that the lightest part of the sky is 500 times brighter than the darkest shadow under a forest.
– This brightness is recorded as a number for each pixel in each frame
– Wide range of brightness requires that a big number - 8 or 16 bits - be recorded for each pixel.
– The codec reduces the range to a scale of, say, 1 to 100. The sky is still brighter, but only 100 times or so brighter.
– Because the number is saved millions of times in the movie file, reducing its size can delete lots of data from the file…most people will not notice this range reduction.
![Page 25: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/25.jpg)
Video Compression
Frame-Difference– For the first frame of a video, every pixel is recorded.
For subsequent frames, only those pixels that have changed are recorded.
![Page 26: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/26.jpg)
Video Structure
Video
Scene
Shot
Frame
![Page 27: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/27.jpg)
Keyframes
The MPEG-4 codec used to encode this video clip allows for “keyframes” to be inserted at fairly short intervals.
Keyframes are are frames in which all the information has been encoded. We can navigate from keyframe to keyframe, gaining some quick info about the content.
Algorithms can be written to “grab” keyframes to create a storyboard, which can be used to make a visual index of the video.
![Page 28: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/28.jpg)
Shot Boundary Detection
Algorithms that compare the similarities between nearby frames. When the similarities fall below a pre-determined level, the limit of a “shot” is automatically defined:
Edge detection Compare color histograms Compare motion vectors
![Page 29: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/29.jpg)
Spatial & Temporal Segmentation
1. Use shot boundary detection and keyframes to define shots & choose representative frames
2. Use CBIR (Content-based Image Retrieval) techniques to reveal features in representative frames
(shapes, colors, textures)
![Page 30: DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it](https://reader036.vdocuments.net/reader036/viewer/2022062300/56649d5d5503460f94a3b635/html5/thumbnails/30.jpg)
CBIR Techniques
Images (frames) have no inherent semantic meaning: only arrays of pixel intensities– Color Retrieval: compare histograms– Texture Retrieval: relative brightness of pixel pairs– Shape Retrieval: Humans recognize objects
primarily by their shape – Retrieval by position within the image