multimedia multimedia bible

7/22/2019 Multimedia multimedia Bible

1/163

Multimedia BSc Exam 2000 SOLUTIONS

Setter: ADM

CheckerACJ

Additional Material: SMIL Language Description Sheet

Answer 3 Questions out of 4

1. (a) What is meant by the terms Multimedia and Hypermedia? Distinguish betweenthese two concepts.

Multimedia ---- An Application which uses a collection of multiple media sources

e.g. text, graphics, images, sound/audio, animation and/or video.

Hypermedia --- An application which uses associative relationships amonginformation contained within multiple media data for the purpose of facilitating

access to, and manipulation of, the information encapsulated by the data.

2 MARKS ---- BOOKWORK

(b) What is meant by the terms static media and dynamic media? Give two examples

of each type of media.

Static Media does not change over time, e.g. text, graphics

Dynamic Media --- Time dependent (Temporal), e.g. Video, sound, animation.

4 MARKS --- BOOKWORK

(c) What issues of functionality need to be provided in order to effectively use a wide

variety of media in Multimedia applications? Your answer should briefly address how

such functionality can facilitated in general Multimedia applications.

The following functionality should be provided:

-Digital Representation of Media --- Many formats for many media

- Capture: Digitisation of Media --- special Hardware/Software- Creation and editing --- assemble media and alter it- Storage Requirements --- significant for multimedia- Compression --- related to above and below, ie can save on storage but can

hinder retrieval

- Structuring and retrieval methods of media --- simple to advanced DataBaseStorage


2/163

- Display or Playback methods --- effect of retrieval must view data- Media Synchronisation --- display multimedia as it is intended9 MARKS --- BOOKWORK

(d) Different types of media will require different types of supporting operations to

provide adequate levels of functionality. For the examples of static and dynamicmedia given in your answer to part 1(b) briefly discuss what operations are need to

support a wide range of multimedia applications.

A selection of the items below is reuired for good marks NOT ALL. Other SolnsPossible?

Typical Range of operations required for common media

Text: Editing

Formatting

SortingIndexing

Searching

Encrypting

ABOVE REQUIRE: :Character Manipulation

String Manipulation

Audio: Audio EditingSynchronisation

Conversion/Translation

Filtering/ Sound Enhancing OperatorsCompressionSearching

Indexing

ABOVE REQUIRE: :

Sample ManipulationWaveform Manipulation

Graphics: Graphic primitive EditingShading

Mapping

LightingViewingRendering

Searching

Indexing

ABOVE REQUIRE: :Primitive Manipulation

Structural/Group Manipulation


3/163

Image: Pixel operationsGeometric Operations

Filtering

Conversion

IndexingCompression

Searching

Animation: Primitive/Group EditingStructural Editing

Rendering

Synchronistaion

SearchingIndexing

Video: Pixel OperationsFrame OperationsEditing

Synchronisation

Conversion

MixingIndexing

Searching

Video Effects/Filtering

12 MARKS --- UNSEEN


4/163

2. (a) Why is file or data compression necessary for Multimedia activities?

Multimedia files are very large therefore for storage, file transfer etc. file sizes need to

be reduced. Text and other files may also be encoded/compressed for email and other

applications.


(b)Briefly explain, clearly identifying the differences between them, how entropycoding and transform coding techniques work for data compression. Illustrate your

answer with a simple example of each type.

Compression can be categorised in two broad ways:

Lossless Compression

-- where data is compressed and can be reconstituted (uncompressed)without loss of detail or information. These are referred to as bit-preservingor reversible compression systems also.

Lossy Compression

-- where the aim is to obtain the best possible fidelity for a given bit-rate

or minimizing the bit-rate to achieve a given fidelity measure. Video andaudio compression techniques are most suited to this form of

compression.

Lossless compression frequently involves some form of entropy encoding

and are based in information theoretic techniques

Lossy compression use source encoding techniques that may involve

transform encoding, differential encoding or vector quantisation.

ENTROPY METHODS:

The entropy of an information source S is defined as:

H(S) = SUMI (PI Log2 (1/PI)

where PI is the probability that symbol SI in S will occur.

Log2 (1/PI) indicates the amount of information contained in SI, i.e., the number ofbits needed to code SI.

Encoding for the Shannon-Fano Algorithm:

A top-down approach


5/163

1. Sort symbols according to their frequencies/probabilities, e.g., ABCDE.

2. Recursively divide into two parts, each with approx. same number of counts.(Huffman algorithm also valid indicated below)

A simple transform coding example

A Simple Transform Encoding procedure maybe described by the following

steps for a 2x2 block of monochrome pixels:

1. Take top left pixel as the base value for the block, pixel A.2. Calculate three other transformed values by taking the difference between

these (respective) pixels and pixel A, i.e. B-A, C-A, D-A.

3. Store the base pixel and the differences as the values of the transform.Given the above we can easily for the forward transform:

and the inverse transform is trivial

The above transform scheme may be used to compress data by exploitingredundancy in the data:

Any Redundancy in the data has been transformed to values, Xi. So We cancompress the data by using fewer bits to represent the differences. I.e if we

use 8 bits per pixel then the 2x2 block uses 32 bits/ If we keep 8 bits for the base

pixel, X0, and assign 4 bits for each difference then we only use 20 bits.Which is better than an average 5 bits/pixel


(c) (i) Show how you would use Huffman coding to encode the following set of tokens:

BABACACADADABBCBABEBEDDABEEEBB

How is this message transmitted when encoded?

The Huffman algorithm is now briefly summarised:

1. Initialization: Put all nodes in an OPEN list, keep it sorted at all times

(e.g., ABCDE).


6/163

2. Repeat until the OPEN list has only one node left:

(a) From OPEN pick two nodes having the lowest

frequencies/probabilities, create a parent node of them.

(b) Assign the sum of the children's frequencies/probabilities to the

parent node and insert it into OPEN.

(c) Assign code 0, 1 to the two branches of the tree, and delete thechildren from OPEN.

Symbol Count OPEN (1) OPEN (2) OPEN (3) OPEN (4)A 8 18 30

B 10 -

C 3 7 12 -D 4 -E 5 -

Total 30

- indicate merge node with other node with number in column


7/163

Finished Huffman

Tr

Symbol Code

A 01B 00

C 101

D 100

E 11

How is this message transmitted when encoded?

Send code book and then bit code for each symbol.

7 Marks --- UNSEEN


8/163

(ii) How many bits are needed transfer this coded message and what is its

Entropy?

Symbol Count Subtotal # of bits

A 8 16

B 10 20C 3 9

D 4 12

E 5 10

Total Number bits (excluding code book) = 67

Entropy = 67/30 = 2.2333

4 MARKS --- UNSEEN(iii)

What amendments are required to this coding technique if data isgenerated live or is otherwise not wholly available? Show how you

could use this modified scheme by adding the tokens ADADAto the

above message.

Adaptive method needed:

Basic idea (encoding)

Initialize_model();while ((c = getc (input)) != eof){encode (c, output);update_model (c);

}

So encode message as before:

A= 01 D = 100

So addd stream:

011000110001


9/163

Modify Tree:Symbol Count OPEN (1) OPEN (2) OPEN (3) OPEN (4)

A 11 21 35

B 10 -

C 3 8 14 -D 6 -

E 5 -

Huffman tree drawn as before but different.

6 Marks --- UNSEEN


10/163

3 (a) What are the major factors to be taken into account when considering what

storage requirements are necessary for Multimedia Systems?

Major factors:

Large volume of dateReal time delivery

Data format

Storage Medium

Retrieval mechanisms

4 MARKS --- Unseen/applied bookwork

(b) What is RAID technology and what advantages does it offer as a medium for

the storage and delivery of large data?

RAID --- Redundant Array of Inexpensive Disks

Offers:

Affordable alternative to mass storage

High throughput and reliability

RAID System:

Set of disk drives viewed by user as one or more logical drives

Data may be distributed across drivesRedundancy added in order to allow for disk failure


(c) Briefly explain the eight levels of RAID functionality .

Level 0 Disk Striping --- distributing data across multiple drives

Level 1 Disk Mirroring --- Fault tolerancingLevel 2 Bit Interleaving and HEC Parity

Level 3 - Bit Interleaving with XOR Parity

Level 4 Block Interleaving with XOR Parity

Level 5 - Block Interleaving with Parity DistributionLevel 6 Fault Tolerant System --- Error recovery

Level 7 Heterogeneuos System --- Fast access across whole system


(d) A digital video file is 40 Mb in size. The disk subsystem has four drives and

the controller is designed to support read and write onto each drive,

concurrently. The digital video stored using the disk striping concept. A block

size of 8 Kb is used for each I/O operation.


11/163

(i) What is the performance improvement in sequentially reading the

complete file when compared to a single drive subsystem in terms of the number

of operations performed?

We have 5120 segments to write to RAID disks. Given 4 disks we have 1280actual I/Os to perform

On 1 drive we clearly have 5120 operations to perform.

(ii) What is the percentage performance improvement expressed as the

number of physical I/O operations to be executed in on the RAID and single

drive systems?

The improvement is

(5120 1280)/1280*100 = 300%. Obvious given 4 concurrent drives and RAID!!

11 MARKS --- UNSEEN


12/163

4 (a) Give a definition of a Multimedia Authoring System. What key features

should such a system provide?

An Authoring System is a program which has pre-programmed elements for the

development of interactive multimedia software titles.Authoring systems vary widely in orientation, capabilities, and learning curve.

There is no such thing (at this time) as a completely point-and-click automated

authoring system; some knowledge of heuristic thinking and algorithm design isnecessary.

Authoring is basically just a speeded-up form of programming --- VISUAL

PROGRAMMING; you don't need to know the intricacies of a programminglanguage, or worse, an API, but you do need to understand how programs work.


(b) What Multimedia Authoring paradigms exist? Describe each paradigm

briefly.

There are various paradigms, including:

Scripting Language

The Scripting paradigm is the authoring method closest in form to traditional

programming. The paradigm is that of a programming language, which specifies (by

filename) multimedia elements, sequencing, hotspots, synchronization, etc. Apowerful, object-oriented scripting language is usually the centerpiece of such asystem; in-program editing of elements (still graphics, video, audio, etc.) tends to be

minimal or non-existent. Scripting languages do vary; check out how much the

language is object-based or object-oriented. The scripting paradigm tends to be longer

in development time (it takes longer to code an individual interaction), but generallymore powerful interactivity is possible. Since most Scripting languages are

interpreted, instead of compiled, the runtime speed gains over other authoring

methods are minimal.

The media handling can vary widely; check out your system with your contributing

package formats carefully. The Apples HyperTalk for HyperCard, Assymetrixs

OpenScript for ToolBook and Lingo scripting language of Macromedia Director are

examples of a Multimedia scripting language.

Here is an example lingo script to jump to a frame

global gNavSprite

on exitFrame


13/163

go the frameplay sprite gNavSpriteend

Iconic/Flow Control

This tends to be the speediest (in development time) authoring style; it is best suitedfor rapid prototyping and short-development time projects. Many of these tools are

also optimized for developing Computer-Based Training (CBT). The core of theparadigm is the Icon Palette, containing the possible functions/interactions of a

program, and the Flow Line, which shows the actual links between the icons. These

programs tend to be the slowest runtimes, because each interaction carries with it all

of its possible permutations; the higher end packages, such as Authorware orIconAuthor, are extremely powerful and suffer least from runtime speed problems.

Frame

The Frame paradigm is similar to the Iconic/Flow Control paradigm in that it usuallyincorporates an icon palette; however, the links drawn between icons are conceptual

and do not always represent the actual flow of the program. This is a very fast

development system, but requires a good auto-debugging function, as it is visually

un-debuggable. The best of these have bundled compiled-language scripting, such asQuest (whose scripting language is C) or Apple Media Kit.

Card/Scripting

The Card/Scripting paradigm provides a great deal of power (via the incorporated

scripting language) but suffers from the index-card structure. It is excellentlysuitedfor Hypertext applications, and supremely suited for navigation intensive (a la

Cyans MYST game) applications. Such programs are easily extensible via

XCMDs andDLLs; they are widely used for shareware applications. The best

applications allow all objects (including individual graphic elements) to be scripted;many entertainment applications are prototyped in a card/scripting system prior to

compiled-language coding.

Cast/Score/Scripting

The Cast/Score/Scripting paradigm uses a music score as its primary authoringmetaphor; the synchronous elements are shown in various horizontal tracks

withsimultaneity shown via the vertical columns. The true power of this metaphor lies

in the ability to script the behavior of each of the cast members. The mostpopularmember of this paradigm is Director, which is used in the creation of many

commercial applications. These programs are best suited for animation-intensive

orsynchronized media applications; they are easily extensible to handle other

functions (such as hypertext) via XOBJs, XCMDs, and DLLs.

Macromedia Director uses this .


14/163

Hierarchical Object

The Hierarchical Object paradigm uses a object metaphor (like OOP) which is

visually represented by embedded objects and iconic properties. Although the

learning curve is non-trivial, the visual representation of objects can make very

complicated constructions possible.Hypermedia Linkage

The Hypermedia Linkage paradigm is similar to the Frame paradigm in that it shows

conceptual links between elements; however, it lacks the Frame paradigms visual

linkage metaphor.Tagging

The Tagging paradigm uses tags in text files (for instance, SGML/HTML, SMIL

(Synchronised Media Integration Language), VRML, 3DML and WinHelp) to link

pages, provide interactivity and integrate multimedia elements.

8 Marks --- BOOKWORK

(c) You have been asked to provide a Multimedia presentation that cansupport media in both English and French. You may assume that you

have been given a sequence of 10 images and a single 50 second digitised

audio soundtrack in both languages. Each Image should be mapped over

consecutive 5 second fragments of the audio. All Images are of the same

500x500 pixel dimension.

Describe, giving suitable code fragments, how you would assemble such a

presentation using SMIL. Your solution should cover all aspects of the

SMIL presentation

.


15/163

< audio system-language="en" src ="english.au" />

.

17 Marks ---- UNSEEN


16/163

Multimedia

BSc Exam Questions Jan 2001 SOLUTIONS

Exam paper format:

(b) Time Allowed: 2 Hours(c)Answer 3 Questions out of 4(d) Each Question Carries 27 Marks

1. (a) What is meant by the terms Multimedia and Hypermedia? Distinguish

between these two concepts.

Multimedia ---- An Application which uses a collection of multiple media sources

e.g. text, graphics, images, sound/audio, animation and/or video.

Hypermedia --- An application which uses associative relationships among

information contained within multiple media data for the purpose of facilitating

access to, and manipulation of, the information encapsulated by the data.


(b) What is meant by the terms static media and dynamic media? Give two

examples of each type of media.

Static Media does not change over time, e.g. text, graphics

Dynamic Media --- Time dependent (Temporal), e.g. Video, sound, animation.


(c) What are the main facilities that must be provides in a system designed to

support the integration of multimedia into a multimedia presentation?

The following functionality should be provided:

Digital Representation of Media --- Many formats for many media Capture: Digitisation of Media --- special Hardware/Software


17/163

Creation and editing --- assemble media and alter it Storage Requirements --- significant for multimedia Compression --- related to above and below, ie can save on storage but can

hinder retrieval

Structuring and retrieval methods of media --- simple to advanced DataBaseStorage

Display or Playback methods --- effect of retrieval must view data Media Synchronisation --- display multimedia as it is intended8 MARKS --- BOOKWORK

(d) Describe giving suitable code fragments how you would effectively combine

and start at the same time a video clip and an audio clip in an MHEG application

and start a subtiltle text display 19000 milliseconds into the video clip. You may

assume that both clips are of the same duration and must start at the same instant.

TheMHEG code listing below is illustrates the solution:

{: Appl i cat i on ( "SYNC_APP. mh5" 0): OnSt ar t Up ( / / sequence of i ni t i al i zat i onact i ons}{: Scene ( "mai n_scene. mh5" 0): OnSt ar t Up ( / / sequence of i ni t i al i zat i onact i onspr el oad ( 2) / / t he connect i on t o t hesour ce of t he vi deo cl i p i s set uppr el oad ( 3) / / t he connect i on t o t hesour ce of t he audi o cl i p i s set up

. . .set Count er Tr i gger ( 2 3 190000) / / book at i me code event at 190000 msec f or exampl e. . .): I t ems ( / / bot h pr esent abl e i ngr edi ent s andl i nks

{: Bi t map 1 / / backgr ound bi t map NOT I MPORTANT FOR SOLN: I ni t i al l yAct i ve t r ue: CHook 3 / / J PEG: Or i gCont ent: Cont ent Ref ( "backgr ound. j pg"): Or i gBoxSi ze 800 600: Or i gPosi t i on 0 0}


18/163

{: St r eam 2 / / vi deo cl i p: I ni t i al l yAct i ve f al se: CHook 101 / / MPEG- 1: Or i gCont ent: Cont ent Ref ( "vi deo. mpg"): Mul t i pl ex ({: Audi o 3 / / audi o component of t hevi deo cl i p: Component Tag 1 / / r ef er s t o audi oel ement ar y st r eam: I ni t i al l yAct i ve t r ue}. . . / / possi bl y mor e pr esent abl e i ngr edi ent s

}{: Li nk 49 / / t he vi deo cl i p cr osses a pr edef i ned t i me code posi t i on

: Event Sour ce ( 2) / / vi deo cl i p: Event Type Count erTr i gger: Event Dat a 3 / / booked at st ar t up byset Count erTr i gger ( 2 3 190000): Li nkEf f ect (: Set Dat a ( 5 / / t ext subt i t l e i s sett o a new st r i ng, t hat i s: NewRef Cont ent ( "subt i t l e. t xt ") ) / /"Subt i t l e t ext ": Set Hi ghl i ght St at us ( 20 t r ue) / /hot spot 20 i s hi ghl i ght ed

)}. . . / / more l i nks): SceneCS 800 600 / / si ze of t he scene' spr esent at i on space}

Key Points:

Preloading both clips is essential to start streaming: Need to book 190000 msec event for subtitles Content loaded and Video and audio is multiplexed Set a link transition for subtiltle

13 MARKS - UNSEEN


19/163

2. (a) Why is file or data compression necessary for Multimedia activities?

Multimedia files are very large therefore for storage, file transfer etc. file sizes need to be

reduced often as part of the file format. Text and other files may also be

encoded/compressed for email and other applications.

3 Marks -- Bookwork

(b) Briefly explain how the LZW Transform Operates. What common

compression methods utilise this transform.

Suppose we want to encode the Oxford Concise English dictionary which contains about

159,000 entries. Why not just transmit each word as an 18 bit number?

Problems:

* Too many bits,* everyone needs a dictionary,

* only works for English text.* Solution: Find a way to build the dictionary adaptively.

Original methods due to Ziv and Lempel in 1977 and 1978. Terry Welch improvedthe scheme in 1984 (called LZW compression).

It is used in UNIX compress and GIF compression

The LZW Compression Algorithm can summarised as follows:

w = NI L;whi l e ( r ead a char act er k )

{i f wk exi st s i n t he di cti onar yw = wk;

el seadd wk t o t he di ct i onar y;out put t he code f or w;w = k;

}

* Original LZW used dictionary with 4K entries, first 256 (0-255) are ASCII codes.


20/163

10 MARKS BOOKWORK

(c) Show how the LZW transform would be used to encode the following 2D

array of image data, you should use 2x2 window elements for the characters:

SEE HANDWRITTEN SOLN attached

14 Marks UNSEEN


21/163

3 (a) What key features of Quicktime have lead to its adoption and an international

multimedia format?

QuickTime is the most widely used cross-platform multimedia technology available

today. QuickTime developed out of a multimedia extension for Apple'sMacintosh(proprietry) System 7 operating system. It is now an international standard formultimedia interchange and is avalailbe for many platforms and as Web browser plug ins.

The following main features are:

Versatile support for web-based media Sophisticated playback capabilities Easy content authoring and editing QuickTime is anopen standard-- it embraces other standards and incorporates them

into its environment. It supports almost every major Multimedia file format

4 Marks BOOKWORK

(b) Briefly outline the Quicktime Architecture and its key components.

The QuickTime Architecture:

QuickTime comprises two managers: the Movie Toolbox and the Image Compression

Manager. QuickTime also relies on the Component Manager, as well as a set of

predefined components. Figure below shows the relationships of these managers and an

application that is playing a movie.


22/163

The Movie Toolbox-- Your application gains access to the capabilities of QuickTime by calling functions in

the Movie Toolbox. The Movie Toolbox allows you to store, retrieve, and manipulatetime-based data that is stored in QuickTime movies. A single movie may contain several

types of data. For example, a movie that contains video information might include bothvideo data and the sound data that accompanies the video.

The Movie Toolbox also provides functions for editing movies. For example, there are

editing functions for shortening a movie by removing portions of the video and soundtracks, and there are functions for extending it with the addition of new data from other

QuickTime movies.

The Movie Toolbox is described in the chapter "Movie Toolbox" later in this book. Thatchapter includes code samples that show how to play movies.

The Image Compression Manager

--


23/163

The Image Compression Manager comprises a set of functions that compress anddecompress images or sequences of graphic images.

The Image Compression Manager provides a device-independent and driver-independent

means of compressing and decompressing images and sequences of images. It also

contains a simple interface for implementing software and hardware image-compressionalgorithms. It provides system integration functions for storing compressed images aspart of PICT files, and it offers the ability to automatically decompress compressed PICT

files on any QuickTime-capable Macintosh computer.

In most cases, applications use the Image Compression Manager indirectly, by callingMovie Toolbox functions or by displaying a compressed picture. However, if your

application compresses images or makes movies with compressed images, you will callImage Compression Manager functions.

The Image Compression Manager is described in the chapter "Image Compression

Manager" later in this book. This chapter also includes code samples that show how tocompress images or make movies with compressed images.

The Component Manager

--

Applications gain access to components by calling the Component Manager. TheComponent Manager allows you to define and register types of components and

communicate with components using a standard interface. A component is a coderesource that is registered by the Component Manager. The component's code can be

stored in a systemwide resource or in a resource that is local to a particular application.

Once an application has connected to a component, it calls that component directly.If you create your own component class, you define the function-level interface for

the component type that you have defined, and all components of that type mustsupport the interface and adhere to those definitions. In this manner, an application

can freely choose among components of a given type with absolute confidence thateach will work.

QuickTime Components :

movie controller components, which allow applications to play movies using astandard user interface standard image compression dialog components, which

allow the user to specify the parameters for a compression operation by supplyinga dialog box or a similar mechanism

image compressor components, which compress and decompress image datasequence grabber components, which allow applications to preview and record


24/163

video and sound data as QuickTime movies video digitizer components, whichallow applications to control video digitization by an external device

media data-exchange components, which allow applications to move varioustypes of data in and out of a QuickTime movie derived media handler

components, which allow QuickTime to support new types of data in QuickTime

movies clock components, which provide timing services defined for QuickTime

applications preview components, which are used by the Movie Toolbox's

standard file preview functions to display and create visual previews for filessequence grabber components, which allow applications to obtain digitized data

from sources that are external to a Macintosh computer sequence grabber channel components, which manipulate captured data for a

sequence grabber component

sequence grabber panel components, which allow sequence grabber componentsto obtain configuration information from the user for a particular sequencegrabber channel component

10 Marks BookWork

(c) Quicktime provides many basic built-in visual effect procedures. By using

fragments of Java code show how a cross-fade effect between two images can be

created. You solution should concentrate only on the Java code specific to

producing the Quicktime effect. You may assume that the images are already

imported into the application and are referred to as sourceImage and destImage.

You should not consider any Graphical Interfacing aspects of the coding.

Thi s code shows how a Cr oss Fade Tr ansi t i on ef f ect coul d be bui l t NOTALL THE I NTERFACI NG STUFF I NCLUDED BELOW I S REQUI RED SEE COMMENTS AFTERFOR I MPORTANT PARTS THAT NEED ADDRESSI NG.

/*

* QuickTime for Java Transition Sample Code

*/

import java.awt.*;

import java.awt.event.*;

import java.io.*;

import quicktime.std.StdQTConstants;

import quicktime.*;

import quicktime.qd.*;

import quicktime.io.*;

import quicktime.std.image.*;

import quicktime.std.movies.*;

import quicktime.util.*;


25/163

import quicktime.app.QTFactory;

import quicktime.app.time.*;

import quicktime.app.image.*;

import quicktime.app.display.*;

import quicktime.app.anim.*;

import quicktime.app.players.*;

import quicktime.app.spaces.*;

import quicktime.app.actions.*;

public class TransitionEffect extends Frame implements

StdQTConstants, QDConstants {

public static void main(String args[]) {

try {

QTSession.open();

TransitionEffect te = newTransitionEffect("Transition Effect");

te.pack();

te.show();

te.toFront();

} catch (Exception e) {

e.printStackTrace();

QTSession.close();

}

}

TransitionEffect(String title) throws Exception {

super (title);

QTCanvas myQTCanvas = new

QTCanvas(QTCanvas.kInitialSize, 0.5f, 0.5f);

add("Center", myQTCanvas);

Dimension d = new Dimension (300, 300);

addWindowListener(new WindowAdapter() {

public void windowClosing (WindowEvent e) {QTSession.close();

dispose();

}

public void windowClosed (WindowEvent e) {

System.exit(0);

}


26/163

});

QDGraphics gw = new QDGraphics (new QDRect(d));

Compositor comp = new Compositor (gw,

QDColor.black, 20, 1);

ImagePresenter idb = makeImagePresenter (new

QTFile (QTFactory.findAbsolutePath ("pics/stars.jpg")),

new

QDRect(300, 220));

idb.setLocation (0, 0);

comp.addMember (idb, 2);

ImagePresenter id = makeImagePresenter (new

QTFile (QTFactory.findAbsolutePath ("pics/water.pct")),

new

QDRect(300, 80));

id.setLocation (0, 220);comp.addMember (id, 4);

CompositableEffect ce = new CompositableEffect

();

ce.setTime(800); // TIME OF EFFECT

ce.setSourceImage(sourceImage);

ce. setDestinationImage(destImage);

ce.setEffect (createSMPTEEffect,

kEffectCrossFade, KRandomCrossFadeTransitionType);

ce.setDisplayBounds (new QDRect(0, 220, 300,

80));

comp.addMember (ce, 3);

Fader fader = new Fader();

QTEffectPresenter efp = fader.makePresenter();

efp.setGraphicsMode (new GraphicsMode (blend,

QDColor.gray));

efp.setLocation(80, 80);

comp.addMember (efp, 1);

comp.addController(new TransitionControl (20, 1,

fader.getTransition()));

myQTCanvas.setClient (comp, true);

comp.getTimer().setRate(1);

}

private ImagePresenter makeImagePresenter (QTFile

file, QDRect size) throws Exception {


27/163

GraphicsImporterDrawer if1 = new

GraphicsImporterDrawer (file);

if1.setDisplayBounds (size);

return ImagePresenter.fromGraphicsImporterDrawer

(if1);

}

}

..

FULL CODE NOT REQUIRED as above

Important bits

Set up an atom container to use an SMPTE effect usingCreateSMPTEeffect()

Set up a Transition with the IMPORTANT parameters: ce.setTime(800); // TIME OF EFFECT ce.setSourceImage(sourceImage); ce. setDestinationImage(destImage); ce.setEffect (createSMPTEEffect,

kEffectCrossFade,

KRandomCrossFadeTransitionType);

A doTransition() or doAction() method performstransition

13 Marks --- UNSEEN


28/163

4. (a) What is MIDI? How is a basic MIDI message structured?

MIDI: a protocol that enables computer, synthesizers, keyboards, and other

musical or (even) multimedia devices to communicate with each other.

MIDI MESSAGE:

MIDI message includes a status byte and up to two data bytes. Status byte

The most significant bit of status byte is set to 1.

The 4 low-order bits identify which channel it belongs to (four bits produce 16possible channels).

The 3 remaining bits identify the message. The most significant bit of data byte is set to 0.

6 Marks --- Bookwork

(b)In what ways can MIDI be used effectively in Multimedia Applications, as

opposed to strictly musical applications?

Many Application:

Low Bandwidth/(Low Quality?) Music on Web, Quicktime etc supports

Midi musical instrument set

Sound Effectts --- Low Bandwidth alternative to audio samples, Sound Set

part of GM soundsetControl of external devices --- e.g Synchronistaion of Video and Audio

(SMPTE), Midi System Exclusive, AUDIO RECORDERS, SAMPLERS

Control of synthesis --- envelope control etc

MPEG 4 Compression control --- see Part (c)

Digital Audio

8 Marks --- Applied Bookwork: Discussion of Information mentioned in

Notes/Lectures.

(c)How can MIDI be used with modern data compression techniques?Briefly describe how such compression techniques may be implemented?

We have seen the need for compression already in Digital Audio -- Large Data Files Basic Ideas of compression (see next Chapter) used as integral part of audio format --

MP3, real audio etc.


29/163

Mpeg-4 audio -- actually combines compression synthesis and midi to have a massiveimpact on compression.

Midi, Synthesis encode what note to play and how to play it with a small number ofparameters -- Much greater reduction than simply having some encoded bits of audio.

Responsibility to create audio delegated to generation side.

MPEG-4 comprises of 6 Structured Audio tools are:

SAOL the Structured Audio Orchestra Language SASL the Structured Audio Score Language SASBF the Structured Audio Sample Bank Format

a set of MIDI semantics which describes how to control SAOL with MIDI a scheduler which describes how to take the above parts and create sound the AudioBIFS part of BIFS, which lets you make audio soundtracks in MPEG-4

using a variety of tools and effects-processing techniques

MIDI IS the control language for the synthesis part:

As well as controlling synthesis with SASL scripts, it can be controlled with MIDIfiles and scores in MPEG-4. MIDI is today's most commonly used representation for

music score data, and many sophisticated authoring tools (such as sequencers) work

with MIDI.

The MIDI syntax is external to the MPEG-4 Structured Audio standard; onlyreferences to the MIDI Manufacturers Association's definition in the standard. But in

order to make the MIDI controls work right in the MPEG context, some semantics

(what the instructions "mean") have been redefined in MPEG-4. The new semantics

are carefully defined as part of the MPEG-4 specification.

13 Marks --- UNSEEN but Basic Ideas mentioned in lectures and earmarked for

further reading . Detailed application of Lecture notes material


30/163

CM0340

1

CARDIFF CARDIFF UNIVERSITY

EXAMINATION PAPER

SOLUTIONS

Academic Year: 2001-2002Examination Period: Autumn 2001

Examination Paper Number: CM0340

Examination Paper Title: Multimedia

Duration: 2 hours

Do not turn this page over until instructed to do so by the Senior Invigilator.

Structure of Examination Paper:

There are three pages.

There are four questions in total.

There are no appendices.

The maximum mark for the examination paper is 100% and the mark obtainable for a

question or part of a question is shown in brackets alongside the question.

Students to be provided with:

The following items of stationery are to be provided:

One answer book.

Instructions to Students:

Answer THREE questions.

The use of translation dictionaries between English or Welsh and a foreign language

bearing an appropriate departmental stamp is permitted in this examination.


31/163

CM0340

2

1. (a) Give a definition of multimedia and a multimedia system.

Multimedia is the field concerned with the computer-controlled integration of

text, graphics, drawings, still and moving images (Video), animation, audio, and

any other media where every type of information can be represented, stored,transmitted and processed digitally.

A Multimedia System is a system capable of processing multimedia data and

applications.

2 Marks - BOOKWORK

(b) What are the key distinctions between multimedia data and more conventional

types of media?

Multimedia systems deal with the generation, manipulation, storage, presentation,

and communication of information in digital form.

The data may be in a variety of formats: text, graphics, images, audio, video.

A majority of this data is large and the different media may need synchronisation -

- the data may have temporal relationships as an integral property.

Some media is time independent or static or discrete media: normal data, text,single images, graphics are examples.

Video, animation and audio are examples ofcontinuous media

4 Marks Bookwork

(c) What key issues or problems does a multimedia system have to deal with when

handling multimedia data?

A Multimedia system has four basic characteristics:

Multimedia systems must be computer controlled. Multimedia systems are integrated. The information they handle must be represented digitally. The interface to the final presentation of media is usually interactive.


32/163

CM0340

3

Multimedia systems may have to render a variety of media at the same instant -- a

distinction from normal applications. There is a temporal relationship between manyforms of media (e.g. Video and Audio. There 2 are forms of problems here

Sequencing within the media -- playing frames in correct order/time frame invideo

Synchronisation -- inter-media scheduling (e.g. Video and Audio). Lipsynchronisation is clearly important for humans to watch playback of video

and audio and even animation and audio. Ever tried watching an out of (lip)sync film for a long time?

The key issues multimedia systems need to deal with here are:

How to represent and store temporal information.

How to strictly maintain the temporal relationships on play back/retrieval What process are involved in the above.

Data has to represented digitally so many initial source of data needs to be digitise --

translated from analog source to digital representation. The will involve scanning

(graphics, still images), sampling (audio/video) although digital cameras now exist fordirect scene to digital capture of images and video.

The data is large several Mb easily for audio and video -- therefore storage, transfer(bandwidth) and processing overheads are high. Data compression techniques very

common.

7 Marks BOOK WORK

(d) An analog signal has bandwidth that ranges from 15Hz to 10 KHz. What is the

rate of sampler and the bandwidth of bandlimiting filter required if:

(i) the signal is to be stored within computer memory.

Nyquist Sample Theorem rate says that sampling must be at least twice the highestfrequency component of signal or transmission channel

Highest frequency is 10 KHz so

Sampling rate = 20 KHz or 20,000 sample per second. 1 Mark

Bandwidth of bandlimiting filter = 0 10 KHZ 2 Marks


33/163

CM0340

4

(ii) the signal is to be transmitted over a network which has a bandwidth

from 200Hz to 3.4 KHz.

Channel has lower rate than max in signal so must choose this a limiting high

frequency so

Sampling rate = 6.8 KHz or 6,800 sample per second. 2 Marks

Bandwidth of bandlimiting filter = 0 3.4 KHZ 2 Marks

7 Marks TOTAL: ALL UNSEEN

(e) Assuming that each signal is sampled at 8bits per sample what is the

difference in the quantisation noise and signal to noise ratio expected for

the transmission of the

signals in (i) and (ii).

Quantisation noise = Vmax/2n-1

SNR = 20 log (Vmaxal/Vmin)

So for (i) Quantisation noise = 78.125SNR = 20 log (10,000/15) = 56.48 Db

3 Marks

And (ii) Quantisation noise = 26.56SNR = 20 log (3,400/15) = 47.11 Db4 Marks

7 Marks TOTAL: ALL UNSEEN

2. (a) Why is data compression necessary for Multimedia activities?

Audio and Video and Images take up too large memory,disk space or bandwidth

uncompressed.

3 Marks BookWork

(b) What is the distinction between lossless and lossy compression?

What broad types of multimedia data are each most suited to?


34/163

CM0340

5


-- where data is compressed and can be reconstituted (uncompressed) without loss ofdetail or information. These are referred to as bit-preserving or reversible compression

systems also.

Lossy Compression-- where the aim is to obtain the best possible fidelity for a given bit-rate or minimizing

the bit-rate to achieve a given fidelity measure. Video and audio compression techniques

are most suited to this form of compression.

Types sutability

Lossless

Computer data fles (compression)Graphcs and graphical images lossless (GIF/LZW)

Lossy

Audio MP3

Photographic images (JPEG)

Video (Mpeg)

5 Marks Bookwork

(c) Briefly explain the compression techniques of zero length suppression and run

length encoding. Give one example of a real world application of each

compression technique.

Simple Repetition Suppresion

Simplest Suppression of zero's in a ---Zero Length Supression

If in a sequence a series on n successive tokens appears we can replace these with atoken and a count number of occurences. We usually need to have a special code to

denote when the repated token appears

For Example

89400000000000000000000000000000000

we can replace with

894f32

where f is the code for zero.


35/163

CM0340

6

Example:

Silence in audio data, Pauses in conversation

Bitmaps Blanks in text or program source files Backgrounds in images

Run-length Encoding

This encoding method is frequently applied to images (or pixels in a scan line). It is a

small compression component used in JPEG compression.

In this instance, sequences of image elements (X1,X2, Xn )are mapped to

pairs (c1,l1, c2,l2,(cn,ln) where ci represent image

intensity or colour and li the length of the ith run of pixels (Not dissimilar tozero length supression above).

For example:

Original Sequence:

111122233333311112222

can be encoded as:

(1,4),(2,3),(3,6),(1,4),(2,4)

The savings are dependent on the data. In the worst case (Random Noise) encoding is

more heavy than original file: 2*integer rather 1* integer if data is representedas

integers.

Examples

Simple audo

graphics Images

7 Marks Bookwork


36/163

CM0340

7

(d) Show how you would encode the following token stream using zero length

suppression and run length encoding:

ABC000AAB00000000DEFAB00000

Total length of token stream = 27

Zero Length Suppresson Code

ABCf3AABf8DEFABf5

Number of tokens 17 where f is code for 0

Run Length Encoding

A1B1C103A2B108D1E1F1A1B105

Number of tokens 26

(i) What is the compression ratio for each method when applied to the above

token stream?

Total length of token stream = 27

Zero Length Suppresson Code

ABCf3AABf7DEFABf5

Number of tokens 17 where f is code for 0

Run Length Encoding

A1B1C103A2B107D1E1F1A1B105

Number of tokens 26

Compresson ratios:

Zero length Supresson = 17/27

Run Length Encoding = 26/27

3 Marks each for correct encoding

2 Mark for each ratio

10 Marks Total


37/163

CM0340

8

(ii) Explain why one has a better compression ratio than the other. What

properties of the data lead to this result?

Data has only one repeated token the 0. So coding is wasted on rapidly changing

remainder of data in run length encoding where every token frequency countneeds recording. 2 Marks

12 Marks for all of PART (d) ALL WORK UNSEEN


38/163

CM0340

9

3. (a) Briefly outline the basic principles of Inter-Frame Coding in Video Compression.

Essentially Each Frame is JPEG encoded

Macroblocks are 16x16 pixel areas on Y plane of original image.Amacroblock usually consists of 4 Y blocks, 1 Cr block, and 1 Cb block.

Quantization is by constant value for all DCT coefficients (i.e., no quantizationtable as in JPEG).

The Macroblock is coded as follows:

Many macroblocks will be exact matches (or close enough). So send address ofeach block in image ->Addr


39/163

CM0340

10

Sometimes no good match can be found, so send INTRA block -> Type Will want to vary the quantization to fine tune compression, so send quantization

value -> Quant

Motion vector -> vector Some blocks in macroblock will match well, others match poorly. So send

bitmask indicating which blocks are present (Coded Block Pattern, or CBP).

Send the blocks (4 Y, 1 Cr, 1 Cb) as in JPEG.8 Marks BOOKWORK

(b) What is the key difference between I-Frames, P-Frames and B-Frames?

Why are I-frames inserted into the compressed output stream relativelyfrequently?

I-Frame --- Basic Refernce FRAME for each Group of picture. Essentially a JPEG

Compressed image.

P-Frame --- CodedforwardDifference frame w.r.t last I or P frame

B-Frame --- Coded backward Difference frame w.r.t last I or P frame

I-frame Needed regularly as differences cannot cope with drift too far from

rerence frame. If not present regularly poor image quality results.

6Marks BOOKWORK

(c) A multimedia presentation must be delivered over a network at a rate of 1.5Mbits per second. The presentation consists of digitized audio and video. The

audio has an average bit rate of 300 Kbits per second. The digitised video is in

PAL format is to be compressed using the MPEG-1 standard. Assuming a frame

sequence of:

IBBPBBPBBPBBI..

and average compression ratios of 10:1 and 20:1 for the I-frame and P-framewhat is the compression ratio required for the B-frame to ensure the desired

delivery rate?

You may assume that for PAL the luminance Signal is sampled at the spatial

resolution of 352x288 and that the two chrominance signals are sampled at half

this resolution. The refresh rate for PAL is 25Hz. You should also allow 15%

overheads for the multiplexing and packetisation of the MPEG-1 video.


40/163

CM0340

11

Desired Rate = 1.5 Mbits/Sec

Desired video rate = Rate audio rate= 1.5 0.3 = 1.2 Mbits/Sec

Physical rate = Video Rate less Headroom

= 1.2 / 1.15 = 1.044 Mbits/Sec

Each Group has 12 Frame: 1 I, 8 B and 3 P frames

So average frame rate = (0.1 + 3*0.05 + 8x)/12 = (0.25 + 8x)/12

Each frame has: 352*288*8 + 2*(176*144*8) bits (uncompressed) = 1,216,512

bits

So average Compressed bits per frame (average over 12 frames GoP) =

1216512*(0.25 + 8x)/12

Therefore Bits per second at 25 Frames per Sec rate=

25*1216512*(0.25 + 8x)/12

We require:

25*1216512*(0.25 + 8x)/12 = 10440002534400*(0.25 + 8x) = 1044000

(0.25 + 8x) = 0.412

8x = 0.16

x = 0.02

Or the compression ratio is 50:1 for the B-FRAME

13 MARKS UNSEEN


41/163

CM0340

12

4. (a) What key features of Quicktime have led to its adoption and acceptance as an

international multimedia format?

QuickTime is the most widely used cross-platform multimedia technology available

today. QuickTime developed out of a multimedia extension for Apple's

Macintosh(proprietry) System 7 operating system. It is now an international standard formultimedia interchange and is avalailbe for many platforms and as Web browser plug ins.

The following main features are:

Versatile support for web-based media Sophisticated playback capabilities Easy content authoring and editing

QuickTime is anopen standard-- it embraces other standards and incorporatesthem into its environment. It supports almost every major Multimedia file format

4 Marks BOOKWORK

(b) Briefly outline the Quicktime Architecture and its key components.

The QuickTime Architecture:

QuickTime comprises two managers: the Movie Toolbox and the Image CompressionManager. QuickTime also relies on the Component Manager, as well as a set ofpredefined components. Figure below shows the relationships of these managers and an

application that is playing a movie.


42/163

CM0340

13

The Movie Toolbox

-- Your application gains access to the capabilities of QuickTime by calling functions inthe Movie Toolbox. The Movie Toolbox allows you to store, retrieve, and manipulate

time-based data that is stored in QuickTime movies. A single movie may contain severaltypes of data. For example, a movie that contains video information might include both

video data and the sound data that accompanies the video.

The Movie Toolbox also provides functions for editing movies. For example, there areediting functions for shortening a movie by removing portions of the video and sound

tracks, and there are functions for extending it with the addition of new data from otherQuickTime movies.

The Movie Toolbox is described in the chapter "Movie Toolbox" later in this book. That

chapter includes code samples that show how to play movies.

The Image Compression Manager

--

The Image Compression Manager comprises a set of functions that compress and

decompress images or sequences of graphic images.

The Image Compression Manager provides a device-independent and driver-independentmeans of compressing and decompressing images and sequences of images. It also

contains a simple interface for implementing software and hardware image-compressionalgorithms. It provides system integration functions for storing compressed images as


43/163

CM0340

14

part of PICT files, and it offers the ability to automatically decompress compressed PICTfiles on any QuickTime-capable Macintosh computer.

In most cases, applications use the Image Compression Manager indirectly, by calling

Movie Toolbox functions or by displaying a compressed picture. However, if your

application compresses images or makes movies with compressed images, you will callImage Compression Manager functions.

The Image Compression Manager is described in the chapter "Image CompressionManager" later in this book. This chapter also includes code samples that show how to

compress images or make movies with compressed images.

The Component Manager

--

Applications gain access to components by calling the Component Manager. The

Component Manager allows you to define and register types of components andcommunicate with components using a standard interface. A component is a code

resource that is registered by the Component Manager. The component's code can bestored in a systemwide resource or in a resource that is local to a particular application.

Once an application has connected to a component, it calls that component directly.

If you create your own component class, you define the function-level interface forthe component type that you have defined, and all components of that type must

support the interface and adhere to those definitions. In this manner, an applicationcan freely choose among components of a given type with absolute confidence that

each will work.

QuickTime Components :

movie controller components, which allow applications to play movies using astandard user interface standard image compression dialog components, whichallow the user to specify the parameters for a compression operation by supplying

a dialog box or a similar mechanism

image compressor components, which compress and decompress image datasequence grabber components, which allow applications to preview and recordvideo and sound data as QuickTime movies video digitizer components, which

allow applications to control video digitization by an external device media data-exchange components, which allow applications to move various

types of data in and out of a QuickTime movie derived media handlercomponents, which allow QuickTime to support new types of data in QuickTime

movies clock components, which provide timing services defined for QuickTime

applications preview components, which are used by the Movie Toolbox's


44/163

CM0340

15

standard file preview functions to display and create visual previews for filessequence grabber components, which allow applications to obtain digitized data

from sources that are external to a Macintosh computer

sequence grabber channel components, which manipulate captured data for asequence grabber component

sequence grabber panel components, which allow sequence grabber componentsto obtain configuration information from the user for a particular sequencegrabber channel component

10 Marks BookWork

(c) JPEG2000 is a new image compression standard. Outline how this new

standard might be incproprated into the Quicktime Architecture. Your answer

need not consider the details of the actual compression methods used inJPEG2000 , instead it should focus on how given the compression format you

could extend Quicktime to support it.

Sketch of ideas required by solution builds on QT Architecture knowledge above

JPEG is a still image format Need to add functionality to the following

Media Data Structure --- add knowledge of data structure on new format

Component manager --- add new component to component manage

Image Compression --- add compression and decompression routines toCompression manage

13 MARKS UNSEEN


45/163

CM0340

1

CARDIFF UNIVERSITY

EXAMINATION PAPER

SOLUTIONSAcademic Year: 2002-2003

Examination Period: Autumn 2002

Examination Paper Number: CM0340Examination Paper Title: Multimedia

Duration: 2 hours


Structure of Examination Paper:

There are four pages.There are four questions in total.




Students to be provided with:

The following items of stationery are to be provided:One answer book.






46/163

CM0340

2

1. (a) What is MIDI?

Definition of MIDI: a protocol that enables computer, synthesizers, keyboards, and

other musical device to communicate with each other.

2 Marks Basic Bookwork

(b)How is a basic MIDI message structured?

Structure of MIDI messages:

MIDI message includes a status byte and up to two data bytes. Status byte The most significant bit of status byte is set to 1. The 4 low-order bits identify which channel it belongs to (four bits produce 16

possible channels).

The 3 remaining bits identify the message.

The most significant bit of data byte is set to 0.4 Marks Basic Bookwork


47/163

CM0340

3

(c) A piece of music that lasts 3 minutes is to be transmitted over a network. Thepiece of music has 4 constituent instruments: Drums, Bass, Piano and Trumpet.

The music has been recorded at CD quality (44.1 Khz, 16 bit, Stereo) and also as

MIDI information, where on average the drums play 180 notes per minute, the

Bass 140 notes per minute, the Piano 600 notes per minute and the trumpet 80notes per minute.

(i) Estimate the number of bytes required for the storage of a full

performance at CD quality audio and the number of bytes for the Midi

performance. You should assume that the general midi set of instruments is

available for any performance of the recorded MIDI data.

CD AUDIO SIZE:

2 channels * 44,100 samples/sec * 2 bytes (16bits) * 3*60 (3 Mins) = 31,752,000

bytes = 30.3 Mb

Midi:

3 bytes per midi message

KEY THINGS TO NOTE

Need to send 4 program change (messages to set up General MIDI instruments) = 12

bytes (2 marks)Need to send Note ON and Note OFF messages to play each note properly. (4 marks)

Then send 3 mins * 3 (midi bytes) * 2 (Note ON and OFF) * (180 + 140 + 600 + 80) =18,000 bytes = 17.58 Kb.

8 Marks Unseen 2 for CD AUDIO 6 for MIDI

(ii)Estimate the time it would take to transmit each performance over anetwork with 64 kbps.

CD AUDIO

Time = 31,752,000*8 (bits per second)/(64*1024) = 3,876 seconds = 1.077 Hours

MIDI

Time = 18,000*8 / (64*1024) = 2.197 seconds

2 Marks Unseen


48/163

CM0340

4

(iii)Briefly comment on the merits and drawbacks of each method of transmission of

the performance.

Audio: Pro: Exact reproduction of source sounds

Con: High bandwidth/long file transfer for high quality audio

MIDI: Pro: Very low bandwidth

Con: No control of quality playback of Midi sounds.

4 Marks Unseen but extended discussion on lecture material

(d) Suppose vocals (where actual lyrics were to be sung) were required to be

added to the each performance in (c) above. How might each performance be

broadcast over a network?

KEY POINT: Vocals cannot utilize MIDI

Audio: Need to overdub vocal audio on the background audio track

Need some audio editing package and then mix combined tracks for stereo

audio.

Assuming no change in sample rate or bit size the new mixed track will have exactly

the same file size as the previous audio track so transmission is same as in (c).

Midi: Midi alone is now no longer sufficient so how to proceed?

For best bandwidth keep backing tracks as MIDI and send Vocal track as Audio.

To achieve such a mix some specialist music production software will be needed to

allow a file to be saved with synchronized Midi and Audio.

How to deliver over a network? Need to use a Multimedia standard that supportsMIDI and digital audio. Quicktime files support both (as do some Macromedia

Director/Flash(?) files) so save mixed MIDI audio file in this format.

The size of the file will be significantly increased due to single channel audio. If this isnot compressed and assume a mono audio file file size will increase by around

15Mb. SO transmission time will increase drastically.

7 Marks Unseen


49/163

CM0340

5

2. (a) What is meant by the terms frequency and temporal masking of two or more

audio signals? Briefly, what is cause of this masking?

Frequency Masking: When an Audio signal consists of multiple frequencies the

sensitivity of the ear changes with the relative amplitude of the signals. If the

frequencies are close and the amplitude of one is less than the other closefrequency then the second frequency may not be heard. The range of closeness

for frequency masking (The Critical Bands) depends on the frequencies and

relative amplitudes.

Temporal Masking: After the ear hears a loud sound it takes a further short while

before it can hear a quieter sound.

The cause for both types of masking is that within the human ear there are tiny hair

cells that are excited by air pressure variations. Different hair cells respond to

different ranges of frequencies.

Frequency Masking occurs because after excitation by one frequency further

excitation by a less strong similar frequency is not possible of the same group of

cells.

Temporal Masking occurs because the hairs take time to settle after excitation to

respond again.

8 Marks BookWork


50/163

CM0340

6

(b) How does MPEG audio compression exploit such phenomena? Give a

schematic diagram of the MPEG audio perceptual encoder.

MPEG use some perceptual coding concepts:

Bandwidth is divided into frequency subbands using a bank of analysisfilters critical band filters.

Each analysis filter using a scaling factor of subband max amplitudes forpsychoacoustic modeling.

FFT (DFT) used, Signal to mask ratios used for frequencies below a certainaudible threshold.

8 Marks - BookWork


51/163

CM0340

7

(c) The critical bandwidth for average human hearing is a constant 100Hz forfrequencies less than 500Hz and increases (approximately) linearly by 100 Hz for

each additiional 500Hz.

(i) Given a frequency of 300 Hz, what is the next highest (integer)frequency signal that is distinguishable by the human ear assuming the latter

signal is of a substantially lower amplitude?

Trick is to realize (remember?) definition of critical band:

Critical Band: The Width of a masking area (curve) to which no signal may be heard

given a first carrier signal of higher amplitude within a given frequency range as

defined above.

Critical Band is 100 Hz for 300 Hz signal so if a 300 Hz Signal So range of band is

250 350 Hz.

So next highest Audible frequency is 351 Hz

4 Marks Unseen

(ii) Given a frequency of 5000 Hz, what is the next highest (Integer)

frequency signal that is distinguishable by the human ear assuming the latter

signal is of a substantially lower amplitude?

5, 000 Hz critical bandwith is 10 * 100 Hz = 1000Hz

So range of of band is 4500 5500 Hz

So next highest audible frequency is 5501 Hz

7 Marks Unseen


52/163

CM0340

8

3. (a) What is the main difference between the H.261 and MPEG video compression

algorithms?

H 261 has I and P frames. Mpeg introduces additional B frame for backwardinterpolated prediction of frames.

2 Marks - Bookwork

(b)MPEG has a variety of different standards, i.e. MPEG-1, MPEG-2, MPEG-4,

MPEG-7 and MPEG-21. Why have such standards evolved? Give an example

target application for each variant of the MPEG standard.

Different MPEG standard have been developed for developing target domains thatneed different compression approaches and now formats for integration andinterchange of multimedia data.

MPEG-1 was targetted at Source Input Format (SIF): Video Originally optimized to

work at video resolutions of 352x240 pixels at 30 frames/sec (NTSC based) or352x288 pixels at 25 frames/sec (PAL based) but other resolutions possible.

MPEG-2 addressed issues directly related to digital television broadcasting,

MPEG-4: Originally targeted at very low bit-rate communication (4.8 to 64 kb/sec).

MPEG-7 targetted at Multimedia Content Description Interface.

MPEG-21 targetted at Multimedia Framework: Describing and using Multimedia

content in a unifed framework.

8 Marks Bookwork


53/163

CM0340

9

(c) Given the following two frames of an input video show how MPEG wouldestimate the motion of the macroblock, highlighted in the first image, to the next

frame.

For ease of computation in your solution: you may assume that all macroblockcalculations may be performed over 4x4 windows. You may also restrict your

search to 2 pixels in horizontal and vertical direction around the original

macroblock.

1 1 1 1 1 1 1 1

1 1 2 3 3 2 1 1

1 1 2 2 2 2 1 1

1 1 2 4 5 2 1 1

1 1 2 5 3 2 1 1

1 1 2 3 3 2 1 11 1 1 3 3 2 1 1

1 1 1 3 3 1 1 1

Frame n Frame n+1

Basic Ideas is to search for Macroblock (MB) Within a 2 pixel window and work

out Sum of Absolute Difference (SAD) (or Mean Absolute Error (MAE) for eachwindow but this is computationally more expensive) is a minimum.

Where SAD is given by:

For i = -2 to +2

For j = -2 to +2

SAD(i, j) = |C(x + k,y + l) "R(X+ i+ k,y + j+ l) |l= 0

N"1

#k=0

N"1

#

Here N = 2, (x,y) the position of the original MB, C, and R is the region to compute

the SAD.

It is sometimes applicable for an alpha mask to be applied to SAD calculation to mask

out certain pixels.

SAD(i, j) = |C(x + k,y + l) "R(X+ i+ k,y + j+ l) |l= 0

N"1

#k=0

N"1

# * (!alphaC = 0)

In this case the alpha mask is not required.

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 2 1 2 2 2 2

1 1 2 1 4 3 3 2

1 1 2 1 4 3 4 3

1 1 2 1 4 4 5 41 1 2 1 4 5 4 5

1 1 2 1 2 4 4 4


54/163

CM0340

10

So Search Area is given by dashed lines and example window SAD is given by bold

dot dash line (near top right corner)

For each Window the SAD score is (take top left pixel as window origin)

-2 -1 0 +1 +2

-2

-10+1

+2

So Motion Vector is (+2, +2).

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 2 1 2 2 2 2

1 1 2 1 4 3 3 2

1 1 2 1 4 3 4 31 1 2 1 4 4 5 4

1 1 2 1 4 5 4 5

1 1 2 1 2 4 4 4

12

12 12 11 11

11 11 11 6 712 12 9 3 4

11 11 9 4 5

11 11 10 3 1


55/163

CM0340

11

(d) Based upon the motion estimation a decision is made on whether INTRA or INTER

coding is made. What is the decision based for the coding of the macroblock

motion in (c)?

To determine INTRA/INTER MODE we do the following calculation:

MBmean =

|C(i, j) |i=0, j=0

N"1

#

N

A = |C(i, j)"MBmeani=0,j=0

Nx,Ny

# |*(!alphaC(i, j) = 0)

If A < (SAD 2N) INTRA Mode is chosen.

SO for above motion

MB = 17/2 = 8.5

A = 18

So 18 is not less that (1 - 4 =) -3 so we choose INTER frame coding.

5 Marks Unseen


56/163

CM0340

12

4. (a) What is the distinction between lossy and lossless data compression?

Lossless preserves data undergoing compression, Lossy compression aims to

obtain the best possible fidelity for a given bit-rate or minimizing the bit-rate toachieve a given fidelity measure but will not produce a complete facsimile of the

original data.

2 Marks Bookwork

(b)Briefly describe the four basic types of data redundancy that data compression

algorithms can apply to audio, image and video signals.

4 Types of Compression:

Temporal -- in 1D data, 1D signals, Audio etc.

Spatial -- correlation between neighbouring pixels or data items Spectral -- correlation between colour or luminescence components. This uses

the frequency domain to exploit relationships between frequency of change in

data.

Psycho-visual, psycho-acoustic -- exploit perceptual properties of the humanvisual system or aural system to compress data..

8 Marks Bookwork

(c)Encode the following steam of characters using decimalarithmetic coding

compression:

MEDIA

You may assume that characters occur with probabilities of

M = 0.1, E = 0.3, D = 0.3, I = 0.2 and A = 0.1.

Sort Data into largest probabilities first and make cumulative probabilities

0 - E - 0.3 - D - 0.6 I 0.8 M - 0.9 A 1.0

There are only 5 Characters so there are 5 segments of width determined by theprobability of the related character.

The first character to encoded is M which is in the range 0.8 0.9, therefore the range

of the final codeword is in the range 0.8 to 0.89999..

Each subsequent character subdivides the range 0.8 0.9SO after coding M we get


57/163

CM0340

13

0.8 - E - 0.83 - D - 0.86 I 0.88 M - 0.89 A 0.9

So to code E we get range 0.8 0.83 SO we subdivide this range

0 - E - 0.809 - D - 0.818 I 0.824 M - 0.827 A 0.83

Next range is for D so we split in the range 0.809 0.818

0.809 - E - 0.8117 - D - 0.8144 I 0.8162 M - 0.8171 A 0.818

Next Character is I so range is from 0.8144 0.8162 so we get

0.8144 - E - 0.81494 - D - 0.81548 I 0.81584 M - 0.81602 A 0.8162

Final Char is A which is in the range 0.81602 0.8162

So the completed codeword is any number in the range

0.81602


58/163

CM0340 SOLUTIONS

CARDIFF UNIVERSITY

EXAMINATION PAPER

SOLUTIONS

Academic Year: 2003-2004

Examination Period: Autumn 2003

Examination Paper Number: CM0340

Examination Paper Title: Multimedia

Duration: 2 hours


Structure of Examination Paper:There are four pages.

There are four questions in total.




Students to be provided with:The following items of stationery are to be provided:

One answer book.






59/163


60/163

CM0340 SOLUTIONS

(c) For each of the following media types, graphics, images, audio and

video, briefly discuss how Nyquists Sampling Theorem affects the

quality of the data and the form in which sampling effects manifest

themselves in the actual data.

Graphics

o Quality: Not an issue with vector graphicso Sampling Artifact: Rendering may lead to Aliasing effect

in lines etc

Images

o Quality: Image size decreases so less detail or samplingartifacts

o Sampling Artifact: Aliasing effect in blocky imagesAudio

o Quality: Lack of clarity in high frequencies, telephonicvoices at low sampling frequencies

o Sampling Artifact: Digital noise present in signal, loss ofhigh frequencies or poor representation of high frequencies

give audio aliasing (should be filtered out before sampling)

Video

o Quality: Video Frame size decreases so less detail orsampling artifacts, motion blur or loss of motion detail

o Sampling Artifact: Aliasing effect in frame images, jitterymotion tracking etc.

12(3 Marks per media type)Marks --- Unseen: Extended reasoning on a few

parts of course


61/163

CM0340 SOLUTIONS

(d) Calculate the uncompressed digital output, i.e. data rate, if a video

signal is sampled using the following values:

25 frames per second

160 x 120 pixelsTrue (Full) colour depth

True color = 24 bits (3 bytes) per pixel

So number of bytes per second is

3*160*120*25 = 144000 bytes or 1.37 Mb

3Marks --- Unseen: Application of basic knowledge

(e) If a suitable CD stereo quality audio signal is included with the video

signal in part d what compression ratio would be needed to be able to

transmit the signal on a 128 kbps channel?

Stereo audio = 44100*2 (16 bit/s byte)*2 = 176400 bytes per second

So uncompressed bytes stream is 144000 + 176400 = 320400 bytes per

second

128 kbps is kilo bits per second

so compression ratio is (128*1024)/( 320400 *8) = 0.05.

3Marks --- Unseen: Application of basic knowledge


62/163

CM0340 SOLUTIONS

2. (a) What characteristics of the human visual system can be exploited for the

compression of colour images and video?

The eye is basically sensitive to colour intensity

o Each neuron is either a rodor a cone . Rods are not sensitive to colour.o Cones come in 3 types: red, green and blue.oEach responds differently --- Non linearly and not equally for RGB differently

to various frequencies of light.

5 Marks --- Bookwork


63/163

CM0340 SOLUTIONS

(b) What is the YIQ color model and why is this an appropriate color model

used in conjunction with compression methods such as JPEG and

MPEG?

oYIQ is origins in colour TV broadcastingoY (luminance) is the CIE Y primary.oY = 0.299R + 0.587G + 0.114Bothe other two vectors:oI = 0.596R - 0.275G - 0.321B Q = 0.212R - 0.528G + 0.311BoThe YIQ transform:

How to exploit to compression:

o Eye is most sensitive to Y, next to I, next to Q.o Quantise with more bits for Y than I or Q.

4 (2 for Transform (Matrix or Eqn) and 2 for Compression scheme)Marks --

- Bookwork


64/163

CM0340 SOLUTIONS

(c) Given the following YIQ image values:

128 126 127 129 55 66 54 54 44 44 55 55

124 123 124 124 56 57 56 56 44 44 55 55

130 136 132 132 45 56 58 49 34 34 36 35

154 143 132 132 34 36 39 37 35 35 34 34

Y I Q

What are the corresponding chroma subsampled values for a

(i) 4:2:2 subsampling scheme

(ii) 4:1:1 subsampling scheme(iii) 4:2:0 subsampling scheme

Basic Idea required (from notes):

Chroma Subsampling

o4:2:2 -> Horizontally subsampled colour signals by a factor of 2. Each pixel istwo bytes, e.g., (Cb0, Y0)(Cr0, Y1)(Cb2, Y2)(Cr2, Y3)(Cb4, Y4) ...

o4:1:1 -> Horizontally subsampled by a factor of 4o 4:2:0 -> Subsampled in both the horizontal and vertical axes by a factor of 2

between pixels


65/163

CM0340 SOLUTIONS

(i) 4:2:2 subsampling scheme

Take every two horizontal pixels in I Q Space

128 126 127 129 55 66 54 54 44 44 55 55124 123 124 124 56 57 56 56 44 44 55 55130 136 132 132 45 56 58 49 34 34 36 35154 143 132 132 34 36 39 37 35 35 34 34

Full YIQ

128 126 127 129 5554 44 55

124 123 124 124 56 56 44 55130 136 132 132 45 58 34 36

154 143 132 132 34 39 35 34

4:2:2

YIQ


66/163

CM0340 SOLUTIONS

(ii) 4:1:1 subsampling scheme Take every 4 pixels in the horizonatl

128 126 127 129 55 66 54 54 44 44 55 55

124 123 124 124 56 57 56 56 44 44 55 55

130 136 132 132 45 56 58 49 34 34 36 35

154 143 132 132 34 36 39 37 35 35 34 34

Full YIQ

128 126 127 129 55 44

124 123 124 124 56 44

130 136 132 132 45 34

154 143 132 132 34 35

4:1:1

YIQ

(iii) 4:2:0 subsampling scheme AVERAGE in every 2x2 block

128 1126 1127 129 55 66 54 54 44 44 55 55

124 1123 1124 124 56 57 56 46 44 44 55 55

130 1136 132 132 45 56 58 49 34 34 36 35

154 1143 132 132 34 36 39 37 35 35 34 34

Full

YIQ

128 126 127 129 59 55 44 55

124 123 124 124

130 136 132 132 43 46 35 35

154 143 132 132

4:2:0

YIQ

15Marks --- Unseen: Practical Application of Bookwork Knowledge


67/163

CM0340 SOLUTIONS

3. (a) What is the distinction between lossy and lossless data compression?


Where data is compressed and can be reconstituted (uncompressed) without loss of

detail or information. These are referred to as bit-preserving or reversiblecompression systems also.

Lossy Compression

where the aim is to obtain the best possiblefidelity for a given bit-rate or

minimizing the bit-rate to achieve a given fidelity measure. Video and audiocompression techniques are most suited to this form of compression.

2 Marks Bookwork


68/163

CM0340 SOLUTIONS

(b) Briefly describe two repetitive suppression algorithms and give one

practical use of each algorithm.

1. Simple Repetition Suppresion

If in a sequence a series on n successive tokens appears we can replace these with a

token and a count number of occurences. We usually need to have a specialflag to

denote when the repated token appears

For Example

89400000000000000000000000000000000

we can replace with

894f32

where fis the flag for zero.

Compression savings depend on the content of the data.

Applications of this simple compression technique include:

oSilence in audio data, Pauses in conversation etc.oBitmapsoBlanks in text or program source filesoBackgrounds in imageso other regular image or data tokens

2. Run-length Encoding

In this method, sequences of (image) elements X1, X2 Xn are mapped to pairs

(C1,l1), (C2,L2) (Cn,Ln) where ci represent image intensity or colour and li the

length of the ith run of pixels.

For example:

Original Sequence:

111122233333311112222

can be encoded as:

(1,4),(2,3),(3,6),(1,4),(2,4)

The savings are dependent on the data. In the worst case (Random Noise) encoding

is more heavy than original file: 2*integer rather 1* integer if data isrepresented as integers.


69/163

CM0340 SOLUTIONS

Applications:

oThis encoding method is frequently applied to images (or pixels in a scan line).oIt is a small compression component used in JPEG compression

10 Marks Bookwork


70/163

CM0340 SOLUTIONS

(c) Briefly state the LZW compression algorithm and show how you would

use it to encode the following stream of characters:

MYMEMYMO

You may assume that single character tokens are coded by their

ASCII codes, as per the original LZW algorithm. However, for the

purpose of the solution you may simply output the character rather

than the ASCI value.

The LZW Compression Algorithm can summarised as follows:

w = NIL;while ( read a character k )

{if wk exists in the dictionary

w = wk;else

add wk to the dictionary;output the code for w;

w = k;

}

Original LZW used dictionary with 4K entries, first 256 (0-255) are ASCII codes.

Encoding ofMYMEMYMO:

W K Output Index Symbol

___________________________________________________________

nil M

M Y M (ASCII) 256 MY

Y M Y 257 YMM E M 258 ME

E M E 259 EM

M YMY M 256 260 MYM

M O M 261 MO

So Token Stream is

MYMEM

12 Marks Unseen: Application of Algorithm

4. (a) What is the basic format of an MHEG application?


71/163

CM0340 SOLUTIONS

An MHEG-5 application is made up of Scenes and objects that are common to

all Scenes.

A Scene contains a group of objects, called Ingredients, that representinformation (graphics, sound, video, etc.) along with localized

behavior based on events firing (e.g., the 'Left' button being pushed

activating a sound). At most one Scene is active at any one time. Navigation in an application is done by transitioning between Scenes.

2 Marks Bookwork

(b) Briefly describe the MHEG Client-Server interaction and the role that

the

MHEG Engine plays in this process.

Client Server Interaction (4 Marks)

Server streams out content requested by MHEG application. Client/Run Time Engine (RTE) embedded in firmware process MHEG,

Deals with streaming of sourced data and formatting for presentation

Run Time Engine (RTE) main functions: (6 Marks)

RTE is the kernel of the clients architecture. Issues I/O and data access and requests to client Prepares the presentation and handles accessing, decoding, and managing MHEG-5 objects in their internal format.

o Interpretation of MHEG objects Actual presentation, which is based on an event loop

where events trigger actions.

These actions then become requests to the Presentationlayer along with other actions that internally affect the engine.

10Marks Bookwork

(c) Using suitable fragments of MHEG code, illustrate how you would code

the

following presentation in MHEG:


72/163

CM0340 SOLUTIONS

Scene 1 Scene 2

The above presentation consists of two scenes. Scene 1 plays someVideo and is overlayed by some text information and a next button is

provided so that the user may elect to jump to Scene 2. Scene 2plays

some video and is overlayed by a visual prompt which when selected

displays some further text information.

Note that the precise MHEG syntax and object attributes and attribute

values is not absolutely required in your solutions. Rather you should

concentrate on giving indicative object attributes values. In essence the

structure of the code is more important than precise syntax.

Basic Idea

Need startup of application --- here do it in startup.mheg

This fires up scene1.mheg --- only essential MHEG objects lists. Button1 event

triggers scence2.mheg.Important Point is button and link transition

scene2.mheg --- fires up on button1 event fires up moreinfo.mheg on visual

prompt

event trigger. Only essential MHEG objects listed. Important point is some

graphics/bitmap overlay icon plus hot spot for linkmoreinfo.mheg --- simply full of text. Not that important transition to is what Q.

requires

startup.mheg (3 Marks):

{:Application ("startup" 0)//:OnStartUp (:TransitionTo(("scene1.mheg" 0))):Items(

{:Link 1:EventSource 1

:EventType IsRunning:LinkEffect (:TransitionTo(("scene1.mheg" 0)))

})

}


73/163

CM0340 SOLUTIONS

scene1.mheg (Object attribute and attribute values just indicative) 5Marks:

{:Scene ( "scene1.mheg" 0 )

:Items(

{:video 0

:InitiallyActive true:OrigContent:ContentRef ( "waterfall.mov" ):OrigBoxSize 120 120:OrigPosition 225 175

:ComponentTag 100:Termination loop}

{:Text 1:OrigContent 'Some Text ':OrigBoxSize 95 95:OrigPosition 0 175:FontAttributes Bold.14

:TextColour black:HJustification centre

:VJustification centre:TextWrapping true}

{:PushButton 3

:OrigBoxSize 100 60:OrigPosition 540 280:ButtonRefColour gray:OrigLabel "back to main"

}

{:Link 4:EventSource 2 :EventType IsSelected:LinkEffect ( :TransitionTo( ( "scene2.mhg" 0) )

)}

}

)


74/163

CM0340 SOLUTIONS

scene2.mheg (Object attribute and attribute values just indicative) 5Marks:

{:Scene ( "scene2.mheg" 0 )

:Items(

{:video 0

:InitiallyActive true:OrigContent:ContentRef ( "painter.mov" ):OrigBoxSize 120 120:OrigPosition 225 175:ComponentTag 100

:Termination loop}

{:Bitmap 1:OrigContent :ContentRef ( "overlay.gif" )

:OrigBoxSize 51 39 // 0 0

:OrigPosition 10 15:Tiling false}

{:Hot Spot 2

:OrigBoxSize 100 60:OrigPosition 540 280:ButtonRefColour gray:OrigLabel "back to main"

}

{:Link 3:EventSource 2 :EventType IsSelected:LinkEffect ( :TransitionTo( ( "moreinfo.mhg" 0)

) )

}}

)


75/163

CM0340 SOLUTIONS

moreinfo.mheg (2 Marks)

{:Scene ( "moreinfo.mheg" 0 )

:Items

(

{:Text 1:OrigContent 'Some Text '

:OrigBoxSize 95 95:OrigPosition 0 175:FontAttributes Bold.14:TextColour black:HJustification centre

:VJustification

multimedia multimedia bible

Documents