multimedia and multi-modal

11
MULTIMEDIA AND MULTIMODAL APPLICATIONS The use of multiple sensory channels increases the bandwidth of the interaction between the human and the computer. It makes human computer interaction more like the interaction between humans in their everyday environment. Discuss this statement explaining the role of application of multi- sensory input in human computer interaction. Multi-sensory systems use more than one sensory channel in interaction E.g. sounds, text, hypertext, animation, video, gestures, vision etc. this is used in a range of applications and is particularly good for users with special needs, and virtual reality. Speech Recognition: currently useful?: For a single user, limited vocabulary systems can work satisfactorily. No general user, general vocabulary systems are commercially successful, yet Large potential, however • When users hands are already occupied -manufacturing, for example • For users with physical disabilities • Lightweight, mobile devices Speech Synthesis: Speech synthesis refers to the generation of speech. It is useful in natural and familiar way of receiving information Problems - similar to recognition particularly prosody (alteration in tone and quality, which allows variations in emphasis, stress, pauses and pitch to impart more meaning to sentences). Additional problems • intrusive - either requires headphones, or creates noise in the workplace • transient - harder to review and browse Successful in certain constrained applications, usually when the user is particularly motivated to overcome the problems and has few alternatives

Upload: marvin-njenga

Post on 14-Nov-2014

106 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Multimedia and Multi-modal

MULTIMEDIA AND MULTIMODAL APPLICATIONS

The use of multiple sensory channels increases the bandwidth of the interaction between the human and the computer. It makes human computer interaction more like the interaction between humans in their everyday environment. Discuss this statement explaining the role of application of multi-sensory input in human computer interaction.

Multi-sensory systems use more than one sensory channel in interaction E.g. sounds, text, hypertext, animation, video, gestures, vision etc. this is used in a range of applications and is particularly good for users with special needs, and virtual reality.

Speech Recognition: currently useful?: For a single user, limited vocabulary systems can work satisfactorily. No general user, general vocabulary systems are commercially successful, yet Large potential, however• When users hands are already occupied -manufacturing, for example• For users with physical disabilities• Lightweight, mobile devices

Speech Synthesis: Speech synthesis refers to the generation of speech. It is useful in natural and familiar way of receiving informationProblems - similar to recognition particularly prosody (alteration in tone and quality, which allowsvariations in emphasis, stress, pauses and pitch to impart more meaning to sentences).Additional problems• intrusive - either requires headphones, or creates noise in the workplace• transient - harder to review and browse Successful in certain constrained applications, usually when the user is particularly motivated to overcome the problems and has few alternatives• screen readers - read the textual display to the user: utilised by visually impaired people• warning signals - spoken information is sometimes presented to pilots whose visual and haptic skills are already fully occupied.

Non-Speech Sounds: examples Boings, bangs, squeaks, clicks etc.• commonly used in interfaces to provide warnings and alarms Evidence to show they are useful• fewer typing mistakes with key clicks• video games harder without soundDual mode displays: information presented along two different sensory channelsAllows for redundant presentation of information - the user can utilise whichever they find easiest.Allows resolution of ambiguity in one mode through information contained in the otherSound especially good for transient information, and background status informationIt is also language/culture independent, unlike speech

Page 2: Multimedia and Multi-modal

Example: Sound can be used as a redundant mode in the Apple Macintosh; almost any user action (file selection, window active, disk insert, search error, copy complete, etc.) can have a different sound associated with it.

Auditory Icons: Use natural sounds to represent different types of object or action. Natural sounds have associated semantics which can be mapped onto similar meanings in the interaction• e.g. throwing something away can be represented by the sound of something smashingProblem: not all things have associated meanings: e.g. copyingApplication: SonicFinder for the MacintoshItems and actions on the desktop have associated sounds• Folders have a papery noise• Moving files is accompanied by a dragging sound• Copying (a problem one) has the sound of a liquid being poured into a receptacle; the rising pitch indicates the progress of the copy• Big files have a louder sound than smaller onesAdditional information can also be presented:• Muffled sounds indicate the object is obscured or an action is in the background• Use of stereo allows positional information to be added

Earcons: Synthetic sounds used to convey information Structured combinations of notes, called motives , used to represent actions and objects Motives combined to provide rich information.• Compound earcons multiple motives combined to make one more complicated earcons

Family earcons: Similar types of earcons represent similar classes of action or similar objects: the family of “errors” would contain syntax and operating system errors.Family earcons are easily grouped and refined due to compositional and hierarchical nature Harder to associate with the interface task since there is no natural mapping.Handwriting recognition: Handwriting is another communication mechanism, which we are used to.Technology: Handwriting consists of complex strokes and spaces Captured by digitising tablet - strokes transformed to sequence of dots • large-scale tablets available, more suitable fordigitising maps and technical drawings• Smaller devices, some incorporating thin screens to display the information, becoming available e.g.those produced by Apple as personal organizers

Page 3: Multimedia and Multi-modal

Recognition: Problems• personal differences in letter formation• co-articulation effectsOf limited success are systems that are trained on a few users, with separated lettersGeneric multi-user naturally-written text recognition systems are not currently of significant accuracy to be commercially successful.

Text and Hypertext: Text is a common form of output, and very useful in many situations • imposes a strict linear progression on the reader, according to the author’s ideas of what is best – this may not be idealHypertext structures blocks of text into a mesh ornetwork that can be traversed in many different ways• allows a user to follow their own ideas andconcepts through information• hypertext systems comprise:• a number of pages, and• links, that allow one page to be accessed from another Hypermedia: Hypermedia systems are hypertext systems that incorporate additional media, such as illustrations, photographs, video and sound Particularly useful for educational purposes• Animation and graphics can allow user to see things happen as well as read• hypertextual structure allows users to explore at their own pace following threads that interest themProblems: “Lost in hyperspace” - users can be unsure as to where in the hypertext web they areMaps of the hypertext are a partial solution, but since hypertexts can be large these can be daunting too• Incomplete coverage of information as there are so many different routes through thehypertext, it is possible to miss out chunks, by taking routes that avoid these areas• Difficult to print out and take away Printed documents require a linear structure; it canbe difficult to get the relevant information printed out in a neat manner

Animation: Animation refers to the addition of motion to images, which change and move in timeSimple examples:• clocksDigital faces - seconds flick pastAnalogue face - second hand sweeps round constantlySalvador Dali clock - digital numbers warp and melt, one digit into the next• cursorHourglass/watch/spinning disc indicates the system is busy flashing cursor indicates typing position clearly different types of cursor pointer indicate different functionality available, or different mode.Animation used to great effect to indicate temporally varying information.

Page 4: Multimedia and Multi-modal

Useful in education and training: allow users to see things happening, as well as being interesting and entertaining images in their own right

Video and Digital Video: Compact disc technology is revolutionizing multimedia systems: large amounts of video, graphics, sound and text can be stored and easily retrieved on a relatively cheap and accessible medium.Different approaches, characterised by different compression techniques that allow more data to be squeezed onto the disc• CD-I: excellent for full-screen work. Limited video and still image capability; targeted at domestic market• CD-XA (eXtended Architecture): development ofCD-I, better digital audio and still images• DVI (Digital Video Interactive)/UVC (Universal Video Communications): support full motion video

Utilising animation and video: Animation and video are potentially powerful tools• Notice the success of television and arcade games however, the standard approaches to interface design do not take into account the full possibilities of such mediaWe will probably only start to reap the full benefit from this technology when we have much more experience. We also need to learn from the masters of this new art form: interface designers will need to acquire the skills of film makers and cartoonists as well as artists and writers.

Applications: Users with special needs have specialized requirements which are often well-served by Multimedia and/or multimodal systems.• Visual impairment - screen readers, SonicFinder• Physical disability - speech input, gesture, recognition, predictive systems (e.g. Reactivekeyboard)• Learning disabilities (e.g. dyslexia) - speech input, output

Virtual Reality: Multimedia multimodal interaction at its most extreme, VR is the computer simulation of a world in which the user is immersed.• Headsets allow user to “see” the virtual world• gesture recognition achieved with DataGlove (lycra glove with optical sensors that measure hand and finger positions)• Eyegaze allows users to indicate direction with eyes alone

Multi-modal from Multi-media systems.

Multimodal System: Multimodal system supports communication with the user through different modalities such as voice, gesture, and typing. Literally, `multi' refers to `more than one' and the term `modal' may cover the notion of `modality' as well as that of `mode'. Modality refers to the type of communication channel used to convey or acquire information. It also covers the way an idea is expressed or perceived, or the manner an action is performed.

Page 5: Multimedia and Multi-modal

Mode refers to a state that determines the way information is interpreted to extract or convey meaning. The modality defines the type of data exchanged whereas the mode determines the context in which the data is interpreted. Thus, if we take a system-centered view, multi-modality is the capacity of the system to communicate with a user along different types of communication channels and to extract and convey meaning automatically. Thus both multimedia and multimodal systems use multiple communication channels. But in addition, a multimodal system is able to automatically model the content of the information at a high level of abstraction. A multimodal system strives for meaning.

Multimodal user interface supports multiple computer input and output, e.g. using speechtogether with pen-based gestures. Multi modal computer interaction can have two perspectives: the human-centred and the technology-centered. According to the human-centred perspective, multimodal systems should support more than one sensory and response modality of the users. The technology-centred approach defines a multimodal system to be one that supports concurrent combination of (input) modes. Alternatively, it could at least specify which mode is operational on each situation. While interacting with a multimodal system, users receive multimodal input and are able to respond by using those modalities, which provide convenient means of interaction. While in multimedia systems the user has to adapt to the system’s perceptual capabilities, in multimodal systems the system adapts to the preferences and needs of the user.

A successful interaction with a multimodal system would be one that provides the user with procedures unified into an integrated experience. In the case of educational technology, a successful multimodal interaction would be one where users could overcome the difficulties they have while interacting with technology and are able to concentrate on the content of the information provided. In such an occasion the technology would fulfill its main aim to become the artifact that provides information/knowledge to the user. From the users’ perspective, users could unify their experience of interacting with technology into an integrated one that would focus on learning.

Multimedia system: From the user's point of view, multi media means that information can also be represented as audio signals or moved images. Multimedia' focuses on the medium or technology rather than the application or user.

A multimedia user interface supports multiple outputs only, e.g. text with audio or tactile information provided to the user. As a result, multimedia research is a subset of multimodal research.From the system’s point of view, a multimedia system is also multimodal because it provides, via different media, the user with multimodal output, i.e. audio and visual information, and multimodal input, e.g. typing with the keyboard, clicking the

mouse. From the user’s point of view, a multimedia system makes users receive multimodal information. However, they can respond by using specific media, e.g. keyboard and mouse, which are not adaptable to different users or contexts of use.

Page 6: Multimedia and Multi-modal

All computer systems, single user or multi-user interact with the work groups and organizations in which they are used. However, group working is more complex than that of a single person. Discuss this statement and briefly explain how organizational factors can make or break groupware or single user factors.

Groupware is technology designed to facilitate the work of groups. This technology may be used to communicate, cooperate, coordinate, solve problems, compete, or negotiate. While traditional technologies like the telephone qualify as groupware, the term is ordinarily used to refer to a specific class of technologies relying on modern computer networks, such as email, newsgroups, videophones, or chat.

Groupware technologies are typically categorized along two primary dimensions:

1. Whether users of the groupware are working together at the same time ("real-time" or "synchronous" groupware) or different times ("asynchronous" groupware), and

2. Whether users are working together in the same place ("colocated" or "face-to-face") or in different places ("non-colocated" or "distance").

For broadly targeted groupware applications, such as videophones or email, understanding users can boil down to understanding how human beings communicate in the first place. A design is also best informed by conducting user studies on system prototypes. In these cases user testing is often significantly more difficult than with single-user systems for the following reasons:

o Organizing and scheduling for groups is more difficult than for individuals. o Group interaction style is hard to select for beforehand, whereas individual

characteristics are often possible to determine before a study is conducted. o Pre-established groups vary in interaction style, and the length of time they've

been a group affects their communication patterns. o New groups change quickly during the group formation process. o Groups are dynamic; roles change. o Many studies need to be long-term, especially when studying asynchronous

groupware. o Modifying prototypes can be technically difficult because of the added

complexity of groupware over single-user software. o In software for large organizations, testing new prototypes can be difficult or

impossible because of the disruption caused by introducing new versions into an organization.

Adoption and Acceptance: Many groupware systems simply cannot be successful unless a critical mass of users chooses to use the system. Having a videophone is useless if you're the only one who has it. Two of the most common reasons for failing to achieve critical mass are lack of interoperability and the lack of appropriate individual benefit.

Page 7: Multimedia and Multi-modal

Interoperability: Lack of interoperability/compatibility. Compatibility issues lead to general wariness among customers, who want to wait until a clear standard has emerged.

Avoiding Abuse: Most people are familiar with the problem of spamming with email. Some other common violations of social protocol include: taking inappropriate advantage of anonymity, sabotaging group work, or violating privacy.

Customization and Grounding: When groups are working together with the same information, they may individually desire customized views. The challenge of customized views is to support grounding: the establishment of a common ground or shared understanding of what information is known and shared between the different users.

Groupware offers significant advantages over single-user systems. These are some of the most common reasons people want to use groupware:

o To facilitate communication: make it faster, clearer, more persuasive o To enable communication where it wouldn't otherwise be possible o To enable telecommuting o To cut down on travel costs o To bring together multiple perspectives and expertise o To form groups with common interests where it wouldn't be possible to gather a

sufficient number of people face-to-face o To save time and cost in coordinating group work o To facilitate group problem-solving o To enable new modes of communication, such as anonymous interchanges or

structured interactions

Organisational issues-Organisational factors can make or break groupware-Studying the work group is not sufficient any system is used within a wider context and --the crucial people need not be direct users-Before installing a new system, the designer must understand:

Who benefits Who puts in effort The balance of power in the organization and how it will be affected

_ Even when groupware is successful it may be difficult to measure that success