inhabited information spaces: living with your data

Computer Supported Cooperative Work

SpringerLondonBerlinHeidelbergNew YorkHong KongMilanParisTokyo

011

011

011

011

11

i

Also in this series 123456789101112345678920111234567893011123456789401112345611

ii

Gerold RiemppWide Area Workflow Management3-540-7643-4

Celia T. Romm and Fay Sudweeks (Eds)Doing Business Electronically3-540-76159-4

Fay Sudweeks and Celia T. Romm (Eds)Doing Business on the Internet1-85233-030-9

Elizabeth F. Churchill, David N. Snowdonand Alan J. Munro (Eds)Collaborative Virtual Environments1-85233-244-1

Christine Steeples and Chris Jones (Eds)Networked Learning1-85233-471-1

Barry Brown, Nicola Green and Richard Harper (Eds)Wireless World1-85233-477-0

Reza Hazemi and Stephen Hailes (Eds)The Digital University – Building a LearningCommunity1-85233-478-9

Elayne Coakes, Dianne Willis andSteve Clark (Eds)Knowledge Management in the SocioTechnical World1-85233-441-X

Ralph Schroeder (Ed.)The Social Life of Avatars1-85233-461-4

J.H. Erik AndriessenWorking with Groupware1-85233-603-X

Paul Kirschner, Chad Carr andSimon Buckingham Shum (Eds)Visualising Argumentation1-85233-664-1

Christopher Lueg and Danyel Fisher (Eds)From Usenet to CoWebs1-85233-532-7

Kristina Höök, David Benyon and Alan J. Munro (Eds)Designing Information Spaces: The SocialNavigation Approach1-85233-661-7

Bjørn Erik MunkvoldImplementing Collaboration Technologiesin Industry1-85233-418-5

Related TitleRichard Harper (Ed.)Inside the Smart Home1-85233-688-9

A list of out of print titles is available at the end of the book

David N. Snowdon, Elizabeth F. Churchill and Emmanuel Frécon (Eds)

Inhabited InformationSpacesLiving with your Data

With 94 Figures

011

011

011

011

11

iii

David N. Snowdon, BSc (hons) MSc, PhDXerox Research Centre Europe, 6 Chemin de Maupertius, 38240 Meylan, France.

Elizabeth F. Churchill, BSc, MSc, PhDFX Palo Alto Laboratory Inc., 3400 Hillview Avenue, Building. 4 Paol Alto, CA94110, USA.

Emmanuel Frécon, MScSwedish Insitute for Computer Science, Interactive Collaborative Environments Laboratory, Platforms for Collaborative Environments Group, Box 1263, 164 29 Kista, Sweden.

Series Editors

Dan Diaper, PhD, MBCSProfessor of Systems Science & Engineering, School of Design, Engineering & Computing,Bournemouth University, Talbot Campus, Fern Barrow, Poole, Dorset BH12 5BB, UK

Colston SangerSchool of Management, University of Surrey, Guildford, Surrey GU2 7XH, UK

British Library Cataloguing in Publication DataInhabited information spaces : living with your data. –(Computer supported cooperative work)

1. Human-computer interaction 2. Interactive computer systemsI. Snowdon, David N., 1968– II. Churchill, Elizabeth F.,1962– III. Frécon, Emmanuel004′.019

ISBN 1852337281

Library of Congress Cataloging-in-Publication DataA catalog record of this book is available from the Library of Congress

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced,stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,or in the case of reprographic reproduction in accordance with the terms of licences issued by theCopyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent tothe publishers.

CSCW ISSN 1431-1496ISBN 1-85233-728-1 Springer-Verlag London Berlin HeidelbergSpringer-Verlag is a part of Springer Science+Business Mediaspringeronline.com

© Springer-Verlag London Limited 2004

The use of registered names, trademarks, etc. in this publication does not imply, even in the absence ofa specific statement, that such names are exempt from the relevant laws and regulations and thereforefree for general use.

The publisher makes no representation, express or implied, with regard to the accuracy of the informationcontained in this book and cannot accept any legal responsibility or liability for any errors or omissionsthat may be made.

Typeset by Florence Production, Stoodleigh, Devon, EnglandPrinted and bound in the United States of America34/3830-543210 Printed on acid-free paper SPIN 10910238

123456789101112345678920111234567893011123456789401112345611

iv

Foreword

The Human Touch: Reflections on i3

The Machine-centred Mind Set

At the Chicago World Fair of 1933, the official motto was: “Science Finds –Industry Applies – Man Conforms”. To many of us today this seems quite shock-ing, yet it has been the driving force of much development in the last century.

In particular, if you look at the rise of computing over the last 50 years, youwill see that, on the whole, development has been extraordinary, but fairlystraightforward: it can be characterised as trying to make “faster and fastermachines fit into smaller and smaller boxes”.

Starting from the time of the ENIAC, one of the colossal computers of the1940s, most IT progress has been driven from the point of view of the machine.Since then things have changed – but perhaps not really that much. Even if com-puters can today calculate many times over what was possible a few years ago,and the machines have become somewhat less obtrusive, much of the “mindset” has stayed the same. It is the visions of huge calculating machines span-ning massive rooms, trying to recreate an absolute artificial intelligence, thatstill haunt much of the thinking of today.

Clearly, it is difficult to shake off old mind sets.

Alternatives

Alternatives to the idea of fitting computing into ever smaller boxes can mainly be attributed to Mark Weiser. In his paper, “The Computer for the 21stCentury”, he outlined notions of how computing could become integrated intothe fabric of everyday life by becoming completely distributed into the envi-ronment. In this way computing would become “ubiquitous”. More recently,similarly inspired work on “tangible media”, by Hiroshi Ishii has emerged fromthe MIT Media Lab. Apart from this, the technological revolution of GSM andthe mobile phone has also had its share of making information technology comeout of its “traditional shell”.

Alternatives to the machine-centred view to computing were also startingaround the same time, such as the “anthropocentric” ideas proposed by Mike

011

011

011

011

11

v

Dertouzos at the MIT Computer Science Lab; and in a similar vein, cognitivescientist, Don Norman has been pointing out the lack of well-designed infor-mation environments.

Roughly at the same time, but from a different perspective, we started think-ing about how to give technology more of a “human touch”. Now, in principle,this should not be that difficult, as technology is after all, made by humans. Inpractice, however, one has to go quite far to break down the machine-centredand box-centred ways of thinking.

We decided that the only way to attack the problem with any significancewas to try to invert the picture completely – that is, to start thinking from thehuman point of view and work outwards. Our idea of “human centredness” wasthat it should nurture technological innovation but within a broader context ofhuman values and aspirations. This was not the same as “user” driven, or“defined by user needs”, all of which tend to become stuck in improving thestatus quo, but not growing beyond it. At the same time, we also wanted tomake sure to break out of the box-centred ways of thinking as much as possibleand avoid doing “traditional HCI”, which was mainly involved in improvingcomputers as they were.

Our ideas were designed to balance questions of technically “how”, with ques-tions of “why?” and “what for?”. And the aim was to see if we could start restor-ing the balance between people’s inventiveness to make new machines, with theessence of being human. Our questions became rather: How can we reach abetter and more fulfilling balance between technology and people? What couldbe new ways of thinking about the problems? What could be the new paradigmsthat could lay the paths for further research and development?

The i3 Research Programme

It is along these lines that we launched our first call for proposals back in 1995.Our general aim was to look at the relationship between people and informa-tion technology in the future: how could people access and use information,and exchange things with others using information technology as a medium?

A clear break was needed to get out of stale thinking. Therefore, we calledfor new paradigms of interaction and research on new interfaces between peopleand the world of information. We also asked how such work could inter-twine human, societal and technological elements into one dynamic researchactivity. One of the main quotes from our call for proposals was:

The goal of i3 is to research and develop new human-centred interfaces for interacting with information, aimed at the broad population (1996).

To help define a specific research agenda, we first had a competition for more specific visions of the future. “Connected Community” and “InhabitedInformation Spaces” were selected as the two visionary themes on which webased a subsequent call for research projects. Even though it took some time

123456789101112345678920111234567893011123456789401112345611

Foreword

vi

to have an extra layer of calls for proposals, in retrospect it was better to “reculerpour mieux sauter”.

The two selected themes had similar yet contrasting underlying philosophies.The Connected Community theme, proposed by a team headed by Irene MacWilliam (Philips Design, Eindhoven) and Marco Susani (Domus Academy,Milan) asked: forget about virtual environments and trying to fit people intosome artificial world – how can we help people in their everyday environment,and integrate technology into this? The idea is to understand how informationand communication tools start making a difference when they are embeddedin a real context, and start being more meaningful for actual people and com-munities. How can technology enhance these environments and activities,rather than replace them?

The other schema, Inhabited Information, proposed by a team headed byTom Rodden (University of Nottingham) took a slightly different perspective.It stated: the Internet and the Web already represent a suspended reality, andpeople want to participate more in these spaces. Given that this is a reality, howcould it evolve in the future? How could we make it more accessible to thebroadest possible public, and make it socially interactive for large groups ofpeople, in meaningful ways? And in similar spirit to the first theme, how cansuch environments link to the physical everyday world rather than be removedfrom it?

At a later stage, we decided to supplement the research with an emphasis on learning. We wanted to explore new relationships between learning and technology. The idea was that a lot could be learnt about designing new inter-faces by looking at how children interact, play and learn. Similar ideas had been experimented in a Lego context by Seymore Papert of the MIT Media Lab.In 1997, we decided to have a call on experimental school environments (ese).This centred around learning for very young children, in fact, the 4–8-year-oldage range. This age range struck us as being particularly challenging because at this stage children don’t have too many of the adult preconceptions of the world, and are still open to new things. Young children have a different kind of “language” – a form of communication and expression from whichadults can learn a lot. From this we wanted to gain insights about how to design meaningful interaction tools for the population at large. The header ofour call was:

The aim of i3-ese, is to research new kinds of IT-based tools designed to enable newapproaches to learning, focussing on the age range of 4 to 8 (1997).

From each of these programmes we selected a number of individual researchprojects. Together these spanned many universities, research centres and com-panies across Europe, and involved a mix of people from many walks of life –artists, designers, computer scientists, game companies, technology companies,experimental schools, teachers and children, people in communities, etc. At thesame time all these different outlooks were united by the common vision: ofexploring new relationships between people and technology.

011

011

011

011

11

vii

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Foreword

Grains of the Future

In this book you will find some examples of work in particular from theInhabited Information Spaces Grouping. It is interesting to see how some ofthese ideas are still “futuristic” and others have started to become part of main-stream thinking and made their way into products.

Some people say that you can find “grains of the future” in the present today– the only problem is, where do you start to look? One of the potential advan-tages of this book is that by looking at the research developments stretchingout into recent past, one can identify how some grains developed into trendsof the present, and other are still just emerging.

For those still interested in seeking out “grains of the future”, this book willbe a valuable source.

Jakub WejchertInformation Society DGEuropean Commission

Jakub Wejchert grew up in Ireland, with a family background of artists andarchitects, of Polish origin. He studied natural science at Trinity College Dublin,specialising in physics, and holds a doctorate (modelling of non-linear net-works) from the same institution. Later he worked in the USA with IBMresearch, working on computer graphics and interface design. He joined theEuropean Commission in 1992. At the Future and Emerging Technologies unit,he set up and managed a number of research programmes such as i3 – intelli-gent information interfaces; i3 – experimental school environments; and the“disappearing computer”. He now works as an advisor on vision and strategyto one of the Directors in the Information Society Programme. Jakub lives inWaterloo, south of Brussels, with his wife and three sons.

The opinions expressed here are those of the author and do not necessarilyreflect the position of the European Commission.

123456789101112345678920111234567893011123456789401112345611

Foreword

viii

Acknowledgements

The editors would like to acknowledge the European i3 initiative and all theauthors of the chapters in this volume for their contributions. Much of the workdescribed in this volume would not have taken place without funding from theEuropean Commission. We would also like to thank SICS, XRCE and FX PaloAlto Laboratory for supporting our activities within this domain. Rosie Kempand Melanie Jackson of Springer also deserve thanks for their help and supportthroughout the process of preparing this book for publication.

011

011

011

011

11

ix

Contents

List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

Part 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1. Inhabited Information Spaces: An IntroductionElizabeth Churchill, David Snowdon and Emmanuel Frécon . . . . . . . . . . 31.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Chapters in this Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Pure Virtual Environments . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2 Mixed Reality Environments . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.3 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.4 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.5 Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Part 2. Pure Virtual Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2. WWW3D and the Web PlanetariumMårten Stenius and David Snowdon . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Producing a 3D Representation of a Web Page . . . . . . . . . . . . . . 122.3 Browsing the Web Using WWW3D . . . . . . . . . . . . . . . . . . . . . . . 132.4 Improving Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.5 The Web Planetarium: Creating a Richer Visualisation . . . . . . . . 20

2.5.1 Visual Differentiation of Nodes . . . . . . . . . . . . . . . . . . . . . . 202.5.2 The Web as a Road Network . . . . . . . . . . . . . . . . . . . . . . . . 222.5.3 Hybrid Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3. PlaceWorld, and the Evolution of Electronic LandscapesSteve Pettifer, Jon Cook and James Marsh . . . . . . . . . . . . . . . . . . . . . . . . 253.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Background: The Physical and the Abstract . . . . . . . . . . . . . . . . . 27

3.2.1 Watching a Cityscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2.2 The Distributed Legible City . . . . . . . . . . . . . . . . . . . . . . . . . 29

011

011

011

011

11

xi

3.2.3 Finding “Something to Do” . . . . . . . . . . . . . . . . . . . . . . . . . 313.2.4 Abstract Influences: Nuzzle Afar . . . . . . . . . . . . . . . . . . . . . 33

3.3 PlaceWorld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.3.1 The Design of PlaceWorld . . . . . . . . . . . . . . . . . . . . . . . . . . 343.3.2 The User Interface and Presentation System . . . . . . . . . . . . 36

3.4 Technological Challenges for Electronic Landscapes . . . . . . . . . . 373.4.1 Synchronising the Behaviour of Entities . . . . . . . . . . . . . . . 393.4.2 Distribution and Communications . . . . . . . . . . . . . . . . . . . . 403.4.3 Defining the Behaviour of Entities . . . . . . . . . . . . . . . . . . . . 413.4.4 Methods and Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.4.5 The Distribution Architecture . . . . . . . . . . . . . . . . . . . . . . . 44

3.5 System Support for PlaceWorld . . . . . . . . . . . . . . . . . . . . . . . . . . 463.5.1 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.5.2 Access Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.5.3 Exploiting Subjectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.5.4 Becoming a Place Where Places Meet . . . . . . . . . . . . . . . . . 48

3.6 Conclusions

4. Using a Pond Metaphor for Information Visualisation and ExplorationOlov Ståhl and Anders Wallberg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2 The Pond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.1 The Pond Ecosystem Metaphor . . . . . . . . . . . . . . . . . . . . . . 544.2.2 The Pond Example Application . . . . . . . . . . . . . . . . . . . . . . 554.2.3 The Hardware Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2.4 The Software Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.4 The Pond Audio Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.5 Observations from Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.7 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Part 3. Mixed Reality Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5. City: A Mixture of Old and New MediaMatthew Chalmers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.3 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.4 Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.5 Ongoing and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

123456789101112345678920111234567893011123456789401112345611

Contents

xii

6. SoundscapesTony Brooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.2 The Soundscapes System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.3 Therapeutic Uses of Soundscapes . . . . . . . . . . . . . . . . . . . . . . . . . 926.4 Artistic Performances Based on Soundscapes . . . . . . . . . . . . . . . . 94

6.4.1 Interactive Painting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.4.2 The Four Senses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

7. The Computational Interplay of Physical Space and Information SpaceEnric Plaza . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1017.2 The Interplay of Physical and Information Spaces . . . . . . . . . . . 1027.3 A Framework for Context-aware Agents . . . . . . . . . . . . . . . . . . . 104

7.3.1 Awareness and Delivery Services . . . . . . . . . . . . . . . . . . . . 1057.3.2 Agents Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7.4 The COMRIS Conference Centre . . . . . . . . . . . . . . . . . . . . . . . . . 1077.4.1 Delivery Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.4.2 Awareness Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1087.4.3 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Part 4. Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8. Communicating in an IIS: Virtual ConferencingAdrian Bullock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1158.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1158.2 Virtual Conferencing – a Historical Perspective: Past,

Present and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168.2.1 What Do We Mean by Virtual Conferencing? . . . . . . . . . . 117

8.3 Approaches to Virtual Conferencing . . . . . . . . . . . . . . . . . . . . . . 1178.3.1 Early Videoconferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1178.3.2 MUDs and MOOs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188.3.3 The Arrival of Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188.3.4 Video Comes of Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1208.3.5 Graphics Come of Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

8.4 Using Virtual Conferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1248.4.1 Understanding Collaboration . . . . . . . . . . . . . . . . . . . . . . . 1248.4.2 The Importance of First Impressions . . . . . . . . . . . . . . . . . 1258.4.3 Sharing Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1258.4.4 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1258.4.5 Real Versus Abstract: The Role of Video? . . . . . . . . . . . . . 126

011

011

011

011

11

xiii

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contents

8.5 Virtual Conferencing Versus Telephony . . . . . . . . . . . . . . . . . . . 1278.6 Guidelines for Using Virtual Conferencing Effectively . . . . . . . . 129

8.6.1 What Is the Task at Hand? . . . . . . . . . . . . . . . . . . . . . . . . . 1298.6.2 Communication Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1308.6.3 Infrastructural Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

8.7 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

9. Getting the Picture: Enhancing Avatar Representations in Collaborative Virtual EnvironmentsMike Fraser, Jon Hindmarsh, Steve Benford and Christian Heath . . . . 1339.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1339.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1359.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

9.3.1 Awareness and Co-ordination . . . . . . . . . . . . . . . . . . . . . . 1379.3.2 Anticipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1409.3.3 Occlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1459.5 Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

9.5.1 Scaleability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1469.5.2 Reciprocity of Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . 1479.5.3 Unrealism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

9.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

10. New Ideas on Navigation and View Control Inspired by Cultural ApplicationsKai-Mikael Jää-Aro and John Bowers . . . . . . . . . . . . . . . . . . . . . . . . . . 15110.1 Introduction and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

10.1.1 Challenges for Interaction Design . . . . . . . . . . . . . . . . . . . 15210.2 Interactive Performances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

10.2.1 Lightwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15510.2.2 Blink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

10.3 Inhabited Television . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16410.3.1 Heaven and Hell – Live . . . . . . . . . . . . . . . . . . . . . . . . . . . 16510.3.2 Out of This World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

10.4 Production Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16910.4.1 Finding and Framing the Action . . . . . . . . . . . . . . . . . . . . 17010.4.2 The Round Table: A Physical Interface . . . . . . . . . . . . . . . 17210.4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

10.5 Discussion: Navigation, Presence and Avatars . . . . . . . . . . . . . . 17610.5.1 Avatar-centred Navigation . . . . . . . . . . . . . . . . . . . . . . . . 17610.5.2 Object-centred Navigation . . . . . . . . . . . . . . . . . . . . . . . . 17710.5.3 Activity-oriented Navigation . . . . . . . . . . . . . . . . . . . . . . . .17810.5.4 Navigation as Montage, Dispersed Avatars . . . . . . . . . . . . 17810.5.5 Accomplishing Presence and Intelligibility . . . . . . . . . . . . 179

123456789101112345678920111234567893011123456789401112345611

Contents

xiv

11. Presenting Activity Information in an Inhabited Information SpaceWolfgang Prinz, Uta Pankoke-Babatz, Wolfgang Gräther, Tom Gross, Sabine Kolvenbach and Leonie Schäfer . . . . . . . . . . . . . . . 18111.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18111.2 Related Work and Requirements . . . . . . . . . . . . . . . . . . . . . . . . 18211.3 User Involvement and Studies . . . . . . . . . . . . . . . . . . . . . . . . . . 184

11.3.1 Partner Settings and Evaluation Methods . . . . . . . . . . . . . 18511.3.2 Do Users Meet at all in a Shared Workspace? . . . . . . . . . 186

11.4 The Tower Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18811.5 Personalised Overview of Activities: The Tower Portal . . . . . . . 18911.6 Awareness in a Working Context: Smartmaps . . . . . . . . . . . . . . 19111.7 Symbolic Actions in a Context-based 3D Environment . . . . . . . 194

11.7.1 The Tower World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19411.7.2 User Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

11.8 DocuDrama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19811.9 Ambient Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20111.10 Lessons Learned About Awareness . . . . . . . . . . . . . . . . . . . . . . . 203

11.10.1 Awareness Is Something One Is Not Aware of . . . . . . . . 20311.10.2 Synchronicity of Awareness . . . . . . . . . . . . . . . . . . . . . . 20411.10.3 Walking and Talking Are Means to Achieve Awareness . 20511.10.4 Peripheral Awareness in Electronic Settings . . . . . . . . . . 20511.10.5 Awareness Is Double-situated: The Workspace’s and the Observer’s Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

11.11 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Part 5. Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

12. DIVE: A Programming Architecture for the Prototyping of IISEmmanuel Frécon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21112.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21112.2 The Virtual World as a Common Interaction Medium . . . . . . . 21212.3 Partial, Active Database Replication . . . . . . . . . . . . . . . . . . . . . . 21312.4 Programming the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

12.4.1 The DIVE Programming Model . . . . . . . . . . . . . . . . . . . . 21612.4.2 Programming Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 21612.4.3 Building your Application . . . . . . . . . . . . . . . . . . . . . . . . . 218

12.5 DIVE as a Component-based Architecture . . . . . . . . . . . . . . . . . 22312.5.1 System Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22312.5.2 User-oriented Components . . . . . . . . . . . . . . . . . . . . . . . . 22412.5.3 The DIVE Run-time Architecture . . . . . . . . . . . . . . . . . . . 225

12.6 The London Demonstrator: An Example Application in More Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

011

011

011

011

11

xv

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contents

12.6.1 Centre of London . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22812.6.2 Collaboration Services for Use by Groups . . . . . . . . . . . . 22912.6.3 Tourist Information Data Visualisation Service . . . . . . . . 22912.6.4 Real-time Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

12.7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

13. Communication Infrastructures for Inhabited Information SpacesDavid Roberts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23313.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

13.1.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23413.1.2 Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23513.1.3 Avatars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23613.1.4 Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23713.1.5 Communication Requirements . . . . . . . . . . . . . . . . . . . . . 23813.1.6 Resources: Computers and Networks . . . . . . . . . . . . . . . . 240

13.2 Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24013.2.1 Localisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24113.2.2 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24613.2.3 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25113.2.4 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

13.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25613.3.1 The DIVE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 25613.3.2 PING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

13.4 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26313.4.1 Point-to-point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26413.4.2 Tunnelled Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26513.4.3 Hybrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

13.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

Part 6. Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

14. Peer-to-peer Networks and CommunitiesMike Robinson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27114.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27114.2 Early Inhabited Information Spaces in CSCW . . . . . . . . . . . . . . 274

14.2.1 Rendering the Invisible Visible . . . . . . . . . . . . . . . . . . . . . 27414.2.2 ClearBoard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27514.2.3 Feather, Scent and Shaker: Supporting Simple Intimacy . 27614.2.4 Gesture Cam: The Nodding Robot . . . . . . . . . . . . . . . . . . 277

14.3 P2P Themes and Overall Direction . . . . . . . . . . . . . . . . . . . . . . . 27814.4 Design for Community: Inhabited Information Spaces . . . . . . . 281

14.4.1 Communities: An Aside on Definitions . . . . . . . . . . . . . . 28114.4.2 Communities: An Aside on Use . . . . . . . . . . . . . . . . . . . . 28214.4.3 Communities: An Aside on Philosophy . . . . . . . . . . . . . . 284

123456789101112345678920111234567893011123456789401112345611

Contents

xvi

14.5 P2P, Community and the Design of Inhabited Information Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

14.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

15. Inhabitant’s Uses and Reactions to Usenet Social Accounting DataByron Burkhalter and Marc Smith . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29115.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29115.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29315.3 Netscan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29415.4 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

15.4.1 Social Accounting Data and Author-assessment Threads 29515.4.2 Social Accounting Data and Newsgroup-assessment Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

15.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

011

011

011

011

11

xvii

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contents

Steve BenfordMixed Reality Laboratory, School of Computer Science,University of Nottingham, Jubilee Campus, Nottingham, NG8 1BB, [email protected]

John BowersDepartment of Numerical Analysisand Computer Science, Royal Institute of Technology, SE-100 44, Stockholm, [email protected]

Tony BrooksAalborg University, Niels Bohrs Vej 8, DK 6700 Esbjerg, [email protected]

Adrian BullockSwedish Institute of ComputerScience, Box 1263, SE-164 29, Kista, [email protected]

Byron BurkhalterDepartment of Sociology, University of California, Los Angeles, 264 Haines Hall, 375 Portola Plaza, Los Angeles, CA 90095–1551, USA

[email protected] ChalmersComputing Science, University ofGlasgow, 17 Lilybank Gardens,Glasgow, G12 BQQ, [email protected]

Elizabeth ChurchillFX Palo Alto Laboratory, 3400 Hillview Avenue, Building 4, Palo Alto 94304, USA [email protected]

Jon CookDepartment of Computer Science,The University of Manchester,Oxford Road, Manchester, M13 9PL, [email protected]

Mike FraserMixed Reality Laboratory, School of Computer Science,University of Nottingham, Jubilee Campus, Nottingham, NG8 1BB, [email protected]

Emmanuel FréconSwedish Institute of ComputerScience, Box 1263, SE-164 29, Kista, [email protected]

011

011

011

011

11

xix

List of Contributors

Wolfgang Gräther Fraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

Tom Gross Fraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

Christian HeathWork, Interaction and TechnologyResearch Group, The ManagementCentre, King’s College London,Franklin-Wilkins Building, London, SE1 8WA, [email protected]

Jon HindmarshWork, Interaction and TechnologyResearch Group, King’s College London, [email protected]

Kai-Mikael Jää-AroDepartment of Numerical Analysisand Computer Science, Royal Institute of Technology, SE-100 44, Stockholm, [email protected]

Sabine Kolvenbach Fraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

James MarshDepartment of Computer Science,The University of Manchester,Oxford Road, Manchester, M13 9PL, [email protected]

Uta Pankoke-BabatzFraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

Steve PettiferDepartment of Computer Science,The University of Manchester,Oxford Road, Manchester, M13 9PL, [email protected]

Enric PlazaIIIA, Artificial Intelligence ResearchInstitute, CSIC, Spanish Council forScientific Research, Campus UAB,08193 Bellaterra, Catalonia, [email protected]

Wolfgang PrinzFraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

David RobertsDepartment of Computer Science,University of Reading, Whiteknights,Reading, RG6 6AY, UK [email protected]

123456789101112345678920111234567893011123456789401112345611


xx

Mike RobinsonSageforce Ltd, 61 Kings Road,Kingston-on-Thames, Surrey, KT2 5JA, [email protected]

Leonie Schäfer Fraunhofer Institute for AppliedInformation Technology, Schloss Birlinghoven, 53754 Sankt Augustin, [email protected]

Marc SmithMicrosoft Research, MicrosoftCorporation, One Microsoft Way,Redmond, WA 98052, [email protected]

David SnowdonXerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, [email protected]

Olov StåhlSwedish Institute of ComputerScience, Box 1263, SE-164 29, Kista, [email protected]

Mårten SteniusAlkit Communications AB, Aurorum 2, SE-977 75, Luleå, [email protected]

Anders WallbergSwedish Institute of ComputerScience, Box 1263, SE-164 29, Kista, [email protected]

011

011

011

011

11

xxi

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


Part 1Introduction

011

011

011

011

11

1Inhabited Information Spaces: An Introduction

Elizabeth Churchill, David Snowdon and Emmanuel Frécon

1.1 Introduction

Studies of co-operative work have shown that, even when co-operationis not explicit, a surprisingly large amount of work relies on the knowl-edge of what other people are doing (or have done) so that work can beco-ordinated. Further, people collaborate over objects that are manipu-lated and exchanged: reports, diagrams, maps, books, models and draw-ings are all examples. People’s conversations and actions are part andparcel of producing the social frame within which work takes place(Giddens, 1984; Lave and Wenger, 1991). However, the technological cir-cumstances within which that work takes place has a profound impactupon how it is achieved – technologies afford resources and constraintsthat affect practice.

This volume offers readers an introduction to the field of InhabitedInformation Spaces (IIS). We attempt to shed some light on the mostimportant issues, including examples of representing information, howpeople interact in such systems, how IIS systems are constructed, andemerging notions of communities. There are already many books dedi-cated to “pure” information visualisation (techniques for visually repre-senting information) and so in this volume we concentrate mainly on thevalue added by inhabited spaces rather than visualisation techniques per se. For a review of the field of “pure” information visualisation werecommend Card et al. (1999), who give a good overview of the field.

The question addressed by IIS design is how best to design spaces andplaces where people and digital data can meet in fruitful exchange – thatis, how to create effective social workspaces where digital informationcan be created, explored, manipulated and exchanged. IIS are the con-fluence of research into distributed, augmented and virtual reality (VR)spaces, information visualisation and computer-supported co-operativework (CSCW).

011

011

011

011

11

3

The term “Inhabited Information Spaces” derived from work beingcarried out within a number of European research initiatives (e.g.INSCAPE, COMIC) and laid out an agenda whereby virtual reality and information visualisation techniques were explicitly combined insupport of collocated and remote collaborative work. Thus, in InhabitedInformation Spaces (IIS) both information and people who are using thatinformation (viewing it, manipulating it) are represented. This supportscollaborative action on objects, provides awareness of others’ ongoingactivities and offers a view of information in the context of its use. Thus,while information visualisation systems are useful tools in themselves, arepresentation of others who are also using the information and whatthey are doing with it could potentially add considerably to the value ofsuch systems for co-ordination of collaborative work. Just knowing thatmany people are accessing a particular piece of information could bealmost as useful as the piece of information itself.

The specific representations can vary but in all cases they are manip-ulable: data can be interrogated, representations of people are mobileand interactive. Further, being shared, navigable, “live-in” spaces, infor-mation can be explicitly sought, can be “discovered by chance” (cf.Williamson, 1998), or “encountered” (cf. Erdelez, 1999) or tacitly con-sumed as part and parcel of navigating an “information neighbourhood”(cf. Savolainen, 1995).

In Europe, the European Commission (EC) funds large programmes ofresearch covering most fields of human endeavour (science, technology,medicine, culture etc.); one of the domains supported by the EC is Infor-mation Society Technologies (IST, http://www.cordis.lu/ist/). In 1996 thei3 (Intelligent Information Interfaces) network (http://www.i3net.org)was formed as a mechanism to create a community of people working ona number of research projects. One of the aims of this was to encourageresearchers participating in European projects to exchange ideas andinformation and to allow collaboration on a larger scale. From 1997–2000the i3 supported a programme called Inhabited Information Spaces andmuch of the work that appears in this book resulted from this programme.

The field of IIS overlaps that of Collaborative Virtual Environments(CVEs) (see Churchill et al., 2001), as CVEs are one of the preferredimplementation techniques for visualisation information in a collabora-tive way. However, IIS does not necessarily imply the use of online virtualenvironments technology – for example, it is possible to imagine a systemthat enables co-located groups to co-operatively work with informationby using a display projected onto physical artefacts. Research into IISalso overlaps with work carried out in Social Navigation, which explic-itly addresses social aspects of information seeking, searching and use(see Munro et al., 1999).

The chapters in this volume cover all variants on IIS, the technologyrequired to make it work and the social and psychological issues raised by such work. We also present recent innovations in “hybrid

123456789101112345678920111234567893011123456789401112345611

Inhabited Information Spaces

4

environments” and augmented real-world environments. Our aim withthis broad coverage is to offer readers the opportunity to reflect on theintersection of technology design, communication, representation, andcollaborative work practices. Technological, psychological and sociolog-ical issues in the design and use of Inhabited Information Spaces are con-sidered, including: design issues in the development of technology forhuman–human and human–system collaboration around informationvisualisations; applications demonstrating uses of the technology; andpsychological/sociological analyses of the way such systems are used inpractice. Our aim was to provide a broader perspective than solely graph-ical aspects of visualising information, systems aspects underlying the distribution and sharing of graphical and textual workscapes, or thecommunication and work practice aspects of the use of such systems.

1.2 Chapters in this Volume

We have divided the chapters in this volume into a number of broadareas, although observations made by the authors often span these areas.The areas are pure virtual environments; mixed reality; communication-oriented systems and applications; construction; and community.

1.2.1 Pure Virtual Environments

Stenius and Snowdon, in their chapter ‘WWW3D and the Web Planetar-ium’ (Chapter 2) describe the WWW3D 3D web browser, and how it turnsbrowsing the web into exploration of a 3D space. The chapter describesthe initial version of WWW3D and describes how it evolved into the Web Planetarium. Not only is the Web Planetarium more aestheticallypleasing and more scalable than WWW3D, but in this incarnation it hasalso been re-purposed to serve as a gateway between 3D environments.

In Chapter 3, Pettifer, Cook and Marsh describe the Placeworld system,and its implementation in the Deva virtual environment. Placeworld was inspired by Jeffrey Shaw’s artwork “PLACE – A User’s Manual” and,like the Web Planetarium, aims to provide a connection between virtualspaces. The chapter describes the series of user trials that led to the finalPlaceworld design and how the Deva system is used to create a high per-formance virtual environment that implements the Placeworld design.This chapter also gives some insight into some of the issues that must betackled to implement large scale IIS and CVE systems efficiently and servesas an appetiser for topics covered in more depth in Part 5 of this book.

The final chapter in Part 2, Chapter 4, is by Wallberg and Ståhl, whouse a pond metaphor for information visualisation and exploration. ThePond is a system that allows people to browse collections of multimediadata, such as music albums. The interface is presented in the form of a

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

An Introduction

5

1

large back-projected display on a table surface allowing several peopleto gather around it and use it collaboratively.

1.2.2 Mixed Reality Environments

Chalmers, in his chapter entitled “City: A Mixture of Old and New Media”(Chapter 5), argues that one of the problems with “traditional” CVEsystems is that they are disconnected from the physical environment andfrom other media (for a review of CVE systems see Churchill et al., 2001).He calls for an approach in which there is an explicit linking between different media, where people are considered to inhabit the “real” (phys-ical) world, not the virtual, and have a number of media both “old” and“new” available to them. The chapter presents an experiment where visi-tors experience a gallery via different media – one by physically experi-encing it, one via the web and one via an immersive 3D VR environment.All three are able to communicate via an audio link and are given aware-ness of the locations and actions of the others. The chapter presents theresults of this experiment and the ways in which the participants usedthe features of the technology to interact and share their experiences.

In Chapter 6, Brooks describes the Soundscapes system, which allowsunencumbered interaction with visual and auditory systems projectedinto the physical world. This work illustrates a different form of infor-mation system, an auditory one, and offers an example of a mixed realitysystem, and example applications are covered including therapeutic useand public artistic performances.

In the next chapter, Plaza illustrates the computational interplay ofphysical space and information space. Drives within mobile computingpush computers further “into” the physical world (see also Mark Weiser’svision for ubiquitous computing, Weiser, 1991). In order to design toolsthat are context sensitive and not inappropriately intrusive, it has beenargued that such devices need to be “aware” of the activities of their users.This chapter describes the approach used in the COMRIS project inwhich wearable computers were linked with an information space com-posed of agents who attempt to find information useful to a person at agiven moment.

1.2.3 Communication

In Chapter 8, Bullock describes human–human communication via the medium of an IIS, considering design issues in the development ofvirtual conferencing. He makes the points that, for an IIS to functioneffectively, all technological elements need to work in concert. With videoconferencing as a backdrop, Bullock explores opportunities and pitfallsof using IIS for mediated communication.

123456789101112345678920111234567893011123456789401112345611


6

Fraser, Hindmarsh, Benford and Heath in their chapter “Getting thePicture: Enhancing Avatar Representations in Collaborative VirtualEnvironments” discuss how humanoid avatars are the most widely usedtype of avatar in VR systems. They note that although the avatars havehumanoid forms they don’t have human-like perceptual abilities withinthe virtual worlds – the data that are relayed back to people from theiravatar as prosthesis-in-the-virtual-world is in fact often misleading. Thischapter examines some of the problems inherent in using different formsof avatar, and the problems this poses for collaboration in IIS. Theauthors describe some extensions the authors have made in an attemptto rectify some of the problems they have encountered.

Jää-Aro and Bowers describe some new ideas on navigation and viewcontrol that have been inspired by cultural applications. This chapterdescribes the lessons learnt from a number of public VR and mixed-reality performances. The authors discuss what they learnt in terms ofthe content of performances, the pacing and the means given to partic-ipants to navigate within the space. An important issue addressed is howsuch performances can be made accessible to non-interactive audienceswho can only see a TV-like rendering of the event.

In Chapter 11, Prinz et al. consider how to present awareness of theactivities of others for better support of collaborative work. This chapterdescribes the TOWER (Theatre of Work Enabling Relationships) system,which provides a number of mechanisms to communicate awarenessinformation to members of a work group both via 3D displays and alsovia Smartmaps integrated with the Basic Support for Co-operative Work(BSCW) document management system. Smartmaps are 2D displaysbased upon the tree-map visualisation technique. The TOWER world isan automatically constructed 3D environment that represents both usersand documents and indicates the actions that users are taking withrespect to the documents via symbolic actions and gestures performedby the avatars. DocuDrama allows 3D presentations of the past actionstaken by members of a project team as a sort of 3D virtual theatre inwhich avatars look and turn towards one another to enhance the impres-sion of an ongoing conversation, and camera navigation is carefully con-trolled in order to generate an interesting presentation. Finally, smallsimple robots were employed in order to give a tangible presentation ofthe activities of other users.

1.2.4 Construction

Frecon’s chapter in Part 5 introduces readers to DIVE, a programmingenvironment for prototyping IIS. The DIVE CVE system is described,with a focus on the mechanisms it provides to allow developers to rapidlydevelop VR applications, including IIS. DIVE provides several differ-ent APIs (application programmer’s interfaces) and mechanisms for

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

An Introduction

7

1

creating dynamic 3D content thereby allowing developers to choose thecombination that works best for them. The chapter concludes with anumber of examples that show how significant applications have beenbuilt using DIVE. DIVE is one of the oldest and most mature VR systemsand the current version is the result of many years of experience; boththe Web Planetarium and Pond systems are built in DIVE. Readers areurged to read these three chapters (2, 4 and 12) to fully understand whatis possible with a mature VR system.

In Chapter 13, Roberts considers communication architectures for IIS,describing the most important networking issues that need to be facedwhen trying to construct distributed IIS systems. He also details some ofthe techniques that can be used to create an illusion of a shared space inthe face of delays caused by communications technology. A number ofdifferent CVE systems are described in order to give concrete examplesof the techniques described in the earlier part of the chapter.

1.2.5 Community

Robinson’s chapter deals with peer-to-peer networks and communities.Robinson argues that the concept of peer-to-peer and notions of com-munity are heavily interdependent, and that the design of IIS wouldbenefit by focusing more closely on community as the organising prin-ciple of peer-to-peer. The author first considers the early days of researchinto CSCW, analyses the metaphor of community and how it relates toelectronically mediated communication, and draws parallels with thecurrent state of peer-to-peer systems today.

Burkhalter and Smith move away from what might be considered atrue IIS in Chapter 15 to explore how the availability of social account-ing data can help the auto-regulation of online communities. The authorsconsider Usenet news but the principles could just as well be applied toother systems. The intent of this chapter is to show how similar infor-mation might be used to help CVE-based communities such as thosebeing created by there.com (http://www.there.com), if they ever reach thescale of communities such as Usenet.

1.3 Summary

As noted above, the chapters in this volume cover a broad range of tech-nical and social issues. All are focused on creating useful and “habitable”environments for information representations, seeking, searching andmanipulation. There are many challenges to be faced at technical and atsocial levels, and far more research needs to be done on the long-termuse of Inhabited Information Spaces, and how they co-evolve as a resultof being regularly inhabited. We hope you enjoy reading about the workas much we have.

123456789101112345678920111234567893011123456789401112345611


8

Part 2Pure Virtual Environments

011

011

011

011

11

9

2WWW3D and the Web Planetarium

Mårten Stenius and David Snowdon

2.1 Introduction

This chapter will describe an Inhabited Information Space based arounda 3D visualisation of a portion of the WWW. The system was originallycalled WWW3D (Snowdon et al., 1996), and evolved from an experimentin immersive 3D web browsing into a richer inhabited information spacecalled the Web Planetarium during the i3 eSCAPE project (Stenius et al.,1998). We include it here as an example of an inhabited space support-ing a common modern activity – web browsing – and also as an exampleof how a basic visualisation was extended to correct deficiencies in theearlier system.

WWW3D started life as an experiment in 3D web browsing. We werethinking about some of the original ideas behind VRML (Virtual RealityModelling Language, www.vrml.org), namely that it would be a sort of3D web. In fact many people working on VRML were inspired by theCyberspace detailed in William Gibson ’s novels:

Cyberspace. A consensual hallucination experienced daily by billions of legitimate operators, in every nation, by children being taught mathemati-cal concepts . . . A graphic representation of data abstracted from the banksof every computer in the human system. Unthinkable complexity. Lines oflight ranged in the nonspace of the mind, clusters and constellations of data.Like city lights, receding (Gibson, 1986).

However, VRML became more a means to represent 3D models withhyperlinks between them, than a richer way of experiencing the web. Itso happened that as we were thinking about this during the summer of1996 we were asked to provide a demonstration for SICS’s (the SwedishInstitute of Computer Science’s) new large-screen immersive VR system.This system, while having impressive graphical performance, did notsupport interaction via a standard 2D GUI (Graphical User Interface);we therefore decided to create a web browser that was entirely 3D. While

011

011

011

011

11

11

the result was not a web browser as usable as Netscapes ’s or Microsoft’sfor standard web browsing, the prototype had several interesting featuresand evolved into a more capable system during the i3 eSCAPE project.In this chapter we will describe the original system, WWW3D, and itsevolution into the Web Planetarium.

Just as a normal web browser, WWW3D allowed users to follow linksand view the web pages associated with those links. However, rather thansimply showing the current web page the user was exploring, WWW3Dalso visualised the structure of the portion of the web that the users had explored, historical information showing when pages had been lastviewed and (thanks to the DIVE environment for which WWW3D was written) other users browsing the web at the same time. Note thatWWW3D was never intended to visualise the web – the web is far toolarge for this to be meaningful, instead we simply visualise the portionof the web that users have explored and are exploring. We assume thatexisting search engines are sufficient to locate new pages of interest. Thetwo key features of WWW3D were its method for representing web pagesin 3D and the method for organising multiple web pages in 3D space. Wewill describe how this was done before continuing with a description ofhow the basic prototype was improved.

2.2 Producing a 3D Representation of a Web Page

Many HTML tags provide semantic information about the marked-uptext – that is, what the text is rather than how it should be displayed onthe page. 2D web browsers will typically use this information to deter-mine what style (i.e. font, font style, font size, etc.) to use when drawinga particular piece of text.

WWW3D uses the information contained in HTML tags to produce arepresentation of the document in 3D space. A web document is repre-sented as a sphere that is labelled with the document’s title. The contentsof the document is placed around the inside surface of the sphere.Displaying large amounts of text in a satisfactory way is difficult incurrent VR systems, so textual information is currently represented byicons that can be unfolded to reveal the entire text. The first few wordsof the piece of text are displayed under the icon to give some indicationof the contents. Images are displayed by texture mapping them ontopolygons on the inside surface of the sphere. Finally, links to other documents are represented as icons labelled with their destination.

To reduce the visual complexity of the virtual environment WWW3Dmakes extensive use of level of detail (LOD) operations. When viewedfrom outside, a document is represented as an opaque sphere and theactual document contents are not displayed. When a user enters a docu-ment to view it, the sphere is drawn in wire-frame so that the rest of thevirtual environment is still visible.

123456789101112345678920111234567893011123456789401112345611


12

Figure 2.1 shows the contents of a web document as displayed byWWW3D. Figure 2.2 shows the display generated when a user reads someof the text comprising a web document.

2.3 Browsing the Web Using WWW3D

When a user selects a link icon, WWW3D creates a new sphere repre-senting the target document and places it near the document from whichthe user selected the link. In order to indicate the structure of the portionof the Web that the users have explored, WWW3D draws arrows betweenthe spheres representing linked documents. If the documents are resi-dent on the same Web server then the arrow is drawn in blue, otherwise

011

011

011

011

11

13

Figure 2.1 A WWW3D representation of a web document seen from the inside. The spheresin the background represent other web pages.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

WWW3D and the Web Planetarium

2

it is drawn in green, thereby helping to provide additional informationon the structure of the documents that the user has explored. In addi-tion to this, the brightness of the arrow is dependent on the time sincea user last followed that link thereby providing users with a visual rep-resentation of their browsing history. If WWW3D fails to fetch a docu-ment then a small red arrow is attached to the source document to represent the “broken” link .

WWW3D is implemented using multiple lightweight threads so usersdo not have to wait for a document to be retrieved before selectinganother link. This behaviour is essential if multiple users are to be ableto browse independently. Users are also free to navigate through the space and browse other documents while waiting for a document tobe retrieved.

123456789101112345678920111234567893011123456789401112345611


14

Figure 2.2 Reading part of the text of a web document.

As WWW3D parses a newly retrieved document, it checks for links todocuments that the user has already explored and draws arrows to rep-resent them. This means that at any given moment the complete set oflinks between documents is displayed without users having to followevery link. This is intended to aid the user by indicating links betweendocuments that the user might have been unaware of. This also has theresult that several users can be browsing different parts of the web andyet any links between the sets of documents they are exploring will bedisplayed. This might be useful since users will then have a visual representation of possible common interests.

To produce an acceptable layout of the set of linked documents an incremental version of the Force Directed Placement (FDP)(Fruchterman and Reingold, 1991) algorithm is used. Links between doc-uments act like spring forces that result in linked documents beingmoved closer together. Documents exert repulsive forces on one anotherwhich prevents documents being placed closer together than a user-specified minimum separation.

At regular (user-specified) intervals WWW3D applies the FDP algo-rithm to refine the inter-document layout for a specified number of iterations. The more links a document has, the greater “inertia” it is con-sidered to have when the FDP algorithm is applied. This has the resultthat heavily referenced documents are less likely to move and providessome stability to the visualisation. In addition damping is applied to tryand prevent large changes to the visualisation for a given iteration of the FDP algorithm – this helps to prevent the user from becoming dis-orientated. The result of this is that the inter-document layout graduallyevolves over time to produce clusters of inter-linked documents. Insteadof running the FDP algorithm until it converges (which could take a long time) only a specified number of iterations are executed. The ratio-nal for this is that the space will change anyway as users browse new web pages so it is not necessary to force users to wait until the layout stabilises before allowing them to continue browsing. If there is nochange to the space (no new web pages) then the system will graduallyconverge to a stable state. However, the FDP algorithm suffers from thedisadvantage that it can take a great many iterations to converge to astable state.

Figure 2.3 shows the display produced by WWW3D after the user hasbrowsed a number of documents. The FDP algorithm has resulted in the formation of clusters of closely linked documents. The colours of thedocuments provide some indication of how long ago the user last visitedthem.

Between invocations WWW3D stores information on the current setof documents, the links between them and the current 3D layout to a file.When a new instance of WWW3D is started, it reads this history file anddisplays the structure as it appeared in the last session. The contents ofthe documents found from the history file are not retrieved until the user

011

011

011

011

11

15

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


2

enters a particular document. Since the document representations areopaque this process is invisible to the user except for an occasional delayin seeing a document’s contents.

2.4 Improving Scalability

One problem with the WWW3D prototype is that it suffers from scala-bility problems both in terms of display and number of users. Since asingle DIVE application is responsible for management of the visualisa-tion, it is vulnerable to becoming overloaded if several users simultane-ously place demands on it. In addition to this the complexity of thedisplay increases as documents are added, lessening the advantage of

123456789101112345678920111234567893011123456789401112345611


16

Figure 2.3 A collection of web documents, showing the links drawn between them.

the displayed web structure information and making it harder to browse.Even though WWW3D makes extensive use of level-of-detail (LOD)operations for individual nodes, there will still be a point where the worldbecomes too complex to be rendered on even the most powerful hard-ware. For this reason we created a new version of WWW3D running onthe MASSIVE-2 CVE (Benford et al., 1997a) that attempted to solve thedisplay problem by making more extensive use of LOD effects combinedwith clustering of web pages.

We decided to extend the use of LOD beyond the contents of a singleweb page to encompass groups of web pages (e.g. all pages on the sameserver, same domain name, owned by the same user etc.). Doing thismeant that instead of showing all the pages in a website, or all the per-sonal pages (home page and related pages) for a user, we could displayan object that represented the whole group. The user would only see thecontents of the group when a certain criteria was met – in this case gettingclose enough to the group. This meant that when a user was outside acluster their client had no need to know anything about the web pagesin the cluster or the users that were currently browsing web pages in thecluster. This would have the following advantages:

● A reduction in the visual complexity of the world.● A consistent metaphor since clusters looked and behaved similarly to

individual web pages when users entered them.● Increased interactivity since the computational load on the client

machine was reduced.● Reduced network bandwidth since MASSIVE-2 assigned separate

multicast communication channels to each cluster. This meant that auser who could not see the interior of a cluster had no need to receivenetwork updates concerning objects or users contained in that cluster.

We used a simple scheme based on URLs to perform our clustering. Amore advanced alternative would be to explicitly consider the legibilityof the CVE and cluster accordingly as is done by LEADS (Ingram andBenford, 1995) which is also capable of adding additional objects such aslandmarks to aid users navigating through a CVE.

Our new implementation was based on two fundamental buildingblocks, containers and links. Containers have the ability to enclose otherobjects in a spatial sub-region of the virtual space. Links connect the con-tainers. A special form of container is the page, which represents a webpage and its contents. A cluster is a container that is able to treat its con-tents as a sub-visualisation complete with layout, other containers andlinks. Figure 2.4 shows a cluster both from a distance and as the user getssufficiently close to see the contents.

In order to simplify the representation of links between clusters, weintroduced compound links. A compound (or aggregate) link between twoclusters is formed whenever there exists one or more links between the

011

011

011

011

11

17

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


2

123456789101112345678920111234567893011123456789401112345611


18

Figure 2.4 A cluster abstraction dissolves to display its contents when the user approaches:(a) the exterior of the cluster; and (b) the view seen by a user as they get closer to the cluster.

a

b

contained objects of the two clusters. Thus, a compound link both servesas a link structure between two clusters and as an abstraction for one ormore lower-level links between the contents of these clusters, this is illustrated in Figure 2.5. Figure 2.6 shows a collection of linkedclusters.

When refined, the current application structure approach should lenditself to parallelisation: letting different processes handle different clusters. This in turn would enable a distribution of calculations on different hosts to allow more complex visualisations.

011

011

011

011

11

19

A B

c

d

e

f

g

AB

Figure 2.5 A compound link (AB) between two clusters (A, B).

Figure 2.6 A set of interconnected cluster abstractions. Their sizes roughly reflect the numberof objects (here pages) contained within.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


2

2.5 The Web Planetarium: Creating a Richer Visualisation

The Web Planetarium extends the WWW3D concept with the idea of the system becoming a gateway between different virtual spaces. Themetaphor of a planetarium was used to create a new, more informativeand more visually appealing look and feel, since the system was alsogoing to be used in public settings using a large, dome-like display(Schwabe and Stenius, 2000). Nodes in the Web Planetarium display canrepresent either web pages, 3D models (in VRML or DIVE format) or 3DDIVE worlds that can be jumped to. The Web Planetarium can thereforealso act as a space of portals to other spaces and provide a means of navigation between different virtual worlds. Compared to WWW3D theWeb Planetarium provides a more visually interesting virtual space anddifferent layout options.

2.5.1 Visual Differentiation of Nodes

One problem with the WWW3D visualisation is that all nodes look alike.The only distinguishing feature is colour and that colour represents thetime since the last access and not a feature of the node itself. When cre-ating the Web Planetarium, it was felt more important for users to beable to quickly scan the space for potentially interesting pages than toshow the usage information. The Web Planetarium replaces the use ofcolour by texture maps extracted from the web pages. Alternatively, thesphere representing a node can be replaced by a 3D model. Not only doesthis result in a more interesting environment but it allows the users todinstinguish more easily one node from another and gain some idea ofwhat the node represents. Figure 2.7 shows an example Web Planetariumview in which the texture maps clearly distinguish nodes from oneanother.

A simple algorithm is used to select the image to texture onto a sphere– the first image found in a web page is used as the texture. In the caseof a user’s home page this image is typically a photo of the user – this iseasily visible in Figure 2.8 in which several of the spheres clearly showpeople’s faces.

Obviously, a simple algorithm such as this will sometimes fail toproduce an interesting image for one of several reasons.

● There is no image on the web page.● A banner advert is selected instead of an image representative of the

content of the page.● The image selected is a piece of web decoration (a line, a bullet point)

instead of something really representative of the subject of the webpage.

123456789101112345678920111234567893011123456789401112345611


20

011

011

011

011

11

21

Figure 2.7 The 3D layout alternative emphasises the concept of a planetarium.

Figure 2.8 A screenshot of the Web Planetarium. Compared to the original WWW3D, one ofthe obvious differences is the use of texture mapping to give more information about thecontents of web pages instead of a simple colour indicating the time since last access.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


2

It would be hard, probably impossible, to produce an algorithm guaran-teed to work, but there are a number of simple filtering operations thatcould be used to ensure that in many cases unsuitable images are rejected.

● Reject images that are too small (according to predefined parameters)or are long and thin. This extremely simple approach was used suc-cessfully in the CWall electronic notice board (Snowdon and Grasso,2002) and rejected most web decoration .

● Use a blacklist of URLs and URL components to reject banner adver-tisements or configure a proxy server that performs this function(Hood, 2000). There are websites that maintain such blacklists (@Man,2000) so each user would not have to shoulder this burden.

● Reject images that do not come from the same server as the web page.This is a very simple approach that will work in many cases, but itmay sometimes reject legitimate images.

An alternative approach could be to use a conventional web browser –embedded within the application or running in parallel – to generate asnapshot of the web page, including background colour, text layout,images and so on. The snapshot would then be used as an icon (or“thumbnail”) to be pasted on the outside of the 3D page representation.This method would have the advantage of creating a stronger connec-tion to the original design of the web page as a whole.

2.5.2 The Web as a Road Network

WWW3D and the Web Planetarium can generate very tangled and con-voluted visualisations of nodes. Although users can manually repositionnodes this is not a solution that can be applied to large spaces. The WebPlanetarium offers users the options of restricting the FDP layout algo-rithm to a horizontal plane, thereby giving the layout shown in Figure2.9. A disadvantage of this approach is that in certain cases the FDP algo-rithm may take longer to converge.

2.5.3 Hybrid Browsing

The original aim of WWW3D was to provide an acceptable and novelweb-browsing experience using an entirely 3D interface. This made aninteresting demo and made sense on a large screen or head-set basedsystem but is less useful on the more common desktop-VR systems inwhich the VR system is just another window on the user’s 2D desktop.The Web Planetarium therefore allows users to combine the 3D view ofthe web structure with a traditional web browser view of the contents of a web page. This is illustrated in Figure 2.10 which shows a WebPlanetarium view and a web browser (Netscape) view of the same web page side by side.

123456789101112345678920111234567893011123456789401112345611


22

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


23

2

Figure 2.9 The 2D layout alternative results in a road-like network of sites and links.

Figure 2.10 Using a conventional web browser as a side-viewer for HTML documents encoun-tered in the Web Planetarium. Whereas the 3D view of the HTML document (to the right)gives an overview of the content and interaction points for links, Netscape provides a detailedclose-up of the current page (to the left).

2.6 Conclusion

This chapter has described WWW3D, a simple 3D web browser and itsevolution into the Web Planetarium. Starting from a humble beginningWWW3D has been changed to address the issues of scalability and pro-ducing a more pleasing and informative visualisation. In addition it cannow be seen as providing a means for navigating between disjoint 3Dworlds as well as between web pages.

However, the current version of the Web Planetarium is not withoutproblems. One major issue is related to the viewing of 2D standard webcontent. Neither the current 3D view nor the separate browser and 3Dviews side by side really provide a convenient browsing experience. Thereis still work to be done to find the optimal way to provide informationabout web structure and a view on web page contents in an integratedway.

The second problem with the Web Planetarum is that while it is per-fectly possible for multiple users to share a single instance of the WebPlanetarium there is no support given to the interrelation of differentplanetaria nor to different views on the same planetarium. The use ofsubjective visualisations (Jää-Aro and Snowdon, 2001) may provide atechnique for merging planetaria since the same web page could be pre-sented differently to different users, thus preserving individual layoutswhile allowing a degree of interaction between users.

123456789101112345678920111234567893011123456789401112345611


24

3PlaceWorld, and the Evolution of Electronic Landscapes

Steve Pettifer, Jon Cook and James Marsh

3.1 Introduction

The eSCAPE project (ESPRIT Long Term Research Project 25377) set outto investigate the idea of an “electronic landscape” (e-scape): large-scaleinhabited information spaces represented as a three-dimensional (3D)virtual environment (VE) in which users could interact with applications,find routes to other virtual worlds, access sources of data, and interactwith one another at a social level. The project brought together computerscientists, artists, social scientists and graphics designers from LancasterUniversity, the University of Manchester, the Swedish Institute ofComputer Science and the ZKM (Zentrum für Kunst und Medientech-nologie – Centre for Art and Media Technology) in Germany.

This chapter describes the evolution of PlaceWorld, an example e-scape. It is organised in four sections. We begin by chronicling thedevelopments of the eSCAPE project that led to the implementation ofPlaceWorld as its final demonstrator. We then consider in more detailthe nature of PlaceWorld itself, and discuss how its development hasinfluenced the underlying system technology. We conclude by tyingtogether some of the technological issues with concrete examples of theiruse in the PlaceWorld landscape.

The vision of an electronic landscape is easy to articulate, but difficultto make real. Distributed 3D virtual environments of any kind push thelimits of today’s graphics, networking and processing technology; someof these issues will be considered in detail much later in the chapter. Addto these problems the social factors involved in making a virtual envi-ronment actually usable by the general public, and the issues becomemuch more complex. The eSCAPE project’s approach to these difficul-ties is described in this chapter.

011

011

011

011

11

25

The project set out to address two problems. First there was the tech-nological difficulty of integrating diverse environments and their appli-cations. There is as yet no accepted standard technology for building a VE,and though the majority of implementers end up using OpenGL to renderthe visual part of the world at some point in their implementation, thesourcesofsemantics and behaviour vary wildly from bespoke handcraftedpieces of code to those using higher level more generic systems such asMASSIVE (Greenhalgh, 1999), DIVE (Chapter 12) or DEVA (Pettifer et al.,2000). We will return to consider the way in which the DEVA VR systemevolved to support electronic landscapes later in this chapter.

The second, and much more open-ended, issue is that of how to makean environment really inhabitable and of practical utility to a user, and itis to this that we first turn. The methodology used in the eSCAPE projectto develop effective electronic landscapes was iterative. Inspiration for the environments was drawn from two sources: artistic vision and ethno-graphic study of suitable real-world situations. The environments werethen implemented and put in front of the general public for testing.Further ethnographic studies were then carried out on the installations,and the environments were refined and redeveloped in the light of theresults from these studies. PlaceWorld combines lessons and technologi-cal perspectives discovered during the three-year period of this process.

123456789101112345678920111234567893011123456789401112345611


26

Figure 3.1 Snapshots of PlaceWorld.

Figure 3.1 shows snapshots from the final implementation of Place-World, and will serve for now, with the brief description that follows, togive an overall flavour of the environment. Users entering PlaceWorldfind themselves in an exotic landscape containing many artefacts.Situated here and there are cylindrical “buildings” covered with imagesfrom other worlds: these are portals that lead elsewhere. A network ofpathways, hovering above the mist-covered landscape lead off into thedistance in all directions. The scene is populated with towers, billboards,flying craft and other miscellaneous structures, and in the sky etherealfloating ribbons of light drift back and forth. Other users move aroundthe landscape, represented by animated walking stick figure-like charac-ters, and leave behind hazy coloured trails. Interaction with any of theseentities brings up on screen a selection of possible actions. Touching auser’s trail, for example, even when that user is a long way away, revealsthat it can be used as a shortcut to find its owner. Touching any otherartefact allows it to be “picked up” and carried around, to be depositedlater at another location. The remainder of this chapter explains why theesoteric nature of PlaceWorld is as it is, beginning with the project’sinitial experiments with electronic landscape .

3.2 Background: The Physical and the Abstract

eSCAPE examined two extreme styles of virtual environment, to extractfrom them appropriate features that would make an environment inhab-itable. At one extreme, there was the physical or concrete “cityscape”:familiar, constrained, slow to change, easily understood. At the otherextreme, the “abstract landscape”: unfamiliar, dynamic, unrestricted.Aspects from both these styles of environment were finally integratedinto PlaceWorld. Virtual environments have the potential to includeaspects of both these extremes: there is no technological reason why oneshould be confined to a groundplane in the virtual world just becausegravity works in a particular way in the real world; there is no techno-logical reason why one should be forced to walk from one part of a virtualworld to another rather than simply “teleporting” to one’s destination,much like clicking on a link in a web page. On the other hand, it is easyto imagine getting completely disorientated in a space where there areno constraints on navigation, or of missing out on chance meetings andopportunistic discoveries if one’s only means of transport is by instan-taneous hyperlink. Thus some balance needs to be achieved betweenthese two extremes of physical and abstract.

The evolution of the project is easier to tell from the perspective of theconcrete cityscape metaphor rather than the abstract, simply because offamiliarity with its nature and terminology. The following vignetteschronicle various attempts at building electronic landscapes, and leadtowards the final design of PlaceWorld.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

PlaceWorld, and the Evolution of Electronic Landscapes

27

3

3.2.1 Watching a Cityscape

The first environment built during the project was a straightforwardcityscape (Figure 3.2), consisting of buildings, roadways, parks and streetfurniture generated by algorithms based on Hillier and Hanson’s (1984)studies of urban evolution. A set of ethnographic studies were completedto see how users behaved in this environment, setting the participantsthe task of finding a number of landmarks in the city such as fountains,statues or pavilions. Studies were carried out at a desktop computer,using a standard mouse as an interaction device. The results were in someways surprising. In spite of the user being a lone inhabitant of thecityscape, and without prompting from the scientist, wayfinding in thecity was almost invariably carried out on the pedestrian areas (pave-ments, parks etc.) in spite of there being not a single car in sight. Thoughthere was no collision detection preventing passage through the build-ings, users mostly followed the road system. Even though the mouse nav-igation included an ability to fly up in the air (a feature that theparticipants were told about, and which clearly would give them anadvantage in terms of spotting land marks from on high), this was rarely used (indeed, one participant who did briefly levitate to a suitable

123456789101112345678920111234567893011123456789401112345611


28

Figure 3.2 A virtual cityscape.

vantage point rapidly came back down to ground level having felt uneasyup in the air). Although this evidence was anecdotal rather than defini-tive (and there were a small number of exceptions in terms of behav-iour), the studies suggested that even with little effort by way ofinteraction device to induce the sense of presence, that the general affor-dances of a real cityscape, transferred rather strongly to the virtual world.A more detailed description of the study can be found in Murray et al.,2001. On this basis, a more ambitious city-like environment, theDistributed Legible City, was built.

3.2.2 The Distributed Legible City

The artist Jeffrey Shaw, a partner in eSCAPE, had once developed anartistic installation piece called the Legible City. In this work, a solitarycyclist, seated on a real bicycle and situated in a darkened room with alarge projected wall, could cycle their way around interpretations of thecities of Amsterdam, Manhattan and Karlsruhe. Rather than ordinarybuildings, the cities were populated with towering, solid letters, formingextracts from texts associated with the areas of the cities. For example, avirtual tourist in the Legible City could follow the ramblings of a taxidriver around Manhattan, or find sections of poetry in Amsterdam.Informal observation of the piece, which is still in situ in the ZKM in thereal Karlsruhe, showed that, even with now outdated rendering technol-ogy and simple interaction style, the piece would capture people’s atten-tion, and they would happily tour around the cities for some time, oftentrying to find out which piece of text was associated with their favouritepart of the world. For entirely opportunistic reasons (much of the tech-nology already existed and the piece was known to be “engaging”), theproject decided to extend this installation to form a distributed envi-ronment, and so the Distributed Legible City (DLC) was constructed(Figure 3.3).

Using consumer-level graphical accelerators, the original database ofcity structures was reused, and modified exercise cycles were pressed intoaction as interface devices. Rather than the large screen projection systemused in the original work, stands for monitors were built in front of thecycles, and where the first Legible City used an LCD panel attached tothe handlebars to display maps of the cities, in this new version the mapwas displayed virtually as an overlay on the monitor screens showingnow the layout of the world as well as the positions of other inhabitants.For the purposes of communication, the users were provided with headphones and a microphone. Other users were represented by ani-mated cycling avatars. Two versions of the DLC interface bike were situated around the ZKM, and another across a wide area network inVienna, and the installation was made available to the public during anumber of exhibitions.

011

011

011

011

11

29

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

For several days, the technology behaved perfectly; however, theoverall result of the piece was extremely disappointing. No longer didusers engage with the environment and there was no interest in explor-ing the world for its own sake. Quickly, users realised that the buildingsoffered no resistance to them cycling through them, and the overlaid mapwas discarded, with users making a beeline towards one another andcompletely ignoring their surroundings. Once users had found eachother, another disappointing interaction took place. Unexpectedly, userswere resolute that they should be facing one another in order to carryout any conversation (this was not a requirement on the audio layer,which had no directional bias). The positioning of the monitor in frontof the cycle, and the inability to pedal backwards meant that a signifi-cant amount of effort was put into achieving “conversational orienta-tion”, cycling round in ever decreasing circles until this was achieved.With this effort expended, the users exchanged little more than a few perfunctory greetings, and then left the installation. Our expectations ofmulti-user cycling tours of the cityscape were far from met.

With hindsight, the problem was one of a mismatch between the affor-dances of the environment, and the expectations of the users. First, themain novelty of the environment to its user turned out to be the poten-tial to find another inhabitant, and perhaps then to tour the world.

123456789101112345678920111234567893011123456789401112345611


30

Figure 3.3 Images from the Distributed Legible City.

Flashing arrows on the overlaid map made it clear that other users werearound somewhere, and once users realised that the “buildings” did notblock their attempt to cycle through them, the map became a meaning-less diversion, which was not returned to as a useful navigation aid later(in spite of its apparent successful application in the original piece).Similarly, on a real bike one would expect to be able to cycle side by sidewith a friend, or at least to look over one’s shoulder to carry out a con-versation with a cyclist behind. The fixed position of the screen in frontof the cyclist, and the difficulty of arranging conversational orientationwith these limitations, made casual touring and chatting unrealistic.

To solve these problems, two simple modifications were made to theinstallation. First, the map was discarded (the alternative of forcing theutility of the map by making the buildings impenetrable seemed unpro-ductive since the cities were large and users simply wanted to find oneanother in the first instance). As a replacement, a bird was positioned inthe sky above each participant, which would fly in the direction of thenearest other user, meaning that now finding one another was straight-forward: simply follow the bird. The most significant alteration was theintroduction of a tracked head mounted display. The original fixedmonitor was retained so that bystanders could see interactions withinthe environment; however, the cyclist was now able to look around inthe environment, and to see the world in a different direction from thatin which the cycle was facing.

The improvement to social interaction with the new version of the DLC,this time tested out at ESPRIT IST98 in Vienna, was dramatic. With theease of finding other users and of achieving a convenient orientation forcomfortable communication, participants were much happier to spendextended periods of time exploring the cities together. The conclusion at this stage was that cityscapes provide easily understood metaphors for interaction, but these metaphors must be matched with the actualaffordances of the environment and its input devices with great care.

3.2.3 Finding “Something to Do”

For all the lessons learned from the Distributed Legible City, it was farfrom being an e-scape: it lacked interaction with other environments,and had no “information” as such beyond the possible interest in thecities’ texts. A more concrete cityscape was built to address these limi-tations. An ethnographic study of a real-world tourist informationcentre, in the northern England seaside town of Morecambe was carriedout, concluding that it was an interesting enough location of social inter-action and information browsing to warrant further attention. In partic-ular the question of “What can we do here?” seemed as relevant to atourist turning up in a new holiday location as it did to an inhabitant ofa new electronic landscape. A virtual Tourist Information Centre (TIC)

011

011

011

011

11

31

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

was constructed to see in what way the behaviours observed in the realworld TIC were transferred to the virtual landscape, and studies of thisnew environment were carried out in situ.

The virtual TIC (Figure 3.4) was based on a map of Morecambe, andincluded 3D representations of the various tourist attractions. The mapand its contents distorted dynamically to bring objects or areas of inter-est to a particular user into view, according to search criteria. Once more,users tended to follow city-like conventions in their exploration of thevirtual world, and there was evidence that to some extent the environ-ment supported social interaction and exploration. Its limitation,however, was clearly one of content. The hand crafted nature of thevirtual Morecambe and the relatively small number of attractions repre-sented within the world meant that its real potential as a source of holidayideas was nowhere near reached. More importantly, it was also obviousthat it was unrealistically costly for any individual to attempt to popu-late such a world. From this it was concluded that an important aspectof any worthwhile e-scape was the ability for its inhabitants and stake-holders to straightforwardly introduce their own content and to modifyexisting artefacts to reflect their own experience.

123456789101112345678920111234567893011123456789401112345611


32

Figure 3.4 A virtual Tourist Information Centre, showing the deformable map and buildings.

3.2.4 Abstract Influences: Nuzzle Afar

In parallel with the thread that has been described so far, informed verymuch by the “concrete” and “physical” end of the e-scape spectrum, anumber of other studies and sub-projects were being explored under theeSCAPE banner. One of these in particular: a multimedia installation ofan abstract nature called Nuzzle Afar had a significant influence of thenature of navigation and wayfinding in PlaceWorld.

Nuzzle Afar, a multimedia installation by Masaki Fujihata, consistedof an enclosed room, with left and right side entrances. Within the roomwere two podia, in front of which were projection screens. Trackballsembedded on top of the podia allowed users to control movementthrough the computer-generated environment displayed on the projec-tion screens, and microphones similarly located allowed distributedusers to communicate. As users moved through the electronic environ-ment, a string-like trace is left upon the virtual landscape, which may belocked onto and followed by another user. The enclosed space of thevirtual world consisted of four walls, ground and sky plane. Upon thewalls were images of “sense organs” (e.g. a hand, an eye, an ear, etc.). Inaddition to spherical “avatars” of unique colour, a sphere and a cylinderare placed within the virtual room. These latter objects were the meansby which users could enter or depart a series of three rooms.

Within each of the spaces in Nuzzle Afar navigation needs to be learntanew: effort, space and travel have different relationships within each.When two or more users are in close approximation “within” any of thespaces, they are able to see a video image of one another mapped andwrapped, visor-like, around the middle of the sphere. This allows forrecognition of the others’ identity. In meeting each other, any two users areable to enter a new virtual space which encompasses them, while lockingout the previous environment and any other inhabitants. This new spaceis, however, visible to other users as a spherical object inside of which thecolours of the two users “inside” merge. Once inside this new space, usersare represented via their video images on a 2D square. When one or both of the users leave this space, a video still of the two users remains,along with details of the time and physical locations of the encounter.

In spite of the very abstract nature of the Nuzzle Afar world, its meansof wayfinding and of locating other users proved much more intuitivethan our initial experiments with maps and other metaphors borrowedfrom the cityscape metaphor. The influence of this installation on the finalimplementation of PlaceWorld is described in the following sections.

3.3 PlaceWorld

PlaceWorld became the final eSCAPE demonstrator and assimilatedmuch of the work done in the earlier parts of the project. It aimed to

011

011

011

011

11

33

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

become an effective “place where places meet” building on the lessonslearnt from studies of the smaller scale landscapes. Another multimediainstallation piece by Jeffrey Shaw formed the basis for its artistic design,and a novel user interface device was commissioned for the purpose ofpublic display of the new environment. The next sections describe thevision and rationale that underpinned the world.

3.3.1 The Design of PlaceWorld

The design of PlaceWorld took as its point of departure many featuresof Jeffrey Shaw’s 1995 artistic work “PLACE – A User’s Manual” (Shaw,1997). The paradigm that is embodied in both implementations is thatof a navigable virtual environment that is populated by cylindrical arte-facts that form portals into other virtual environments. The world is also characterised by pathways that provide identifiable routes betweenthese places.

As boundary architecture, cylindrical space is quite different frommore typical rectangular spaces: the latter is perceived as an enclosure,whereas the former constitutes what is in effect a scaleable panoramichorizon. This aspect of the virtual cylinder is extended into the real world by the panoramic navigator interaction device described later.Cylindrical spaces are especially appropriate within PlaceWorld becausethey allow all the idiosyncratic virtual environments to define their ownunique enclosures within a neutral and scaleable circumference. Despitethe apparent exterior uniformity of these cylinders, attaching images totheir surfaces offers an expressive way to signal their interior contents.In this manner, rather than by architectural variation, they constitute acharacteristic and very expressive visual syntax for PlaceWorld. Thiscylindrical uniformity does not preclude the possibility to scale them variously in height and width, or to work with the relative transparencyof their surface, and this allows further expression of spatial as well astemporal meanings as will be described later.

From this initial vision, a graphic designer developed a design speci-fication that took the earlier PLACE work, and extended it with new features that addressed the particular needs set by the objectives of anelectronic landscape. One important aspect was the fact that an every-day urban landscape is characterised by the presence or absence of itspopulation of citizens. PlaceWorld also socialises its unusual landscapewith representations of its visitors and occupants, but in accord with theinherent properties of this information space, these avatars have uniquedesign characteristics that enable them to effectively function as true citizens of PlaceWorld.

A visitor in PlaceWorld would be expected to orient themselves to alarge extent by the visual identities of the cylinders themselves – that is,by the individual images that would be textured on the outer surfaces of

123456789101112345678920111234567893011123456789401112345611


34

these cylinders (a form of advertising their interior contents) and thegeneral stability of the layout. Much as in the real world, it seemed usefulto also establish some method of signposting that could offer and guidevisitors to the various locations. Such signposts could also be set up bypersons who had built a place in PlaceWorld as pointers to their location.

The issue of the relative permanence of the overall geography ofPlaceWorld and its cylindrically bounded environments was felt to be animportant issue if regular visitors to PlaceWorld are to be expected toreadily gain familiarity with it. On the other hand, the ability to intro-duce new content to an environment means that some controlled mod-ification of the world must be possible, as was made clear by ourexperiments with the virtual TIC. A description of the access model – aset of malleable guidelines and policies that could develop over timeunder the influence of the inhabitants of the world – was developed tosupport coherent and manageable changes to the underlying structure.A brief description of the access model is given later in this chapter.

The issue of balancing familiar structure with the desire for modifi-able content formed a fundamental feature of the temporal nature ofPlaceWorld’s landscape, and reflects once more the desire to combineappropriate aspects of the concrete and the abstract. Masaki Fujihata’sproject Nuzzle Afar provided an exemplary demonstration of the valueand effectiveness of attaching traces to users’ movements through avirtual environment. Colour coded for each visitor, these lines gave tem-poral evidence of people’s presence and their paths of interest. And theadded feature that a visitor in such an environment could “hook on” tosomeone else’s trace and then get a roller coaster ride along that convo-luted line that brought him face to face with that other person was also very effective and enjoyable. However, further discussion of thismethodology revealed a basic weakness – as soon as there would be alarge number of visitors in the virtual environment these lines wouldbecome an unintelligible tangle. Because we had decided to restrictmovement to ground level in PlaceWorld, such traces could then only bedrawn in one plane, thus adding to the confusion if there were many vis-itors. On the other hand, the restriction of these markings to one planerevealed the possibility of another very interesting design strategy. Werealised that these traces could in fact be merged with the geographicaland physical articulation of roadways in PlaceWorld. These roadwayscould be programmed to dynamically and temporally express the volumeof their usage, being a virtual reflection of “desire lines” in the real worldthat are created, say, when pedestrians create their own shortcuts acrossgrassy areas in parks. The hierarchy could be say trail, path, road,highway, which would be defined by the density of traffic experiencedalong these routes, and would change according to the varying density.In this way a new feature of informative comprehensibility could beadded to PlaceWorld: the nature of the pathways would themselvesexpress a history of visitor interests, so one could, for instance, choose

011

011

011

011

11

35

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

the well-trodden paths or venture along trails into relatively unfre-quented regions, or forge a new trail into previously unexplored territory.

Following this line of thought about the informative value of tempo-ral signifiers in PlaceWorld, the notion was extended to the cylindersthemselves, which expressed the temporal condition and/or the popu-larity of their contents. As with the tourist information centre, cylindersthat are of interest or frequently visited were designed to increase in scaleover time, with those of little interest shrinking into obscurity.

A point of debate was the notion of hyperlinking the individual cylin-drically bounded environments so that a visitor could “teleport” instantlyfrom one related place to another without having to negotiate thePlaceWorld landscape. A more radical extension of this facility would bethat a visitor could at will completely reorganise the distribution of cylin-ders in PlaceWorld to suit a specified personal need or desire (e.g. heonly wants to visit the popular places, so these all become clusteredaround him and the others disappear). However, going back to thepremise of PlaceWorld as a physical electronic landscape that should be strongly anchored in real world physics and experiences, it was feltthat such hyperlinks and hyper-rearrangements should be kept to aminimum, since they were not entirely consistent with the “physicallandscape” objectives set for this environment and shown to be effectivein the previous e-scapes. As a compromise, the ability for a user to gen-erate a hyperlink between places that they had already visited was madeavailable, with the link being visible only to that user as a glowing trailin the sky, anchored at both ends on the groundplane.

3.3.2 The User Interface and Presentation System

The Panoramic Navigator (PN) is a patented technology developed byJeffrey Shaw and the ZKM whose initial function was to provide apanoramic method of allowing visitors to the ZKM to preview andexplore the contents of the ZKM building. Using a 360-degree rotatabletouch screen coupled to a video camera, this augmented reality approachenables visitors to the ZKM to look around the building (via the livecamera image) and use the touch screen to access additional multime-dia information that is attached to specific locations in that building.

While these past implementations of the PN have used the real envi-ronment as its frame of reference for interactive information delivery, itwas realised that this technology could also be an ideal interface to exploreand interact with wholly virtual environments, and that this would beeminently suitable for PlaceWorld. What was needed was a method of generating a representation of that virtual environment around the PN, so that the touchscreen could then be used to interact with it.

The design solution for this was to attach a video projector to the backof the PN (behind the touchscreen) and to put a circular projection screen

123456789101112345678920111234567893011123456789401112345611


36

around the PN. In this way a new presentation and interaction methodwhich embodies the innovative concept of augmented virtuality (PN-AV)was developed. This PN-AV technology is used in PlaceWorld to embodytwo distinct but interrelated functions:

● to display the 360-degree immersive representation of virtual envi-ronment, wherein one also see the effects of user interaction with thatenvironment.

● to give the user a touchable interface that allows him/her to exploreand interact with that surrounding virtual environment by showing aparallel representation of that environment which is augmented by a dynamic set of user interaction tools.

The final version of the PN-AV, in use during a public demonstration, isshown in Figure 3.5.

Having described the evolution of the PlaceWorld environment, wenow turn our attention to the problem of its actual implementation, anddescribe the development of the DEVA VR system, which grew in paral-lel with the e-scape demonstrators. We will return to how aspects ofPlaceWorld were supported by this technology at the end of this chapter.

3.4 Technological Challenges for Electronic Landscapes

Until recently, technological limitations were the clear brake on progressfor VEs (the graphics challenge is perhaps the clearest example). Beyondthe technology the key task is that of writing software. The scale and complexity of this task is often underestimated, and it is here where we believe that the major problems lie. We do not at this time have ade-quate frameworks to simplify the task of implementing virtual worlds.Today, a person wishing to implement a challenging VE application hastwo broad options, as evidenced in current demonstrations. The most

011

011

011

011

11

37

Figure 3.5 The Panoramic Navigator – Augmented Virtuality version, in use with PlaceWorld.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

sophisticated VEs are usually bespoke applications constructed from thegraphics layer upwards; this is a substantial undertaking. The alternativeis to use the VE system. These fall into different categories, (for example,VRML browsers at the lower end, up to VE systems such as DIVE andMASSIVE at the more ambitious extreme).

The software support challenge to facilitate large-scale VE applicationsis twofold. First, to find techniques and algorithms to address specificneeds in VEs. Collision detection, parallelism, distribution, synchronisa-tion, navigation and so forth all require work of this kind. Secondly, andperhaps rather harder, it is to find frameworks that allow all the parts tobe put together in “flexible yet powerful” ways. The rather trite natureof such a statement belies the difficulty of quantifying that task.

Finding the “right” framework is particularly difficult in the case ofvirtual reality (VR), since it brings together a number of complex tech-nical issues, and binds them with real-time constraints at the social/perceptual interface. A desirable approach for the necessary flexibility toexperiment is to build as little into the system as possible, so the systemprovides a set of mechanisms and policies and default behaviours thatcan be unplugged and tailored at each level.

The issue of scale is an important one; simple small-scale VEs that do not challenge today’s hardware can be constructed by any number of means. Building large-scale complex applications raises a number ofchallenges. For shared VEs these relate, broadly speaking, to the follow-ing areas: number of entities; complexity of behaviour required of theseentities; complexity of individual rendering techniques; number and geo-graphical distribution of simultaneous users; and number of co-existingand interacting applications.

Matters of scalability and synchronisation, and some architectural andnetwork topologies for achieving this are discussed in Chapter 13. Herewe present the approach implemented in the DEVA VR system, andarising from the observation that for a shared VE there is a natural dis-tinction between the users’ perceptions of the world and what is actually“going on” within it. That is, between the distribution and simulation ofapplication behaviours on the one hand, and the task of presenting acoherent perception of the VE to each user on the other. This is similarto the traditional philosophical distinction between subjective percep-tion and the underlying objective reality. The reason that this is useful isthat in VR it separates out the challenging graphics/interaction tasksfrom the semantics of the underlying world simulation. The distinctionalso legitimises efforts to make a perceptually smooth presentation of theworld in the light of fundamental networking limitations. Architecturallywe have addressed these two aspects of the VE (perception/reality) sep-arately. First, rendering and spatial management seem to need specialtreatment that is different from current approaches to graphics toolkits.The MAVERIK system aims to address these issues. The unusualapproach taken by the MAVERIK system is to avoid having any internal

123456789101112345678920111234567893011123456789401112345611


38

“structure” for the representation of VEs and their contents. Instead, anobject-orientated framework is provided that supports an applicationbuilder in implementing rendering, interaction and spatial managementroutines that are tailor made and appropriate to their particular purpose.This is as much a performance issue as programming elegance, for keygraphics optimisations are highly application specific, and are generallyunavailable when the application must export its representation into theVR system. The system is described in detail in Hubbold et al. (2001).The more difficult problem is that of defining the underlying behaviourof entities in a way that can be distributed to multiple users. It is to thisproblem that we now turn our attention.

3.4.1 Synchronising the Behaviour of Entities

The distribution of semantics or behaviour in shared VEs is a particu-larly difficult issue. The ideal solution would be to describe the requiredbehaviour in a single location and to have it instantaneously sent to allparticipants (a pure “client/server” approach). Current network tech-nology is far too limited to make this solution feasible. An alternative isto replicate behaviour locally for each participant (a pure “peer-to-peer”approach); this introduces extreme synchronisation problems.

The approach taken in DEVA is a hybrid solution, part way betweenclient/server and peer-to-peer. The model is one of an objective and sub-jective reality; the former being located (logically) on the server, the latterbeing represented on each client. Each user interacts with the objectivereality via their own subjective view. This introduces the idea that eachentity in the virtual environment has two definitions, which may differsignificantly in their semantics. Our usage of the term “subjective” is notintended to imply that each user experiences an entirely different VE (thiswould hardly count as a shared experience). Rather, we argue that anamount of the user’s experience may be decoupled from the “objective”behaviour of the world without disturbing consequences. Given thatabsolute synchronisation is in any case impossible, it is our contentionthat so long as the length of the delay is not too large, and causal eventsoccur in the correct order, it is possible to accept a degree of subjectivitywithout it affecting the users’ understanding of the semantics of the appli-cation. Under these circumstances, given no other frame of reference theusers are unlikely to even realise they are not receiving the same view ofthe world as each other (as will be described later, PlaceWorld demon-strated this phenomenon in extreme ways). In general it would thereforebe useful to differentiate between what is “actually” happening in a VEand what the users perceive. In this way it is possible to optimise the man-ner in which information is transferred to minimise causal discrepancies.

In the DEVA system this notion is implemented by describing an entityas comprising of a single “object” and a number of “subjects”, the former

011

011

011

011

11

39

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

part being the “objective non perceived” aspect of the entity and the latterbeing a collection of the “subjective perceivable” characteristics (e.g. itsvisible, audible, haptic parts). An alternative view of this distinction is that the “object” represents what an entity does, while its “subjects”represent what the entity looks/sounds/feels like when it is doing it.

Communication between an object and its subjects can be imple-mented using whatever high-level “vocabulary” is appropriate. In thisway the system minimises the need for strict synchronisation while max-imising the accuracy of causally important events. One example of thisis the use of twines (Marsh et al., 1999). Twines are lightweight para-metric curves that are used to “smooth out” spatially and temporally thediscrete updates to the visible aspects (e.g. position and rotation) ofappropriate entities by interpolation thus increasing their “plausibility”by eliminating disconcerting jumps. We are also currently investigatingan infrastructure for providing “quality of service” based on balancingsuch smoothing against frequency of updates depending on the perceivedimportance of various events.

The flexible nature of the communications is also advantageous forsubject-to-object communications; for example when a user “grabs” andmoves an entity. Such manipulation with a classic client–server archi-tecture would involve a round trip: the event is sent from the client tothe server to be processed and then the effects of the event are then trans-mitted back to the client to be visualised. With the commonly availablenetwork technologies we are targeting for the client–server connectionssuch a round trip would introduce an unacceptable lag. It is perceptu-ally important for the cause and effect of the manipulation to be as tightlycoupled as possible.

With the separation of subject and object, changes caused by themanipulation can be immediately perceived by the subject, with a user-definable policy function being used to determine when the object isupdated. For example, an arbitrary fraction of the changes (say, one everyfifth of a second) are transmitted back from the subject to the object andthus onto other visualisers connected to the server. Alternatively, achange in position is only updated to the object when the subject hasmoved a certain distance from its previously synchronised location.

The goal is to ensure that the entity behaves “correctly/optimally” onthe visualiser performing the manipulation and also that it behaves“plausibly/acceptably” on any other visualiser connected to the cluster(while accepting that latencies and bandwidth issues rule out strict andabsolute synchronisation in such a distributed system).

3.4.2 Distribution and Communications

The programming model employed by the DEVA system is one of communicating “entities” which can represent objects in the virtual

123456789101112345678920111234567893011123456789401112345611


40

environment, the properties of the environment itself, or alternativelyabstract programming concepts that have no direct representation to theinhabitant of the environment. These entities are coarse-grained pro-gramming objects, exporting a number of methods that can be called byother entities, and implementing these internally using optimised imper-ative code (currently written in C++). The DEVA programming modelmakes entity behaviour explicit, allowing entities to query one another’sfeatures, those of the environment in which they exist, and to reasonabout these, rather than having behaviour emerge as an implicit sideeffect of a piece of code being executed. In this section we describe themakeup of an entity, and the mechanisms provided for enabling trans-parent and lightweight communication between entities distributedaround the system.

3.4.3 Defining the Behaviour of Entities

We can identify at least four conceptual sources of behaviour for eachindividual entity within a VE:

1. Behaviour unique to an entity. Entities each have their own particu-lar role within a VE, for example the ability of a stopwatch to recordelapsed time.

2. Behaviour common to a range of entities. Often many entities sharesome aspects of their behaviour through being of a similar type. Forexample, each pawn on a chessboard is subject to the same detailedrules governing movement; these rules do not apply in the same wayto other pieces which have their own restrictions. However, all thepieces have in common the rules determining whether they may be“captured”; a notion that is specific to chess pieces but not necessar-ily to other entities in the same environment. Thus a hierarchy ofcommon groups can be determined.

3. Behaviour common to all entities in a particular world. We generallyconsider gravity to be a phenomenon associated with everything foundin the world around us rather than as a property of each individualentity. This also applies to social constructs such as “monetary value”.

4. Behaviour that is dynamically required at runtime. If an entitybecomes inhabited by a user for example, it will behave differently –being controlled by a navigation device and so forth – to when unin-habited. Similarly an entity that is “set alight” will suddenly have prop-erties previously unavailable to it (for example, being able to set fire toneighbouring entities or raising the temperature of an environment).

Commonly, VR systems define the behaviour of an entity by attachingpieces of code, often written in Java, Tcl or some other scripting language. With little specific architectural support, however, it is oftenlaborious to code and difficult to ensure consistency. The DEVA

011

011

011

011

11

41

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

programming model attempts to improve the situation by taking an object-orientated approach to the definition of behaviour and by pro-viding features such as introspection (the ability to ask an object whatmethods it has, what parameters they take etc.) and run-time combina-tion of environmental influences with entity-specific behaviour. Thisessentially allows the traditional inheritance graph to be modified whilean entity is “live” in an environment, adding or removing properties and behavioural traits as necessary.

Our solution to the problem of merging the various sources of behav-iour that comprise an entity is to use “characteristics”. In DEVA a char-acteristic is a collection of methods and attributes relating to a singleconcept that can be attached or detached from an entity at runtime.DEVA supports three types of characteristic:

1. Innate: behaviours that define an entity and make it different to thosearound it, for example its physical shape.

2. Imbued: behaviours that are offered to an entity when it joins an envi-ronment, but can be over-ridden by that entity by a more specificinnate behaviour (for example, an entity lacking mass in an environ-ment that requires such an attribute, may inherit an imbued charac-teristic that approximates a default mass from its bounding volume;an entity that has a better idea of what its mass is can provide thefunctionality itself).

3. Enforced: behaviours that must be processed in order for an entity toconform to an environment’s requirements.

The methods and attributes that comprise the characteristic are dividedinto a single “object” and multiple “subjects” as outlined in the previoussection. While typically a characteristic will contain both parts, somecharacteristics are entirely abstract and have no directly perceivable rep-resentation in the virtual environment, that is, they have no “subject”part.

The researcher trying out new low-level ideas is free to write charac-teristics directly in C or C++ that interface to the VR kernel at whateverlevel is appropriate. More general users are free to use the library of exist-ing characteristics to construct entities without concerning themselveswith their implementation. We speculate also that this characteristicsystem has the added benefit of being usable from an immersive “graph-ical user interface” allowing the user to “mix and match” characteristicsat runtime.

In order to facilitate efficiently moving entities between worlds withdifferent behaviours, they contain two lists of characteristics, one inher-ited from the environment, and the other containing its own innatebehaviour. Traditional object-oriented inheritance is supported throughcharacteristics being able to load other characteristics when they are ini-tialised. The order in which the characteristics are searched enforces thecorrect precedence.

123456789101112345678920111234567893011123456789401112345611


42

When a method is called upon an entity, the two lists are searched in strict order. First of all the list of characteristics given to the entity by the environment is scanned for methods marked as being “enforced”.If one with the correct name is found, then this is called. Enforcingmethods allows an environment to ensure all entities contain a particu-lar method that cannot be overridden. Next, the entity’s innate charac-teristics are searched. If the method is still not found then theenvironment characteristics are searched again for methods marked as“imbued”. These are methods given to the entity by the environment butwhich can then be overridden by the entity itself. For example, an envi-ronment may enforce the notion of “solidity” upon all its entities (“youcan’t pass through walls”). It may also “imbue” all entities with a conceptof mass with a default value estimated based on its volume, but whichthe entity is free to override should it prefer. This process is illustratedin Figure 3.6.

3.4.4 Methods and Filters

The strict order in which characteristics get searched for methods leadsto another useful concept, that of the filter. As well as methods justreturning results, they can be declared to be filters. These are allowed to return a new method call as their result, which then continues to

011

011

011

011

11

43

Optional return valve

Income method call

Deva Entity

Enfo

rced

Inna

te

Imbu

ed

Characteristic 1method 1method 2method 3









Figure 3.6 Structure of a DEVA entity, showing the ordering of method processing.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

propagate along the characteristic lists. This is useful in a number of sit-uations. Say, for example, it was necessary to constrain an entity to aplane. The entity inherits the imbued method setXYZ from itsThreeDSpace environment. In this case the entity would define a char-acteristic itself that defines setXYZ as a filter. Since this would be foundbefore the imbued method, the entity’s filter is able to constrain the co-ordinates to a given plane before returning a new setXYZ message withthe new co-ordinates that then propagates through to the environmentimbued setXYZ method. Another example would be storing an entity’sstate prior to migrating the entity to another server node process or priorto a complete shutdown of the system. Each characteristic defines a snap-shot filter that adds any state variables it possesses to the input messageand returns a new snapshot message. The final characteristic then returnsa restore message containing all the variables that is sent to the newentity. The new entity contains a restore filter in each characteristic thatoriginally stored variables, which then unpacks the message.

Continuous behaviour is supported by permitting characteristics todefine a special method called “activity” which is polled intermittently.Activities are not subject to the usual calling precedence, but alwayspolled if defined.

3.4.5 The Distribution Architecture

DEVA is logically a client–server architecture, which to a first approxi-mation provides a single definitive locus of control for the VE using itsserver component, with “mirrors” of the entities being maintained ineach client process. Behind the scenes, however, DEVA pragmaticallymanages the delegation of control dynamically to the most appropriateparts of the system, thereby achieving the highest fidelity of perceptualand causal coherency attainable for the application at hand.

The “server” is in fact a cluster of processors running identicalprocesses called “server nodes” that together form a single multi-threaded parallel virtual machine capable of processing large numbersof entities. The intention is that the server provides a computing resourcefor multiple virtual environments, and maintains a far heavier process-ing load than any one user’s client could manage at any one time. A net-working layer provides lightweight position independent messagingbetween entities. Entities are created in and managed by the server nodeprocesses, and client processes – such as a visualiser or user application– connect to the server to interact with and obtain state informationabout the entities.

The server is persistent: it remains alive, processing any entitiesregardless of whether or not any clients are connected. Administrativetools exist to simplify the start-up, monitoring and shut down of the parallel server.

123456789101112345678920111234567893011123456789401112345611


44

Creating and Addressing Entities

Each server consists of a (large) fixed number, M, of virtual servers. TheM virtual servers are trivially mapped using a lookup table to the N servernode processes that comprise the server. This conversion takes place ata low level within the system and essentially hides the configuration ofthe server.

An arbitrary virtual server is chosen to create and manage each entity(currently a random virtual server is chosen but this selection processcould take into account loading factors).

Each server node potentially manages multiple entities. Each entity isassigned a unique “pool ID” – an offset into the list of entities managedby a given server node.

The location of an entity in the server is uniquely defined by the virtualserver and pool IDs. This pool ID is not strictly necessary and is pro-vided for efficiency only, since each entity has a unique name that canbe searched for in the list of entities managed by the server node.

When an entity is created a hash function is applied to the entity’sname to obtain a second virtual server. This virtual server manages thename of that entity, that is, it definitively knows the virtual server ofwhere the entity is actually located and its pool ID. The name of the entityand the entity itself are managed by separate virtual servers.

The same hash function is used throughout the system allowing anyDEVA process to obtain the location of a named entity. This data is cachedfor future use.Entity name to location lookup, while a lightweight process,is a central and frequent task in a distributed system. A scheme where thisload is spread equally across all server nodes is therefore advantageous.

The main advantage of the addressing mechanism is that it allows enti-ties to dynamically migrate across server nodes to help balance the pro-cessing load. When an entity moves it only needs to inform its namemanager of its new location. DEVA processes – which now contain outof date cached data – will receive an error the next time they try to com-municate with the entity at its old location. This error is trapped inter-nally and the new location of the entity is obtained from the namemanager; the originator of the communications with the entity is obliv-ious to the migration. The name manager can always be relied upon toknow the correct information and its location is trivially obtained.

Server Reconfiguration

It is also possible to migrate the names managed by a given virtual serveronto a different server node by updating the virtual server-to-server nodelookup tables in every client and server process.

The migration of both entities and name management allows servernodes to be dynamically added and removed from a running server.

011

011

011

011

11

45

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

Networking Protocol

Currently, standard TCP/IP point-to-point socket communication isemployed although some work has been undertaken investigating multi-cast, since for local area networks at least it promises improvements inperformance for our application.

Although not a strict requirement, the communications strategy isbased upon the assumption that inter-server node communication is fastcompared to server–client communication. For example, the servernodes are connected via a dedicated network or protected to some extentfrom superfluous traffic by a bridge; while clients connect to the serversvia a high-traffic shared LAN (local area network) or modem connection.

3.5 System Support for PlaceWorld

In this final section, we describe how a number of the artefacts and fea-tures of PlaceWorld, informed by the previous multimedia installationsand ethnographic studies, were supported by the technology of the DEVAsystem.

3.5.1 Menus

Every entity in PlaceWorld is able to respond to the user in one way oranother. When the user touches the entity on the navigator screen, amenu appears showing the available features. Most entities, for example,can be picked up and put in the user’s “pocket”, and in this way movedto another part of the same world, or indeed somewhere completely dif-ferent. Even entities that were coded before PlaceWorld was conceivedhave this feature “enforced” upon them by PlaceWorld’s master envi-ronment giving a degree of coherence to the interface. Other entitieshand-crafted for the world, such as the generators or noticeboards offermore sophisticated behaviour via the “innate” mechanism. Filters areused to cascade through the possible methods provided by each entity,combining the imbued, innate and enforced facilities available to the userand presenting these graphically at the interface.

3.5.2 Access Model

PlaceWorld’s access model demonstrates a more sophisticated use of thecharacteristic architecture. In order to maintain a degree of order in the environment, users are able to specify rules that govern the use ofentities they have placed in the world. The mechanism for describingthese rules is beyond the scope of this chapter, however the method of

123456789101112345678920111234567893011123456789401112345611


46

enforcing them can easily be explained. The PlaceWorld master environ-ment enforces on its contents (which includes all its sub-environmentsand their contents) an “authenticate” filter that responds to all possiblemethod calls. This effectively intercepts communication of any kindarriving at any entity in the world from any source. The authenticate filterexamines the incoming method call, and matches the credentials of the caller against whatever rules are currently being requested by theindividual entity via an innate method higher up the call chain. If theattempted invocation is valid according to these rules, the filter “releases”the original method call to cascade through the remainder of that entity’smethods, with the caller effectively unaware of the validation process thathas occurred. Methods that are not authenticated are returned to thecaller, which in turn has a method imbued by the environment to copewith such “return to sender” results. As before, entities that were codedwithout any knowledge of the access model automatically pick up aminimum amount of functionality to enable them to interact with theaccess model in a suitable manner; no extra effort on the part of the implementer is necessary, while those entities aware of their sur-roundings are able to easily extend the rule base in bespoke ways.

3.5.3 Exploiting Subjectivity

PlaceWorld has comparatively relaxed synchronisation requirements.Interaction takes place mostly at a social level, and there is currently nodetailed manipulation of shared entities in real-time such as might benecessary in a collaborative CAD package. The subject/object distinctionembodied by DEVA allows us to take advantage of this by programmingquite relaxed synchronisation routines for these entities. Though it isnecessary to synchronise, say, two users attempting to pick up the sameentity at the same time, or to reflect the change of text on a noticeboardto all users in a timely fashion, it is not necessary to synchronise thegentle floating effect of the PlaceWorld artefacts as they hover above the groundplane, nor, as another example, the swaying of the glowinghyperlinks in the sky.

The public exhibition of PlaceWorld at the I3Net conference in Swedenand at the Sixth Doors of Perception conference in Amsterdam demon-strated that the relaxation of synchronisation and the use of subjectivitycould be pushed further than we originally expected. Due to a technicalproblem with the graphics drivers at the time, it became necessary to usean extra rendering client to generate the display on the touch sensitivepanel of the PN-AV device, rather than using the camera as was origi-nally intended: the projected display on the external cylinder showed theuser’s view of the world, and the display on the touch screen showed the same view, but augmented with the controls and menus etc. The ideaof relaxed subjective behaviour had always been intended to overcome

011

011

011

011

11

47

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

network lag where users were geographically distant, and unable to seeeach other’s views into the environment: here, unusually, the same userwas able to see one view on the touch screen in front of them, and “thesame” view on the projected cylinder. The potential power of the sub-jective set-up was demonstrated when not a single user of the installa-tion noticed that there were significant variations between the twodisplays that they could now see simultaneously: where their attentionwas focused (changes of text, avatars moving around the environment,entities coming and going), all appeared to behave normally. The aes-thetics of the floating entities, which were in fact moving independentlyof one another with no attempt to synchronise the views, went unno-ticed. More dramatically, one of the artefacts, a version of ‘QSPACE’(Pettifer et al., 2001), is a graphical network much in the same vein asthe Web Planetarium described in Chapter 2. To preserve bandwidth thedistributed version of this artefact in DEVA was implemented with theforce-placement routines as subjective behaviour, and the weights andconnectivities that define the structure as objective and synchronisedbehaviour. What was overlooked during the implementation was that inthe force placement routines, a small random factor was introduced tothe positions of the nodes prior to application of the algorithm, as thisseemed to produce more aesthetically pleasing results. The implicationof this, however, was that although the logical and gross structure of theresulting 3D network was the same for each run of the algorithm, thesesmall perturbations in the initial state of the network resulted in slightlydifferent placement solutions. For the several weeks of implementation,and during the public demonstration of PlaceWorld, this went entirelyunnoticed, even though, with hindsight, it was clear that the variousvisual representations of the QSPACE being manipulated by the collab-orating users were very different. The fact that the logical structure of the artefact was coherent across views appeared to mask the visual differences in orientation and absolute positioning of the nodes.

3.5.4 Becoming a Place Where Places Meet

PlaceWorld aimed to bring together many diverse virtual environments,not just trivially by allowing a user to “switch” from one application toanother, but by making it possible for artefacts created in one place to be used elsewhere, analogous to cutting and pasting a spreadsheet into a word processor document. It was decided to integrate a numberof existing virtual environments, including other multimedia art instal-lations, into the PlaceWorld system. Artefacts and applications that are written with the DEVA framework in mind have this facility by their very nature; more challenging, however, was the problem of making legacy applications, such as the Legible City, or the MemoryTheatre VR (an installation by Agnes Hegedüs) work within this

123456789101112345678920111234567893011123456789401112345611


48

framework (this last application, in particular, is interesting since it was written using Silicon Graphics Performer, rather than raw OpenGL, andtherefore was already part of its own “system”). Figure 3.7 shows this and other environments embedded within PlaceWorld. MAVERIK’s(Hubbold et al., 2001) agnostic approach to data structures made thisintegration at the graphical level relatively straightforward, even to theextent of enabling the low-level graphics context to be shared betweenPerformer and native MAVERIK objects, and DEVA’ s ability to combinebehaviour at runtime based on enforced and imbued characteristicsallowed legacy applications such as the Memory Theatre VR to pick upappropriate interaction features such as the menus and the ability to becarried around by a user.

011

011

011

011

11

49

Figure 3.7 Images from PlaceWorld and its embedded environments: (a) the QSPACE; (b) the Legible City; (c) Memory Theatre VR; and (d) the Advanced Interfaces Group’s laboratory.

c d

a b

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


3

3.6 Conclusions

The vision of an electronic landscape remains an exciting one, and asgraphics technology and higher performance networking become moreaffordable and available, there are increasing opportunities for the kindof research and development described in this chapter. The iterativeapproach of the eSCAPE project has been an effective way of under-standing the social, aesthetic and technological problems associated withmaking these worlds meaningful and useful to the general public.

Acknowledgements

PlaceWorld was influenced in many ways by all the partners and partic-ipants in the eSCAPE project: too numerous to mention, this chapterdraws upon much of their work and our thanks go to them all. Specialthanks are due to Paul Arnold, Annika Blunck, John Bowers, AndyColebourne, Andy Crabtree, Timo Fleish, Simon Gibson, John Hughes,Andreas Kratky, John Mariani, Gideon May, Craig Murray, AndreasSchiffler, Jeffrey Shaw, Adrian West, and the project’s co-ordinator, Tom Rodden. The work was supported by a grant from the EuropeanCommission.

123456789101112345678920111234567893011123456789401112345611


50

4Using a Pond Metaphor forInformation Visualisation and Exploration

Olov Ståhl and Anders Wallberg

4.1 Introduction

The constantly increasing amount of information and media available inelectronic form leads to a growing demand for new methods for search-ing and browsing. Traditional text-based database queries can be limit-ing, often requiring a user to know exactly what it is she is looking forand to express this interest using predicates in a query language such asSQL. Furthermore, to be successful in locating the right information, auser will often have to be familiar with the standard interface metaphorof desktop computers and know how to use a mouse and keyboardefficiently.

Approaches to improve access to online information and to visualiseit in an intuitive manner have been under development for a long time.Examples of systems that display information graphically in three dimen-sions include VR-VIBE (Benford et al., 1995a), QPIT (Colebourne et al.,1996) and BEAD (Chalmers, 1993). All of these systems require (at leastto some extent) the users to use navigation to access particular infor-mation objects, because objects may obscure each other or may be outof view. However, navigation within 3D spaces is known to be difficult(Ingram and Benford, 1995), especially if the navigation needs to beprecise.

In this chapter we describe The Pond, a system used to search for andvisualise data elements on an engaging tabletop display. The Pond usesmethods of unencumbered interaction and audio feedback to allow usersto investigate data elements, and supports shoulder-to-shoulder collab-oration with the physical Pond artefact mediating the collaborationbetween those people gathered around it. The user interface is based onan ecosystem metaphor, presenting data elements in the form of shoalsof aquatic creatures inside a virtual 3D pond. The design makes use of a

011

011

011

011

11

51

static view of the information space, making viewpoint navigation unnec-essary. Instead the information creatures move and form groups, allow-ing the user to easily identify related information and to distinguishresults from different queries.

The work draws heavily on our experiences in developing two previ-ous systems, and concepts and approaches can be traced back to thesesystems. The Web Planetarium, a 3D interpretation of HTML structuresin the form of a graph (Chapter 2), and the Library Demonstrator, aninformation landscape constructed from the contents of an online librarydatabase (Mariani and Rodden, 1999), were both interfaces to activeonline information. Common features of these systems were the spatialarrangement of data elements, navigation around these data elementsand the introduction of new data elements into the display.

While these systems were successful in presenting information to end users, they were not necessarily easy to use. When using the WebPlanetarium, new information was only introduced as a result of explicitinteraction (clicking) by the user, and novice users could be too shy todiscover this fact and not load new information. The positioning of thisnew data could also be problematic. In the Planetarium the user must beobservant otherwise she may miss the introduction of the new dataelement. The Library Demonstrator addressed this problem using ani-mation and self-organising models to show the emerging relationshipsbetween information, and it is this approach we build upon in The Pondwith its shoal metaphor. Navigation around the data used a point andclick method, automatically transporting the user to (or close to) theselected object. The core of this navigation technique was to adopt anobject centric philosophy where users were explicitly freed from the over-head of having to manage their navigation and movement in threedimensions. This restriction of the overall freedom of movement meantthat users were able to focus on the exploration of the information space.However, users still had problems navigating the structures. It couldrange from getting completely lost to not being able to look upon thedata in the way they wanted. These systems were also subject to singleuser interactions and did not encourage a social atmosphere for theexchange of gathered information.

When we started the work on The Pond, the objective was to design amultimedia user interface where users without any prior knowledge or acquired competence should be able to easily handle both single aswell as groups of objects in an affordable and easy way. The objects inturn should be able to represent any type of information or media. Theyshould be able to present themselves in a natural fashion at the user’sconvenience. Also, they should disappear in non-obtrusive fashion whenno longer needed. The user should be able to select, move, sort andexplore objects of interest without having to confront the rigid hierarchythat is the hallmark of traditional file handling and database applica-tions. Instead of using colour, form or position to indicate group or class

123456789101112345678920111234567893011123456789401112345611


52

membership we wanted to use motion dynamics of objects to indicatethese properties. If new information is requested, objects should just floatto the surface and present itself to the user. Objects no longer relevantshould slowly sink to the bottom and quietly disappear. The viewpointshould stay fixed above the virtual pond surface, thus making it unnec-essary for the users to engage in navigation within the information space.

An observation made early in the design process was the difference in the behaviour of users when confronted with a horizontally placed display vis-à-vis a vertically placed one. When a group of people gathersin front of a vertically placed display – be it a big screen TV or a wall pro-jection – immediately a somewhat authorial situation tends to develop.The person in control of the display content is perceived as a teacher orlecturer. The rest of the group will play the role of an audience or schoolclass. When, on the other hand, the display or projection area is posi-tioned horizontally, people will gather on a more equal basis. With inspi-ration from as disparate sources as roulette tables, billiard tables and themilitary’s “classical” tactical war gaming board, it was decided that The Pond should use a horizontal display and that none of the sides of the physical artefact should be more important than any of the others.Tabletop displays have been in use for a number of years now, for exam-ple in visualising information (Kreuger and Froehlich, 1994) for com-mand and control scenarios (Chase et al., 1998) and augmenting physicalobjects (Ishii et al., 1999). The table provides a natural centre for interac-tion to take place around and encourages collaboration between the userswhile they are interacting with the table (Benford et al., 2000). The devel-opments in plasma display technology coupled with touch-sensitive sur-faces now make it possible to dispense with potentially clumsy projecteddisplays in favour of a neat, compact display. Interaction with The Pondis object based, using physical tags to load information into the applica-tion by placing them around the edge of the table (Ishii and Ullmer, 1997)(Ullmer et al., 1998). Unlike other approaches where the physical objectsare placed directly on the table and manipulated (Underkoffler and Ishii,1999; Fjeld et al., 1999; Schäfer et al., 1997), the tags used in The Pond arepassive. They make it possible to load and store information, but do notmanipulate this information any further. Interacting with the contents ofThe Pond is supported through direct manipulation of the virtual pondobjects (Shneiderman, 1983) using the touch screen display. Typically,interactions with the tabletop systems discussed above make use ofstereoscopic glasses, data gloves, and magnetic position trackers. Thesetechniques are not used in The Pond as our aim was to create as easy anddirect an interface as possible and not to encumber the user with devicesneeded to experience the material presented.

In the remainder of this chapter we will describe The Pond as it standstoday, details of its implementation, and observations from a study of sys-tem in use. We conclude with a discussion on some design choices andtheir implications on the potential utility of The Pond in different settings.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Using a Pond Metaphor for Information Visualisation and Exploration

53

4

4.2 The Pond

We now describe The Pond system in detail, examining its construction,both in terms of the physical artefact and the underlying software, andthe way in which it is used.

4.2.1 The Pond Ecosystem Metaphor

The Pond uses a user interface based on an ecosystem metaphor. Theobjective was to present an aesthetic that would hide the work choreaspect and act as a complementary backdrop for dialogue. When inter-acting with The Pond, the users see a 3D presentation of a virtual poolor pond (see Figure 4.1), an aquatic environment in which the informa-tion objects resulting from queries are presented as shoals of marinecreatures. The visual presentation is complemented by a sound environ-ment, consisting of a number of bubbling, splashing and whirling sounds,indicating various active processes inside the pond virtual environment.

Each information creature has a simple navigational behaviour thatgoverns its movements within the virtual pond. The behaviour is inspiredby the boid flock model described in Reynolds (1987). The basic steer-ing rules are:

● Avoid collisions with nearby creatures.● Match velocity with nearby shoal mates.● Stay close to nearby shoal mates.

123456789101112345678920111234567893011123456789401112345611


54

Figure 4.1 The Pond virtual environment.

These simple rules are then combined with additional influences likecuriosity (a creature might temporarily break away from its shoal to“examine” its surroundings) and competitiveness (an urge to be in thelead), resulting in a very dynamic and fascinating pond environment,where shoals of information creatures move around in tight groups,avoiding each other as well as the pond walls, and reacting to interac-tions from the users. The movement of a shoal is entirely dependent onthe combined navigational behaviour of its creatures, that is, there is noshoal “intelligence” that determines where the shoal is going or how itscreatures should behave in certain situations. The argument for using aflocking algorithm for positioning the information creatures is not basedon aesthetics alone. Our earlier experiences in information visualisationmixed with an ambition to explore a somewhat alternative user interfacemade us choose a self-animating system for the data elements. Not onlyanimating during the creation/insertion stage but during the whole timethe elements were available. In addition, the human visual system is quitegood at separating objects with common velocity vectors making it easierto identify group belongings in a crowded environment.

The Pond does automatic garbage collection, which means that one orseveral of the shoals that exists in the virtual pond may be removed. Thismay happen, for instance, if the environment is too crowded or because aparticular shoal has not been interacted with for a long time. A shoal thatis selected for removal by the system will sink down towards the bottomof the virtual pond, where it will disappear. However, should a user inter-act with a sinking shoal, for example, select one of its creatures, the shoalwill return to the surface and another shoal might be removed instead.

4.2.2 The Pond Example Application

To investigate the concepts behind The Pond we developed an applica-tion that allows users to search for and browse music content on theAmazon.com web site. The music theme was chosen to maximise ThePond’s visual and sonic impact, being a theme common to most poten-tial end-users. However, The Pond is not limited to this domain and isadaptable to other database resources.

A search task is initiated by a user providing a keyword string, forinstance the name of an artist, group, or a musical genre. The Pond com-municates the keyword to Amazon.com and presents the resulting albumhits as a shoal of creatures, each representing one specific album. Thevirtual creatures are represented by simple geometric shapes, which aretexture mapped with the albums’ cover images. By interacting with acreature, users are able to access the album data (e.g., artist name, albumand track titles, etc.) and play samples from some of the tracks. Thevirtual pond itself consists of a 3D model of a deep, narrow shaft thatextends down from a watery surface.

011

011

011

011

11

55

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

The music-oriented Pond application was selected since we wanted anapplication that would allow us to explore and evaluate the ideas behindThe Pond without having to implement a lot of advanced search func-tionality, and which would also be interesting and enjoyable to use (e.g.,looking for and listening to your favourite music). Furthermore, the sizeof a “typical” query result was expected to be less than 50 items, whichwould allow us to present a number of simultaneous shoals in the virtualpond environment without making it appear too crowded.

4.2.3 The Hardware Platform

The physical Pond artefact has the form of a desk, on top of which a largetouch-sensitive plasma display is horizontally placed. On top of thedisplay surface is a wide wooden frame with an irregular curved outline,representing the bank of the virtual water pond rendered on the display.The frame is covered by pieces of thick carpet so that users standingaround it can comfortably lean over the display when interacting withthe virtual pond environment (see Figure 4.2).

Built into the frame are a number of speakers that are used to outputvarious sounds and music samples. The use of audio is an important

123456789101112345678920111234567893011123456789401112345611


56

Figure 4.2 The Pond desk.

feature in interacting with The Pond, and the sound system consists ofseveral devices including a sampler, sub-woofer and amplifier. The framealso encloses several RFID tag readers. Each reader is entirely embeddedinto the frame carpet and uses three light-emitting diodes to indicate itsposition and state to the users. Users may initiate queries by placing aRFID tag on such a reader. The tag’s identifier, sensed by the reader, willidentify a query keyword or phrase and the query will be initiated.

4.2.4 The Software Platform

The Pond software platform consists of two different components (seeFigure 4.3):

● The visualiser application renders the view from the virtual pond envi-ronment on the plasma display.

● The pond manager application accepts query keywords from the usersand communicates these to the Amazon web server. The resultinginformation is used to introduce and control shoals of creatures inthe virtual pond.

The visualiser and the pond manager are built using the DIVE (Chapter12) distributed VR system from the Swedish Institute of ComputerScience (SICS). The 3D virtual pond is in fact a DIVE virtual world,shared by the visualiser and the pond manager. When the pond manager

011

011

011

011

11

57

PondManager

Query and result viaweb services API

AmazonWeb server

Visualiser

Dive pond virtual world

Tag readers Speakers

Figure 4.3 Overview of The Pond software components.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

application is started it loads a file containing a model of the pond graph-ical environment. Initially the virtual pond is empty, that is, it containsno information creatures since these are only created as the result of userqueries. When the visualiser is started it joins the world created by thepond manager and will receive a state transfer containing the graphicalenvironment. From this point on the two applications share the pondenvironment, exchanging DIVE messages to notify each other of anyworld changes (e.g., the introduction of information creatures or crea-ture movements).

The visualiser is responsible for detecting user interactions on theplasma display’s touch sensitive surface. If a user clicks on a creature, the visualiser distributes an interaction event that will be received by thepond manager. After examining the event, the manager determines theappropriate action to take (if any), which typically involves some changeto the virtual creature (e.g., a change of appearance or position). Thechange will generate a new DIVE message, which will be received and han-dled by the visualiser, thus making the world change visible to the users.

The visualiser uses a number of rendering plug-ins to enhance thevisual appearance of the pond environment. A ripple plug-in createsripples and waves that graphically deform the perceived environmentwhen fingers are pressed on and moved over the touch screen of ThePond. A second plug-in is responsible for generating a caustic lightingeffect on all objects in the water. These plug-ins use hooks in the DIVErenderer and operates on the image it generates.

User input in the form of query keywords is handled by the pond man-ager. Queries are initiated either by direct request, that is, a user placinga tag on a reader, or more indirectly through inferred user interest. Themanager continuously monitors the tag readers embedded in The Pondtable and immediately senses any change in their status (i.e., RFID tagsbeing added or removed). To get data requested by these queries themanager uses the Amazon.com web services (Amazon.com, 2002). Withits web services program, Amazon.com offers third party web sites andapplications the ability to search or browse Amazon’s product database.The manager retrieves information like album titles, artist names andURLs to cover images through the web services using XML over HTTP.This information is then used in the process of creating representationsof individual albums and shoals representing more than one hit on aquery subject. The pond manager also handles the audio output to ThePond table speakers, which is described further in Section 4.4.

4.3 Interaction

Users standing around The Pond table are able to interact with it inseveral ways to perform various tasks. In the absence of keyboard or mouse devices, the users perform most interactions by tapping or

123456789101112345678920111234567893011123456789401112345611


58

stroking the touch-sensitive display surface. Furthermore, prepared tagsthat are spread out on top of The Pond frame allow users to input infor-mation about queries without having to type on a keyboard.

Queries are initiated by placing tags on the tag readers in table frame.Since each available query tag has a sticker attached to it showing thekeyword text or an image of an artist or group, users are able to deter-mine what keyword a certain tag represents (see Figure 4.4). When a tagis placed on a reader, the reader senses the tag’s unique identifier, whichis pre-mapped to the keyword. As soon as a query is initiated, an emptyshoal appears inside the virtual pond, representing the ongoing query.The shoal is indicated by a circle and a text string specifying the querykeyword, as seen in Figure 4.5a. The circle will commence to float insidethe pond environment, bouncing off the walls and avoiding other shoals.

When the query results are initially delivered from the Amazon.comweb server, creatures start to appear inside the empty query shoal. Eachsuch creature represents an information element from the query result;in this case a CD album. As soon as a particular result creature has beencreated, it will begin to move around inside the virtual pond. However,since all creatures resulting from a particular query stay close together,different query shoals are easily identifiable, even with a vastly populatedpond environment.

011

011

011

011

11

59

Figure 4.4 Query tags representing the keywords “Frank Sinatra”, “Billie Holiday”, “ABBA”and “Kraftwerk”.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

123456789101112345678920111234567893011123456789401112345611


60

Figure 4.5 Actions and events in The Pond: (a) To the right is an empty shoal, representingan ongoing query using the keyword “Nitin Sawhney”. To the left is a shoal of creatures rep-resenting a finished query. (b) Two shoals representing the results of queries using the key-words “Zeppelin” and “Queen”. (c) To the right a selected creature dragged a bit from itsflock. (d) Playing sample one (out of five available) on the album “Siamese Dream” made bySmashing Pumpkins. (e) Three creatures dragged into a creel, limited by the pond walls andtwo buoys. (f) Getting related albums to an album called “Lamb” made by the group Lamb.

e f

c d

a b

When all the results have been delivered and the corresponding virtualcreatures created, the shoal circle label will change to only display thequery keyword (see Figure 4.5b). Shortly thereafter the shoal circle andlabel will disappear, leaving behind only the creatures visible to the users.However, it is possible for users to make the circle and label visible againby tapping on any of the creatures belonging to the shoal. Each shoalmember represents an information element that is part of the result ofthe corresponding query. The information includes name of group orartist, album title, and URL to the album cover image, as well as URLsto a number of short Real Audio® samples of some of the album tracks.Users are able to access this information by manipulating the creaturesin different ways.

A user selects a creature by tapping on it on the display, and is thenable to see the information identifying the artist and title of the corre-sponding CD. This information is presented as a virtual text string, encir-cling the 3D creature and moving alongside it (see Figure 4.5c). The textwill only be visible for a brief period (around five seconds) and will thendisappear. The frame of the creature will become green to indicate theselection, and will remain so until changed back to white again whenanother creature is selected.

Tapping once more on an already selected creature makes it float upto the surface and initiates playback of the corresponding album’s firstReal Audio sample (see Figure 4.5d). By using the Real Audio player and the sample URL, the audio data is streamed over the network fromthe Amazon.com web site and output to the speakers embedded in tableframe. By tapping repeatedly on the creature being played, users are ableto step through the sample tracks available for that particular album. Thetext encircling the creature displays the number of the sample beingplayed as well as the total number of available samples.

By default, the viewpoint in the virtual pond environment is placedabove The Pond surface, looking down, and at a distance from where theview is always guaranteed to include all the existing creatures. In this waythe users are able to get a good overview of all the activity within theenvironment. However, this also constrains the creature representationsto be rather small (as seen on the plasma display), which might presenta problem when trying to identify the creature graphics of the CD albumsthey represent.

In order to allow users to get a closer view of one of the shoals, azooming mechanism allows for the translation of the virtual viewpointto a position close to a shoal centre. The viewport is just large enough toencompass the whole shoal, with the benefit of making the creatures inthat shoal together with their associated text strings and images appearlarger. As a result, other shoals may end up out of view, not visible fromthe new viewpoint position. Another feature of the zooming mechanismis that, while zoomed, the viewpoint is attached to the shoal, which meansthat it will move as the creatures within the shoal moves. In this way the

011

011

011

011

11

61

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

viewpoint will always stay centred on the chosen shoal, even as this shoalchanges its position within the virtual pond. While zoomed, the shoalcreatures may be interacted with in the same way as before, for example,selected to initiate replay of the music samples.

The zooming mechanism is triggered when a tag is placed on a readerwhile the corresponding shoal already exists in the pond environment.Removing the tag from the reader will reset the viewpoint to the defaultoverview position. Thus, if a user places a tag associated with the keyword“Dylan” on a reader while a “Dylan” shoal exists, the viewpoint willchange to a position close to this shoal, and stay there as long as the tagis on the reader. Whenever the tag is removed, the viewpoint is reset.Only one user at a time can use the zoom mechanism. If a zoom is activewhile a user initiates another zoom the viewpoint state will not be overridden and the second request is ignored.

As users initiate more and more queries, older shoals may have to be removed in order to incorporate newer ones. To prevent a particularcreature from being removed from the environment, it is possible forusers to move individual creatures to safe areas, called creels. This is doneby touching the creature with the finger, and then dragging the finger(and thus the creature) along the surface and releasing it over the creelarea (see Figure 4.5e). Creels exist in several places in the virtual pondenvironment, close to the tag readers. Once inside a creel, a creature isconstrained to move only within the creel boundaries. By moving severalcreatures, possibly from different shoals, into a creel, a selection shoal isformed consisting of creatures that a user finds interesting for somereason. Since this particular action shows an interest from the user in aspecific data element, this instantiates an extra functionality. A morefocused query is automatically launched pertaining to the correspond-ing album (usually resulting in fewer hits and consequently a smallershoal), thus further populating the environment (see Figure 4.5f). In thecase of Amazon.com the system issues a related albums query resultingin items that have a high chance of being interesting to the user as well.The creatures in the creels may be interacted with in the same way asother creatures (e.g., tapped on to play music samples), the only differ-ence being that they won’t be removed from the environment as long asthey stay inside the creel. Creatures that are dragged out of a creel willreturn to their native shoal, or form a new shoal if their native shoal nolonger exists.

By using a recordable tag, it is possible for a user to save the contents(i.e., creatures) of a creel selection shoal. A recordable tag is a RFID tagwhich is not pre-mapped to a search keyword. Placing such a tag on atag reader next to a creel creates an association between the tag and thecreatures within the creel area. When the tag is then removed from thereader, the creel’s selection shoal will disappear from the environmentand may be regarded as being stored on the tag. By placing the same tagon a reader at a later stage, the creatures of the “saved” selection will

123456789101112345678920111234567893011123456789401112345611


62

reappear, added to the corresponding creel shoal. Thus, creels andrecordable tags allow users to save references to one or several albumsof particular interest, and to gain quick access to these on a later occasion, as well as to share references with each other.

4.4 The Pond Audio Environment

The sonic environment of The Pond consists of two different parts, thesoundscape and the interface sounds. In accordance with the ecosystemmetaphor, the nature of the soundscape is founded on a family of aquaticwhirling sounds and a deep, obscure mumbling giving the impression of the data elements ascending from an abyss of ooze or mud. Theseambient sounds fade out when samples from a selected creature starts toplay, and fades smoothly in when the music stops. The interface soundsacts as a feedback mechanism to indicate user interactions like select-ing/unselecting, clicking, dragging etc. This collection also originatesfrom a number of concrete water sounds. Samples from different typesof bubbles, a dripping tap, whirlpools etc. are heavily processed to suittheir particular purposes. Examples include:

● When a user initiates a query, the appearance of the query shoal isaccompanied by the sound of sluggish bubbles rising from the bottom.

● When a sound file is retrieved over the network, the waiting time ismasked with a bubble vortex that after a few seconds is smoothlymerged with the music sample.

● When the user removes a tag from a reader, the action is accompa-nied by sounds of a cluster of bubbles being rapidly inhaled by ThePond itself.

● When the user draws his finger across the touch screen to drag anobject, a glass organ sound reminiscent of drawing a finger along thedamp edge of a crystal wineglass is heard.

● The visual zooming in of a shoal is illustrated by the familiar bubblesound gradually magnified through lowering of the pitch. Whenzooming out, the process is reversed.

Every time an interface sound needs to be played, the system will ran-domly choose a sound from a collection of sounds available for the particular interaction. There are ten query sounds, ten RFID sounds, ten dragging sounds, etc. The idea is to give the impression of the sonicinterface being somewhat organic and unpredictable.The precise spatial placement of every sound is achieved through a built-in high quality four-channel sound system. A subwoofer in the table footproduces a deep and suggestive bass. The computer-controlled softwaremixer makes it possible to physically move sounds around and to createexpressive musical gestures.

011

011

011

011

11

63

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

4.5 Observations from Use

To test out The Pond a number of sessions were held with external participants in the expectation that some qualitative assessment of theeffectiveness of system could be made. After being presented with an introduction to The Pond and the example application by one of thedevelopers, the participants were told to explore the system, using the available tags on the table frame. The focus of the studies was on thetechnology-in-use (Button, 1992), that is, the sequences of interactionwithin which The Pond came to be used in real time. The developersstayed close the whole time to answer questions and to help out whenproblems occurred. Most sessions involved two or three participants andlasted for about an hour. Video was used to record each session and fromthis material a number of observations were drawn.

The general impression is that the participants had very few problemsin understanding the basic principles of The Pond design and its operation, for example, how to use the tags to initiate queries or how tointeract with the creatures in order to play sample music. On very fewoccasions did the developers have to repeat the instructions given at thestart of the session. The Pond is engineered to support hands-on expe-rience and ease of use, and most participants did engage in various activ-ities, together and alone, such as selecting CDs, listening to samples,transferring CDs to tags, and transferring CDs from tags after only a shorttime. During a discussion after one of the sessions, a participant stated:“The main good thing about The Pond is that the interface doesn’trequire any special level of computer skills for using it.”

Not only did the participants get the grasp of The Pond fairly quickly,but were also able to envisage its use for practical purposes in every daysettings. For example, one of the participants suggested:

“Place it in a Virgin Megastore, it can be used as a jukebox. Perhaps youshould specialise it more so it fits into a category like film, books, articles,and radio. And you should be able to tell The Pond which CD you want andthen be able to put it into your shopping basket.”

In some sequences of interaction it is noticeable that the sound workedas functional feedback. After the participants had recognised the con-nection between the function and the sound, the sound worked as asupport for the user in interacting with The Pond. This can be seen in asequence when a participant drags a creature out from a creel. In thesequence the sound that is heard as the dragged creature passes the creelboundary makes it clear to the participant that she has succeeded andthe creature is released.

The problems that participants had during sessions were mostlyrelated to the hardware, and especially the touch screen. A serious draw-back with the large area touch screen technology we currently use is alimitation when it comes to detecting multiple simultaneous touches.

123456789101112345678920111234567893011123456789401112345611


64

This sometimes caused conflicts when more than one participant wastrying to interact with creatures at the same time. For instance, it was impossible for two users to drag creatures at the same time. Also, ifone user was dragging a creature while another user tried to select a different creature, the selection would most often fail. The reason for thisis that when two positions on the touch screen are clicked simultane-ously, a position in between the two will be returned by the driver soft-ware, most likely causing the second user to miss the intended creature.Future multi-user touch screens will alleviate this shortcoming. In thepresent set-up we try to work around the problem by making most interactions single click based, dragging being the exception. Anotherproblem with the touch screen was that of sensitivity. Sometimes whenthe participants where dragging creatures they seemed to unintention-ally lose contact with the touch surface, causing the creature to stop following the finger and return to its shoal. What happened was that even though their finger was actually in contact with the surface, theydidn’t put enough pressure onto it, which made the driver report thatthe interaction had ended. A number of “phantom clicks” (i.e., the driverreported a click when there was none) were also experienced which hada similar effect on the interaction. We expect most of these hardwareproblems to disappear in the future as the technology matures.

One of the software mechanisms that caused some problems waszooming. It was evident from the video material that zooming couldcause confusion if being performed by one participant when others wheretrying to interact with creatures in shoals different from the one beingzoomed (as these shoals would disappear from the view). Zooming prob-ably requires synchronisation of the users’ activities in order to be useful.

As it turned out, the navigational behaviour of the information crea-tures became a source of fascination for many of the participants whenencountering The Pond, and it was not uncommon to see them standingsilently at the side of the display for some time, simply watching the creatures swim back and forth.

4.6 Discussion

One of the most frequently expressed comments from users was thedesire to be able to search The Pond in a more dynamic fashion, that is,to issue custom search queries, which is not possible through the use of the tags at the moment. The seriousness of this limitation depends ofcourse on the application and its intended use. The Pond is not primar-ily designed to be used for the task of quickly locating and browsing arbitrary known information items on the web, in which case the use of the pre-configured query tags would probably be unacceptable.Instead, the focus is more on applications where the size of the data setis fairly small (i.e., a product catalogue) so that the query tags more or

011

011

011

011

11

65

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

less cover the whole search space. In addition, if the database is hierar-chical or allows items to somehow contain references to each other, theusers can traverse the data not only by issuing direct queries using thetags, but also by exploring the creatures already existing within The Pond (following the inter-creature links to discover other “parts” of the dataspace). In such settings the tags can be seen as a way to introduce infor-mation objects that probably won’t be the end result of a user’s query,but merely a starting point for further exploration. Examples includeusing The Pond in a record store setting where the top 100 albums are tobe displayed and played, or in a hotel lobby where menus from localrestaurants, museum information, etc., can be examined in a collabora-tive fashion. In the Amazon.com data source set-up, a way of searchrefinement and database traversal was available through the use of relatedalbum queries. This introduces items that have a high chance of beinginteresting to the user. Combined with the fact that items considereduninteresting will sink and disappear, this has the effect of graduallyrefining the ecosystem to contain more and more interesting items. Soeven though very specific selections are being used as “seed” material,the contents of the environment might develop through time to containa mix of several general tastes, specific favourites and complete wildcards.

Our aim when designing the interface was towards a non-intrusiveform of interaction, avoiding introducing keyboard type input. The RFIDtags do serve well in cases where a hierarchical, finite and discrete data-base structure exists, as the music database in our case. Notice that theenvironment is in the most part populated by relations and not exactmatches, so we force exploration on the users. It proved at times frus-trating for users not to be able to directly summon the artist or track oftheir choice, and at times gratifying in discovering alternative music of their liking. A solution that would in fact allow users to input exactsearch keywords or query phrases and which wouldn’t entail the use ofsome sort of keyboard function would be to use voice as input.

An issue that came up during the user studies was the possibility ofmodifying or even replacing the example application to add new func-tionality or to support different types of media. For instance, instead ofsaving references to albums onto recordable tags, one participant wantedto be able to save the actual sample data onto some kind of portabledevice, for example, a PDA, which could then be used for playback. Thiscould be achieved by using some kind of point-to-point (e.g., infra-red orBluetooth) connection to transfer the audio data from The Pond to thePDA. Also, since DIVE runs under PocketPC™ it would be possible todevelop a PDA application which would join the pond virtual world to beable to access the information (e.g., music samples) represented by thevirtual creatures. Modifying The Pond to support a different type of con-tent media, or even combinations of different media, would requirechanges to the pond manager application (see Section 4.2.4 above) in two different ways. First, it would require the pond manager to be able to

123456789101112345678920111234567893011123456789401112345611


66

communicate and issue queries to a database service different from theone provided by Amazon.com (assuming that Amazon.com does not support access to data elements of the “new” media type). This would befairly straightforward since the code responsible for handling queries isisolated in one specific module, which can be easily replaced. It is also pos-sible to add a second (or more) module to exist in parallel with the first ifthere is a need to connect to more than one database service during a ses-sion. Secondly, to support new types of content media would require newrendering mechanisms to present the creatures to the users when there isinteraction with them. As is the case with the current Amazon.com webservice version and the RealAudio player, special information renderingthat goes beyond small pieces of text, images and texture-mapped videoswill probably need external renderers. These can be in the form of DIVEplug-ins or external applications (e.g., Windows Media Player to rendervideo material). The pond manager application defines a number ofevents (e.g., OnClick) that are fired when creatures are interacted with andneeds to display details of their associated information (e.g., album titlein the Amazon.com application). By modifying or replacing the currenthandling of these events in the pond manager code, it is possible to changethe way the creatures present themselves to the users.

One of the consequences of using a static view into the pond envi-ronment is that the display space is restricted, thus limiting the numberof creatures that can be presented simultaneously. As described earlier,garbage collection can alleviate the problem to some extent by removingold shoals when new ones need to be introduced, but it does not help if the new query result is too large to fit into the virtual pond by itself. Making creatures smaller in size will make more of them fit before the pond becomes to cluttered, but also makes it harder to dis-tinguish the album images, read the text strings, etc. One possible exten-sion of the shoal concept that could help in presenting large data sets ishierarchical shoals. Assume that a shoal could be made up of sub-shoalsas well as information creatures, and that a sub-shoal was initially rep-resented by a single graphical object, just like an ordinary creature. Whenclicking on a sub-shoal object it would expand to display all of its elements, some of which might themselves be sub-shoals, and so on. Anexpanded sub-shoal might collapse into the single sub-shoal object aftera certain amount of time, or possibly as the result of an explicit userinteraction (e.g., clicking once more on the sub-shoal object). In this wayit would be possible to make available a large number of informationcreatures to The Pond users without having to display them all at thesame time. The rules governing the creation of sub-shoals are probablyapplication specific, that is, the criteria used to determine when to createa sub-shoal and which creatures to add to it. Another possible solutionto handle large query results would be to make each shoal present onlya limited set of the resulting creatures at one time, and then allow theusers to somehow (via some shoal interaction) make the shoal present

011

011

011

011

11

67

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


4

the next set of creatures, and so on. This is the method used by mostInternet search engines where the user is presented with a list of aboutten hits, and is then expected to use the “next” or “previous” links tomove forward or backward in the result material.

4.7 Summary and Future Work

We have presented The Pond, a multi-user system for browsing infor-mation (currently CD record data) on the Web using an engaging table-top display. Users input musical search keywords (typically names ofartists or groups) using RFID tags and are presented with matching Webinformation in the form of shoals of aquatic creatures inside a virtual 3Dpond. The virtual pond environment is presented on a big touch-sensi-tive plasma display, which is placed horizontally to better support shoul-der-to-shoulder collaboration between those people gathered around it.By touching the surface of the display users can interact with the crea-tures to access the information they represent, for example, play musicsamples from the corresponding CDs.

A series of user studies have provided initial insights into the utilityof The Pond. The results indicate that the device and metaphor are easyto understand and use, but also identify a number of problem areas. Forinstance, the touch sensitive display doesn’t currently support multiplesimultaneous interactions, which sometimes caused the users’ actions tointerfere with each other.

The Pond has been demonstrated to members of the research com-munity as well as to the public on numerous occasions. The feedback andobservations from these sessions have been used to refine The Pondunder an ongoing evolution process. Future ideas include support formultiple simultaneous interactions on the display surface, using a voiceinput system for entering search keywords, and using PDAs to extractand input information to and from The Pond environment.

Acknowledgements

We would like to thank Lennart E. Fahlén, Jan Humble, Jenny Lundbergand Jonas Söderberg for their participation in the development of The Pond system. We would also like to thank Adrian Bullock for hiscontribution in documenting The Pond work as well as Andy Colebournefor his Pond-related efforts within the eSCAPE project. This work wasundertaken as part of the Swedish Research Institute for InformationTechnology (SITI) funded Electronic Landscapes Project (ELP). Wewould like to thank all those who have experienced The Pond andprovided us with invaluable feedback. The Pond work has also beendescribed in (Ståhl et al., 2002).

123456789101112345678920111234567893011123456789401112345611


68

Part 3Mixed Reality Environments

011

011

011

011

11

69

5City: A Mixture of Old and New Media

Matthew Chalmers

5.1 Introduction

The majority of the chapters in this book describe what might nowadaysbe called “traditional” inhabited information spaces: collaborative virtualenvironments (CVEs) or virtual worlds. Although not all CVEs centre oncomputer-rendered 3D graphics, the paradigmatic CVE does so. Shared3D virtual environments are emblematic of CVE research, but have onlygained public acceptance in the form of computer games. The focusedengagement in such games is designed to fit with the closed world of thevirtual environment. A player can become immersed in a game – closedoff from the “real” world – by attention as much as by apparatus. A per-sonal computer (PC) at home can be as engaging as the head-mounteddisplays and immersive projection technologies in research labs.However, even a single-player non-networked game may be a resourcefor social interaction, for example played by one person while friendsand family shout advice from the sofa, order pizza by phone and slip intothe kitchen to get more drinks. A computer game is a resource for farmore social interaction than the software’s architecture may suggest. Ingeneral, the wider context of use is hardly modelled or represented in thesystem. Games’ internal data structures are designed to be decoupled –that is, closed off – from the other media people use in everyday life, andthis decontextualised design approach has worked well in this domain.

Although many households, schools and workplaces have computersthat could support 2D or 3D virtual environments, such CVEs are rarelyused as a medium of family interaction, education or work. One reasonfor this may be the decoupling from the more traditional interactions offamily members, the overall educational activities of the school and thebusiness of the workplace. The information within the CVE would haveto correspond with each user’s wider context, and this would requiresensing and tracking of users’ activity in media beyond that of the com-puter. To paraphrase part of this book’s introduction, a large amount of

011

011

011

011

11

71

our activity relies on the knowledge of what other people do, and whatpeople do in the home, street, school and workplace involves many non-computational media. However, CVEs are designed on the basis of nar-rowly focused engagement, decoupled from their users’ wider context offamily, friends, learning and work. Other than in games, CVEs’ decon-textualised design approach has not led to popularity or widespread use.

As the introduction to the book also points out, a number ofresearchers have begun to work on inhabited information spaces that aremore “out in the world” than traditional CVEs. A rhetorical example theeditors give is “a system that enables co-located groups to co-operativelywork with information by using a display projected onto physical arti-facts”. An IIS may include tangible artefacts in more traditional media,such as urban models and interaction devices made from wood, wire andplastic (Underkoffler and Ishii 1999), or tiles and pages made from tonerand paper in augmented reality systems such as MagicBook (Billinghurstet al., 2001).

The work discussed in this chapter is intended to go further in thisdirection. As part of the Equator interdisciplinary research collaboration(www.equator.ac.uk), the City project explores digital spaces that arepeers with others, rather than a digital space that is the primary focus orlocus of activity. For example, in our system, one person’s use of a 3DVR model of a museum exhibition is coupled with another person’s useof hypermedia describing the exhibition as well as a third person’s use of the traditional “bricks and mortar” exhibition. No one of the threeis primary; each is part of the context of the other two. We combined themedia of traditional exhibitions, mobile computers, hypermedia andvirtual environments in one design, and support interaction betweenpeople using different subsets of this heterogeneous set of media.

The project was initially theory led. A number of theoretical issueswere outlined in a discussion document, and then exemplified by scenarios of technology use. As the project grew and developed, our theoretical issues, design scenarios, system development, evaluation ofpilot trials, and observational studies all affected each other. Although Isometimes campaign for the rule of theory, not one of these areas isprimary. Instead, each is part of the context of the others. The projecthas always aimed to get out into the streets of the city, but we decidedto begin our work in a more controlled setting: the Mackintosh Inter-pretation Centre, a permanent exhibition devoted to the life and work ofCharles Rennie Mackintosh (1868–1928). Mackintosh was a Glasgowarchitect, designer and artist, and several of his buildings and other institutions related to his work stand in the city. Often simply called the “Mack Room”, the Centre is comprised of textual and graphical displays with some original artefacts, as well as over 20 screens present-ing video and interactive digital material. The Mack Room is in TheLighthouse, Scotland’s Centre for Design, Architecture and the City(www.thelighthouse.co.uk).

123456789101112345678920111234567893011123456789401112345611


72

More generally, we are exploring the way that digital information isjust another part of an interwoven set of media, the use of which con-stitutes inhabiting the city. Unlike traditional CVEs, we aim for systemscoupled with and contextualised in everyday activity, and hence inaccord with contemporary theory of the use of language and space. Abasic theoretical premise is that we can only use digital media becauseof such interweaving and interdependence, and we are looking for waysto increase and take advantage of the interdependence of traditional andnew media. This theoretical standpoint is set out in the next section ofthis chapter. The subsequent section outlines a system design based onthis standpoint, followed by a discussion of user experience in trials ofan implemented research prototype. Some of the details of our ongoingand future work are then outlined before a concluding section that offersmore general reflections on the project.

5.2 Theory

This section focuses on conceptions of space and the media associatedwith work, and how we often conceive of space as a medium that standsabove or apart from others. It is this usually implicit assumption that letsus talk of information spaces as being “inhabited”. I would like to presentan opposing view that treats information spaces as merely one mediumamong the many used in everyday life. My approach is based on experi-ence with information visualisation and virtual environments, as well assome borrowing of structuralist semiotics (Saussure, 1983; Nöth, 1995)and philosophical hermeneutics (Grondin, 1994). Part of this latter workwas set out in a recent paper in the Journal of Computer Supported Co-operative Work (Chalmers, 2003).

When discussing work, and designing systems for remote collabora-tion, we all too often concentrate on emulating the spatial aspects of theworkplace, for example modelling spatial forms and supporting remotecommunication that appears to be like face-to-face interaction. Alldesign has to be biased in some way, and the bias towards space in CVEresearch may be due to it being technologically led more than sociolog-ically, semiologically or philosophically led. The arrival of cheap graph-ics hardware and the eye-catching novelty of 3D images gave rise to agood deal of work that focused rather narrowly on the construction ofrather decoupled and decontextualised information spaces. This is truein some of my own work over the past twelve years, ranging fromChalmers (1991) to Morrison et al. (2002), and there are strengths, weak-nesses and alternatives to such a bias (Chalmers, 1999).

Newer technologically driven research is weakening or revealing CVEresearch’s implicit assumptions of space’s primacy and independence.Many of the characteristic design principles and assumptions were estab-lished before the current fashion for mobile computers and ubiquitous

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

City: A Mixture of Old and New Media

73

5

computing. Nowadays, it is possible to obtain tolerable frame rates for3D graphics on a wirelessly net-connected handheld computer. A personcan thus be walking down a city street with a friend, chatting about amuseum they intend to visit, while simultaneously watching the avatarof another friend moving through a CVE – with that “remote” friend alsotaking part in the conversation. In this case, it would seem difficult toclaim that the person “inhabits” the information space. One might askwhether the person is in digital or virtual space, or in real or physicalspace, but the question is based on two false dichotomies: digital mediaare no more or less real than older media, and computers are just asphysical as buildings and books.

The workplace has always been affected by communication withpeople in other locations. Many traditional, everyday and non-digitalmedia support remote interaction, for example letters, books, maps andthe landline telephone. There are already digital media in the contem-porary workplace that support remote interaction, such as email andmobile telephones. Nowadays, why do we not speak of “entering cyber-space” when we use email, as people did a decade ago? Why do we not inhabit telephone space, or speech space, or MacDonald’s employeename badge space? I suggest that a principle from philosophicalhermeneutics is useful here: we don’t talk about these technologies insuch marked ways because we have appropriated them into our every-day life and language. We no longer “enter cyberspace” because email isso interwoven in our everyday life and familiar in our experience that wedon’t need to mark it out in such a way. We don’t inhabit telephone spacebecause we understand telephones, in particular how to present ourselvesthrough them and how to present ourselves to “spectators” nearby whocan perceive our use of them. We only “inhabit” virtual worlds becausetheir designs are so new and decoupled from other media. Experienceand understanding of such coupling lets us focus on the task of com-munication, not on the tool for communication, just as a carpenterengaged in his work focuses on hammering and not the hammer.

We continually mix and couple media in our everyday communica-tion: walking, gesturing and pointing while talking, and referring toplaces and what people did in them as one writes. Space is an essentialpart of this mix. It has its unique characteristics that differentiate it fromother media, but it has no privileged position above or apart from them.It does not stand alone as a paragon for computational media to emulate.More generally, a medium cannot be fully used or understood in an iso-lated or “singular” way. People’s activity continually combines and cutsacross different media, interweaving those media and building up thepatterns of association and use that constitute experience and under-standing. A person’s work or activity may be influenced by the configu-ration of space around them and the interactions that space affords, butalso by books, telephones, hypermedia, 3D computer graphics and soforth. People act and work through the full range of media they have

123456789101112345678920111234567893011123456789401112345611


74

ready to hand. A narrow emphasis on space as the paramount resourcefor activity underrates the influence of other media. Recent technologi-cal developments, such as mobile phones and email, heighten or high-light a phenomenon already familiar in the use of older media such aswritten text, maps and cinema, and well explored in older disciplinesthan computer science.

For example, a city’s meaning is not just in its bricks and mortar, butalso in our understanding and use of the information about it. At anytime, one is likely to have symbols in a number of heterogeneous mediaavailable for interpretation and use. As I walk through a train stationtowards a city square, the map in my hand, the voice of a friend on mymobile phone, the signs informing me of exit routes and the postersadvertising exciting shopping opportunities are all open for my inter-pretation and action. Temporally, symbols in an even broader range ofmedia influence me, as my activity is influenced by my past experienceand my expectations of the future. Past experience may include my pre-vious visits to that city, my browsing of a web site with good maps toprint out, and my experience of magazines, books and films about urbanlife, and so forth. My language and culture, spanning media old and new,affect me as much as the immediate perception of spatial form. SinceHeidegger and Saussure, a fundamental tenet of philosophy and linguis-tics has been that language is constituted by all the symbols and all themedia one uses, with each symbol interpreted through immediate indi-vidual perception as well as past social experience. Contemporary neurophysiology is in strong accord with this view (Churchland andChurchland, 1998; Edelman and Tononi, 2000), as is the field most obvi-ously related to the design of space, architectural and urban design(Leach, 1997).

The differences between media are usually very obvious. We can char-acterise media and treat each one as if it were an isolated individuatedentity because of the senses we use in perceiving each one, and alsobecause of our understanding of how to relate and to distinguish exam-ples of each one. For example, it is easy to distinguish the spoken word“red” from the written word red because of the senses one uses in eachcase. Despite having the same letters, it is easy to distinguish tar fromrat by looking at the order of letters within each written word. Simplerules about what one can immediately see, hear, etc. within a word begin to strain and then break when one considers, for example, how wedistinguish homonyms such as rose. The written word rose can meanmany things, including a flower and having risen. When spoken, the samesyllables can also mean linear structures (rows), about or belonging tofish eggs (roe’s), moving in a boat (rows), small deer (roes) and multi-ple occurrences of the Greek letter � (rhos). The word’s usage is under-stood through its context – one’s understanding of the other symbolsco-occurring with its use – rather than perception of the word’s patternof syllables or letters.

011

011

011

011

11

75

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

Context becomes progressively more important as we turn from think-ing about the differences between media, and distinguishing symbols, toconsidering the similarities of media and the relatedness of symbols. Forexample, the spoken word “red” and the written word red are relatedbecause we can use either of them in the context of rose blooms, freshblood, the former USSR and so forth. We understand, relate and differ-entiate symbols through experience of contexts of use within a culture.As shown in the early twentieth century by Saussure, this understandingis not solely dependent on the form or medium of each symbol, but alsoon how we use each symbol in the context of other symbols – and thiscontext includes symbols in other media.

Taking fuller account of the interdependence of media enriches ourunderstanding of space and of work. Consideration of how to makesystems that are consistent with this standpoint opens up new possibil-ities for technology design and for computer-mediated social interaction.More particularly, it opens up a wealth of approaches based on couplingand contextualisation. For example, a museum exhibition might be asso-ciated with a set of web pages so that walking into a room of a particu-lar architect triggers the display of text describing the life and work ofthat architect. Similarly, reading the text might trigger display of a mapor visualisation of the room, affording access to a structured collectionof blueprints, design sketches and building models. The space of theroom would be coupled with the text of the page, with each becomingpart of the context of the other. In terms of social interaction, a personwalking into the room might be made aware of a friend’s reading of theweb page, and hence open to conversation about the exhibition despitethe two people being geographically remote from each other.

Our intention is to support social interaction, as is familiar in tradi-tional museums where co-visitors use awareness of each other’s interac-tion with exhibits as a resource for their interaction with each other, anduse interaction with each other in interpreting the exhibits (Falk andDierking, 1992; Galani and Chalmers, 2002). The City project exploresthe coupling of new and traditional media, weaving them together toform resources for social interaction and interpretation.

In particular, our 2002 experiment explored social interaction betweenpeople in different locations and contexts where, by definition, they havedifferent resources at hand. As they discuss and refer to contextual infor-mation, heterogeneity of media is inevitable: one person can use the non-digital resources of his or her location while others have only digitalrepresentations of that location. A case that is more easily handled isaudio: each person will hear his or her own voice and sounds from othernearby sources differently to others, because of the digitisation and trans-mission of audio, but we have become relatively accustomed to handlingthis. A much more challenging heterogeneity is that of people’s posi-tion, orientation and gesture within rooms and buildings. For example,the Mack Room presents much greater visual and tactile richness than

123456789101112345678920111234567893011123456789401112345611


76

the room’s digital representations, e.g. maps and VR models. Unlikemost earlier CSCW research, the City project addresses this inevitableheterogeneity by coupling media together, tracking activity in eachmedium and representing it in others, and so letting participants inter-weave these media in their social interaction.

5.3 System

This section outlines our prototype system, beginning with its infra-structure: the EQUIP platform, the Auld Leaky contextual link server, theVR Juggler framework, and the Bristol ultrasonic positioning system.More detail of this system can be found in MacColl et al. (2002).

The EQUIP platform is being developed within Equator to supportinformation sharing between heterogeneous devices. It provides a run-time infrastructure to support interoperation between Java and C++, andsupports extensibility, for example dynamic loading of C++ classes. TheUniversity of Nottingham is leading the development of EQUIP, withcontributions from the various Equator projects. City uses it as a black-board architecture through which VR Juggler, Auld Leaky and the Bristolultrasonics interoperate. Data items representing user context, an under-lying spatial model and context-dependent content are stored for manip-ulation by City clients and services. Additional EQUIP facilities supportreal-time 3D graphics and mathematical operations, and abstract, ren-derable scene graph nodes. In addition, interfaces between EQUIP and anumber of other systems have been developed, including the Universityof Iowa’s VR Juggler.

VR Juggler is used as the renderer for 3D graphics. It is described asa virtual platform for virtual reality (VR) application development. It isa high-level, cross-platform architecture supporting immersive and non-immersive presentations. Both UCL and Nottingham have immersiveprojection facilities, and the UCL facility has been used for developmentand pilot trials in the City project. The 3D graphics rendering is used toprovide an analogue to the traditional exhibition space visited by tradi-tional visitors. For World Wide Web visitors, the space is represented as a 2D map. We also require a presentation of the information in theexhibition displays, and this is provided by Auld Leaky.

Auld Leaky is a lightweight contextual link server being developedwithin Equator to store and serve hypermedia structures, using contextto filter query results. The model used to define the structures is capableof representing a variety of hypermedia domains: link-based, spatial and taxonomic. Auld Leaky is being developed by the University of Southampton, and is written in Perl although it has a Java API.Information is encoded as an XML linkbase, loaded into Auld Leaky and queried using HTTP. The text and images of the hypermedia weretaken from the Mack Room’s catalogue. Contextual queries are used to

011

011

011

011

11

77

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

generate location- and device-specific content to be delivered by theApache World Wide Web server and servlet engine. For 2D or 3D ren-derings, a location can easily be derived from the position of a visitor’savatar or map marker. For the mobile computer, an ultrasonic system isused to provide position information.

The University of Bristol is developing a low cost indoor positioningsystem using a combination of radio frequency (RF) and ultrasonics(Randell and Muller, 2001) as part of its contribution to Equator. Thesystem uses a single RF transmitter for synchronisation, with ultrasonictransmitters for positioning. The ultrasonics transmit at known intervalsafter the RF, and are received by a handheld or wearable receiver. Each second, the variations in flight time of the ultrasonic transmissionsare used to calculate the spatial position of the receiver. The receiverincorporates a magnetic compass to provide orientation informa-tion. The City project installation involves eight ultrasonic transmitterscovering the approximately 10 m by 20 m area of the Mack Room. Theroom is a challenging environment for ultrasonics, as it is split into twolarge areas by a partial “time line” wall and has some areas set up as cubi-cles within which ultrasonic reception is virtually impossible. For aes-thetic and coverage reasons, the transmitters are set on top of walls,displays and cubicles so that ultrasonic transmissions are reflected offthe ceiling.

The system supports a shared visiting experience, with one visitorusing a handheld or wearable computer in the Mack Room, a secondvisitor using the World Wide Web on a laptop or PC in another room,and a third using 3D graphics on a similar machine in a third room. Thesecomputers communicate through 802.11 wireless Ethernet. A separateaudio subsystem, that we will not detail here, handles visitors’ speakingto and hearing each other. In discussion and development scenarios wename the visitors Vee, Dub and Ana respectively, and these will also beused in the remainder of this chapter. The names do have roots: Vee isfor ‘visitor’, and Vee was the user in our first design scenario; Dub isfrom the first syllable of “double-U”, as in WWW; and Ana stems from‘analogue’, playing with the way that the digital space of the VR is ananalogue of the Mack Room. Some people have suggested that Veeshould be the ‘virtual visitor’ but we decided to keep the name, so as toirritate those who think that Vee’s experience is not strongly influencedby digital media.

Spatial awareness is supported by tracking activity in each of the 2D,3D and handheld systems, sending position and orientation informationfor each one into EQUIP, and then rendering the information about allvisitors to each individual visitor. The components of the prototypesystem operate similarly for each visitor, broadly as follows:

1. store spatial position and orientation in EQUIP;

2. retrieve and render positions of other visitors;

123456789101112345678920111234567893011123456789401112345611


78

3. store named location in EQUIP in response to position change;4. store content from Auld Leaky in EQUIP in response to location

change;5. format content for presentation and advise client program of avail-

ability in response to content change.

For Ana, position and orientation information is automatically publishedin the EQUIP data space by the VR Juggler client, and the positions ofall visitors stored in the data space are automatically rendered as 3Davatars. Figure 5.1 shows a non-immersive spatial awareness display forAna, with avatars representing Vee and Dub (displaying only headsrather than complete avatars).

Vee uses a Hewlett-Packard Jornada that polls position and orienta-tion sensors, and sends the results via a proxy into EQUIP. The proxy isalso responsible retrieving the information about the other visitors, andthe positions and orientations of all visitors are presented to Vee on a2D map. Figure 5.2 shows a visitor the Mack Room, with a handheld and an ultrasonics receiver. The figure also shows a close-up view of thehandheld as an inset.

011

011

011

011

11

79

Figure 5.1 A non-immersive VR display of the Mack Room for Ana, with avatars represent-ing Vee and Dub (displaying only heads rather than complete avatars).

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

Dub interacts with a Java applet in a World Wide Web browser frame.The applet communicates via a proxy that converts mouse clicks on a 2Dmap of the Mack Room to position and orientation information. Theapplet also displays representations of all visitors. An example of Dub’smap is shown in Figure 5.3, corresponding to Ana’s 3D display in Figure5.1. The red boxes on Dub’s map are trigger zones, discussed in the nextsection. Vee’s map is similar to, but simpler than, Dub’s.

Shared visiting requires a sense of shared context, and hence somecomparability of the information available to each visitor, but we also

123456789101112345678920111234567893011123456789401112345611


80

Figure 5.2 A visitor in the Mack Room, “Vee”, with a handheld computer and ultrasonicsreceiver. The figure also shows a close-up view of the handheld as an inset.

wanted to maintain and explore a degree of the heterogeneity that isinevitable in remote collaboration. Vee has the rich environment of theMack Room’s displays and artefacts, as shown in Figure 5.2. Providinghypermedia to Dub and Ana involves converting positions to namedlocations, querying Auld Leaky with the visitor’s device and location inorder to generate informational content, and then formatting and pre-senting the content. In initial trials of our prototype system we did notdeliver hypermedia content to Vee, encouraging her to use the richcontent of the existing room when interacting with her friends. Dub andAna have rich access to the web that Vee lacks, and they can move and jump between Mack Room locations in ways that Vee cannot. Ana’s

011

011

011

011

11

81

Figure 5.3 An example of Dub’s map. For paper publication, the image has been annotatedwith the names of the visitors.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

3D view of the Mack Room VRML model has greater visual richness than the 2D maps of Dub and Vee, but she also has visual occlusions todeal with.

Positions are converted to locations by an EQUIP-connected service.These locations represent semantically significant volumes or extentswithin the spatial model (shown as red outlines on Dub’s map, as inFigure 5.3). Also, for each visitor a target is inserted into the EQUIP dataspace, currently equivalent to a 10 cm cube held in the hand. Detectionof a collision between a target and a sensor invokes code that inserts anew (user, location) item into the data space. Adding or updating a user-location item in the data space triggers a query to Auld Leaky and theresults – a set of hypermedia fragments – are stored in the EQUIP dataspace. Adding or updating such a set of hypermedia fragments in thedata space triggers formatting for delivery to the visitor. The contentfragments are retrieved and combined into an HTML page. Dub’s appletdisplays the HTML page in a separate browser frame set aside for thispurpose. Ana also runs a browser and applet that, when advised, displaysHTML pages, i.e. Ana has no map display but does have a textual display.

The overall effect, then, is that all three visitors have location infor-mation about their co-visitors’ shared awareness of location. Dub andAna each have spatial and textual information about the exhibition, with the text updated as they use the 2D map and the 3D graphics. Veehas the traditional information of the exhibition room. Many of the arte-facts and exhibits have corresponding representations for each of thethree visitors. We sometimes refer to such artefacts and exhibits as“hybrid objects” because of the visitors’ tight interaction around corre-sponding heterogeneous representations. Space, text and audio affordedsufficient interaction, context and reference to support a shared visitingexperience, as the next section discusses.

5.4 Use

In the summer of 2002, the City project carried out a set of system trialsin the Mackintosh Interpretation Centre. Rather than trying to make thebest possible system for the Mack Room in particular, our focus was ongeneral lessons we could learn for the design of systems involving het-erogeneous representations and interactive media. We aimed to increaseour experience and understanding of how these systems serve as con-straints and resources for users’ interaction. We had already studied theuse of a number of cultural institutions, including the Mack Room,without our technology (Galani and Chalmers, 2002), and were interestedin comparing the Mack Room with and without our technological inter-vention. Fuller discussion of the trial can be found in (Brown et al., 2003).

The trials involved 34 participants: ten groups of three and two groupsof two. The groups of three consisted of a Dub (using the web), an Ana

123456789101112345678920111234567893011123456789401112345611


82

(using VR) and a Vee (using a mobile computer). Dub was in theInterpretation Centre, while Ana and Dub were in separate rooms on adifferent floor of The Lighthouse. The pairs explored different combina-tions: Dub and Ana visiting without the physical visitor (one trial), andVee and Ana (one trial). For the first half of the trial, participants wereasked to explore the Centre together, to familiarise themselves with thetechnology and how they could co-visit. Since we were specifically inter-ested in how the system supported social interaction, we introduced anartificial task for the second half of the trial. Each participant was giventhree questions, and the group was asked to answer these questionstogether. Some questions were designed to provoke open-ended discus-sion and interaction between the participants. For example, participantswere asked “What is the group’s favourite Mackintosh painting?” and“What contribution has Mackintosh made to Glasgow?” as well as morefactual questions such as “What was Mackintosh’s birthday?” This com-bination of open and task-centred behaviour allowed us to study activ-ity that was typical of a museum visit, such as finding exhibits, and toobserve how the system supported the shared aspects of visiting amuseum. During the trial, use of the system was heavily logged and eachDub was video taped. After the trial, the participants were interviewedas a group in a recorded semi-structured debriefing.

For analysis, we combined the map view used by participants, with thevideo and audio recordings. We analysed transcripts of the post-trialdebriefings, and the logs of the visitors’ use of the system. We paid closeattention to the details of how users interact with each other and withtechnology, especially through video analysis. With particular interest inthe use of location and of exhibits, we looked for “critical moments”where the system was used in a way that would let us reveal designlessons, consistencies and inconsistencies with theory, and comparisonswith earlier studies.

The participants engaged in rich social interaction around the hybridor coupled exhibits. When participants found that objects correspondedin this way they quickly were able to move on to using them in theirshared tasks and activities, for example discussing the qualities of theobject and comparing it to the other exhibits. In the following extract theparticipants discuss a set of Mackintosh pictures to decide which onethey like the most. Square brackets (e.g. [pet]unias) show overlappingtalk and italics shows a speaker’s emphasis.

Vee: Petunias is errm better for me than RosemariesAna: Ok [pet]uniasDub: [hhh] Petunias it isVee: Early workAna: Hey guys see: this other one it’s really nice. It’s called Fort Mailly hhh

Fort Mailly in nineteen twenty sevenVee: Nineteen twenty sevenAna: Yeah, it’s got the light

011

011

011

011

11

83

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

Vee: Yeah I know but I like [Roses]Ana: [Can you] see it?Dub: Fort Mailly?Ana: HmmmmDub: Yeah that’s quite niceVee: I still prefer Roses

The visitors do not focus on the system or the media involved, such as the differences between the digital and the printed reproductions of the Fort Mailly painting. Instead they focus on its aesthetic qualitiesand the task of deciding which picture they like most. They don’t focuson the tool but on the task.

However, interacting around these hybrid exhibits was not without its problems. In ordinary face-to-face interaction, we assume a degree ofcommonality in the objects that we can see, hear, touch and so forth.Each user of our system had to build an understanding of the perspec-tives the other users had of the Centre. They could often see similarthings, or related things, but not the same things from the same per-spective.

Dub and Vee frequently guided Ana to specific exhibits verbally. Wedesigned the system so that Ana did not have an overview map, andtherefore might find that occlusion of objects in the exhibit was aproblem. In turn, since Dub and Ana could shift attention between artefacts placed far apart in the room much more quickly than Vee, theyfrequently found information and then guided Vee to the correspondinglocation. Again this was often done verbally, but more spatial or graph-ical guiding is discussed later.

Participants also made use of shared location and orientation, usingicons on an outline map for Vee and Dub, and avatars in Ana’s 3Ddisplay. Shared awareness of location also allowed users to quickly moveto their friends, and to quickly find or confirm the exhibits being dis-cussed, that is, to quickly find what their friends were looking at and thenmove so as to look at the same or a closely related thing. They developedsimple means to gesture. In one case, Ana moved her avatar back andforth in a “wiggle” so as to confirm which icon represented her and toshow her location to Dub. This gesture was something like the wave ofa hand used by someone to show he or she is in a crowd. Global locationcould be seen “at a glance” on the map, without the need for the visitorsto use talk, but examples such as Ana’s wiggle show participants’ aware-ness of how they would be perceived by each other in different media.

Participants learned about each other’s perspectives through questionsand observations, building understanding of what they shared and whatthey did not, and thus how to more smoothly interact with each otherthrough the resources at hand. Indeed, participants put considerableeffort into designing their interactions to take into account the charac-teristics and limitations of their varying views of the Centre. For example,the above extract shows Vee emphasising and confirming the year in

123456789101112345678920111234567893011123456789401112345611


84

which “Fort Mailly” was painted (“nineteen twenty-seven”). One of thelargest exhibits in the Mack Room is a long wall of panels, with each ofthe chronologically ordered panels showing a number of images associ-ated with a year of Mackintosh’s life. Mentioning a year to Vee would lether move quickly to a corresponding part of the exhibit. The informa-tion presented to Ana and Dub about this wall was broken up into a pagefor each year, so any one of the visitors could help guide the others to aparticular image in that exhibit by specifying the year.

The use of our system involved more talk, and louder talk, betweenthe co-visitors than we observed in conventional museum settings. Forexample, during a post-visit discussion two participants were asked:

Q: Is it different to a museum visit?A: Yeah, it’s really talkative.B: You kind of go “Mmmm, that’s nice” [. . .] If you find something

interesting, you go “Look”, and “That’s over here”

Another commented on being able to talk without disturbing others:

I quite enjoyed the social engagement . . . being able to talk about every-thing more and not feeling that you are disturbing. Not thinking about otherusers in the gallery, you know it’s kind of liberating . . .

I offer here a few of the many possible reasons for this, stemming fromthe use of multiple rooms, the audio hardware used and, perhaps moreinterestingly, the set of media available for interaction. Two of the par-ticipants were in rooms other than the traditional exhibition room, andso were less influenced by the normal hushed reverence adopted insidemuseums. All three participants were engaged in a trial of an unfamiliartechnology, without established social norms for speech levels. Vee oftentalked more loudly and paid less attention to other members of the publicin the Mack Room. Much like a telephone, the loudness of speech neededfor audibility through the audio hardware may not be the same as in face-to-face interaction. Lastly, the participants often used shared audiobecause the coarse-grained representations of position and orientationdid not afford familiar use of gesture and posture. However, participantswere not solely using talk in order to circumvent the tools given; theywere often talking while engaged in the tasks. Also, they developed ges-tures for the media at hand, for example Ana’s wiggle: a gesture made ina 3D VR to be seen on a 2D map.

In everyday face-to-face situations, interaction between two people canbe impeded for a number of reasons such as one or both of the partici-pants choosing to interact with others (e.g. on the telephone) or to inter-act with objects, or by participants being forced to interact with otherpeople or objects because of interruption, breakdown, occlusion and soforth. In the trial, interaction would often pause when participants founda difference between the visitors’ representations of the Centre. For

011

011

011

011

11

85

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

example, the room’s interactive video displays were only available to Vee.When a visitor started to use and talk about a display or exhibit that wasnot shared, the other participants would refrain from interacting andmove on to other exhibits.

Similarly, Dub and Ana generally used movement on the map or VRin order to access information, in preference to the more conventionalhyperlink navigation. This seems to have been partly due to the fact thatsuch spatial movement was more a part of the shared experience, interms of conversational references to locations but also as a way of avoid-ing future confusion: following links to a web page about a new exhibitdid not move the participant’s icon or avatar to the corresponding newposition. (This capability was implemented during the trial period but,to maintain consistency, was not deployed.) This meant that “web move-ment” could leave a visitor’s icon in a potentially confusing place.

While there were occasional interactional breakdowns, they were notfatal for the sense of a shared visit or for interaction. Overall, partici-pants showed skill in finding ways to handle the differences between thedifferent representations, and exploiting corresponding and coupled fea-tures. The system successfully supported a shared experience by enablingusers to talk about and interact around the exhibition, offering a sociallyengaging experience beyond that available to a conventional web sitevisitor.

5.5 Ongoing and Future Work

We continue to explore remote collaboration in cultural information andcultural institutions, in particular collaboration involving heterogeneousmedia. We support social context as a resource for the interpretation ofinformation, and contextual information as a resource for social inter-action. We are extending our system to be used in more of the city thanThe Lighthouse, adding GPS (Global Positioning System), dead reckon-ing, GPRS (General Packet Radio Service), and further 802.11 aerials. Weare using the publicly available VRML model of central Glasgow fromthe University of Strathclyde, and 2D maps from services such as EDINA(www.edina.ac.uk). We have been undertaking field studies of visitors ina range of locations in Glasgow, seeing how their visits include far morethan traditional cultural institutions, and how they use resources such astourist information centres, maps, guidebooks and signage. We aim torun another field trial that involves a number of participants visiting thecity, each of whom has a wearable computer on which tightly coupled3D VR, 2D maps and hypermedia can be used. Each participant’s activ-ity will be available to the others synchronously, much as in the MackRoom, but also asynchronously.

One reason for this latter development is to move beyond the tradi-tional objectifying or “scientific” systems of classification and retrieval

123456789101112345678920111234567893011123456789401112345611


86

that too often are the only means of access to digital information.Influenced by the theoretical standpoint outlined in Section 5.2 and theRecer recommender system (Chalmers et al., 1998), we have built acentral information resource, connected to EQUIP and thus accessiblevia a variety of media and devices, and which stores a growing and evolv-ing body of individuals’ paths or narratives through a range of symbols:our own images and fragments of hypertext, annotations made by users,locations in the city, and other locations and web pages worldwide. Thesystem uses this resource to make contextually specific recommenda-tions of people, places and things by comparing each person’s recentactivity with similar sections of the past activity of selected others. Pathscan also be more directly shown on maps, in VR models and woven intoweb pages. We will allow this body of information to grow as people useit, making new associations between symbols and adding in new ones.Information access based on evolving inter-subjective patterns of con-textual association and use will complement access based on more staticand objective interpretation. We are also experimenting with bridgingbetween the two, using pre-written “official” explanations of the con-nections between symbols as a means to enrich the dynamically createdrecommendations.

Remote collaboration brings abstraction and approximation as asystem monitors and senses activity such as a person walking across a city. Issues such as sampling, resolution, delay, disconnection anduncertainty have to be faced as one decides how to represent the activ-ity inside a system, even before one considers how to represent the activity to a remote collaborator. There is no getting away from the factthat activity is going to be interpreted through sensors and transducerssuch as cameras, GPS and ultrasonic systems, and any digital mediumhas characteristic losses and uncertainties. While we want to make newresources for interaction available to people, combining new media withold in a perfectly seamless way is not going to happen. Uncertainties andinaccuracies (“seams”) are an inherent part of any communicativemedium, and people often learn to use these characteristics for their ownends. For example, mobile phones can be set to display the current cell,if the service provider permits, and some people choose to enable thisfacility. This is an elegant ambient or peripheral presentation of poten-tially useful information: users can choose what use to make of it, forexample seeking a stronger signal by moving to a location that forceshandover to another cell. Cell boundaries and signal strengths are inter-actional resources of the medium. Similarly, long-term use of video-mediated communication was reported by Dourish et al. (1996) to leadto “complex patterns of behaviour built up around the interactionaldetails of the video medium . . . When the medium changes, themechanisms change too; but the communicative achievements remain.”

Recalling a term used by Mark Weiser (1994), we see seamfulness asan important design goal for our future work. We plan to design in

011

011

011

011

11

87

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


5

explicit representations of the errors and uncertainties in our systems,letting people take account of the characteristic heterogeneity, errors andlimitations of the systems we offer them. For example, we are starting todevelop explicit presentations to accommodate uncertainty due to ultra-sonic and GPS-based positioning, showing a person’s sensed position asa spatial extent, rather than as a point, and showing estimates of sensingaccuracy and communications bandwidth on our city maps and models.

5.6 Conclusion

The City project emphasises the interdependence of media, such as com-puter graphics and audio, with others. We have explored the combina-tion of CVE technology with hypermedia and mobile computers, and alsowith the architecture and exhibits of the Mackintosh InterpretationCentre. Supporting broad social context in remote collaboration involvesheterogeneity, and our project aimed to address this through couplingand correspondences between media. Users of a mixed collection ofinteractive media were able to enjoy a shared visit experience, engagingin collaboration through awareness of each other’s activity and throughmore focused talk and interaction around “hybrid” objects.

By presenting the theoretical issues underlying much of this work, aswell as system design and experiences of use, this chapter may serve tocomplement many other chapters of this book. Rather than consideringthat users inhabit our information space, we see people as inhabitingcities and towns, and using new technologies and older media to inter-act with friends, relatives and colleagues. Looking to the near future, CVEtechnologies will be widely available via phones and mobile computers.I suggest that taking fuller account of their use among a wider set of tech-nologies and media, and designing for contextuality, heterogeneity andseamfulness, will greatly enrich our work.

Acknowledgements

Special thinks go to all the City project members past and present, espe-cially Barry Brown, Areti Galani, Chris Greenhalgh, Ian MacColl, DaveMillard, Cliff Randell and Anthony Steed. Also, we are all grateful for thegenerosity of our hosts at The Lighthouse, especially Lynn Bennett andStuart MacDonald. Thanks also to Ziggy Stardust, Fugazi and Low.

Equator is an Interdisciplinary Research Collaboration (IRC), sup-ported by the UK Engineering and Physical Sciences Research Council,and the City project was supported by a donation from the Hewlett-Packard Art and Science programme.

123456789101112345678920111234567893011123456789401112345611


88

6Soundscapes

Tony Brooks

6.1 Introduction

Although purely virtual information spaces receive much attention andare indeed useful in certain situations, it can be difficult to integrate themwith real-world activities and awkward to interact with them. As pointedout in Chapter 5, even notionally single-user activities often happentogether with other people and a purely virtual environment tends to cutpeople off from their surroundings making it difficult for several co-located people to share an experience. Fraser et al. (Chapter 9) also showthat collaborative interaction within a CVE is not quite as intuitive as itmight first appear. For these reasons purely virtual environments maynot be desirable in cases where the participants have a disability or wherethe environment is intended to be shared with other people, as in a publicperformance .

The creative process is something that is inherent in most people, eventhe severely disabled, yet it is an often uncharted channel within whichto explore their expressive human potential. “Productive creativity” isbecoming recognised as a beneficial therapeutic treatment for peoplewith disabilities.

In this chapter the Soundscapes system, which is built to allow unen-cumbered creative expression for both able-bodied and handicappedpeople, will be discussed. The next section will describe the system itselfand then further sections will show how it has been used therapeuticallyand also to stage public performances.

6.2 The Soundscapes System

The origins of the Soundscapes system lie in a MIDI bass (a modifiedbass guitar that can generate MIDI – Musical Instrument Digital Interface– signals allowing the control of electronic synthesisers) and an expres-sion pedal (a foot-controlled sound effects processor that can change the sound of an audio signal). From interactions with his severely

011

011

011

011

11

89

handicapped uncle in which the author played the bass and his unclemanipulated the expression pedal the author realised that even a simplemeans of expression could open up ways for people to communicate andcould provide immense satisfaction. The author started searching for aricher means of expression and it soon became clear that the MIDI bassitself was too intimidating and required too much physical skill to be sat-isfying for people like his uncle. This led to a search for a rich expres-sion medium that required no training to use. Although immersivereality hardware could track peoples’ movements in a natural way it wasexpensive and uncomfortable to use to the equipment since it had to bephysically worn by the user.

After much experimentation the author selected infrared movementsensors. Within the range of the sensor, movement from the flicker of aneye to a full body movement can be detected. The Soundscapes systemuses three such sensors to allow triangulation of movement or trigger-ing of three separate operations. Each sensor head is mounted on a flexible “gooseneck” support to allow for different configurations. The

123456789101112345678920111234567893011123456789401112345611


90

Figure 6.1 Photograph of the first three-headed infrared movement sensor used by theSoundscapes system.

sensors output MIDI information that have been used to control move-ment and navigate through 3D space as well as controlling filters appliedto computer-generated images.

The Soundscapes system is formed from a library of tools for bodyfunction capture and a number of programs for creating the results thateach user response determined. The library of input tools ranges frombiofeedback sensors for brain activity, muscle tension, GSR (galvanicskin resistance), heart rate, to infrared (see Figure 6.1), ultrasound andvideo tracking (the EyesWeb http://www.eyesweb.org/ is used to performthe video capture) technologies. Most often the captured body movementdata is transmitted using the MIDI protocol and the MAX program(http://www.cycling74.com/products/maxmsp.html) is used as a centralhub directing and manipulating the data as needed.

In conjunction with a physiotherapist the Soundscapes operator cali-brates the system such that the range of velocities measured by thesensors matches the movement capability of the individual who will beparticipating in the session. The system is configured so that movementtriggers audio and visual feedback with the exact nature of the feedbackdepending on the individual – we have found that some individualsrespond better to audio stimulus and others to visual stimulus. Systemsettings as maintained between sessions and movement data are loggedso that the therapist can measure progress from session to session.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Soundscapes

91

6

Figure 6.2 Version two of the three-headed infrared sensor system in use. The three sensorheads are visible at the centre of the image in between and a little below the performer’shand and the image projected on the screen.

6.3 Therapeutic Uses of Soundscapes

One of the uses of the Soundscapes system is to act as an “expressionamplifier” for people with physical disabilities – it provides them withthe possibility to generate sounds and images from whatever movementthey are able to control and this can be a powerful therapeutic experience.

The therapeutic use of Soundscapes is illustrated by the i3 fundedfuture probe project called The World Is As You See It (TWI-AYSI, http://www.bris.ac.uk/Twi-aysi/). TWI-AYSI was a successor to the CARESSproject, which was also funded by i3 as part of the Experimental SchoolEnvironments programme. The TWI-AYSI project members consisted ofmembers of the Bristol University Electrical Engineering Department,and Stefan Hasselblad both of whom were involved in the EU CARESSproject, and Tony Brooks. The CARESS project successfully motivatedand empowered children to develop creativity, imagination and expres-sion through interactive acoustic environments. The objective of TWI-AYSI was to answer the question “Can immersion in a visual envi-ronment hold similar potential for such children in terms of the aestheticresonance they might derive from movement within such a visual space?”

During the project we brought young children from a school for multi-handicapped people into the Centre for Advanced Visualisation

123456789101112345678920111234567893011123456789401112345611


92

Infrared motionsensor

Disabled childInfrared motionsensor

Camera feed quadsplitter monitor & VCR

Videocamera Face shot

camera

Videocamera

Computer

Projectionscreen

Video projector

Full scenecamera

Figure 6.3 Typical set up as used in TWIAYSI (Sweden 2001) with Multiple Camera Analysis(MCA) utilising six video cameras and three infrared motion sensors.

and Interactivity (CAVI, http://www.cavi.dk) in Denmark. The childrenranged in age from three-and-a-half to five-and-a-half. A common expe-rience for all of the children involved was the manipulation of sounds,robotic lights and coloured images. However, in CAVI we also wanted toexplore the possibilities for using the movement sensors to enable 3Dnavigation.

MIDI signals from the movement sensors were received by a Linuxworkstation and translated to movement information that was used tocontrol the rendering of the 3D world on a SGI Reality Monster (SiliconGraphics Inc.). In this experiment, the movement sensors allowed aperson to control the movement of a rocket ship that was projected ontoa large screen in front of them. It was possible to view the image in stereousing LCD shutter glasses .

However, the children did not want to wear the necessary head sets andshutter glasses that were used to view the virtual objects and so we madethe decision to project the ship in mono and the youngsters once placedwith their head inside of the active sensor space, were able to control thevarious degrees of movement of the spaceship by small head gestures (a video of this is available online at http://media.nis.sdu.dk/video/twi-aysi.html). A small gesture to the right and the rocket ship moved tothe right, a movement left and the rocket ship moved to the left, a headmovement down and the ship’s nose tipped down as if to dive, and a headmovement backwards and the ship’s nose tipped up as if to climb.

As an example of the results of this experiment we shall describe a ses-sion with a multi-handicapped five-year-old boy. At first we tried to placethe 3D shutter glasses on him so he could view the scene in stereo but heshook his head to remove them. We removed the glasses and changed toa mono projection. This was a big success as the young boy was totallyimmersed for around six-and-a-half minutes. The young disabled boywas only able to express through a single verbal expression (a sort of“AYYE” phrase). While in the session and immersed in the interaction he could clearly be heard making his sound, which was translated by

011

011

011

011

11

93

Figure 6.4 Freja, a severelydisabled girl with her helperMaggie. Freja is painting via hermovement and can see herself inthe window on the right. Thecolour painted depends on thevelocity of the movement, witheach individual having his/herown settings to match theirability. Freja’s face tells the story.Reproduced with permissionfrom Stefan Hasselblad.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Soundscapes

6

his helper as an expression of a joyful experience, and this was probablydue to the fact that he quickly understood the functioning of the interac-tive space and was able to control the image on screen – an experience ofcontrolling his environment that was probably completely new for him.

As a result of the CARESS and TWI-AYSI projects we learned thatwhen audio only or visual only feedback was present some people reactedbetter to audio and some better to video. When both audio and videowere present in the environment a greater degree of immersion wasobserved. This work resulted in another European project to take theseideas further, called CARE HERE (http://www.bris.ac.uk/carehere),whose results are due to become available as this book goes to press.

6.4 Artistic Performances Based on Soundscapes

6.4.1 Interactive Painting

The inspiration for “The Interactive Painter” came from a discussion ata party in which it was stated that some people believed that painting oncanvas was “dead” since computers enabled us to do so much more with

123456789101112345678920111234567893011123456789401112345611


94

Infrared motion sensors

Videocamera Video

camera

Videocamera

Computer

Painter

Projection screen

Mirro

Video projector

Back projectedimage

Figure 6.5 Diagram of the equipment used in the interactive painting performances. The threesensors capture motion via infrared beams. In this diagram the video image is back pro-jected but front projection is also an option to allow working by casting shadows and tooverlap the video on a canvas.

011

011

011

011

11

95

images and colours. We decided to try to investigate a way for a tradi-tional painter to utilise new technology while painting on canvas in thetraditional way. The resulting performance involved the painter ManuRich from Paris, France and within Tony Brook’s COIL (Circle ofInteractive Light) interactive installation, it toured a number of museumsof modern art in Scandinavia in 1998 and 1999 culminating at the DanishNeWave festival performed at the Gershwin Hotel, in Manhattan, NewYork 1999. More recently we have experimented with a therapeuticversion of the installation for elderly and handicapped people.

Interactive painting is designed to be a live experience with an audi-ence and takes place in a darkened room, which provides a challenge tothe painter since painting normally requires a well-lit room. The eventmakes use of projected computer-generated imagery and synthesisedsounds that change according to the movement of the painter. In turnthe painter may be influenced by the sound and images thereby creatinga feedback loop.

In order to give maximum freedom of movement, infrared and ultra-sonic movement sensors were used to capture the painter’s movement,thereby allowing him to move freely without trailing wires. Owing to thesize of the canvas and the limited range of the infrared sensors a jacketwith reflective strips attached to it was worn by the painter, therebyallowing movement to be detected from a greater distance. The sensorswere strategically set up around the canvas and translated light reflectedfrom the jacket into movement information that a computer translatedinto sounds, colours and images.

One of the primary aims of the performance was to see how the painterwould be influenced by images projected onto the canvas. The light fromthe projected image would alter the appearance of the painted image andchange how both the artist and audience perceive it. In order to do thisa video projector is set up to project images onto the canvas. The move-ment sensors control filters applied to the image such that when there isno movement there is no image. As the painter moves then portions ofthe image are projected onto the canvas.

Figure 6.6 Photograph of aninteractive painting sessionGershwin Hotel, Manhattan, NewYork 1999.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Soundscapes

6

In addition to the projected image, the painter’s movement also trig-gers sounds from a sound synthesiser. The author changed the synthe-siser’s patch data in real-time to match the “mood” of the compositionand also to try and influence the mood of the painter.

The audience of an interactive painting performance therefore hasmany elements to focus on:

● The painting that the painter is actually producing on the canvas.● The movements of the painter, which the painter may change in order

to change the sound and visuals as well as the painting.● The sound and images caused by the movement of the painter.

The artist naturally tends to focus on the creation of the painting andnot the multimedia experience. However, it was apparent that the soundand images had an effect on the painter and the painting. From experi-ence we noted that smooth “pad” sounds tended to result in a corre-sponding smooth motion of the artist’s brush and passive choice ofcolour and similarly a hard-edged “Hendrix” guitar lick resulted in anaggressive motion and severe choice of colour.

6.4.2 The Four Senses

In April 2002 the first collaboration between artist/researcher TonyBrooks and artist Raewyn Turner resulted in a series of multisensory performances called The Four Senses in Auckland, New Zealand, withthe Aotea Youth Symphony, Touch Compass, (a mixed able-bodied/handicapped dance company), and HANDSS (Hearing ANd Deaf SignSingers, a deaf signing choir). The performances were an improvisationin light, sound and olfactory information. Brooks and Turner created areal-time translation of sound and the gestures of making that sound,into light and colour and multiple layers of smell.

The concerts aimed to engage and reframe perception of music and to play with subjective experiences and simulated synesthesia. Eachsensory element was constructed from information relating to the otherelements. The associations and correspondences of the elements madeby the audience were according to their own individual and personalexperiences.

Tony Brooks utilised movement sensors and video cameras to capturebody part movement and translate it into painting with coloured light.In this way the orchestra conductor was able to “paint” the scene throughhis gestures. Similarly, orchestra members, dancers and a special signingchoir for the deaf images were blended into the backdrop in real-timesuch that their velocity of movement affected the colour of image generation and collage composition.

123456789101112345678920111234567893011123456789401112345611


96

Raewyn Turner translated the music through drawing and propor-tional relationships into coloured light and olfactory stimulation. Thiswas done using computer-controlled lighting rigs and by releasingaerosols into the ventilation system at key moments during the perfor-mance.

The light collage thus created was a play of interaction between livevideo feeds and sensors, and coloured light pre-programmed to an inter-pretation of sound, each affecting the other in a dynamic visual loop.

The fourth sense employed was touch. In order to give hearingimpaired people a feel (pun intended!) for the concert experience vibrat-ing cushions were used. These cushions were developed as an accessoryfor the Sony Playstation® and can transmit vibration at various levels ofintensity. These cushions were coupled to the PA system so that theywould vibrate according to the music played by the orchestra and anumber of them were distributed throughout the auditorium. Hearingimpaired people could sit on the cushions or hug them to their chests inorder to get a “feel” for the music. In addition a deaf signing choir par-ticipated in the performance to counterbalance the conventional (butsight impaired) singer also performing on stage.

011

011

011

011

11

97

Projectionscreens

PercussionTimpani

/

Basses

3 × roving videocamerasDancers

Deaf signing choir

Lead violin Lead cello

Singer

Conductor

Primary camera capture areas

Figure 6.7 Diagram showing the locations of performers and projection screens for the Four Senses performance.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Soundscapes

6

123456789101112345678920111234567893011123456789401112345611


98

Figure 6.8 Two photos of the Four Senses performance: (a) dancers in front of the orchestra;(b) an example of the projected video generated during the Four Senses performance.Reproduced with permission from Milan Radojevic annd Mirjana Devetakovic Radojevic.

a

b

6.5 Conclusion

We believe that there will one day be a programmable ion layer sur-rounding our body which will be able to stimulate our individual sensesso that we are enabled towards a truly augmented reality without wear-ables and special hardware such as screens. This is not pure fantasy sincea number of people in the Nanotechnology community such as StorrsHall (http://discuss.foresight.org/˜josh/UFog.html) and Drexler (1992)are exploring methods that would allow the creation of a “Utility Fog”(Storrs Hall) that would, among other things, allow our senses to be stim-ulated and our movement sensed.

Our human-centred work with people with special needs, who are dis-abled, elderly or in rehabilitation allows us, in certain circumstances andwith certain individuals, to get closer to higher nuances of the senses andwhat the senses mean. By exploring that research and working in cross-disciplinary teams including neuropsychologists, computer scientists,human–computer interaction (HCI) researchers and others, we hope towork slowly towards that goal.

We believe that the immersive “play” within an interactive environ-ment has much potential to do good. Specific to the current work is thebelief that everyone is so individual that a system is required that can betailored to each desire, facility and requirement – this entails adaptabil-ity with a capital A. Libraries of input HCI devices, together with librariesof mapping devices and libraries of output software is the optimal wayforward. Through non-fixation on interactive spaces through invisiblesystem components – hardware (disappearing computers, sensors em-bedded in environments etc.) – we will obtain the mapping of body function subconsciousness that can help people. The subliminal perva-sive aspects – or really more so terms such as “proactive computing” (thatwhich focuses on improving performance and user experience throughspeculative or anticipatory actions) and “autonomic computing” (whichfocuses on improving user experience through the system’s self regula-tion) in relation to this work are preferred as both relate to the user expe-rience rather than the artefacts often referred to in pervasive computing– are obvious. We also believe that responsive audio/visual/haptic feed-back may have much more to offer in the future over and above what wenow utilise, in fact we believe that correspondences between synchronisedfeedback especially sonic and visual (in the first instance) are only justscratching the surface and that there are many new discoveries waiting tobe made.

Acknowledgements

The CARESS and TWI-AYSI projects were funded by the EuropeanUnion via the i3 network.

011

011

011

011

11

99

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Soundscapes

6

7The Computational Interplay of Physical Space and Information Space

Enric Plaza

7.1 Introduction

There is a current trend in computer science to develop new devices and applications that make the transition from desktop (and laptop)computing to computing devices that are embedded in the physical andsocial environment in which people live. Several approaches have beenproposed in this direction and they have different names and focus on related but distinct issues. The first, known as ubiquitous computing,pervasive computing or the disappearing computer, focuses on embed-ding computing devices into the physical objects and surroundingswhere people work and live. A second related trend is that of wearablecomputers, that focuses on embedding personal computing services indevices that people can carry or wear while moving around in their every-day activities. Next, augmented reality focuses on enriching people’s per-ception of physical surroundings with computer-generated information.And finally, to be brief, there is a trend for developing autonomous agentsthat take on people’s goals and try to achieve them on their own.

However different these approaches are, a common issue they all haveto deal with is the awareness of the physical (and social) surroundingsin which people interact with computing devices. Traditionally, com-puters (ranging from mainframes to personal computers) operate in a purely informational world – typically, screens and printers, plus a customised connection to manufacturing machines or task-specificsensors. The advent of the Internet and the World Wide Web links thesecomputers into a common (or rather shared) information world. Therelation of the Internet, as an information space, with physical space isa research issue that essentially deals with the issues of context aware-ness: who is where and when, with whom, doing what. This chapterfocuses on the interplay of physical space where people act and live and

011

011

011

011

11

101

an information space where software programs reside, interact amongthem, perceive some properties of the physical world, and perform sometasks and actions on both the informational and physical spaces.

7.2 The Interplay of Physical and Information Spaces

Since all computers, and the software they run, are potentially connectedover the Internet, we can consider this as an “information space”. Then,the computing devices that populate the physical world, from personaldigital assistant (PDAs) to the emerging “ubiquitous computing” devices,can be considered as the interface between the physical spaces peopleinhabit and the “information space” inhabited by software programs.The most critical issue to improve this interface is that nowadays soft-ware programs have little or no awareness of the physical space and ofthe activities in which people engage in that physical space.

In this chapter we will first discuss the general issues that need to beaddressed to improve the interfacing of physical spaces by the “infor-mation space” inhabitants, in particular awareness of the physical andsocial context of people. Later, we will present the COMRIS project andexplain how these general issues are addressed and solved for a specificphysical space (a conference centre) and a series of informational tasksuseful in that space.

Awareness of physical space involves more than merely spatial or geo-graphic reasoning. People perform activities and interact with otherpeople while moving in physical space, and the more aware they are ofthose activities the better the interface with the information space. Twolevels of context awareness are required:

1. Physical sensors, determining the granularity of the perception ofphysical space activity. The physical sensors can be a GlobalPositioning System (GPS – giving a co-ordinate point); wireless tagsthat people wear to detect who is close by or in which room; or evenmore sophisticated speech capture and analysis systems (e.g. tryingto determine topic, mood, etc., of ongoing activities).

2. Common sense knowledge, determining the inferences a system canmake on the “world situation” given the physical sensor information.For instance, if a wireless tag indicates that a person is in a room oftype Meeting Room, there is a series of inferences that can be madefrom a knowledge base that models business activities (e.g. the person is in a business meeting with other people and should not be interrupted unless there is something urgent).

Perception using physical sensors establishes the baseline for aware-ness capabilities. For instance, GPS can be used together with personaldevices like PDAs to yield information customised to the person (usingthe personal profile in the PDA) in that spatial situation (by a server that

123456789101112345678920111234567893011123456789401112345611


102

has a database of services available near that location). However, thisapproach is very centralised, depending on a service provider holdinggeographical information. For some tasks it is better to add more sensors that allow the person’s behaviour to be determined: a micro-phone can be used to determine if the person is busy (talking with somebody) or idle. Currently, project Esprit “IT for Mobility” 26900 is developing a sensor board that could be integrated with mobile phones or PDAs. The sensors on this board include two microphones, adual axis accelerometer, a digital temperature sensor and a touch sensor.With them, a computing device can locally infer different contexts where this device is situated – such as “sitting in a pocket”, “lying on thedesk”, “in user’s hand” – allowing the device to adapt its behaviour toeach context.

Even discussing the physical sensors used we have had to include theidea of a computational model that transforms the raw input data intosome interpretation of the state of affairs in the world. This is becausecontext-awareness is essentially an interpretation of the world situationand, as such, what is needed is plenty of knowledge about what the world is like, in other words, common sense knowledge is needed. Therehas been a lot of research in artificial intelligence (AI) in the last 10–12years, concerning the issue of common sense, with the Cyc project(http://www.cyc.com) being the most well-known endeavour. Thecurrent understanding of common sense can be summarised as:

1. an ontology, defining the objects existing in the world that we wantto talk about, and

2. an inference engine, capable of using a model of the state of affairs inthe world expressed in that ontology to conclude new facts or state-ments about that state; new facts that are “obvious” or “implicitly”known by people, by what we are calling “common sense”.

Moreover, context aware applications need to have some propertiesthat differ from current applications: they need to be persistent, respon-sive and autonomous. We will call this collection of properties continu-ative computing1 because they set apart context-aware applications fromthe usual applications oriented to input–output. First, a context-awareapplication needs to be persistent, that is, persistently in runtime state,non-terminating. Commonly, an application accepts input and producesoutput (in fact, the same definition of algorithm is based on the idea oftransforming an input to an output) – an exception is programs that are operating system services (that are difficult to model in the algorithmparadigm based on termination). A common application is a file that,when a user needs it, becomes runtime, receives an input and after someprocessing time yields a result and then goes offline.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

The Computational Interplay of Physical Space and Information Space

103

7

1 Continuative: tending or serving to continue.

A context-aware application needs to be non-terminating, awake andrunning persistently, much like an operating system or a PDA. Moreover,it needs to be persistent in order to be responsive: able to adapt andproduce adequate responses when something changes in a context or the context changes to become a new context. Finally, context-aware applications need to be autonomous in the sense of having an identitypersistent in time and a memory (or internal state) that is individual.Since changing context is one of the most important pieces of informa-tion a context aware application can deal with, it makes no sense thateach particular physical space has a context-aware application that is independent from other locations. Since people move around it isbetter to think of a context-aware application as centred on users, like a PDA that a user carries around. In this way the context-aware application can know the past history of contexts of the user, and evenlearn to anticipate the most likely future contexts of the user and preparefor it.

There is a current paradigm and associated technology that fits these requirements, which we have outlined as continuative computing:intelligent agents. Agents and multi-agent systems being developed in artificial intelligence are conceived of as autonomous, permanent entities capable of using ontologies to perform inferences for solvingproblems, and for co-operating and/or competing with other agents or people. Intelligent agents are classified as reactive or deliberative,depending on some design properties, but we will show later in thechapter that designing agents with a particular architecture like the oneproposed, means they can also be responsive and thus exploit awarenessof physical and social context to improve their performance on behalf of users.

7.3 A Framework for Context-aware Agents

Our framework is composed of a collection of context-aware personalinformation agents (CAPIAs) working in an information space and a collection of human users interacting in the same physical space. A useful way to visualise this distinction is the dual space schemadepicted in Figure 7.1. Human users, on the right-hand side of Figure7.1, are in a location, interacting with other persons (who may or maynot be users) in the context of social activities. Information agents, on the left-hand side of Figure 7.1, inhabit an information space wherethey interact with other agents and gather information in the interest of the users.

Moreover, we have mediation services connecting the informationspace of agents and the physical space of human users. Specifically, weare currently using two mediation services, namely an awareness serviceand a delivery service (see Figure 7.1).

123456789101112345678920111234567893011123456789401112345611


104

7.3.1 Awareness and Delivery Services

The awareness service takes charge of pushing information from thephysical space to the information space. Specifically, the awarenessservice provides real-time information about the physical location andmovements of users to CAPIAs. The specific data provided depends onthe particular sensors available in the awareness service for a particularapplication. For instance, in the conference centre application the aware-ness service provides a real-time tracking of attendees’ location as wellas the group of other attendees nearby a given attendee – see Section 7.4for the features of the awareness service in the COMRIS ConferenceCentre application.

The delivery service offers mediation and brokerage capabilities (sub-scribed by the human users) for delivering information from the infor-mation space to the physical space. Specifically, the delivery serviceprovides the channels for delivering the information gathered by theCAPIAs to their corresponding users. For instance, in the conferencecentre application the delivery service allows information such as audiooutput to be sent by means of a wearable computer and HTML pages bymeans of screen terminals scattered through the conference building.

7.3.2 Agents Requirements

The society of agents has to be able to communicate using a commonontology for a specific application, and they have to share a collection ofinteraction protocols appropriate for that application. Our approach isto use the notion of agent-mediated institution (Noriega and Sierra, 1999)

011

011

011

011

11

105

Deliveryservices

CAPIA

CAPIACAPIA

CAPIA

CAPIA Awarenessservices

Informationspace

Physical spaceSocial activity

Figure 7.1 A schema of the dual space of information agents and human users with the medi-ation services between them.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


7

to specify the ontology and interaction protocols to be used by a societyof agents for a particular application.

In addition, to support the ontology and interaction protocols of anagent-mediated institution the agents should be able to handle context-awareness information. That is to say, a context-aware agent should beable to react dynamically when new physical context information isreceived from the awareness service. Moreover, since the future physicaland social context of the user is not known, a desired feature of CAPIAsis the capability of gathering information that may become relevant in afuture context. For instance, in the conference centre application, whenan attendee is at a specific exhibition zone, the CAPIAs use the knowl-edge provided by the conference about the physical distribution ofbooths for trying to anticipate the next movement of the attendee.

In our framework, CAPIAs are based on the distinction between twokinds of information valuation, namely interestingness and relevance.Information interestingness measures the intersection of a given piece ofinformation with the user model a CAPIA has for the tasks with which itis charged. That is, interestingness: Info × UM

|→ eI where Info is a givenpiece of information; UM is the user model; and eI is the estimation of theinterest that the user has in Info. For instance, in the conference applica-tion a preliminary criterion for determining the interestingness of a givenpaper presentation is performed comparing the user interests (describedas a collection of topics with different weights) and the keywords associ-ated with the presentation (also described as a collection of topics withdifferent weights). Then, other criteria such as the knowledge about thespeaker or the user’s agenda increase or decrease the initial assessment.

Depending on the physical and social context of the user and on thetime, however, some information may be more or less relevant for theuser at each particular point of time. Information relevance measuresthis intersection of a given information with the time and the context ofthe user. That is, relevance: Info × Time × UC

|→ eR where Info is a giveninformation; UC is the user context; and eR is the estimation of the rele-vance of Info in UC. For instance, in the conference application when anattendee is near an exhibition booth, the information related to the boothis estimated as more relevant. Another example of increase of relevanceis when a conference event is close to start: a CAPIA has a time constraintfor deciding if that event is useful for the user interests.

We can say that, basically, the CAPIAs go on their tasks, interactingwith other agents (and other accessible information resources), to gatherinformation that is interesting for their users. Concurrently, each CAPIAuses an awareness service (see below) to keep track of the whereaboutsof its user and decides which information is relevant for the user in aparticular physical and social context. Clearly, interestingness and rele-vance are not completely independent, and the information gathering iscorrelated with the information the agent expects to deliver to the user,but for exposition purposes it is handy to talk about them separately.

123456789101112345678920111234567893011123456789401112345611


106

At this point it is useful to put this framework into a concrete applica-tion to illustrate the dual space and the personal information agents’exploitation of context awareness.

7.4 The COMRIS Conference Centre

This section applies the framework of context-aware information agentsto a particular application, that is to say in the physical location and thesocial activity context of a conference centre in the COMRIS project.2 Weview the conference centre (CC) as an agent-mediated institution where asociety of information agents work for, and are aware of, the attendees ofa conference (Plaza et al., 1998). The ontology of the CC institution definesthe conference activities that take place in the CC. Examples of conferenceactivities are exhibition booths and demo events, plenary and panel ses-sions, and so on. The ontology also defines the roles that a person takes indifferent locations while performing different activities, such as speaker,session chair, attendee, organisation staff, etc. Other important elementsdefined by the CC ontology are the different locations of the conferencesuch as the exhibition areas, the conference rooms, and the public areas –i.e. halls, cafeterias and restaurants. This information is used by the agents for reasoning about the movements of users in the conference. Theschedule of conference events is also defined in the CC ontology.

Finally, the CC ontology supports the definition by each user of the“instruction set” that their CAPIA should follow. The instruction set isentered by the conference attendee using a WWW browser while regis-tering, and basically includes (1) an interest profile (specifying the topics,with weights, in which the attendee is interested); (2) those tasks the usercommissions the PIA to do on their behalf (e.g. if they are interested inmaking appointments); and (3) the delivery modes that the CAPIA willuse to communicate with the user.

We implemented two types of CAPIAs in the CC application: CAPIAsrepresenting interests of attendees and CAPIA advertisers. There is a CAPIA for each attendee, a CAPIA advertiser for each exhibition booth, and a CAPIA advertiser for each paper session. The goal of CAPIAadvertisers is to convince people to attend the conference event they arerepresenting.

7.4.1 Delivery Service

The delivery service in COMRIS allows the users to receive informationin two ways: by means of a wearable computer with text and audio output

011

011

011

011

11

107

2 COMRIS stands for Co-habited Mixed-Reality Information Spaces. More infor-mation is available at: http://arti.vub.ac.be/~comris/

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


7

and by screen terminals scattered throughout the Conference Centre. Thewearable computer is used to convey short messages that are relevant forthe user with respect to their current physical and social surroundings.The user can walk to a terminal if they wish to have more informationabout this message or other recent messages they have received. Whenthe user approaches a screen the wearable computer detects this termi-nal’s identifier, and then sends this identifier to the user’s CAPIA. Oncethe CAPIA is aware of this situation, the agent sends to that screen thereport of the performed tasks and the report of ongoing tasks.

The delivery service comprises several components. The first compo-nent is the natural language generation (NLG) component. The NLGcomponent receives the message sent by a CAPIA and generates anEnglish sentence explaining the message content and taking into accountthe current attendee context and the sentences previously generated.Then, when the message has to be delivered as audio, the sentence struc-ture is sent to a speech synthesis component that produces the actualaudio heard by the user. Similarly, there are components that transformCAPIA’s messages into HTML or VRML in order to be delivered to thescreen terminals.

7.4.2 Awareness Service

The awareness service keeps track of the whereabouts of the attendeesin the Conference Centre. In the COMRIS CC the detection devices are anetwork of infrared beacons (marking the different rooms, places andlocations in the CC) and the wearable computers carried by the atten-dees. The COMRIS wearable computer (shown in Figure 7.2 and com-monly called parrot) detects the infrared beacons and thus informs the

123456789101112345678920111234567893011123456789401112345611


108

Figure 7.2 The wearable computer also known as “the parrot”. The CPU is on the front unitwhile the back one hosts sensors and batteries.

awareness service of the location of its user. Moreover, the wearabledevice possesses an infrared beacon, allowing the detection of otherpersons, wearing a parrot, located nearby. In order to have access to thisinformation, each CAPIA in the information space “subscribes” its userto the awareness service. As a result, the CAPIA receives messages aboutthe changes in location of that person and a list of other people close tothat person. When the CAPIA interacts with other CAPIAs (represent-ing other conference attendees), and decides that those CAPIAs are inter-esting persons, it subscribes those persons to the awareness service.Consequently, the CAPIA is aware of the location of the most interest-ing persons for its user and detects, for instance, when one of thesepersons is in the same location as the user – a most relevant situation topush to its user the information concerning that person who is interest-ing and nearby.

7.4.3 Tasks

The tasks that the COMRIS Conference Centre supports are at the coreof the activity in the CAPIAs. It is important to remark here that, in orderto perform these tasks, the information agents use both the CC ontologyand the awareness service to infer the situation of the user. That is to say,knowing that the user is in a particular place, the current time and theactivity scheduled by the Conference for that place at that time, the infor-mation agent can infer the social activity in which the user is involved.

The tasks performed by COMRIS CAPIAs and the scenes in which theyare involved are summarised below.

● Information gathering: this task is responsible for establishing initialconversations with other CAPIAs for estimating the interestingness ofthe attendees or conference events they represent. We say that theinformation gathering task constructs the interest landscape of a givenattendee. The interest landscape holds all the information consideredto be useful for the interest of the attendee and is used and refined inthe other tasks. When the information-gathering task assesses a con-ference event with a high interestingness valuation, the informationis directly delivered to the attendee. This delivery strategy was adoptedfor biasing the future decisions of the attendee. In CAPIA advertisers,this task has been specialised for attracting persons who might beinterested into the conference events (exhibition booths or conferencesessions) they represent.

● Appointment proposal : in this task, using the interest landscape, theCAPIAs try to arrange an appointment between two attendees. First,CAPIAs negotiate a set of common topics for discussion (the meetingcontent). When they reach an agreement, CAPIAs negotiate on theappropriate meeting schedule.

011

011

011

011

11

109

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


7

● Proximity alert: in this task an attendee is informed that she is phys-ically near to another person with similar interests – or near an exhi-bition booth or a thematic session with similar topics.

● Commitment reminder: this task is responsible for checking if atten-dees are aware of their commitments. The CAPIA uses context todetermine that the user may be unaware of a commitment, e.g. if sheis not near the location of an appointment (or a bus scheduled toleave) a few minutes before. Commitments of attendees are onlynoticed when the context information available to CAPIAs indicatesthat the attendee is not aware of the commitment (e.g. it is fiveminutes before the starting of a session chaired by the attendee andthe attendee is physically in another place).

For each task several activities are launched in the CAPIA. Forinstance, when an agent in COMRIS is discussing appointments withseveral CAPIAs, each thread of interaction is managed by a distinct activ-ity. The activities can start either by an internal decision of a CAPIA orbecause a CAPIA has received a request from another CAPIA.

7.5 Conclusions

We have shown a specific context-aware application (the COMRISConference Centre) developed in the framework of the COMRIS projectthat illustrates the interplay of physical space and information space. Wehave seen that the physical infrastructure, consisting of an individualwearable computer and localisation beacons, was used as a “awarenessservice” by a society of agents inhabiting the information space. We havefocused on the kind of software required to develop a context-awareapplication, showing that using an agent-based approach we can fulfilthe properties we required in what we called continuative computing. Wealso discussed the kind of agent architecture that can exploit contextawareness.

The approach we presented was based on the idea of having a personalagent per person. This allows the continuative dimension of processing tobe user-centred. As the user changed context, the personal agent receivedthe corresponding perception data from the awareness service and fol-lowed the user to a new physical context. In addition to this, services thatneed to be provided to the users (services provided by the Conferenceorganisation in our example) are agentified, that is, they are also providedby agents (and they also are aware of the contexts in which they are inter-ested). Moreover, this approach is scalable: the agent could use anotherawareness service at a different conference centre. Moving from oneapplication context to another (from one conference to another) requiresa standardisation effort of the awareness services, but this effort is reasonable since it can improve the performance and lower the costs.

123456789101112345678920111234567893011123456789401112345611


110

We see thus that context awareness can be integrated into an agent-based paradigm in a well-understood way. A better infrastructure oncontext perception, as we can expect to be developed in the next decade,can be integrated in the agent-based paradigm without major problems.The main reason for this is the AI approach to agents that employ ontolo-gies describing the world. Clearly, with a better perception infrastructurethe agents could perform better inferences about the state of the world.For instance, in the COMRIS conference centre the agents knew whentwo people were in the same room (using beacons) and when two peoplewere in front of each other (using the wearable computer beacons), but since there was no microphone on the wearable computer there wasno way to know if the user was already busy talking with someone or not.

In addition to improved awareness services and perception infra-structures, a second issue that was considered in the COMRIS project butnever tried was that of learning. Agent learning is developing into anactive area of research, initially focused on reinforcement learning but itis rapidly broadening. Agents in context-aware applications should beable to adapt to new contexts but also to learn from the user satisfaction(or not) of the agent’s action. However, learning from examples requiresa sufficient number of examples to be worthwhile, and the experimentsconducted in the framework of the COMRIS project assumed that theConference only lasted one day – and the amount of data was too sparseto allow significant learning. It turned out that learning would be inter-esting if personal agents were carried over by the user to different conferences, allowing the collection of examples significant enough in number and variability. Only when awareness infrastructures are more proficient and more readily available (and awareness services aremore standardised, allowing agents to continue from one applicationcontext to the next) the AI and agent technologies will be able to respondubiquitously and intelligently to the requests and the needs of people.

Acknowledgements

The research reported in this paper is partly supported by the ESPRITLTR 25500-COMRIS (Co-habited Mixed-Reality Information Spaces)project.

011

011

011

011

11

111

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


7

Part 4Communication

011

011

011

011

11

113

8Communicating in an IIS: Virtual Conferencing

Adrian Bullock

8.1 Introduction

This chapter considers the role that virtual conferencing has to play inrealising a successful Inhabited Information Space (IIS). For any IIS to be successful it needs to weave together many different constituentelements and present these in a coherent and seamless manner. Forexample, Maher et al. (2000) describe how many different componentsare used together to create a virtual design studio for architectural col-laboration. For the IIS to function, all the elements must work both indi-vidually and collectively. Communication is one of the basic buildingblocks for an IIS, and can be in many modes across many media.Approaches to virtual conferencing offer support for communicationacross a number of media and can be utilised in an IIS. These approachesare also starting to offer support for collaboration. By providing an intro-duction to, and overview of, various possibilities for virtual conferenc-ing this chapter aims to show how these solutions can provide therequired and appropriate support for communication and collaborationbetween inhabitants in a shared information space. Of course virtual con-ferencing solutions exist at many levels of sophistication and fidelity.Communication media can range from text through 3D graphics to videorepresentations. The aim of this chapter is to present these many andvaried possibilities, drawing on the experience of the author as well asinsights into the past, present and future. In this way it is possible to see how diverse a range of IISs can make use of virtual conferencing functionality.

We talk about virtual conferencing as an activity that takes placewithin an IIS, the aim being to examine a range of communication andcollaboration possibilities available for use in an IIS. Many of theapproaches could well be termed IISs in their own right, whereas othersprovide only part of the functionality required by an IIS. Our aim is notto simply overview virtual conference approaches, but to demonstrate

011

011

011

011

11

115

how these approaches can be incorporated, borrowed or used to com-plement other applications and so aid in the construction of an IIS. Webegin by considering just what we mean by the phrase “virtual confer-encing”, and then we look at a number of approaches, demonstrating therange of areas that virtual conferencing covers. Next, we examine howvirtual conferencing can be used, and identify some issues that have abearing on this use. We also consider the roles of video and graphics inrealised solutions for virtual conferencing. These approaches are thencompared to telephony, arguably the most pervasive virtual conferenc-ing solution in use today in global terms, and we suggest areas that needto be addressed if conferencing is to approach these success levels in thefuture. We end by presenting general guidelines for making use of virtualconferencing approaches effectively.

8.2 Virtual Conferencing – a Historical Perspective: Past, Present and Future

We have come a long way since the first telephone call in 1876 in Boston(Farley, 2001), the first videoconference meeting in 1930 in New York(Rosen, 1996) and the first TV broadcasts in March 1935 by the GermanPost Office. The subsequent developments in these technologies have concentrated on reproducing lifelike representations of shared audioor/and video between participants. The introduction of computers in the1960s saw the development of more abstract forms of communication:email, text chat and MUDs / MOOs (see Section 8.3.2) are early examples.As computer graphics developed so did the possibility of creating simpleshared environments, something picked up by authors at that time. W.Gibson (1986) coined the term ‘cyberspace’ in his writing in the 1980s todescribe the virtual world in which we will be as comfortable occupyingin the future as we are the physical world now. Stephenson’s (1992)Snowcrash further describes how physical and virtual worlds may becomeintertwined, undertaking actions in both worlds at the same time. In StarTrek we see a “Holodeck”, where the ship’s crew walk into an empty room,speak to the computer and suddenly it is as if they are in a completely different physical location, able to experience all that it has to offer –though we are still some way from achieving this today (Raja, 1998, Leigh et al., 1999). While early shared environments such as AlphaWorld(http://www.activeworlds.com/) encouraged thousands of users to usethem and indeed have been used to host international conferences (Jones,2000), it is perhaps with online gaming that we are starting to see thepotential of IISs with hundreds of thousands of users interacting togetherin shared gaming environments, such as EverQuest (2003).

By examining past and current approaches to virtual conferencing,providing examples and experiences of use, we aim to present virtual

123456789101112345678920111234567893011123456789401112345611


116

conferencing as an abstract area, fulfilling the requirements of many different environments and scenarios, including those listed above. It isan activity that takes place in the IIS to support the work that is theprimary purpose of the system.

8.2.1 What Do We Mean by Virtual Conferencing?

We define a virtual conference as a meeting between two or more par-ticipants, where the participants are physically located at different places.Typically participants communicate and collaborate through some formof visual and audio communication channel, along with the sharing oftask-related information. Videoconferencing systems such as the H.323-based Polycom ViaVideo (Polycom, 2003) and the IP–based Marratech(2003), telepresence systems such as MASSIVE (Greenhalgh and Benford,1995) and DIVE (see Chapter 12) and text-based systems such as MUDs and MOOs (Burka, 1995) are common forms for supporting thiscommunication. An important aspect of virtual conferencing is that thereis a shared space that supports appropriate communication across anumber of media in as natural a manner as possible; an IIS is such a space.

8.3 Approaches to Virtual Conferencing

We now examine how virtual conferencing has developed, consideringits origins in video in the late 1960s and multi-user dungeons in the late1970s, the effects of developments in graphical, computational andnetwork capabilities, and application areas that benefit from and driveforward the development of virtual conferencing systems.

8.3.1 Early Videoconferencing

Early videoconference developments dating from the 1960s placed theemphasis on communication, enabling two people in different geo-graphical locations to talk to and see each other. The Picturephonesystem (Massey, 2003; Schindler, 1969) developed by AT&T was an earlyexample of an attempt to enhance the experience of talking on the tele-phone. A telephone was augmented with an accompanying video displayunit with a built-in camera, enabling the user to see the person to whomthey were talking. Functionally the system was sound, but people felt thatthe unit was too bulky, the controls too unfriendly, the cost of use tooexpensive and the picture too small. Another difficulty was the highbandwidth requirements for a video call, 333 times higher than a stan-dard phone call. Clearly this would not scale very well with the telephone

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Communicating in an IIS: Virtual Conferencing

117

8

infrastructure of the early 1970s and the Picturephone disappeared veryquickly. However, with advances in compression techniques, improvednetworks (both Internet and Telecommunications) and increasing avail-ability of low-cost bandwidth we are seeing conditions where video-phones are becoming viable. Indeed, we increasingly see their use by the media when reporting from countries with little infrastructure (e.g.satellite video phones from Afghanistan in 2002).

8.3.2 MUDs and MOOs

The roots of more abstract forms of virtual conferencing can be tracedback to MUDs (Multi User Dungeons) and MOOs (Object OrientedMUDs) in the 1980s (Bartle, 1990). At this time computational resourceswere limited and expensive (mainframes), so people had to rely on simpletext message passing and, perhaps more interestingly, the use of text to construct some of the first IISs (even if they were not thought of assuch then). Despite the impoverished appearance of such environments(only a screen of text after all), they were extremely effective at engen-dering a sense of community and presence – a very effective IIS indeed!Emoticons were used to enrich communication (Rivera et al., 1996),perhaps the most famous being the smiley :-) (Fahlman, 2003). Thesetextual environments showed that effective collaboration and commu-nity building was possible. Embodiment, information representation andtechnological issues were addressed in the minds and imaginations ofthe end users, and textual descriptions aided in this process (a textualdescription of how and where someone appears). It is not surprising thatthis works well, given the way in which it is possible to immerse oneselfwhile, for example, reading a book. These environments had a range of uses, some being for games and role-playing, some a place to simplyhang out and meet friends (Evard et al., 2001), while others were supportenvironments for work (Evard, 1993, Churchill and Bly, 1999).

8.3.3 The Arrival of Graphics

In the early 1990s graphical add-ons to MUDs and MOOs started toappear, as graphics technology became something that was available onthe desktop and was no longer the preserve of expensive, high-poweredservers. However, these early approaches to bolt on graphics (e.g.BSXMUD, 1994) did not really work (see Figure 8.1). The interesting anduseful interaction continued to take place in the simple text exchanges,while the graphics were simply a series of updated still images and scenesthat bore little resemblance to the ongoing dynamic activity. More suc-cessful approaches to graphical MUDs and MOOs were ActiveWorld’s

123456789101112345678920111234567893011123456789401112345611


118

AlphaWorld (http://www.activeworlds.com/) and Blaxxun’s Contact(http://www.blaxxun.com/), realisations of the kinds of online commu-nities described in Neal Stephenson’s (1992) Snowcrash (indeed Blaxxuntakes its name from the virtual club from the book). Here there was a more direct correlation between the user and their embodiment(Cuddihy and Walters, 2000), albeit in a very constrained and ratherunnatural way. Typed text was still the main communication channel,but now gestures could be performed by the embodiment as well asemoticons such as :-). These environments appealed to teenagers in par-ticular, offering a customisable place to hang out with friends and meetnew people. Creativity could also be expressed, with parallels betweenthe early settlers in the USA and the early settlers in AlphaWorld,arguably the world’s largest virtual reality world, being easily madethrough the creation of homes and communities. Activeworlds Maps(2003) charts the development of this world from December 1996 toAugust 2001 in pictorial form. Currently the world is approximately thesame size as California, i.e. 429,025 square kilometres. Today these com-panies have grown from their MUD roots to offer a 3D home on theInternet with sophisticated possibilities for interaction .

8.3.4 Video Comes of Age

Away from MUDs and MOOs, work in the late 1980s was also beingundertaken on shared media spaces, where video technology was usedto create a real shared space inhabited by real representations of people.Xerox EuroPARC pioneered this work with their RAVE media space(Gaver et al., 1992), in tandem with research into media spaces at PARC(Stults, 1986, Bly et al., 1993). ISDN-based videoconferencing was the

011

011

011

011

11

119

Figure 8.1 A screenshot from BSXMUD. Reproduced with permission from Henrik Rindlow.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

standard for business and commercial use, and a new suite of softwaretools that ran over the Internet were developed in the academic com-munity at this time: the MBone tools (Macedonia and Brutzman, 1994).However, workstations and peripherals to support video were expensiveand not in widespread use, even by the mid-1990s. All this was to changewith the arrival of inexpensive USB devices and dedicated video hard-ware in the latter part of the 1990s. Video became accessible on thedesktop to all users, software became more widely available (e.g.Microsoft bundled NetMeeting with their operating system), plug andplay all-in-one hardware devices such as the Polycom ViaVideo removedthe complexity from local peripheral configuration, and network infra-structures made it possible to support the bandwidths required for videocommunication.

We give two examples of different types of videoconferencing solu-tions that are most commonly in use today. Figure 8.2 shows a room-based solution, where dedicated hardware is installed in a conferenceroom, and a large screen displays the other members of the videocon-ference. This is purely videoconferencing, and not really an IIS in the wayit is being used. However, if the other screens were used to display sharedworkspaces, e.g. using Virtual Network Computing (VNC) (Richardsonet al., 1998) to share a computer desktop between all the sites then thissimple communication space is quickly transformed into an IIS.

123456789101112345678920111234567893011123456789401112345611


120

Figure 8.2 A typical room-based videoconference session

Our second example shows how a desktop-based videoconferencesolution can be thought of as an IIS. Figure 8.3 shows what a typicaldesktop looks like when holding a meeting using the desktop video-conference system Marratech Pro (http://www.marratech.com). Fourwindows make up the session and we describe them moving from left toright. The viewer enables the user to load HTML web pages. By defaultthese pages are local to the viewer, though users can choose to transmittheir content to others in the same session. In this case the project website, being discussed in the meeting, is displayed. The next window is a shared whiteboard that supports the import of images, Microsoft Wordand PowerPoint documents as well as standard whiteboard functionality.This holds the meeting agenda and is updated with annotations as themeeting progresses (who is responsible for developing which sections ofthe web site in our example). The video window shows who is currentlytalking, and the video image follows the audio source, changing dynam-ically. Finally, the participant window shows thumbnail video images ofeveryone who is in the meeting. Participants join the meeting by con-tacting a portal and joining a session. Sessions can be encrypted and/orpassword protected for privacy purposes. It is also possible to commu-nicate privately with other participants in the meeting while still beingpresent in the main meeting room.

011

011

011

011

11

121

Figure 8.3 A meeting in a desktop videoconference system.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

8.3.5 Graphics Come of Age

From the mid-1990s onwards, technological advances were such that 3Dgraphical environments that supported audio communication betweenparticipants were possible on desktop workstations, opening the door topotentially rich interactions between standard users. Early examples ofthese types of systems included DIVE (see Chapter 12), a fully program-mable distributed interactive virtual environment, and MASSIVE-1(Greenhalgh and Benford, 1995), a 3D graphical teleconferencing system.Another important driving force in the development of CVEs at this pointwas the gaming market. Doom (1993, Id Software) was a revelation,demonstrating that engaging 3D environments into which users couldimmerse themselves for literally hours on end were possible, and on verylowly specified PCs. OK, so these were hand-crafted, specific and opti-mised solutions, but if such levels of realism and involvement were possible in a game then surely it would only be a matter of time beforesuch environments were possible for business and leisure use?

The main drawback in the development of realistic environments inthe early 1990s was the available computational power. Despite pushing

123456789101112345678920111234567893011123456789401112345611


122

Figure 8.4 A virtual conference in DIVE (c. 1995).

the technological capabilities to the limits, the early environments sup-ported by systems such as MASSIVE and DIVE appeared primitive andimpoverished (see Figures 8.4 and 8.5). Textures helped increase realism,but at the time this had a major effect on performance. Unlike the pre-defined scenes from games, these new systems were programmable onthe fly and very much dynamic in nature. The collaborative nature alsohad a major impact on networking, especially with the bandwidthrequirements of audio communication between participants. Initialexperiences showed that the configuration of such distributed environ-ments was most definitely non-trivial, even for those who simply wantedto run a client. However, once sufficient experience and training has beenacquired it has been observed that people could collaborate well in suchenvironments, could build personal and social relationships with others,and adapt their behaviour quite flexibly (Greenhalgh et al., 1997, Bullock,1997). It was possible to examine collaboration and usability issues, and not just be content with the technical “success” of merely givingeveryone the possibility to communicate, though this was a result in itsown right.

011

011

011

011

11

123

Figure 8.5 Participants examine a representation of network connectivity in MASSIVE-1.Reproduced with permission from The University of Nottingham.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

8.4 Using Virtual Conferencing

In the previous section we examined the development of systems thathave overcome the technical challenge of simply enabling people to meettogether in a shared environment. This stable base allows the investiga-tion into a number of interesting research questions and challenges forsupporting virtual conferencing and meeting in an IIS. Some of these arequite technical in nature – such as how can scalable systems that supportmany thousands of simultaneous users be constructed and how complexcan the environments and users be? – while others are concerned withsocial factors – such as how do we convey subtle, involuntary commu-nications inside an IIS, and how can we assess the successfulness of theIIS as a whole? Perhaps an easy way to summarise is to say that we needto understand the process of collaboration and identify the key aspectsand challenges to be addressed.

8.4.1 Understanding Collaboration

When we meet together with other people we are naturally immersed ina shared environment; it is the environment in which we spend our entirelife and we spend many years learning how to understand it and inter-act in it. Through the use of computing and telecommunication tech-nology for supporting virtual meetings we introduce an explicit obstaclebetween each of the users, in that each person must first interact withthe artificial shared environment and then interact with the occupantsand contents of that environment (Churchill and Snowdon, 1998). Therehave been many studies of how people collaborate together in bothvirtual (Fraser et al., 2000) and real (Heath et al., 2001) environments.One of the major challenges of teleconferencing, and one that is still far from being solved, is the ability to interpret the many and complexinteractions that take place in real life when two or more people meettogether. Information is exchanged in what is said, how it is said, ges-tures, facial expressions, body orientations as well as people’s naturalability to somehow sense how the other person is feeling (possibly fromthe sources listed above but maybe from other factors as well). This infor-mation is very difficult to capture and process, though there are attemptsto build dedicated installations to do this such as in the EU IST-fundedVirtue project (Schreer and Kauff, 2002) and the Office of the Futureproject (Wei-Chao Chen et al., 2000). There are also continuing devel-opments in the area of image recognition and tracking, so that systemscan be developed that automatically capture not just the voice andappearance of a user, but also their gestures and other subtle interac-tional components and automatically translate these into input for theshared meeting system (e.g. Fuchs, 1998 and Intelligent Spaces in MIT’s

123456789101112345678920111234567893011123456789401112345611


124

Project Oxygen (2003)). Another possibility to capture user data wouldbe to expect someone to explicitly control a virtual embodiment to rep-resent their range of interactional possibilities. However, the complexityand cognitive load in using such an interface would most likely make itimpossible to use – users would spend all their time trying to controltheir avatar to the extent that they do not concentrate on the actualinteraction itself, the whole reason for being in the environment in thefirst place!

8.4.2 The Importance of First Impressions

While meeting someone for the first time should be a real-life experi-ence, repeat meetings can benefit from virtual conferencing approaches.There is already a shared experience on which to build (the first meeting),and people can use that to bridge the gap between the real and the virtual(be it graphical or video) world. These techniques are equally valid forgraphical environments and videoconference systems.

8.4.3 Sharing Context

Another important aspect of collaboration is the ability to share thecontext or setting in which each user is situated. The communicationitself is important, but so is the work that is going on (Luff et al., 2000).We need to share work as well as communication, and tools that let usdo this are being continually developed, e.g. shared CAD design software,whiteboards and application sharing systems such as NetMeeting andVNC. We also need to be able to share experiences between local andremote participants, an example being the TELEP system from Microsoft(Jancke et al., 2000) that allows remote viewers to attend seminars, havinga presence in the room and an ability to interact.

8.4.4 Scalability

When it comes to scale there are big problems to be overcome. Whatdoes it mean to be in a virtual audience of a thousand users listening to a presenter give a speech? Hard choices have to be made as to howmuch awareness of the rest of the audience we have while we naturallyconcentrate on the speaker. There should be the potential to interact with any individual in the room (e.g. in a plenary session at the ACMCHI conference anyone is able to ask a question, and that person is heard by everyone in the room, but only at the instant the question isasked. At the same time we are aware of what the people immediatelyaround us in the audience are saying.). Techniques to scope and limit

011

011

011

011

11

125

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

the interactions are necessary (Chapter 13, Benford et al., 1997a), andthese techniques must be flexible enough for easy switching between thedifferent scenarios, and ideally done without the awareness of the user.Here the technical issues have to be dealt with while considering thesocial implications of any given choice.

8.4.5 Real Versus Abstract: The Role of Video?

Approaches to teleconferencing using 3D graphical systems (Greenhalghet al., 1997, Lloyd et al., 2001) offer a “real” shared environment, graph-ically rendered with polygons and textures, where users are embodied inthe same shared space and it is possible to observe the interactionsbetween the users. This offers potential for developing and supportingthe more involuntary aspects of communication that we touched on pre-viously. As the embodiments can be quite crude, however, and even whenthey are very realistic as in computer games or state of the art research(VR Lab, 2003; Mira Lab, 2003) (see Figure 8.6), it is still not natural toimmediately associate the embodiments with actual people. There arealso very real performance issues when using more realistic embodi-ments, as significant computational resources are needed as the levels ofrealism increase, especially if the embodiments are to react dynamicallyin real time. And how should such embodiments be animated and controlled? One danger with a lifelike embodiment is that there is anexpectation of lifelike behaviour from the avatar. If an avatar has legs is it not unreasonable to expect them to move when the avatar moves?Often this is not the case.

With video the association between people and embodiment is auto-matic and we immediately get a sense of the person we are talking to.However a simple video image is quite poor in the depth and quality itcan offer and much depends on the local user configuration (i.e. wherethe camera is placed, lighting, the behaviour of the person themselves).

123456789101112345678920111234567893011123456789401112345611


126

Figure 8.6 Realistic human embodiments. Reproduced with permission from Daniel Thalmann.

So while we recognise and relate to the person on one level, much of the subtle interactional methods are lost with a purely video approach.Therefore the immediate way forward would seem to be a hybridapproach where video-based avatars are placed in a shared 3D space andcan interact with each other (examples being Benford et al., 1996; Insleyet al., 1997), or mixing the real and virtual worlds together (Koleva et al.,2001).

8.5 Virtual Conferencing Versus Telephony

In spite of the advances in support of virtual conferencing at a numberof levels across a number of media, perhaps the most used form of virtualmeeting today remains the humble telephone and conference calls, wherethere is no user embodiment nor support for shared information. In thissection we examine some of the reasons why the telephone enjoys wide-spread success while the seemingly superior virtual conferencing servicesare lagging some way behind. At first it appears strange that more use isnot made of virtual conferencing, given the potential savings in time andmoney if people were not to travel so often (Townsend et al., 2002).However, a Scandinavian survey (SAS, 2002) undertaken by Gallup forSAS, the region’s major airline, has found that for business, personalmeetings are preferred to virtual ones (maybe not too surprising givenwho the survey was commissioned by). It is interesting that the same survey also found that women were more positive towards virtualmeetings than their male counterparts.

With the introduction of broadband (0.5Mb and above) to homesacross Europe and the United States, one might think that the days ofregular telephony were numbered (if everyone has a computer and it isalways connected then why use the telephone?). This is definitely not thecase. Let us consider why the telephone is so successful. People are famil-iar and comfortable with the telephone, they know the limitations andknow what to expect when using it. They simply lift the phone, dial anumber and talk to the person at the other end. Technical glitches arerare and not the fault of the individual. Also, the interface has changedlittle over the past 50 years – indeed if a 50-year-old phone is connectedto today’s system it is more likely to work than not. Contrast this to com-puters, relatively new and complex devices that serve many and variedpurposes, that are constantly being improved and developed, offeringlittle backwards compatibility in terms of both software and hardware(things quickly become obsolete).

With virtual conferencing we are often not dealing with a single device(the telephone), but a collection of devices: microphone, camera, screen,computer and the connections between all these devices. The softwarethat holds everything together is often not as stable as the telephonesystems and much configuration and fine-tuning is necessary to have

011

011

011

011

11

127

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

everything perform optimally. Also, much of this configuration is a localissue that the users have to deal with themselves. Unfortunately manypeople have poor experiences of trying to use virtual conferencing appli-cations, and these experiences make them shy away from using suchapproaches in future, be it consciously or unconsciously. However, manyof the problems of the past concerning interoperability and configura-tion are being solved. USB devices, such as cameras, are standard andeasily used across platforms, and software wizards make the configura-tion of applications a relatively easy task. Also, instant messaging systemssuch as ICQ and Microsoft Messenger are educating people to the poten-tial advantages for virtual conferencing, even if it does not explicitly seemthat way. These tools offer facilities much like the standard telephonewhere people can select an icon and then start a conversation withanother user. While this is not much different to a standard telephonecall, it is providing the user with experience of using his or her PC forcommunication, and perhaps more importantly some confidence andbenefit from the process. The next step of experimenting with confer-encing applications should hopefully be easier with such a user, as theyare aware of potential benefits and have some confidence in the tech-nology in use (i.e. they have used it successfully and know the benefitsto be gained at first hand).

Long-term usage of room-based videoconferencing (Bullock andGustafson, 2001) identified a number of potential problems that need tobe addressed if people are to concentrate on their meeting and on eachother, and not on the infrastructure supporting the meeting. A consis-tent experience is necessary, as with the telephone. Participants ratedtheir experiences very highly the first time they experienced the system,but for subsequent meetings they rated their experience less highly, even though there were no perceivable differences between configura-tions. We suggested that people were excited and taken in by the set-upthe first time they used it, and subconsciously compensated for smallglitches in audio and video. Once familiar with the set-up, however, they were more susceptible to minor imperfections and noticed these,and these made people feel more tired. So, while systems might appearto be good in the short term, they need to offer consistency and relia-bility like the telephony system if regular long-term use is to occur.Perhaps the most significant problem in maintaining a sense of presencebetween the participants concerned audio, and feedback in particular.Audio headsets help eliminate this, but in a room setting these are not practical. While microphone/speaker combinations do exist that help eliminate feedback, these can be expensive, or else the quality theyoffer is relatively poor. In order to take advantage of the increased qualitythat computational approaches offer (compared to the telephone), then audio solutions offering feedback-free communication need to be developed.

123456789101112345678920111234567893011123456789401112345611


128

8.6 Guidelines for Using Virtual Conferencing Effectively

When deciding on what type of virtual conferencing solution to use, thereare a number of factors that need to be taken into consideration. Perhapsmost important is the task that is being undertaken. It is also importantto think about what the most relevant media are for communication, andwhat types of communication need to be supported. Other factors to beaware of are the supporting infrastructures that will be used (e.g. net-working, display possibilities) and issues of heterogeneity. We bring thischapter to a close by drawing up a set of guidelines to help in selectingappropriate approaches to adopt for communication in an IIS.

8.6.1 What Is the Task at Hand?

The most important factor to consider is what activity is being under-taken in the IIS. A scheme for classifying different types of activityinvolves the activity itself and two parameters describing the activity. Thefirst parameter describes whether the activity naturally occurs in real,physical space or in a virtual, constructed space, and can take the valuereal or virtual. The second parameter concerns information that will beshared between participants in the activity, and again takes the valuesreal and virtual depending on whether the information is real (e.g. worddocuments) or virtual (e.g. a CAD model of a physical object). We giveexamples for each of the four resulting application areas and show howvirtual conferencing can best be used to support these areas.

Activity(Real, Real)

An example of this kind of activity is a project meeting using desktopconferencing that supports a shared workspace. Returning once more tothe system shown in Figure 8.3, video and audio provide the necessarycommunication between participants, and shared editing and browsingtools allow web pages and office documents to be shared between par-ticipants. There is no abstraction in the virtual conference, with real lifemeeting characteristics and properties mapped directly onto the virtualconference.

Activity(Real, Virtual)

An example of this kind of activity would be a system similar to the one described above, but where support is provided for sharing moreabstract information between participants, rather than simply docu-ments. The DING project (Törlind et al., 1999) is a good example of such

011

011

011

011

11

129

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

an environment, where a videoconference system is augmented with a shared 3D virtual world where participants are embodied and can acton shared 3D CAD models.

Activity(Virtual, Real)

This kind of activity would take place in a shared virtual environment,but the information shared in the environment would be real. The WebPlanetarium (Chapter 2) is a good example of an IIS that supports thiskind of activity. The shared 3D environment is provided by DIVE but theinformation inside is real web pages.

Activity(Virtual, Virtual)

Finally, truly virtual activities inside an IIS would involve things likeshared browsing of information visualisations or similar activities.

8.6.2 Communication Media

Above we gave examples of different types of real and virtual activities.What we didn’t explicitly mention was the possibilities for communica-tion inside these IISs. Audio is arguably the most useful form of real-time communication in an IIS, and is equally applicable to any of thescenarios listed above. The one drawback with audio communication isthe way it suffers from interference through dropped audio packets orpoor availability of resources. Simple text messages are a very useful andvery effective form of communication, supporting both synchronous andasynchronous communication, and providing an explicit history mech-anism in the process. Video imagery is useful, and the more realistic theimage the better the sense of presence will be, but video can normally begiven a lower priority than the other media unless there is an explicitneed for it in the activity.

8.6.3 Infrastructural Support

In order to support high quality audio or/and video conferencing, highnetwork bandwidths are required. If such networks are not available thensimpler forms of communication need to be used, such as typing textmessages or using lower quality encodings for the media streams.However, given the increasing availability of network infrastructures andsupported bandwidths, it is better in the long run to make design choicesbased on infrastructures existing in the future (e.g. Smile! – Johansson,

123456789101112345678920111234567893011123456789401112345611


130

1998)) rather than work around problems that are not really problemsand not have chance to examine what are the interesting issues.

8.7 Final Remarks

There is no doubt that virtual conferencing has a big part to play in futurebusiness and leisure scenarios and applications. As interfaces to the com-puter improve, and the computer as an artefact disappears to be replacedby intelligent devices with built-in computational and communicationfacilities, we will no longer think of running virtual conferencing appli-cations or configuring infrastructures for communication – we willsimply take part in virtual meetings as if they were happening in reality.Of course it will take some time before an experience of actually “beingthere” with the other person is possible, but the use of conferencing techniques in much the same way that the telephone is used today wouldrepresent a significant step forward.

011

011

011

011

11

131

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


8

9Getting the Picture: Enhancing Avatar Representations in CollaborativeVirtual Environments

Mike Fraser, Jon Hindmarsh, Steve Benford and Christian Heath

9.1 Introduction

Inhabited Information Spaces (IISs) can be either realistic or abstract.Realistic IIS systems presenting 3D graphics are usually termed “Collab-orative Virtual Environments” (CVEs). However, difficulties encoun-tered by designers of abstract spaces such as Populated InformationTerrains (PITs) still arise in “realistic” CVEs. Notably, both abstract andquasi-realistic information spaces suffer the problem of supporting“natural” interaction in an unnatural world. Users of abstract spacesencounter difficulties because interactions in an unfamiliar world mustbe learnt. Users of CVEs encounter difficulties because interactions in a familiar world are assumed where many remain unsupported.

In this chapter, we focus on the realism of inhabitant representations.The use of “avatars” is a fundamental part of CVE design. Users are rep-resented in the 3D environment so that they can see their own repre-sentation and also see one another (Fraser et al., 1999). CVE systemsusually take the approach of providing “realistic” avatars (e.g. Guye-Vuillème et al., 1999; Salem and Earle, 2000). Designers mirror humanappearance and behaviour as closely as possible, even though there issome debate about how close designers are to creating a virtual experi-ence that is “truly realistic” (e.g. Brooks, 1999). Visual representation andanimation of human-like 3D figures has made great advances over thepast few years. Avatars can look and move like realistic humanoid figuresmore than ever (e.g. Badler et al., 1999; Faloutsos et al., 2001). Thisapproach has pervaded the representation of people in CVEs, and hascommonly meant that CVE avatars are, at the least, pseudo-humanoid.Actions that can be performed at the interface are represented through

011

011

011

011

11

133

the use of a human-like metaphor or model. However, problems with theuse of realistic representations remain. Rendering realistic 3D graphicsmust take into account constraints of the system such as graphics pro-cessing speed, network bandwidth and so on. Another often-overlookedfactor, however, is that the user’s control and perception is not com-pletely realistic. For example, a human-like figure can suggest realisticperceptual and movement capabilities to other users. These capabilitiesare simply not supported by current, or even prototype, display andtracking technologies. Nor is it likely that many problems, particularlythe speed-of-light limitation on data transmission, will be removed overtime (Brooks, 1999; Fraser et al., 2000).

Recent studies (Hindmarsh et al., 1998; Fraser et al., 1999; Valin et al.,2000) have shown how traditional approaches to representation in CVEscan cause problems for users. Communications technologies have fre-quently ignored the importance of the contents of the world to users.Designers have tended to focus on supporting face-to-face communica-tion rather than providing the ability to refer to the environment thatusers share. This does not take into account all manner of interactionsthat we regularly rely on in co-present communication. In the case ofCVEs, designers have tended to work around these kinds of communi-cation by focusing on support for simple meetings or coarse informalinteractions, rather than attempting to allow users to collaborate aroundobjects or features of interest within the world. The limitations of field-of-view, speed of movement, and ease of gaze changes, mean that otherusers’ avatars are often off-screen, a key problem for coherently workingtogether around shared objects. This makes it difficult to use pointinggestures and references, not only because movement can be slow, butalso because it is hard to show (parts of) an object to someone whoseposition and/or view are unclear.

The focus on representational realism of 3D shared spaces contrastsdirectly with that of 2D shared spaces, for which usability of the inter-face has been the key design focus. Research in 2D groupware has pro-duced a number of concepts to describe how users’ views are representedto each other, which do not rely on the realism of representation. In particular, approaches to handling multiple views often involve “relaxedWYSIWIS” (What You See Is What I See) (Steffik et al., 1987). In thisapproach, each user’s view is outlined with a rectangle to indicate thatview to the other user(s). As with CVEs, the use of independent view-points in relaxed WYSIWIS systems offers greater flexibility for individ-uals to act by providing adaptable views of the shared space (Gutwin andGreenberg, 1998).

This chapter presents an investigation of an alternative approach tothat of realistic simulation. Removing a focus on realism, we aim to recon-sider representation of people in CVEs using an analogous approach to 2D groupware systems, by investigating how people use and interactwith particular representational forms. We have begun to experiment

123456789101112345678920111234567893011123456789401112345611


134

with different representations of people in CVEs. Instead of simply pro-viding a human-like avatar, we provide a representation that embodiesthe capabilities of the particular user’s interface. We have adapted the useof an outlined field-of-view, as provided in relaxed WYWIWIS systems,for 3D environments. A user’s view is graphically embodied in the virtualworld as a semi-transparent frustum. The extent of the frustum matchesthe size of the user’s horizontal and vertical field of view.

We then present an observational analysis of the use of this technique.Our data is drawn from recordings of pairs of users performing a designtask. Analysis reveals instances of detailed co-ordination of talk and ges-tures, anticipation of problems, and examples of difficulties with occlu-sion of the environment. On the basis of this investigation, we outlineconsiderations for design, as well as reflecting upon our approach moregenerally.

9.2 Method

There are a number of ways of representing views in virtual environ-ments. The traditional method has been to provide an avatar; a human-like representation whose head/eyes presumably represent the view that the corresponding user sees. Approaches that are less realistic andmore explicit might include representation on the virtual environmentitself, perhaps through subtle highlights or shadows on objects in view.However, the increased graphical complexity of rendering lighting or shadows on a moment-by-moment basis, as the avatar moves aroundthe environment, would severely compromise system performance.Additionally, while lights or shadows would be rendered on the graphi-cal features of the world, they would not be visible in the interveningenvironment.

We have extended the MASSIVE-2 CVE system (Greenhalgh, 1999) tosupport a different form of representation. The field of view is revealedin the virtual environment by outlining the user’s view frustum. Thismethod provides information on the extent of the field of view, and alsoembeds this information onto and within the environment, connectingthe avatar to the objects in view. Our approach can therefore make aviewpoint visible to others who might not be able to see the source of theview (i.e. the avatar itself). The outline consists of a semi-transparentfrustum. This “view frustum” technique is shown in Figure 9.1. Note howthe black hi-fi object is bisected by the frustum, allowing observers todetermine that it borders the other’s view. The use of semi-transparencyexplicitly highlights the viewing area while showing the objects in thefield of view.

There are two further aspects to the interface used in our study.Distorted windows called “peripheral lenses” (Robertson et al., 1997;Fraser et al., 1999) are included at the sides of the main view. An example

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Getting the Picture: Enhancing Avatar Representations in Collaborative Virtual Environments

135

9

is shown in Figure 9.2. Our design of the view frustum does not displaythe bounds of the peripheral lenses, only the main view. Bounding thedistorted areas could mislead others to believe that objects in the periph-eral view are easily seen, when in fact objects are fairly difficult to see indistorted conditions. Additionally, a complicated representation wouldbe required to bound both the distorted and undistorted views. Ourimpression was that this representation might prove too complex andhave the effect of confusing an observer. Instead, the representation displays only the undistorted view.

Arms that can be stretched out into the environment are provided onthe avatar – a conceptually similar approach to stretching the represen-tation of the field of view. Participants could point by stretching a singlearm out into the world. Picking up and moving objects was representedby both arms stretching to touch an object, and that object turning intoa wire-frame for the duration of the grasp. This design helps pointingand grasping actions to be visible across the environment.

Using these representations, we asked pairs of volunteers to performa design task. They were told to collaboratively arrange furniture in avirtual room (pictured in Figure 9.1). This task was designed toinvestigate geographically distributed interaction and has been used to study CVEs and other communications technologies (Gaver et al.,1993; Hindmarsh et al., 1998; Heath et al., 2001). Participants were givenconflicting opinions on the ultimate layout of the room to encourage

123456789101112345678920111234567893011123456789401112345611


136

Figure 9.1 User’s “traditional” avatar (circled) with field of view made visible as a semi-trans-parent frustum.

discussions. However, we were not interested in the final outcome orsuccess of the participants’ design. Rather, our data selection focused onelements of the data in which participants specifically used or discussedthe view frustum as part of their interaction. Our method of data collec-tion closely mirrored that reported by Hindmarsh et al. (1998). We col-lected audio-video data of each participant’s screen and of their physicalenvironment, including what was being said through their microphoneand heard through their headphones. We asked 16 participants toperform the task. Each pair took about an hour to finish designing theroom. Our method of analysis has been to account for particular exam-ples of interaction that are recurrent within the data we have collected.

9.3 Analysis

Our study reveals three key issues with regard to the use of the frustumrepresentation, which are discussed in the following sections. The firstsection describes how users are able to intricately co-ordinate their talkand gestures, implicitly using the frustum to understand views of thevirtual environment. The second section describes how users are able toanticipate and circumnavigate potential problems that others may havewith the visibility of the shared environment. Finally, the third sectiondescribes how view frusta may occlude actions without that occlusionbeing obvious to the frustum’s “owner”.

9.3.1 Awareness and Co-ordination

The depiction of the user’s field of view extends into the virtual envi-ronment to make it visible to other users. As a result, it is possible that

011

011

011

011

11

137

The left-hand edge ofFred’s view frustumis in Harry’s view The right-hand edge of

Fred’s view frustumappears in the rightof Harry’s view

Figure 9.2 Screenshot from Harry’s screen in Example A as he says “you’re behind me”.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

another user’s position and orientation can be understood without re-course to fast and intricate gaze movement, and indeed without recourseto the other’s avatar. Consider Example A, which follows a long periodof silence in which Fred and Harry1 have been moving objects in sepa-rate parts of the virtual room. Fred’s avatar is located behind Harry. Fredrotates so that they are both facing in approximately the same direction(i.e. Fred can see the back of Harry’s avatar).

Example A2 (Audio Perspective3: Harry)

Harry: you’re behind meFred: yeh (.) I’m gonna grab the white chairs and put em round the tableHarry: oh go (on then)

At the initiation of the talk, Fred’s avatar has not been in Harry’s viewfor some time. However, Fred’s frustum has intermittently appeared inHarry’s view as it extends across the virtual environment. At the start ofthe example, the left edge of Fred’s frustum is in Harry’s main view. AsFred rotates his view to the left, the right edge of his frustum appears inHarry’s right peripheral lens (see Figure 9.2). At this point Harry says,“you’re behind me”.

Harry uses Fred’s frustum as a successful resource in locating Fred andunderstanding where he is looking, without having to re-orient to findFred’s avatar. Fred and Harry can easily proceed with their course ofaction regarding the design task – grabbing and placing the white chairsaround the table. This example shows how the view frustum is used as aresource in locating others, even when their avatar is completely out ofview. Two key aspects of the example are worth noting. First, direct visualcontact does not need to be explicitly established. The frustum can beused to maintain awareness of the other participant’s movements overtime. At key moments such as in Example A, this resource can be used

123456789101112345678920111234567893011123456789401112345611


138

1 All participants in these trials have been called Fred and Harry for purposes ofanonymisation. Actual participants are different pairs of users.2 Examples show the turns at talk that participants have. Numbers in parenthe-ses are the length of pauses in talk. A single period in parentheses means a pauseof 0.2 seconds or less. Letters in parentheses show uncertainty by the transcriberin what is said at that point. Square brackets show overlaps in conversation.Italics shows emphasis of those syllables with volume. Colons show elonga-tion of the previous sound, the number of colons being proportional to the elongation.3 Small delays mean that the time at which words are said and heard differsslightly, as audio is transmitted across the local network. This, in turn, meansthat timings may differ if talk is transcribed from one participant’s perspectivecompared to the other’s perspective. In examples presented in this chapter, thereis minimal discernible difference.

in interaction. Secondly, the resources provided by the frustum oftenfeature very plainly in conversations between participants. Thus, it isperhaps what is not said that is the most interesting feature of thisexample. An implicit recognition of viewpoint is displayed with “you’rebehind me”. There is no need to explicitly discuss each other’s positionand orientation for the purposes at hand. This implicit recognition showsthat talk can be co-ordinated with the use of the frustum without detract-ing from the task of designing the virtual environment.

Example A illustrates how easily interaction with the other participantoccurs without the requirement of seeing their traditional avatar repre-sentation. The extension of the view into the environment means thatusers have a resource to collaborate without having to explicitly talk andfind out about each others’ viewpoint, even when the avatar itself is notvisible. However, the example is one in which understanding the other’sview to pinpoint accuracy is not demanding for the activity at hand. Thefrustum simply provides an understanding of the location from whichthe other is seeing, and therefore for co-ordinating conversation betweenone another.

In addition, however, the frustum representation also gives the oppor-tunity to accurately co-ordinate around “visual” features of interaction,such as pointing gestures. In Example B, Fred and Harry are discussingwhere to put the table and chairs. Harry has stretched an arm to pointtowards an area of the room (Figures 9.3a and 9.3b). Fred has rotated hisview from looking directly at Harry’s avatar, following Harry’s pointingarm (Figures 9.3c and 9.3d).

Example B (Audio Perspective: Harry)

Fred: yeh i- i- i see where you mean (0.4) right in the corner yeh?Harry: well somewhere over here cos they’re not really (.) that much use for

much else but some people could sit (and eat out)Fred: yeh (1.0)Harry: play a game or something (0.8) cos the chairs and the table go

together

As Fred and Harry discuss the placement of the furniture, Fred rotateshis view left, following Harry’s arm to look at the table and chairs, andthe proposed location. Figures 9.3c and 9.3d show Fred and Harry’srespective views at this point. As Harry says “go together”, Fred’s frustumpasses the end of Harry’s pointing arm. Harry seems to align to thefrustum moving past his pointing gesture. He releases his mouse buttonand drops his arm. Compare their views again in Figures 9.3e and 9.3f.

The most noticeable aspect from Harry’s perspective is the movementof Fred’s frustum along his pointing arm. The edge of Fred’s frustumprogresses along the outstretched arm. The arm is slowly “revealed”, asit is progressively less obscured by the frustum. As soon as his arm is

011

011

011

011

11

139

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

wholly revealed, Harry drops his pointing gesture. He is able to co-ordi-nate the production of his gesture with whether Fred can see it. Thedetailed timing of holding and dropping the gesture relies on the frustumrepresentation to indicate how, whether and when Fred can see the point.This allows the intricate co-ordination of their talk with the gesture.

Again, as with Example A, the participants’ talk remains focused onthe design task. Co-ordination of talk and gesture occurs without theneed to explicitly talk about how to achieve those activities with thesystem. In other words, the use of a frustum representation can allow theparticipants to focus on design work rather than the technology itself.

9.3.2 Anticipation

So far, we have seen how collaboration can be achieved by virtue of “knowing” what the other can see. The frustum allows gestures to beproduced in conjunction with understanding how those gestures areseen. Participants use the representation to design their actions for their

123456789101112345678920111234567893011123456789401112345611


140

Figure 9.3 Screenshots from Example B: (a) Fred’s view: begins to follow Harry’s gesture,rotating left; (b) Harry’s view: points to “somewhere over here”; (c) Fred’s view: rotates left,following Harry’s arm; (d) Harry’s view: his arm is still just inside Fred’s frustum; (e) Fred’sview: Harry’s pointing arm has dropped; (f) Harry’s view: Frustum passes Harry’s arm andhe drops the point.

e f

c d

a b

visibility by others. However, as the CVE interface does not provide rapidgaze movement, there are also many cases where the frustum makes itobvious that things will not be seen. In this case, the frustum allows acourse of action that might be called “anticipation”. Participants can circumnavigate potential situations in which others are not able to seerelevant features of the environment. This aspect of frustum use is shownin example C. Harry begins by asking which of two large chairs can bedisposed of. Fred is busy trying to place the television in the corner (he is finding this a tricky operation).

Example C (Audio Perspective: Harry)

Harry: which do you think is a- best of these two chairs the big three seateror the (0.3) big one comfy one (0.6)

Fred: (errl- l- let me) wait there (trying to) put the TV thereHarry: cos I don’t think we want this umm (0.3) this big seat that I’m moving

now (.) I’ll move it into yer view so you can see it (0.7) oh: (.) sort of (1.0)Fred: hhhHarry: do you think we need this? (0.5)Fred: need whatHarry: this- this one I’m selecting (0.4) its in yer view hhhFred: well just leave it out (of) the way for the time being (0.5)Harry: well no do you think we need it at all (0.3)Fred: we could do (0.3)

This example shows how participants can use the frustum to circum-navigate potential problems in which a referent might not be seen. Harrycan see the chair is not in Fred’s main view, as it is not within the boundsof his frustum (Figure 9.4a). Therefore, in order to show the chair to Fred,he moves it into his frustum, so that they can discuss whether it can bedisposed of (Figure 9.4c). The frustum allows Harry to anticipate prob-lems with pointing the chair out to Fred, and instead follows a course of action to ensure Fred can see which chair the “comfy one” is, beforediscussing its use.

Harry has some difficulties with moving the chair (displayed to Fredby saying “so you can see it (0.7) oh: (.) sort of (1.0)”). However, oncethe chair is visibly within Fred’s frustum, Harry re-states his question“do you think we need this?” He proceeds to select and deselect the chairby repeatedly grasping and letting go of it. This changes the appearanceof the chair from filled-in polygons to wire-framed polygons. This rep-resentation is supposed to show that an object is being moved. Here,however, it seems as if Harry is making the chair “flash” in order to makeit more visible to Fred. As Fred continues to ignore the question (andobject) Harry’s voice becomes irritated, as he “knows” that the chair isnow visible.

Harry’s talk in this example seems to show how the frustum providesa responsibility for its “owner” to attend to the object at hand. It is not

011

011

011

011

11

141

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

that the object must be viewed, but rather that there is a social, ratherthan technical, explanation for non-attendance. The possibility of tech-nical reasons for not being able to see the object are reduced, becausethe frustum implies that what is contained within it can be seen. Harryseems to assume that technical problems with seeing things are no longera potential “get-out clause” for ignoring the chair. The frustum rendersits “owner” accountable to attend to the reference, once work has beencarried out to reconcile that the potential referent can be seen.

When we see this example from Fred’s view (Figure 9.4d), however,the object and arm (circled) are indeed relatively difficult to see. Thereare representational issues which make the visibility of the object verydifferent between Fred and Harry’s views. Specifically, the edge of Harry’sview extends between the chair and Fred’s location, and thus Fred’s viewof the chair is obscured to some extent by Harry’s frustum. This is theissue to which we turn in the next section.

9.3.3 Occlusion

The previous examples have shown that the co-ordination and course ofactivities can be supported by the frustum. In effect, the frustum bene-ficially transforms the ways in which gestures and object grasps areaccomplished. The last example, however, showed that the effect of rendering a representation of one’s own view on a co-participant’s viewcould not be seen. Despite this problem, participants are often wary of

123456789101112345678920111234567893011123456789401112345611


142

Figure 9.4 Screenshots from Example C: (a) Harry’s view: at the start of the example; (b) Fred’s view: at the start of the example; (c) Harry’s view: holding the wire-framed “comfyone” inside Fred’s view frustum; (d) Fred’s view: Harry’s arms and the wire-framed chair(outlined) are obscured from Fred by his view frustum.

c d

a b

blocking each other’s line of sight. In this particular CVE, sight blockingis especially relevant for a number of reasons:

● there is no “solidity” programmed into the environment and thus participants can walk through each other’s avatars;

● there is no haptic feedback that an avatar collision has occurred. Forexample, one participant can stand in front of another and reverseinto them without realising; and

● it can be difficult to quickly glance around to check another’s line ofsight when their frustum is completely or partially out of view. Forinstance, having one side of the frustum in view may not indicatewhich side it is, leaving vastly different alternatives about where theother participant may be looking.

Example D shows how, as a matter of course, Fred and Harry organisesight-blocking movements. Harry anticipates blocking Fred’s line of sightto the object he is moving (the “desk thing”), and notifies him of the possibility.

Example D (Audio Perspective: Fred)

Fred: I’m gonna move that desk thing (0.3) [out the wayHarry: [I’m just walking across your (0.4) line of sight (I’ll) be out the way in

a minute

Harry is walking towards the edge of Fred’s frustum, near to his avatar.At this moment, he interrupts Fred before he has finished speaking. Hetalks in an apologetic tone, and in a way that implies his turn is an “aside”from the main business of talking about the desk thing. He says “I’m justwalking across your” and then pauses for 0.4 seconds as he crosses Fred’sfrustum, continuing with “line of sight”. Harry is able to provide an apol-ogy for moving across Fred’s line of sight, and obscuring the “desk thing”by virtue of the frustum representation. Blocking Fred’s line of sight ontothe “desk thing” is seen by Harry as something to warn him about.

However, despite the accountability of obscuring another user’s lineof sight, sight blocking occurs more than one might expect within thevideo data. This is because the frustum represents the exact edges of the participant’s view, and thus cannot be seen by that participant (thefrustum is not even rendered on the local user’s view, as it would be invis-ible and slow their speed of movement through unnecessary rendering).It is simply a representation to other users, and only the frusta repre-senting the views of others can be seen. As a result, it can be difficult todetermine the effect of moving your own view on someone else’s view.Example E illustrates how Fred and Harry encounter this problem. Fredrotates his view until it centres on Harry’s avatar. He then rotates tocentre the desk in his view.

011

011

011

011

11

143

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

Example E (Audio Perspective: Fred)

Fred: what you d- d- what you doin with that desk (0.7)Harry: puttin it near a pink chair again (0.4) will you stop blinding me with

that (.) flashlight!Fred: what flashlight?Harry: your [visionFred: [oh

Fred’s perspective is that he moves so that his view shows Harry’s avatarand then the desk. His reply “what flashlight?” seems perfectly reason-able. On the other hand, when the example is viewed from Harry’s per-spective, Fred’s frustum appears across his entire view just prior to hisexclamation (compare Figures 9.5a and 9.5b). Perhaps more importantly,it bisects Harry’s view from the object that he is grasping. The effect ofhis movement on Harry’s world is unavailable to Fred. However, it makesdefinite changes to Harry’s view, to the extent that he asks Fred to “stopblinding” him. Participants can be unaware that their own movementsare causing significant changes to what another sees.

The occlusion effect that a participant is having on another’s view maybe particularly hard to understand. Fred and Harry are standing side onto each other, a common configuration for discussing the same object.It may be hard to imagine that one’s representation would affect anotherwho is not directly in one’s line of sight. In order to blind a person witha flashlight, it makes sense that they would be facing you, not standingto one side.

Example D showed how obscuring with the avatar can be anticipatedand accounted for. However, the view frustum is entirely invisible to its owner. Thus, it causes problems where apparently insignificantmovements of a participant’s view can cause significant changes to a co-participant’s view. If we now return to Example C, it can be seen that,as Harry is selecting the chair (by grasping and releasing it to make itflash between wire-frame and solid rendering), the chair arguablybecomes less visible to Fred behind Harry’s frustum (Figure 9.4d).Whether Fred is not seeing the chair on purpose is not an issue here.

123456789101112345678920111234567893011123456789401112345611


144

Figure 9.5 Screenshots from Example E: (a) Harry’s view: “Putting it near a pink chair again”;(b) Harry’s view: “stop blinding me with that flashlight”.

a b

Rather, it is clear that one participant sees an object as highly visiblewhen, in fact, it is barely visible to the other participant.

9.4 Summary

Our study suggests three key issues to consider when representing users’views in CVEs.

1. Establishing locations and views. Talk about design between partici-pants can be made possible by subtly co-ordinating references toobjects or spaces. The explicit frustum representation enables this co-ordination of talk and gesture. Frusta, unlike traditional avatars,can give precise details about what can and cannot be seen by others.This allows simple cases to be assumed, such as whether an object isvisible to another user. However, it also allows intricate co-ordinationthat is not possible with avatars; it offers the ability to design gesturesfor their visibility by others; and to observe gestures in such a waythat the gesturing user knows their movements are potentially visible.

2. Anticipation of problems with visibility. The frustum representationdoes not comprehensively fix all problems with the visibility of thevirtual world; it simply transforms the problem space encountered by users. However, one key feature of the frustum is that it can allowanticipation of such problems. In turn, it provides the necessaryresources with which users can work around their issues. Thus, it ispossible to insert simple sequences of action (e.g. moving an objectinto someone else’s view) to make features of the world visible toothers.

3. Orientation to occlusion by traditional avatar, but not by frustum. Thetraditional human-like avatar representation is made relevant ininteraction as a potential obscuring device. For example, users arevery aware of blocking the line of sight on which another user may beacting or planning to act. Similar alertness is not generally apparentwith the frustum representation, which often partially obscures theothers’ views.

9.5 Reflections

In this section, we review the implications of our study for the broaderdesign of representations in shared 3D systems. First, some suggestionsare derived for addressing problems of sharing viewpoints and otherforms of 3D representation. Secondly, we discuss the principle of recip-rocal perspectives, including a treatment of “out-of-body” cameraapproaches to shared applications. Thirdly, we consider the ways inwhich views of information spaces can be shared in “unrealistic” ways.

011

011

011

011

11

145

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

In particular, we describe some problems with the pervasive notion of“real-world metaphor’.

9.5.1 Scaleability

Although our study clearly shows some benefits of explicit view repre-sentations, we need to consider the scaleability of enhanced representa-tions. We have provided explicit information about the user’s perceptionand actions by graphically embedding their properties within a repre-sentation. Such an explicit approach, however, can provide challengesfor scalability. The more users we represent, the more visual informationwe must provide. We have observed problems with occlusion with twousers. It is most likely that multiple representations to support multipleusers will compound difficulties. Providing so much information aboutso many activities could blind a user from the activities themselves.Additionally, consider our use of peripheral lenses (Robertson et al.,1997; Fraser et al., 1999) on the interface. Simply providing extra infor-mation about the extent of these distorted views would treble the numberof graphical representations required.

Alternative representations of viewpoints are also likely to suffer fromscaleability problems. For example, spatialised audio representationsmight be used to indicate locations or boundaries without occludingvisual information. These sounds will soon compound, however, to“occlude” conversations between users. Another approach might be toallow users themselves to define their interest in explicit representations(Dyck and Gutwin, 2002). In this case, however, we need to ensure thatthe user controls themselves do not distract from the task at hand. Thus,we propose that representations of users (and, indeed, user controls)need to consider key contextual factors in order to address scaleabilityproblems. Designers need to decide just what representations are appro-priate, and when. These decisions about the context in which informa-tion should be presented must include the use of semantics and subtlety.

Semantically, we need to consider whether the choice of representa-tion matches the application or task. For example, in the case presentedin our study, the task involves decisions about objects and places.Perhaps our system can show which objects are being viewed at any par-ticular moment in a way that fits the task itself? We might derive exam-ples from studies of interior designers experimenting with scale models,observing how they look at and show objects to one another. In terms ofsubtlety, we would like the message conveyed to appear as part of thevirtual task; to appear as a natural metaphor for conveying that kind ofinformation. However, it seems clear that our definitions of appearanceof the representation, and of the context in which that representation isrelevant transform the ways in which that representation is used.

123456789101112345678920111234567893011123456789401112345611


146

9.5.2 Reciprocity of Perspective

Schutz (1970) describes how in everyday life we tacitly assume that ourindividual perspectives are irrelevant for the collaborative task at hand;that everyone encounters objects in the same way, and that if you wereme, this is what you would see. The assumption holds until evidence tothe contrary appears.

This evidence arises frequently in cases involving occlusion with theview frustum. The assumptions that are made by participants regardingreciprocal perspectives are suddenly made problematic, and requirework to overcome. This is not to say that participants are not extraordi-nary in their abilities to “interact their way out of troubles”. Rather it isto say that the reality of CVE interaction is one where reciprocal per-spectives are difficult to tacitly maintain, especially with regard to themutual availability of features of the virtual environment.

In the case of our study, the frustum technique was provided to over-come problems of location and viewpoint confusion. However, ratherthan completely solve these problems, the frustum transforms the prob-lem space that participants encounter in assuming the perspective of theother. The provision of the frustum makes it more difficult for partici-pants to see the effect of moving their view on the world of the other. InGutwin and Greenberg’s (1998) terms, “power” has been exchanged for“workspace awareness”. Providing awareness information of what youcan see can make it harder for you to understand what others can see.

This raises important points about the provision of techniques toprovide awareness in CVE systems. If we consider providing reciprocalvisual awareness, we can broadly categorise CVE techniques into threekinds: camera viewpoints; exchanged viewpoints; and augmented view-points.

Camera Viewpoints

A common technique used in commercial and research CVEs is the useof out-of-body camera views. These generally take the form of a bird’s-eye view over the avatar, or an “over-the-shoulder” view to give a widerperspective on the environment, framing the scene or activity at hand.These techniques are commonly known as tethered viewpoints. Somecomputer games have implemented this technique with automated view-points in relation to the focus of activities within the “task”. For example,a martial arts game can be framed with both avatars “in shot”, or racingcars’ relative positioning shown. However, the success of tethered andautomated views may be tied to the activity being supported. Racing ina driving simulation is quite a different pursuit from discussing objectswith a colleague and making one’s actions intelligible. The problems thatmay be caused in such situations can, in any case, be incorporated into

011

011

011

011

11

147

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

a game as an additional challenge in winning the fight or race; participantproblems can be useful additions for gaming. In a similar vein, activity-oriented techniques are suggested in this book for working with virtualcameras in real-time production activities (Chapter 10). Unlike the situ-ation presented in our study, however, the viewpoint of the camera isalso the locus of work in this case. In other words, viewing activities from different angles is part of the work of camera operators, and there-fore the separation between bodies, objects and views is integral to theproduction process.4

The problems in “seeing what others can see” in our study suggest that difficulties could occur when using these camera techniques for co-ordinating real-time work. Whether participants are provided with theability to change views with respect to their avatar, or whether the inter-face uses algorithms to change those perspectives, problems in interac-tion are likely to arise. Consider how difficulties in discerning the extentof another’s view through their representation can disrupt participants’interaction. If camera views relative to the avatar are used to improvethe user’s perceptual experience of the world, then the avatar becomesmisleading for others not just with respect to their field of view, but also to the very location from which the represented user is viewing theenvironment.

Exchanged Viewpoints

Techniques in CVEs have been outlined in which different views of par-ticipants can be chosen at any moment, in order to support exchange ofreciprocal perspectives; in other words, the ability to actually adoptanother user’s view on the 3D information space, to see what they areseeing (e.g. Valin et al., 2000). However, the changes in metaphorbetween individually controlled views and shared views cause the repre-sentation of users again to become a critical issue. Just what does anavatar represent in cases where the individual perspective metaphor isabandoned?

Augmented Viewpoints

The frustum technique we have presented here comes under an alterna-tive category, “augmentation”, in which reciprocal perspectives are supported by augmenting the realistic appearance of the environment

123456789101112345678920111234567893011123456789401112345611


148

4 Nonetheless, and for the same reasons we discuss in this chapter, the cameraoperators themselves may still need to be aware of each other, cf. Drozd et al.(2001).

with additional information about the “reality” of the user interface (cf.Fraser et al., 1999). Our study indicates that these kinds of technique arefeasible, in the sense that participants are able to use the augmentedinformation as part and parcel of their strategies for collaboration.Nonetheless, there are also ways in which the augmentation itselfbecomes part of the task collaboration.

All three techniques to support reciprocal perspectives may be usefulin one or another situation. However, our study at least tentatively indi-cates that techniques of augmentation can support reciprocal perspec-tives well in situations of “closely coupled” collaborative work.

9.5.3 Unrealism

Although technologies to support virtual reality (VR) will continue to develop, it is likely that a range of obstacles will prevent an all-encompassing realistic experience. There remain severe difficulties indeveloping environments that satisfactorily support interaction betweenindividuals within a convincing virtual world. Current haptic interfacesare crude and expensive enough to be uncommon. Even though therealism of 3D graphics is rapidly advancing, interfaces to realisticallycapture movement and real-time expression for these illustrations lagbehind. Network bandwidth (and the speed of light) means that differ-ent versions of events and causality are presented to different users. Yet, fundamentally, the success of the virtual reality programme as itstands depends on the success of the realism metaphor; it depends onthe willing belief that one is sensing and acting in a computer-generatedenvironment.

It may be worth reconsidering whether the successful design of VRapplications rests upon our ability to completely simulate ordinary real-ities and physical worlds. This is particularly true of CVEs, where thenirvana of a “truly real” virtual world may cloud the fact that a commu-nications technology exists which, however rudimentarily, has the poten-tial to be deployed on currently available hardware. While the overridinginterest in achieving a realistic sense of virtual presence continues, theshorter-term potential of CVEs to provide effective tools to support distributed collaborative work is being undermined.

Note that here we are not suggesting that shared spaces should neces-sarily present less realistic representations in favour of abstract infor-mation. Some might see the frustum representation as more appropriatefor an abstract data space in which coping with a realistic world is lessof an issue. In some cases this may be true; however, we see this as anapplication-specific issue, to be considered in those terms alone. As out-lined in the introduction to this chapter, whether abstract visualisationsor real-world simulations are portrayed in CVEs, success will alwaysdepend on participants’ ability to collaboratively make sense of, and act

011

011

011

011

11

149

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


9

in, the environment. We would like to emphasise that our studies simply question the commonly assumed correlation between realism andusability.

Perhaps it is time to directly investigate the usability of CVEs ratherthan measuring their realism through sense-of-presence and co-presencemetrics. We have shown that representations that might be termed unre-alistic (in the sense of simulating human embodiments) can still, in fact,be used. These representations actually turn out to be realistic (or, rather,practical) in the sense that, unlike humanoid forms, they are a viableproposition for accomplishing collaborative tasks in virtual spaces.

9.6 Conclusions

The implication of the approach we have outlined here is to become farmore pragmatic than trying to achieve a perfect representation of realityin information spaces. CVEs provide a technology that will currentlysupport interaction between distributed participants and enable them toengage and co-operate in particular activities. Our study has shown thatusers are able to accomplish collaborative tasks in this way, given theright resources. Rather than assume that we need to simulate conven-tional realities, we have examined one way in which users can accomplishcertain forms of collaborative work through CVEs. Instead of concealingthe limitations of the technology through photorealistic portrayals, wehave tried to provide individuals with a sense of the constraints of theCVE system.

Our approach has further implications for designers of abstract inhab-ited information spaces. The conclusion of our study must be that real-istic portrayals of inhabitants must be tempered with representationsthat convey the in vivo characteristic properties of an interaction, ratherthan its “natural” properties. Unlike quasi-realistic approaches to inhab-itant representation, therefore, designers might apply our strategy toboth abstract and quasi-realistic spaces alike.

Our study has shown some benefits and considerations of construct-ing representations that allow users to identify and deal with the char-acteristics of the system. We hope to have shown an alternative approachto designing information spaces, by shifting focus away from the generaltrend of hiding their properties and towards providing an enabling tech-nology for remote collaboration.

123456789101112345678920111234567893011123456789401112345611


150

10New Ideas on Navigation and View Control Inspiredby Cultural Applications

Kai-Mikael Jää-Aro and John Bowers

10.1 Introduction and Overview

In this chapter we describe some of the work which was conducted withinthe eRENA project of the EU’s Inhabited Information Spaces researchactivity. This project combined the expertise of partners from comput-ing, social scientific, artistic and television backgrounds to investigatethe application of advanced media technology in cultural settings. Our“inhabited information spaces” were “electronic arenas” for participa-tion in cultural events of an entertaining or artistic nature. In this chapterwe focus on how, in these application areas, we have rethought somefamiliar interaction issues in human–computer interaction (HCI) to offerthe beginnings of an approach which may be useful in such complex settings.

Let us give a little more of the flavour of the application areas whichconcern us and the approach we have adopted. We organise this aroundfive key themes.

1. Mixed reality technologies. All the demonstrators, applications andother developments within eRENA hovered on the border between thevirtual and the real. While we were particularly interested in design-ing inhabited information spaces which were rendered with 3D graph-ical techniques familiar from virtual reality (VR) research, we wereconcerned to do so in ways which were sensitive to our final applica-tions settings (galleries, performance spaces, television studios). Thisrequired us to be concerned about how interaction with the systemmight be impacted by the real world with all its interruptions, non-computer-mediated communication channels and physical lim-itations. Furthermore, as we shall see, some of our demonstratorsdirectly explored physical, embedded interaction techniques in relation to VR.

011

011

011

011

11

151

2. Real-time applications. We were not developing offline methods. Allthe rendering, interaction, sound manipulation and other techniqueswe were working with had to give real-time results. To set ourselvestough test cases, we often worked with formats that allowed for adegree of improvisation on the part of performers or production crew.

3. Large scale participation. eRENA applications went beyond the “oneperson-one computer” paradigm of classical human–computer inter-action into interaction between crowds, over large virtual spaces andphysically separated nodes. We were particularly interested in formatswhich enabled public participation, potentially on a large scale. Thisin turn set us challenges for formulating techniques that could be usedwith minimal training by a heterogeneous set of users.

4. Media rich environments. Nor was interaction restricted to graphicspresented on a computer screen and input through a keyboard, butrather whole-body interaction with physical props, display on multi-ple screens (and alternative projection technologies) and multi-channel audio were more commonly our concern.

5. Cultural context. The applications were intended for culture, art, per-formance and entertainment. We made a practice of regularly dis-playing our work to the public through relevant cultural outlets(television programmes, events at arts festivals, exhibitions, publicperformances). Again this set us challenges for delivering high-qualitycontent appropriate to those settings and the expectations of audi-ences within them. It also gave us the opportunity to flexibly explorethe roles of spectators, performers and producers – while obtainingdirect feedback on the quality of our work.

10.1.1 Challenges for Interaction Design

As we have already noted, such settings and applications set core chal-lenges for interaction design. In this chapter we describe our contribu-tions to addressing these challenges in three areas:

1. Navigation and view control. How does a user choose what to see ofan interactive artistic installation or of a performance? How, withinan interactive digital television show, does a participant’s view relateto the television audience’s view, and both of these to the views avail-able to production and direction personnel? What ordering is givento how views are selected and scenes navigated between? As we shallsee, in eRENA, we experimented with a number of techniques for sup-porting navigation and view control – in particular, ones which extendthe conventional notion of an avatar as indicating a locus of unitaryview control.

2. World and experience design. How are “electronic arenas”, the “inhab-ited information spaces” of art and entertainment, to be designed?

123456789101112345678920111234567893011123456789401112345611


152

What are their constituents? How are they assembled? In particular,how does the structure and organisation of such environments serveto make them arenas for experience, that is, places where certain incidents might be encountered or effects enjoyed?

3. Production management. How does one produce an experience, andyet again, if it is a large-scale experience, how does one co-ordinatethe possibly very many people and computer processes involved inthe production? What are the “behind the scenes” activities like andhow in turn might we offer support for them?

Over the course of our work on eRENA, we came to interrelate ouranswers to such questions rather closely. We began to develop, forexample, world design techniques in tandem with navigation methodsfor such worlds and do so in such a way as to facilitate real-time pro-duction work. We unfold some specific examples of this approach in thischapter. This is done most effectively through presentation of a series ofexamples of cultural events realised in the project.

10.2 Interactive Performances

Many of the cultural events within the eRENA project were realised asinteractive performances in one sense or another (see, e.g. Carion et al.,1998; Bowers et al., 1998a, 1998b; Benford et al., 1999c). The underlyingideas and themes of these spanned a wide spectrum, both aestheticallyand technically. We will first look at the scope of work in general withregards to interactive performances, and then look in detail at some spe-cific examples which involved the improvisation of intermedia environ-ments – that is, where the construction of graphical (and, in some cases,sonic) environments took place live within the performance itself.

Interactive performances can be classified along several dimensions:How tightly scripted are the performances? What can be affected by theinteractor(s)? What is the number of performers? What is the relationbetween “performer” and “audience”? What is the overall physical settingof the performance (staged, promenade, public space . . .)?

Some examples of interactive performances within eRENA demon-strate these factors.

● CyberDance (Carion et al., 1998) consisted of a human dancerequipped with motion capture sensors and a set of realistically ren-dered computer-animated dancers. The virtual dancers would move insynchrony with the human dancer. The choreography was carefullyplanned beforehand, yet the behaviour of the computer-generatedparts was created in real time, in response to the movements of thehuman dancer. The setting was one of a “traditional” staged dance per-formance, but where some of the performers were computer-gener-ated, the audience were off-stage and not involved in the interaction.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

New Ideas on Navigation and View Control Inspired by Cultural Applications

153

10

● To the Unborn Gods (for a description, see Norman et al., 1998) wasa “virtual reality opera”, where a human singer interacted with com-puter-generated figures. Their behaviour was indirectly affected by thesinger’s actions, by an off-stage operator choosing suitable paths.Here too, the setting was very much in the way of a traditional stagedmusic performance, but where some of the actors were screen-projected.

● Lightwork (Bowers et al., 1998c) was an intermedia performancewhere two human musicians interacted with abstract computer-generated forms which were generated in the real time of the perfor-mance. The graphics were created through the actions of one playerand responded to by the other. An audience faced a performance areain a conventional manner. We will describe Lightwork in more depthbelow.

While the above three examples involve interactivity between perform-ers and computer-realised material, they involve quite conventional stag-ings. In each example, an audience has a space associated with them, theperformers have theirs (in the first two examples, this was provided bya theatre stage). Other eRENA performances involved rather more flex-ible relations between the performance space, audience and performers.

● Desert Rain (Benford et al., 1999c; Shaw et al., 2000; Rinman, 2002)featured a number of discrete spaces through which participantsmoved as they engaged with a game-like activity based upon eventsin the Gulf War. At a number of points, actors directly engaged withthe participants, giving them instructions or “debriefing” them. At onestriking moment in the production, each participant interacts with avirtual environment projected onto a “rain curtain” – a fine waterspray which holds a back-projected image. At a key moment, an actorappears between the projector and the rain curtain and slowly movestowards a participant, casting a silhouette in the projected image. Theactor passes through the rain curtain and gives a swipe card (used laterin the performance) to the participant. Desert Rain manifests a com-plex set of relationships between participants, actors and productioncrew. Performances have not been realised in conventional theatre settings (the debut of Desert Rain was in a disused factory).

● The crowd behaviour simulation and animation system ViCrowd(Hoch et al., 1999b; Lee et al., 2000) was used in a number of perfor-mances/installations for interaction by groups of people with com-puter-animated figures. People entering the interaction space weretracked and their combined behaviour caused various scripted behav-iours in the computer-generated figures. As the tracking was onlyposition based, movements in the space could be made arbitrarilycomplex for the benefit of the other humans present, but the virtualhumans would react the same way regardless of how the triggeringconditions were achieved.

123456789101112345678920111234567893011123456789401112345611


154

Let us give a little more detail of one aspect of all this endeavour, wherewe explore interactive performances with an eye to opening up the scopefor improvisation by the performers.

10.2.1 Lightwork

While being an artistic performance combining electro-acoustic musicand the real time construction of graphical environments, Lightwork alsoserved as a “live” occasion to test many of the ideas for navigation andinteraction in virtual environments proposed within the eRENA project,all the more so as the format of an improvised performance wouldsharply bring to the fore any problems with these ideas and their imple-mentation. The design goals, intermingling technical and aestheticissues, included the following:

● The use of a minimum of encumbering apparatus – rather the adap-tation of the standard tools that our performers would most typicallyuse (i.e. electronic music control devices).

● A large element of improvisation in the performance – we wished toavoid the use of prepared world models and instead generate mater-ial algorithmically on the fly.

● The gestures of interaction with the environment being made publicand a legible, meaningful part of the performance.

● “Compressing” and “decompressing” interactive gestures, so thatsmall causes may get large effects and vice versa – the challenge beingto allow experimentation with this, while still ensuring the legibilityof performer activity.

● The creation of an “infinite collage”, where sound, graphics and textare recombined and perceivable from a variety of viewpoints – anyparticular realisation of Lightwork is just one of an indefinitely largenumber of possible ones.

This being a real-time situation where any interaction would have imme-diate and publicly visible effects, a classic direct manipulation approachto interaction was felt to be insufficient in that it both would require thefollowing of similar steps to achieve similar results, and small excursionsfrom the intended path may lead to unintended and unrecoverable effects(in the sense of requiring explicit, publicly visible and time-consumingactions to undo). Rather, we followed an indirect principle of “algorith-mically mediated interaction”. The idea was to interpret the actions ofthe performers as input parameter values to content generation algo-rithms that would create visually rich, animated graphics.

At the same time, the interaction methods were chosen so that allactions would result in meaningful effects – thus movement, while beingadjustable, would happen along paths that were guaranteed to always

011

011

011

011

11

155

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

have something interesting in view. In a sense the performers would beshaping material independently flowing by.

The initial sequence of the performance consisted of a 90-second flythrough through large squares displaying textures representing thevarious themes of the performance. Finally, the viewpoint entered the large sphere that was the container for the graphics during the restof the performance. This sphere was texture-mapped with maps andsatellite images that were automatically switched every 80 seconds.

The viewpoint orbited around the centre of this sphere in either of twotypes of paths:

a. A cloverleaf pattern according to the function r = m + n × sin �t,where r is the radius modulated by the sinusoidal function, n deter-mines the maximum deviation from the base radius m and � gives the period of oscillation. See Figure 10.1 for some samples of this function.

b. A back-and-forth motion through the centre of the sphere.

123456789101112345678920111234567893011123456789401112345611


156

Figure 10.1 Three parameter settings for the “clover leaf” path: (a) m = 10, n = 10, � = 3; (b)m = 20, n = 10, � = 4; (c) m = 30, n = 10, � = 5.

c

a b

The plane of motion was slowly tilted back and forth, in order to giveas many viewpoints as possible – yet the angle of tilt was kept within±�/4 radians in order to retain a sense of up and down. For the type aorbits, the view direction could be chosen to point either in the directionof motion or towards the centre; for the type b orbits the view could pointeither towards or away from the centre. The position of the viewpointwould also determine the spatialisation of the sounds. While this still wasa geometrical navigation method, we had started moving towards a moretarget-centred approach, where the path was not under the direct controlof a performer, but rather the viewpoint control was mediated by analy-ses of performer input.

Within the sphere various objects could be created. One of the per-formers (V) could select between five form generation algorithms withthe parameters of the algorithms being set by means of analyses of performer input. Specifically:

● scaffold, a grid of “pipes” – cylinders or parallelepipeds. The colourof the pipes as well as their density and lengths could be varied.

● formModulator, a tessellated sphere, the vertices of which were displaced according to a frequency modulation algorithm. The samefundamental algorithm was used to create an immersiveForm,sculptureForm or cave, using different parameter ranges. These formscould then be animated, giving the impression of an organic, “breath-ing” shape.

● chamber, a cube with rotated and scaled cubes inserted in its surfaces.● plenumbulator, simple rectangles with text or images, set at random

positions and orientations.● orbiting forms, groups of textures and text strings orbiting around a

common centre.

All shapes except scaffold were texture-mapped with images selectedfrom one of 13 themes. Each new object would step to the next imagetheme.

Only one of each kind of object could be present at any time, but anycombination of them could be displayed simultaneously.

V could create and destroy objects, as well as determine the type of path(a or b orbits, above) by pressing foot switches. V’s activity was analysedby a program we called “The Interactive Narrative Machine”, which com-puted magnitude and irregularity parameters over various time-scales.These (indirectly) determined m and n, the view control parameters,while also setting parameter values for the form generation algorithmsand affecting the shape and motion of the created objects. For a fulloverview of the set-up, see Figure 10.2.

At the end of the 15-minute performance, the viewpoint automaticallyretraced its path through the initial textured screens.

011

011

011

011

11

157

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

The sound was also created and mixed using algorithmically mediatedinteraction methods. There was a large set of prepared sounds, bothphysically modelled and sampled, that could be mixed together live – see(Bowers et al., 1998c) for more details.

Performance Experience

The first performance was realised in December 1997 at the RoyalInstitute of Technology in Stockholm and revealed a number of difficul-ties with Lightwork as initially conceived. While the quality of the graph-ical images (cf. Figure 10.3) and the electro-acoustic sounds was admired,

123456789101112345678920111234567893011123456789401112345611


158

V

T

S

Figure 10.2 The connections between the units. The data from the wind instrument and footswitches are passed to the Interactive Narrative Machine, a Max program on an AppleMacintosh, making analyses of the behaviour of V over three different time periods, com-pressing these into two values, fed to T’s SGI O2, which runs the visualisation software. Thefoot switches determine what objects are to be created at the centre of the sphere as well aswhich type of orbit is to be used, while the data values generated from the wind instrumentset the parameters for object creation as well as for the orbits. The spatial position data areforwarded to the Mac of S, where SO-2, yet another Max program, uses these to spatialisethe music improvised by S.

the indirect connections between elements often made it difficult tounderstand what was going on. Our techniques of “algorithmically medi-ated interaction” involved the analysis of performer activity in a numberof time windows. Effectively, this introduces a smoothing and a delay torelations that V’s activity might have to graphical consequence. As theworlds were quite complex and the analyses of performer activity hadseveral different consequences, this made it hard for the audience (andperformers for that matter) to pick up on the interactive relationships inthe work. V found himself exaggerating his pressing of the foot switcheswhich introduced new world contents or removed objects which werepresent as this gesture had a more immediate and legible effect. In short, while our methods of analysing performer activity and indirectlyusing this to parameterise world building and navigation algorithmsworked in a technical sense, the degree of indirection in interactivity wasexcessive for enabling the audience and performers to track what was happening and why.

In Lightwork there were multiple mappings requiring calibration.Performance data needed to be measured in different time windows. Theresults of these analyses needed to be mapped to parameter values for

011

011

011

011

11

159

Figure 10.3 A chamber in Lightwork.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

the world building and navigation algorithms. In principle, we could varyhow the performance data were measured and scaled, the number andsize of the time windows, the functions which mapped analysis results toparameter values and so forth. Ideally, one would wish to explore manydifferent alternative mapping functions under different performanceconditions. We were careful to set the ranges of values so that (forexample) generated graphical objects would all be placed within theenveloping sphere and that the navigation paths would never be out ofan intended range. That is, a number of calibration issues could be settleda priori. However, this alone does not guarantee that the range of worldsthat will be generated will be appropriate and that the views obtained ofthem will be interesting. This requires an empirical approach to calibra-tion and, before the performance deadline, we were not able to deal withall these matters fully. Accordingly, it was left to performers to trou-bleshoot calibration problems in performance itself. For example, V feltforced to adopt a playing style that would be guaranteed to keep a rea-sonable path, but which was less expressive than desired. However, ourindirect, algorithmically mediated approach to interaction did not alwaysgive performers adequate resources to troubleshoot such problems. Forexample, rather more often than one would wish, the projected view was uninteresting and could not be swiftly adjusted.

10.2.2 Blink

Continuing the ideas from Lightwork, the next performance was Blink(Bowers and Jää-Aro, 1999). Many of the elements of Lightwork wereretained, but expanded with new graphical themes and new texts (includ-ing randomly shuffled three-line poems which could appear superim-posed upon 3D image material). Importantly, we introduced some newinteraction and navigation ideas to avoid some of the problems encoun-tered earlier. In particular, the viewpoint in virtual space could beattached to any of three cameras moving within the space. In such a waythe risk that nothing interesting would be visible was minimised. Cuttingbetween views was supported with a direct interaction technique – sothat “degenerate” views could be cut away from there and then. In short,Blink complemented our ideas of indirect, algorithmic interaction (e.g.to generate world materials and to control some features of cameramovement) with direct interaction (e.g. to cut between views and in otherways override the algorithms if necessary).

This notion of navigation by cutting reveals an influence of a cinematicor television metaphor for view control and indeed, for us, was a directresult of our participation in the television-oriented applications ofvirtual environments which we will describe later. In Blink we definedthree cameras which would tend to produce related but different views(again by analogy to a cinema–television convention of giving different

123456789101112345678920111234567893011123456789401112345611


160

camera operators different responsibilities). Two of the cameras circledthe centre of the virtual environment, one near, the other further out,while an “explorer” would move radially from the centre to randompoints on the periphery and back, giving a further variety of views. Theimages from one of these cameras could be selected for “transmission”or “TX” in the vocabulary of television. Another virtual camera could beselected for “preview” without altering the TX view. In this way, shotscould be prepared and meaningful edits made from one camera toanother. (Ideally, perhaps, all three cameras should have their imagescontinually available with TX selection being a matter of just choosingthe camera required. With the graphics hardware available to us,however, we found that computing four views – the cameras plus TX –was not feasible at the resolution and frame rate required.)

Furthermore, based on experiences both from Lightwork and the RealGestures, Virtual Environments workshop (Hirtes et al., 1999), it wasdecided to make the physical environment of Blink visually richer byusing a large number of screens surrounding the audience, rather thanhaving them as a backdrop for the performers.

Perhaps the biggest change was that instead of relying only on com-puted data for the input, the software was now equipped with a user inter-face with manual overrides for all functions, so that technical problems,bad parameter settings, and so forth could be bypassed during the performance. See Figure 10.4 for a view of the interface.

Blink was performed as part of the Digital Clubbing event at the Now98arts festival in Nottingham, England in October 1998. The venue was TheBomb, a nightclub in the city. On a small stage at The Bomb, 4hero per-formed in collaboration with Carl Craig, who was located in his studioin Detroit, through a video and data link. Two graphics operators wereseated at separate SGI O2 workstations, each running the Blink software,though unsynchronised with each other. Twenty-one monitors were dis-tributed over the premises in seven groups of three. In each group, onemonitor showed the live transatlantic video link into Carl Craig’s studiowhile the others displayed images from the two workstations. In addi-tion a large screen was placed adjacent to 4hero’s performance area,showing a video mix of these three sources. The groups were positionedso that the audience would typically have a line of sight of at least oneset of monitors no matter where in the somewhat cavernous environ-ment of The Bomb they were located.

The initial intention was that Carl Craig and 4hero were to exchangeMIDI data with each other, and thus interact with each other’s music during the performance. The combined MIDI stream was to be analysedto yield input parameters to the graphics generation algorithms. Theanalysis was simpler than the one used for Lightwork in order to make the relation between music and objects more perceivable. In the event, the MIDI connection turned out to be unreliable and it was necessary touse the manual overrides throughout to set the graphics parameters. The

011

011

011

011

11

161

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

123456789101112345678920111234567893011123456789401112345611


162

Figure 10.4 The manual interface to Blink. At the top are a number of buttons, sliders andselection lists that define all the available functions and their parameters and in addition thebuttons that determine what views will be shown. The buttons marked “Near orbit”, “Farorbit” and “Explorerorbit” determine which camera view will be shown, “Left”, “Front”,“Right” and “Back” determine which direction the camera will look towards. “Set TX” willthen actually put this image on the public screens. In this manner the view space can beexplored to find an optimal shot without the audience being subject to the browsing. Textsand images can also be pasted onto the “visor”, i.e. placed right in front of the camera andmoving along with it. These can be tested separately (using the buttons at the lower right)and when a suitable combination has been found, it can be sent to the TX screen.

graphics operators thus worked out a schedule in which they would grad-ually exercise the abilities of the software and independently, but in a gen-erally co-ordinated way, increase the tempo of cutting and the complexityof objects during the progress of the performance. In this way, the dynam-ics of the performance of the Blink software approximately followed thedynamical shape of 4hero and Carl Craig’s musical performance.

Performance Experience

The audience reception to the event was extremely enthusiastic, thoughthis of course is due in large part to the participation of (in their field)popular and well-known musicians, yet it seemed that the graphical work(see Figure 10.5) was considered to add to the event and make it evenmore exciting.

011

011

011

011

11

163

Figure 10.5 The TX window in Blink. We see a scaffolding and within it an immersive form.We can also see a box-like object, half hidden behind the scaffolding; this is one of the othercameras in the environment.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

There were, however, problems with the system, mainly concernedwith the pace of interaction and cutting. A mouse-based interface withlots of on-screen buttons, of which many had to be pressed in a partic-ular sequence, was not conducive for the rapid cutting usually associatedwith (for example) the music videos which typically accompany the kind of music performed. The synchronisation of the “preview” windowand the TX view was only accurate to within a second or so, so the actualtransmitted view would tend to be slightly different from the oneintended. This made it hard to perform closely co-ordinated edits (e.g.where there’s a cut from one view to another at the moment when thecamera collides with an object).

Overall, we felt our concept of multiple virtual cameras and supportinglive editing between them was validated in Blink. The software operatorsrarely found that they had nothing interesting to cut to. The camera pathshad been selected to contrast with each other while yielding occasionaloverlaps of material to permit meaningful edits to suggest themselves.The interface to the Blink graphics software combined direct and indirectalgorithmic techniques, enabling the operator to override inappropriatealgorithmically generated material, while making edits that could be synchronised with the ongoing music. Furthermore, the combination ofdirectly actioned cuts with machine-controlled camera paths allowedoperators to “buy time” for themselves: a camera could be left to explorethe environment while the operator thought about what to do next.

10.3 Inhabited Television

A different interpretation of the concept of interactive performances isthat of inhabited television (Benford et al., 1998, 1999b; Greenhalgh etal., 1999; Craven et al., 2000). Here the concern is (in some way) tohybridise some of the traditional concerns of television (e.g. broadcastto a potentially mass audience) with the interactive possibilities of onlinevirtual environments. The sources we have just cited describe a numberof experiments in inhabited television. In its fullest form, inhabited tele-vision manifests the following features:

● The set for the show is a shared virtual environment.● The actors-performers-presenters as well as (at least some) audience

members access the shared environment through a computer inter-face.

● The audience can take an active role in the unfolding events.● Views from the virtual set are broadcast to those audience members

who cannot be present in the virtual environment.

The persons involved in such a production can thus be divided into thefollowing groups: performers, those who have been engaged to create the show – this can be taken to also include staff not visible in the

123456789101112345678920111234567893011123456789401112345611


164

environment but controlling it, such as directors, camera operators andsupport staff of various kinds; inhabitants, members of the publicpresent in the virtual environment and able to interact with it andviewers, who receive images and sound from the virtual environment,corresponding to the traditional television audience. Inhabited televi-sion, then, can be regarded as a contribution to research on the possi-bilities of digital television and allied new media – one which particularlyemphasises public participation in broadcast events through (character-istically) activity in an “inhabitable” virtual environment. In what followswe develop the theme of this chapter by particularly highlighting theinteraction and navigation innovations that have been explored in thisapplication of VR technology. We do this by contrasting two inhabitedtelevision shows: “Heaven and Hell – Live” and “Out of this World”.

10.3.1 Heaven and Hell – Live

Heaven and Hell – Live was a game show with a theme based loosely onDante’s Inferno. It was broadcast live on British television’s Channel 4 on19 August 1997. The show was realised as a collaboration between threeeRENA project partners: the University of Nottingham, IlluminationsTelevision and British Telecom. Two celebrities and the programme hostwere the performers, and 135 members of the public who had receivedcopies of the software client for the virtual environments (Sony’sCommunity Place) were the inhabitants. The two performers did a trea-sure hunt, an avatar stacking game, a quiz and a gambling game, theintention being that the inhabitants would help the contestants with theirtasks. The interest level by the public can be gauged by noting that notonly the 135 inhabitants but an estimated 200,000 others stayed up in themiddle of the night to watch the show.

While it was a great technical achievement, Heaven and Hell – Livewas problematic as a television show. There were insufficient resourcesto do a full-scale test beforehand. Thus both performers and inhabitantswere working through the game environments, controlling their avatarsand performing their tasks as best they could live. This resulted in theostensible game content – rule enforcement, point counting etc. – oftennot being taken very seriously, as the task of working in the environmentwhile trying to make adequate improvised television of one’s effortsbecame the performers’ primary concern. The loose relevance of thegame format to the show was also noted by the inhabitants, many ofwhom tended to not take the game very seriously either. Furthermore,as the software only supported text-based communication (the per-formers in the studio could of course speak directly to each other), inter-action between performers and inhabitants was slow. As a consequencethe inhabitants tended to ignore the game and instead drifted away tochat with each other.

011

011

011

011

11

165

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

The properties of the interface also made for a strange mix of paces –while the actors were working hard at their respective machines, typingtext and moving through the environment, the view for the televisionaudience was quite a lot slower than normal for an entertainment programme.

Furthermore the Community Place software placed constraints on thenumber of other avatars a user could be aware of at any one time. Thus,inhabitants did not have the impression they were part of a mass-participatory event and, in some cases, became suspicious about theauthenticity of the occasion (e.g. if they failed to see themselves on tele-vision, in spite of being in the camera field). Similarly, for the televisionaudience, the environment often looked quite desolate, in spite of thereactually being well over a hundred people “on set” (see Figure 10.6).Whatever else Heaven and Hell – Live was, it was scarcely recognisableas the participatory game show it was intended to be.

10.3.2 Out of This World

Out of this World was a determined attempt to avoid the problemsencountered with Heaven and Hell – Live and bring off an event which

123456789101112345678920111234567893011123456789401112345611


166

Figure 10.6 A scene from Heaven and Hell – Live. Inhabitants and actors in a graveyardsetting. Reproduced courtesy of Illuminations.

was more recognisable in television terms. Again the game show formatwas followed but various production and navigation facilities were addedinto the show’s software to encourage (or even, if necessary, mandate)appropriate participation. Out of this World was not a television broad-cast but was shown to a theatre audience as part of the InternationalSymposium on Electronic Art in Manchester, England in 1998. (Fulldescriptions of the technologies in the show can be found in Benford etal., 1999b, while Bowers, 2001 recounts the story of the behind-the-sceneproduction work.)

The game scenario was of two teams, one composed of robots, one ofaliens, being pitted against each other in a series of games on a doomedspace station, the winning team being allowed to escape (see Figure 10.7).

There were still two performers that were the main contestants, but inthis case they used an immersive interface with motion tracking and wereplaced on stage either side of a projection screen showing a live editedTX view of the show. The inhabitants were eight volunteer members ofthe audience who were seated by workstations in a separate room, using

011

011

011

011

11

167

Figure 10.7 The opening briefing of Out of this World. The robot team stands in the fore-ground with the alien team in the distance. On the video screen is the game show host andproducer, John Wyver of Illuminations. Reproduced courtesy of Illuminations.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

a joystick interface to move in the environment. MASSIVE-2, the systemused, allowed all inhabitants and performers to be connected by audioand be visible to each other.

Two specific applications had been built in the MASSIVE-2 system tohelp address earlier difficulties. A production support applicationallowed the show to be represented as a series of phases with varied capa-bilities assigned to avatars and objects on a per phase basis. In this way,the movement capabilities of the avatars could be constrained so as toaid their participation in a game depending on, for example, whethergroundplane movement only needed to be supported or whether theavatars could climb upwards. This allowed a single familiar interactiondevice to be used (a joystick) without the user pressing modifier buttons.At key moments also control could be removed from the team membersso as to enable the smooth movement of their avatars to a new location.For example, travellators took the avatars from one game environmentto the next. The transition from one phase to another was actioned by amember of the production crew. This enabled reasoned decisions to bemade about the pacing of the action in discussion with the show’s direc-tor. In addition, some special purpose phases were defined to speedthings along or cover technical failures without suspending the show asa live event.

In addition, a virtual camera interface was developed to assistmembers of the production crew in obtaining meaningful views of theaction. Shots could be targeted on specific avatars or objects. The dis-tance from the target and movement around it could be manually con-trolled to permit various zooming and panning effects. Free navigationcould be engaged to enable “roaming” shots but the cameras couldalways “snap” to a particular target if the operator received an instruc-tion from the director or picked up on a cue from the action itself.

Out of this World was much more successful in realising an eventwhich was recognisably a game show than was Heaven and Hell – Live.The pace of the action was more consistent with what one would expectfrom such a piece of entertainment. The inhabitants were much morefocused on the games within the show and their role in competing againstthe other team in playing them. In large part this was due to furnishingthe production crew of the show with applications which were specifi-cally designed to assist in production management and camera control.It is also important to emphasise, though, the role of the productioncrew’s professional competence in working with these technologies. Forexample, there are, in principle, many different ways in which a gameshow in a virtual environment could be shot and edited from the viewsoffered by the four virtual camera operators. Not every way of workingwith the cameras or of selecting shots would have led to a recognisableproduction of a game show. Indeed, the director experimented notablywith different sets of instructions for the camera operators and differentprinciples for making edits over the course of her work on Out of this

123456789101112345678920111234567893011123456789401112345611


168

World (Bowers, 2001) before settling on a style and pace of shot com-position and editing which seemed appropriate.

The main difficulties with Out of this World as identified by audiencemembers centred on three problems:

● A lack of empathy with the show and its characters. It was hard foraudience members to engage with the concerns of the robots andaliens. The cover story for the show was too thin.

● A lack of a sense of history and context. In particular, no effort wasgiven to elaborating any sense of history for the characters, why theycame to be on the space station, what their past relationships witheach other might have been, and so forth.

● Objections to the game show format. Many audience membersthought that the choice of a game show format was unambitious.

In more recent work on inhabited television, these problems have beendirectly addressed by embedding the broadcast within a whole series ofpreparatory activities which help elaborate a participatory narrative.Avatar Farm/Ages of Avatar prepared for a live Internet broadcast showby providing a series of online worlds and allowing a sense of “commu-nity” to develop over the course of several months. Characters wereintroduced into these online worlds and narrative fragments were sug-gested before four inhabitants were invited to participate in the live eventitself (for details of this work, see Craven et al., 2000).

10.4 Production Management

Whether we are concerned with cultural events in an artistic or enter-tainment tradition, it is clear that some major challenges are presentedto virtual and mixed reality research if those technologies are to be usedto realise the events. In the events described so far, we have addressedparticular issues concerned with world design, and navigation and viewcontrol. In Lightwork and Blink, we were designing virtual environmentsof a visually rich and (hopefully) engaging nature. However, we neededalso to ensure that the worlds were presented in such a way that inter-esting views on them would be relatively easy to find. In other words,there was a reciprocal relation between world design and view controlissues. Technically, we addressed this problem on geometrical andoptical grounds. For example, we “layered” world content within a spherein such a way that chambers could be created and viewed inside and outby switching between an inner and an outer orbiting path. Practically,we provided performers with a variety of resources for creating worldsand, in Blink, choosing between cameras. Our intention was to give per-formers a rich range of resources for world creation and navigation butto structure them in such a way as to make them usable within the realtime of an improvised performance setting.

011

011

011

011

11

169

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

As in the research on inhabited television, we realised early on that itwas impractical and unnecessary for performers, inhabitants or the oper-ators of virtual cameras to have unconstrained six degrees of freedom(6DOF) motion when trying to frame shots in the environment. It wasall too easy to overshoot the target or end up at an inconvenient loca-tion. To remedy this, more application-specific movement methods weredeveloped for the later inhabited television productions to enable cameraoperators to focus on particular persons or objects and move aroundthem without losing them from sight. These camera vehicles could, forexample, constrain their motion to the surface of a sphere centred on therelevant object, thus simplifying movement to a 2DOF (plus zoom) oper-ation. In Out of this World, the virtual cameras interworked with a production management application so that, for example, cameras couldbe moved to set locations at the beginning of a new phase or be able totake as the object of their shot a particular entity which might be impor-tant to the events in the current phase. In all these respects, productiondesign and management is a matter of conjointly designing virtual envi-ronments and navigation control so as to simplify the real time burdenon participants in live performance.

10.4.1 Finding and Framing the Action

We have already mentioned that the director of Out of this World exper-imented with a number of ways of organising the camera operators andselecting from the shots they gave. One of the reasons for this is that intelevision, as in cinema shot composition, it is not just a matter of select-ing the right target and framing it. One has to compose shots and editbetween them to convey a sense of the action in the scene. Sometimes itis enough to select a particular target (e.g. an actor or avatar performinga critical and clearly legible action). At other times it is vital to show char-acters in relationship to each other, or in mutual relationship to objectsin their environment, or juxtapose one shot with another to capture theaction. The problem of finding the action and framing it is even moreacute when there is a large number of participants (e.g. in Heaven andHell – Live). Clearly, this is a different order of view control problemfrom those classically discussed in 3D computer graphics or virtualreality. How can we find and frame the action? Is it possible to designview control and navigation techniques which directly support thisrequirement?

To begin to investigate these issues, we developed SVEA (Sonificationand Visualisation for Electronic Arenas), a tool which enabled us to viewthe patterns of people moving in the space and to place cameras in theenvironment. The intent was that these cameras would have semi-autonomous behaviour, left to themselves they would roam the envi-ronment, finding the hotspots of activity and frame a shot so that it would

123456789101112345678920111234567893011123456789401112345611


170

capture the group of actors at that point, attempting to (depending onthe settings) maximise either the number of faces or profiles in view. Itwas also possible to add additional weight to specific performers, so thatthey would be “favoured” in view. A human operator could then takeover to make any adjustments, or even move the camera somewhere elsedeemed more important at the time. We give further details of our tech-niques for camera deployment shortly.

Attempting to further enrich the impression one could have of thebehaviour of a populous crowd we experimented with sonifying anumber of parameters, so that one would be able to hear, for example,the levels of aggregation. (Further details of our sonification strategiesand a preliminary experimental test of the sound model we used can befound in Bowers et al., 1999).

To find the potential areas of interest in an environment, we made anovel use of the Spatial Interaction Model (Benford et al., 1995b). Thecentral ideas in this model are that in a virtual environment (potentially)every object has a focus and a nimbus, where the former is (approxi-mately) an abstraction of an object’s “attention”, and the latter (approx-imately) an abstraction of an object’s “projection of presence”. Typicallythese are taken to be functions over space, but it is possible to extend thenotion over time as well (Sandor et al., 1997). The combination of focusand nimbus can then be used to model an object’s awareness of anotherobject: the awareness A has of B is some function of A’s focus on B andB’s nimbus on A. An ordinary language approximation of these notionsmight be: my awareness of you depends upon my level of attention toyou and the degree to which you are making your presence felt. Benfordet al. (1995b) and Sandor et al. (1997) show how (given certain assump-tions) awareness levels can be quantified in continuous and graphed“spaces” respectively. These notions have been used in a number ofsystems for collaborative virtual environments to control the level andkind of detailing in rendering in shared environments among otherapplications.

To support camera control and view selection in virtual environmentson the basis of finding and framing activity, we make use of the spatialmodel on the assumption that objects (most specifically avatars) in theworld will be differentially aware of the various parts of the scene andwill themselves orient towards things of current interest. In principle, we can compute various “activity landscapes” using focus, nimbus or calculated awareness. For example, we can determine the sum for everypoint in space of the nimbus at that point contributed by every object inthe environment, giving a “nimbus landscape”. An awareness landscapecould be determined in the corresponding way using the awareness levelfor every object of others at each point. A nimbus landscape would give high values at points where many objects are nearby, the latter gives high values at points where, in addition, many objects are focusing uponeach other. Such landscapes can then be visualised as a “heat map” of

011

011

011

011

11

171

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

the environment, where the levels in the activity landscape are displayed.In SVEA, we simplified this notion in a number of ways. We visualisedonly avatars. As motion in these particular settings was taking place ina horizontal plane we could justifiably ignore the avatars’ heights abovethe ground plane. We computed the heat maps for just the points whereavatars were located. This allowed us to visualise the avatars as small triangles (showing position and orientation) in a 2D “overview” envi-ronment and colour them to show the landscape value at their location.Naturally, other visualisation techniques could be experimented with –see (Hirtes et al., 1999) for discussion.

A visualised activity landscape (map) could be used for navigation andview control in a number of ways. First, directly: a user could move toareas of interest on the activity map by directly interacting with it. Thatis, one might click on a location and be teleported to it. Alternatively, themap could be used to inform the use of conventional navigation controls.That is, one might see an area of interest on the map and move towardsit using whatever one’s current navigation vehicle might be. Finally, onecould use the activity map algorithmically. For example, a region of themap might be selected (e.g. by drawing around it) and a camera locationand orientation computed which maximises the number of avatar “faces”or “profiles” in the region that are in shot. In Hirtes et al. (1999), wedescribe a number of algorithms which could do this and similar viewcomputations given a selected set of objects to display in shot.

To give another example of how an activity map can be used algo-rithmically, consider its use as a map of dynamical “potential” whichwould govern an autonomously determined camera movement. As anexample, we implemented what we termed a “puppy camera”, a camerawhich would follow the gradient of awareness until reaching a localmaximum and, if this point was stable, stay there for a while until thecamera gets “bored” and sets off in a random direction to find a newlocal maximum – a behaviour analogous to that observed in puppy dogsthat the authors have encountered. This camera thus roams an environ-ment seeking out interesting places and capturing views from these. Weexperimented with a population of four puppy cameras. In order to keepthem from flocking to the same highly interesting event (something thatwas a concern even with human camera operators in the inhabited tele-vision shows we studied), the cameras could avoid each other by associ-ating a force of repulsion to each of them. As a proof of principle, weused SVEA to present data captured during Heaven and Hell – Live (formore information, see Hirtes et al., 1999 and Figure 10.8).

10.4.2 The Round Table: A Physical Interface

To concretise our work with SVEA, we needed to take the applicationbeyond a proof of principle and prototype a use scenario. In the work

123456789101112345678920111234567893011123456789401112345611


172

on inhabited television we have discussed, shot selection and visionmixing was done by the director using conventional television equip-ment, a video line entering a mixing desk from each workstation runningvirtual camera software. In our initial version of SVEA, shot selectionand camera deployment were accomplished by mouse operations. Whileour camera algorithms worked very well, it was awkward to manuallyorient cameras in the environment with a mouse-driven interface.Furthermore, if the mouse was moved to select a new camera, time delayswere introduced in comparison with the simple button press used to

011

011

011

011

11

173

Figure 10.8 SVEA visualisation of a crowd based on the data recorded from Heaven and Hell– Live. The inhabitants are represented by coloured triangles, showing both their awarenesslevels (represented by saturation changes rather than colour scales in order to accommodatecolour-deficient users) and their orientation in the plane. Open triangular objects representcameras and their immediate field of view. As we did not have access to the geometry dataof Heaven and Hell – Live, the background is empty, but the geometry is of course triviallyadded. At the bottom is a timeline showing the current time in the data set – in the pre-recorded data set a user could move back and forth in time at will, otherwise the timestepwould be updated at the rate of incoming data.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

select an input channel at a conventional video mixer. The direct yetspeedy manipulation of a SVEA visualisation to compute camera loca-tions and select views seemed to require a novel interface solution.

In collaboration with Michael Hoch (then at the ZKM in Karlsruhe inGermany), we worked with the Round Table as a physical interfaceapproach. The Round Table itself is composed of a projection surfaceand a camera (see Figure 10.9). Optically tracked physical objects(“phicons”) can be manipulated to accomplish interaction operations,their positions and orientations being reported on by image analysis software. Bowers et al. (2000b) describe a number of applications of theRound Table including a virtual space sound mixer and as a presentationdevice for interactive artworks.

We projected SVEA visualisations on the table, placing differentlyshaped physical objects to represent cameras, to select groups of objectsfor framing by algorithmically deployed cameras or for zooming up fordetailed scrutiny. These latter two facilities made another use of the ideaof an awareness landscape. Selection of a group for the camera would bedone by placing a phicon at a particular point. Those avatars that wereaware of that point would then be selected for a camera shot. Likewise azoom would be done so that the view grew to contain just those objectsaware of the given point. By default there were four puppy camerasroving around, but the placement of a camera phicon would snap one ofthe cameras to that view overriding the puppy behaviour. If a markerwas placed on the camera phicon, that camera was selected for trans-mission (TX). In the real world environment of the Round Table, we

123456789101112345678920111234567893011123456789401112345611


174

Figure 10.9 Michael Hoch usingthe Round Table. Inside thecylindrical bottom is a projectordisplaying images on the mattglass tabletop. In the can abovethe table is an infrared lamp anda camera with an IR filter. Themanipulable props are coatedwith IR-reflective tape makingthem visible to the camera.Reproduced with permission from Michael Hoch.

placed other screens to display views from cameras and TX. Thus, we areenvisaging a scenario in which users can share the physical environmentof the table and co-operatively work using it as a shared display to manip-ulate views of a large-scale virtual environment. Depending on the appli-cation these users could be behind-the-scenes production crew, selectedinhabitants, or other participants.

We found that interacting with physical props, while allowing all theadvantages we had foreseen – rapid manual interaction by multiple usersin close proximity – raised new issues. Of course, the phicons would notmove of their own accord when the display was updated. This meant, forexample, that when zooming in on a section of the view, any previouslyplaced camera markers would remain where they were while the objectsthey were presumably trained at would move away. Obviously this couldnot be taken to mean that their corresponding cameras would jump tothose new points, so instead we chose to de-assign those cameras and letthem continue moving on their own until picked up again by a cameraphicon.

This meant we had to change the metaphor from placing represen-tations of cameras to placing a tool that would in turn place a camera.This is somewhat indirect, but perhaps not more so than the indirectioninherent in using a mouse to move objects on a separate screen. If sucha solution seems opaque, then alternatives would be to disallow zoom-ing or to compute zooming algorithms which distorted the display tomaintain at least some of the camera-phicon associations, among other possibilities.

10.4.3 Conclusions

The most important points of our work with production managementare the following:

● Real-time interaction. As noted in the introduction, all applicationshad real-time requirements, thus our camera-control algorithms alsohad to operate in real-time. We could not prepare camera pathsbeforehand but had to be able to find at least approximate shots auto-matically as events unfolded.

● Large-scale participation. We wanted to be able to accommodate largenumbers of participants in virtual worlds, and therefore could notrestrict ourselves to only following a few select performers, but ratherwe had to be able at any time to move the viewpoint to somewhereelse in a possibly large space in order to capture an interesting eventand also being able to notice that such an event was taking place.

● Understanding rules of practice. We need to start from how televisionactually is produced. Our suggested interfaces indeed are quite dif-ferent from what is used by television producers today, but they do

011

011

011

011

11

175

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

not in themselves preclude the tasks that need to be done, even if theysuggest alternative fashions in which to perform them.

● Division of labour. Our systems are created with the goal of allowinga working group to fluently and in real time divide the work betweenthemselves as well as between computer system and human workers– a camera can be passed from autonomous algorithm to humancamera operator, to a different operator and back to the computerprocess, all by simply moving a wooden block over the projectionsurface.

● Improvised action. A script can help camera operators and directorsto plan their work but an inhabited television event is by necessity atleast partially improvised on account of the invited inhabitants whichmake every show a live event, so it must be possible to fluently adaptone’s work as needed.

● Activity-orientation. Camera direction is not concerned with geome-try but with activity. Therefore we have developed methods for findingactivity and orienting cameras in relation to it.

10.5 Discussion: Navigation, Presence and Avatars

We have been discussing some of the work from the eRENA projectwhich was concerned with supporting large-scale real-time interactionin cultural and entertainment events using virtual and mixed reality tech-nologies. In these settings we have described the interaction techniquesand world design principles we have worked with in a number of artis-tic and entertainment applications. While we have presented some spe-cific applications, these demonstrate some general design principles. Byway of discussion, we situate our design principles in relation to generalquestions of interaction, navigation and avatar design for virtual andmixed reality environments.

10.5.1 Avatar-centred Navigation

Classically, navigation in virtual environments is taken to mean some-thing like “move my avatar in a certain direction with a certain orienta-tion at a certain speed”. Various restrictions may be placed on whatdirections and speeds are allowed – e.g., one may use a “walking” or“driving” metaphor where the avatar is restricted to move along a groundplane, walls and other objects may be impenetrable, or one may use unre-stricted six-degree of freedom motion, passing through any object at will.

A typical experience is that it is preferable to have additional con-straints on motion, in order to avoid ending up at an awkward anglehalfway through the groundplane, unsure of how to manipulate one’s

123456789101112345678920111234567893011123456789401112345611


176

interaction device so as to straighten up and stand on the ground again.In part this may be due to mismatches between the input devices and theavailable degrees of freedom, mouse-based interaction, for example,requiring modifiers and/or modes in order to support all six degrees offreedom. Still, even when using 6DOF devices, it is often difficult to reachone’s intended goal at the intended orientation.

We also questioned whether it is necessary to actively steer one’s avatarevery single step of the way, if one already knows where one is going. Acommon experience is that most of the time in virtual environments isspent on navigation, getting desired objects into view and then movingto them, both taking up valuable time and being a cause of frustrationwhen one’s path goes astray or the intended target cannot be found.

Accordingly, when we have encountered avatar-centred navigation inthe work reviewed above (e.g. to support the inhabitants in inhabitedtelevision), we have found it most effective to constrain motion and mapit in an activity-informed way onto simple (low DOF) navigation devices.Not only does this make for easier participation, it also makes the distribution of avatars more predictable, thus facilitating production andcamera work. In short, rather than work with generic motion and self-representation notions, we believe that the capabilities of an avatarshould be specifically configured for the activities it is to engage in andso that its conduct can be appropriately picked up by others.

10.5.2 Object-centred Navigation

Another common navigational paradigm is based on objects in the envi-ronment. Viewpoints are predefined points that the designer of the environment deemed likely to be interesting for a visitor. By choosingthe name of a viewpoint in a list, one’s avatar is transported to that spotand encounters an intended set of objects there. The transportation maybe a smooth animation through the environment, or it may be an instan-taneous relocation, a teleport. A teleport may also be undertaken to somearbitrary co-ordinates; this requires that there is some kind of map ofthe environment, or that the co-ordinates of some likely spot can beeasily ascertained and saved, for example, as a URL. Target-based move-ment allows the user to point at some visible object in the environmentand be brought there, again either through a smooth animation orthrough a teleport.

We have reported explorations which have further refined object-centred navigation so that, as in the later inhabited television experi-ments, significant individual avatars or the “centre of gravity” of a groupof them can be targeted. The camera control application used in Out ofthis World also supported viewpoints but, significantly, these were inte-grated with a production management application to support a workingdivision of labour between a show producer and a camera operator.

011

011

011

011

11

177

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

Furthermore, once targets were selected, view control was informed bythe need to produce coherent camera shots, rather than only allowingunconstrained motion. Our experience, then, is again that navigation andview control need to be informed by the practical activities that partici-pants are performing (e.g. as a camera operator or as a show producer)and specifically configured with those ends in mind.

10.5.3 Activity-oriented Navigation

We wanted to go further in the direction of informing interaction interms of the practical activities that participants are engaged in by supporting navigation and view control where the intended target is not a geographical spot as such, but a place of interest and activity. Wehave described an approach which employs a model of awareness andactivity in virtual environments to make inferences about where the “hotspots” might be so as to enable transportation there – in effect like ahelpful taxi driver in a foreign city when asked: “Take me where theaction is.” We described how navigation informed by representations ofactivity (activity maps/landscapes) might interwork with algorithms for view computation, for example, to maximise the number of faces ofa group of avatars who are at a particular hot spot. While we presentedthese notions in terms of the deployment of cameras in the productionof events, similar tools could be made available to inhabitants and otherparticipants so that, for example, they can visualise an overview of activ-ities and occurrences that they happen to be interested in and manipu-late this to control their movement in the environment.

10.5.4 Navigation as Montage, Dispersed Avatars

A consequence of the above is that we may loosen the idea that naviga-tion is the moving of an avatar through space and make it more akin toreal-time image cutting in television or film production, selecting frommultiple viewpoints. A teleport still uses the metaphor that there is asingle avatar, though one that can be instantaneously transferred fromone place to another. But consider the following scenario: there are mul-tiple users, each represented with an avatar, indicating their position inthe environment and serving as a focus for the interaction of others withthem; at the same time there is a pool of cameras that any user can accessto check out what’s going on in the rest of the world, and on finding aninteresting place, can ask to be teleported there.

Now, in a sense the cameras in this scenario are parts of the user’savatar, as the user is able to perceive the environment from the view-point of a given camera, yet it is not a unique access point, as severalusers may be peering through the same camera. If the cameras are

123456789101112345678920111234567893011123456789401112345611


178

extended with the ability to manipulate the environment, we have completed the dispersion of the avatar.

Furthermore, from this perspective it can be noted that our RoundTable implementation of SVEA disperses views not only over multiplecameras, but also their control over several physical devices which in turncan be shared between several users. In this way, we allow for flexiblemappings between users, peripheral devices, displays, views, camerasand avatars.

10.5.5 Accomplishing Presence and Intelligibility

We are prepared, therefore, to entertain some radical consequences of our work for interaction concepts in virtual and mixed realities.Components of the avatar “function” might be distributed and disem-bodied (e.g. there is no necessity for cameras to be “owned” or visu-alised). Of course, we may wish to constrain the picture for particularpurposes in design. Our emphasis throughout has been on informingnavigation and related issues on the basis of general event design andproduction considerations. Particular events may indeed require a tra-ditional avatar concept. Other events may allow or require very differentapproaches. Furthermore, one may need to support different realisationsof the relationships between navigation, view control and embodimentfor different participants in the same event (e.g. a traditional avatar foran inhabitant, a dispersed capability for production work).

By decoupling navigation from view control and by varying all thesematters with respect to avatar-like embodiment, it might be argued thatwe are inviting complexity and unintelligibility. After all, much conven-tional thinking about traditional avatars and interaction in multi-uservirtual environments is based around assumptions about how avatarscan be used as a resource for mutual inferencing about participants’conduct and perception. For example, if I can see your avatar at such aspot oriented in such a way, I should be able to make inferences aboutwhat you can see and, perhaps, what you are doing. On our more dis-tributed and variably embodied model, how can this be true any more?

Interestingly, Bowers et al. (1996) present ethnographic evidence thatit is not the case that such inferencing occurs automatically and non-problematically in all circumstances in conventional shared virtual envi-ronments. Participants commonly check whether their sight of another’savatar does indicate the other’s presence before interacting with them,for example, by pre-calling the other’s name. Avatars might be creativelyused to indicate the momentary unavailability of their user, for example,by laying them down. Avatars might change “ownership” as someoneelse in the real world takes over at the workstation. Other communica-tion media might be used alongside the virtual environment itself to troubleshoot problems and sort out identities. In short, the presence of

011

011

011

011

11

179

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


10

persons in a virtual environment and the intelligibility of their actionsare matters which are accomplished using designed technologies along-side whatever other resources participants have for making sense of whatis going on. Participants “naturally” decouple navigation, view control,embodiment and activity, and reassemble them in their practical activ-ity within virtual environments in whatever way is appropriate to the taskat hand. From this point of view, our advocacy of an approach to inter-action which regards navigation as montage and disperses the traditionalcapabilities of the avatar over multiple perceptual loci turns out to be amodest proposal.

123456789101112345678920111234567893011123456789401112345611


180

11Presenting Activity Information in an InhabitedInformation Space

Wolfgang Prinz, Uta Pankoke-Babatz, Wolfgang Gräther, Tom Gross, Sabine Kolvenbach and Leonie Schäfer

11.1 Introduction

A group of people that work together in the same spatial environmentrelies on various possibilities to observe the activities of partners, tomonitor the progress of a project or to recognise the development ofgroup structures. Such activities are often recognised peripherally; nev-ertheless, they stimulate spontaneous and informal communication.Although the computer supported co-operative work (CSCW) researcharea yielded a number of systems and solutions that enable and supportdistributed co-operation processes, distributed work is still significantlymore difficult to manage than co-located work. A significant reason forthis is the missing perception of the activities and actions within a dis-tributed group. Therefore, distributed groups often suffer from a lack ofawareness of the common activities. Co-operation partners are often notaware of activities of other co-operation partners that are relevant fortheir own work. The synchronisation problems resulting from this oftenlead to decision problems, misunderstandings or duplicated work. Thus,effectiveness, spontaneity, and social orientation possibilities in distrib-uted teams are limited. The social forces which facilitate the behaviour-milieu synomorphy in an environment – that is, in a behaviour setting(Barker, 1968) – are very limited in electronic spaces. Awareness supportcan make the difference between an electronic behaviour setting and apure electronic space (Pankoke-Babatz, 2000).

Apart from the lack of awareness of actions that could be co-opera-tive, there are limited opportunities for chance meetings. In the localworking environment, coincidental meetings often initiate communica-tion and the exchange of experience and knowledge. Prussak (1997)describes this phenomenon very appropriately: “If the water cooler was

011

011

011

011

11

181

a font of useful knowledge in the traditional firm, what constitutes avirtual one?”. Social contacts that are initiated by chance encounters atcopiers, printers or coffee machines are important for the social orien-tation, the mutual exchange of information and knowledge, and the co-ordination of shared activities (Swan et al., 1999). The Tower system thatis presented in this chapter addresses this problem. Tower (Theatre OfWork Enabling Relationships) provides different approaches for thesupport of awareness and the creation of chance encounters for local anddistributed teams.

First, we describe requirements and the methods we applied forrequirements analysis. This is followed by the architecture of our aware-ness environment. Then we introduce different means for the presenta-tion of activity information using a portal, Smartmaps, a 3D environment,or ambient interfaces. The chapter concludes with lessons learned aboutawareness in distributed electronic settings.

11.2 Related Work and Requirements

The importance of awareness for CSCW was initially described andanalysed for synchronous co-operation processes (Dourish and Bellotti,1992). Later, different approaches were presented to stimulate awarenessof shared activities through different awareness widgets such as multi-user scroll bars, or radar views (Roseman and Greenberg, 1996). For thegeneric support of synchronous applications, infrastructures have been developed that support the exchange of synchronisation and noti-fication events between different applications by the provision of a notification server (Patterson et al., 1996; Segall and Arnold, 1997).

Awareness is of equal importance for the support of asynchronous co-operation processes. For successful co-operation it is essential thatusers are informed about relevant activities of co-operation partners ina situated and intuitive way (Schlichter et al., 1998). Most approachesthat provide asynchronous awareness visualise user actions by appro-priate awareness icons, that indicate current or past activities on a sharedobject (Sohlenkamp et al., 2000). Infrastructures to support asynchro-nous awareness have been presented in Lövstrand (1991) and Prinz(1999). Video-based media space (Gaver, 1992; Lee et al., 1997) or videowalls have been developed to bridge spatial distance between differentlocations and to support chance encounters and ad hoc communicationbetween different places. In Benford et al. (1997c) such an approach isapplied in a VR environment to provide chance encounters for peoplewho browse the web.

The Session Capture and Replay system is an example of a system thatallows other users to replay past actions. This is particularly helpful forusers who join the group process later. The system captures users’ inter-actions with an application and stores data into a session object. Group

123456789101112345678920111234567893011123456789401112345611


182

members can annotate, modify and exchange the session objects. Sessionobjects consist of data streams representing captured interactions andaudio annotations, which users can add while their interactions are cap-tured. When a session is replayed, data streams are re-executed by theapplication (Manohar and Prakash, 1995). Begole et al. (2002) did sta-tistical analyses on the availability of users of an instant messengersystem and found out work rhythms. These work rhythms are used toproject the current or future availability of other users.

Common to all these approaches is the concentration on a particularco-operation type (synchronous/asynchronous) or a restricted applica-tion domain. However, co-operation processes are not limited to a par-ticular application or a specific co-operation type. This is why Toweraims to develop a model that supports awareness across the boundary of different applications and that can visualise activity information usingdifferent presentations. The system provides an application-independentinfrastructure that receives event information submitted by user activi-ties or activity sensors. It stores and administers the event database andforwards events for the presentation by different indicators to usersbased on user-specific interest. Such an infrastructure is able to providea comprehensive awareness support for different co-operation situa-tions. In the following, we describe different scenarios and the resultingrequirements for an awareness environment.

A prerequisite for the smooth operational sequence of a co-operativeprocess is the seamless interlinking of the actions of different teammembers. The availability or explicit provision of information orworking objects that are required for the next activity often initiate useractivities. To minimise meta-communication by which co-operationpartners tell each other what they have done, it is necessary that notifi-cations about relevant activities are produced automatically and pre-sented to the user in the appropriate situation (Pankoke-Babatz and Syri, 1997).

In addition to the immediate notification, users require a presentationthat provides an overview of the current activities in the co-operativeenvironment. Such an overview must be user configurable to allow eachuser to adapt it to their specific needs. We describe such a user config-urable awareness portal later in this chapter.

Users who work on a shared object or within a shared workspacerequire object-related information about the activities of others. Mostsystems provide activity symbols that indicate recent activities by otherson an object. In addition, it is helpful to get an overview of all activitiesin a shared workspace or environment. We will show how such anoverview can be provided using Smartmaps indicators that are based ontree-maps (Johnson and Shneiderman, 1991). In addition to the indica-tion of activities, they also support an activity-based navigation in ashared workspace. This is exemplified by the integration of Smartmapswith BSCW shared workspaces (Appelt, 1999).

011

011

011

011

11

183

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Presenting Activity Information in an IIS

11

The relevance of peripheral awareness to support mutual orientationwas identified in Heath and Luff (1991). In addition to the summarisedpresentation of awareness information via a portal, an environment suchas Tower must provide possibilities for the presentation of user activi-ties in the user’s periphery. For this purpose, Tower applies a 3D envi-ronment in which user activities are visualised automatically. The layoutof the 3D environment is based on shared working contexts. Thus, theenvironment also represents meeting places where people who work ina similar context but at different geographical locations meet coinciden-tally. We will show in this chapter how such a world is created and howuser activities are represented in the 3D environment.

The settings discussed so far describe synchronous communicationpatterns. However, the support for long-term project work requires thesupport of asynchronous awareness. As also shown for video-basedmedia spaces (Fussell et al., 2000), the continuous observation of remotepartners does not suit team needs. Therefore, support for the presenta-tion of past activities was developed in Tower. This supports users whoneed information about recent activities after a temporal absence fromthe co-operation process. A similar approach is described in Greenhalghet al. (2000a). In the DocuDrama section of this chapter (Section 11.8)we describe how Tower addresses this issue.

Although awareness is studied extensively in synchronous and collo-cated work situations (e.g. Heath and Luff, 1991), needs in asynchronousand distributed settings are still under-researched. Tower has providedthe opportunity to study awareness needs in different settings and to usea tangible artefact (Brave et al., 1998) to elicit further needs and to discussthe potentials of technical support in more detail with potential users.The results of these user studies augment the description of the presen-tation tools described in the following. Finally we present the lessonslearned about awareness.

11.3 User Involvement and Studies

From the very beginning of Tower development, user groups participatedin the project team. Their participation enhanced the project team’sunderstanding of user needs with respect to awareness. Planned Towerfeatures or prototypes were discussed with the users and their early feed-back improved the system design. We selected such an iterative designprocess based on qualitative evaluations since the co-operation processessupported by Tower were also long term and asynchronous.

Studies of awareness needs and requirements for technical supportwere performed in two different ways. On the one hand, discussion ofneeds and the usability of awareness features were discussed with poten-tial users of application partners. Here, needs for awareness support werestudied by means of interviews and workshops. On the other hand, the

123456789101112345678920111234567893011123456789401112345611


184

team process of the distributed Tower team itself was observed through-out the whole course of the project.

11.3.1 Partner Settings and Evaluation Methods

One application partner was a small German company with about 20 staffmembers. Its major aim was the provision of web-based support mainlyto engineers. The other partner was one of the world’s leading providersof civil engineering consultancy and support services currently expand-ing to about 10,000 employees. Representatives of the two Tower appli-cation partners also became members of the Tower team.

The collaboration began with expert interviews and expert workshops.We discussed the Tower features planned and under development as wellas particular needs at their application partners’ sites. This was followedby five user workshops performed at the application partners’ sites.Between three and ten staff members from the respective applicationpartners participated per workshop. Workshops comprised three kindsof topics: the need for awareness support in the respective settings, thedemonstration of Tower features and finally the discussion of the poten-tial usability of those features. Thus, Tower features were used as tangi-ble interfaces to elicit further user needs.

The Tower development team was distributed across four companiesin two countries. In addition one expert from each application partnerjoined this team. In total, about 40 people had access to the Tower work-space that was used to co-ordinate the project work. Among them about15 used it regularly. In the course of the project we could study howawareness needs changed depending on the team process and on partic-ular work situations. Seven persons joined the project at a later stage.From them we could study the particularities of the awareness needs ofa newbie.

The course of the Tower team process was documented continuouslyfrom the members’ perspective. In addition, technically measured data– that is, recorded event data – were analysed to understand the teamprocess, the relevance of media usage, and the requested awarenesssupport in the respective phases of team work. This enabled a detailedview into the awareness needs but also into the way in which awarenesswas achieved in a distributed work setting. This also informed our studyon how work activities are reflected in the recorded event data and howthese can be processed to provide suitable awareness notifications.

The diversity of application partners in Tower has provided the oppor-tunity to study various different work settings and to find out about theirparticularities with respect to needs for awareness support. With theintroduction of early prototypes we were able to test potential effects onusers. Where there was positive acceptance of a feature, we could learnfrom users how to improve it. Where users objected to or rejected a

011

011

011

011

11

185

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

feature, more detailed analysis of the underlying rationale was needed.We had to analyse whether the rejections were due to shortcomings ofthe feature, which could be overcome, or whether they were due to misfitswith their needs and misassumptions about awareness.

Interestingly, any feedback we gained from the users also discloseduser-specific needs for awareness support on the other hand. Thus wecould learn a lot about awareness needs in general. We found that theneeds for awareness depend on the different settings, the particularitiesof the co-operation cultures, the modes of working and the current workprocess and the individual interests of an actor in a particular situation.When Tower started the major aim was to design a theatre of work, i.e.a 3D stage which augments an existing shared workspace system with therequested awareness about ongoing activities. In the course of the work,however, awareness turned out to be much more complex than we hadthought and we realised that it cannot be supported with a single tool.

In real world settings, awareness is a multi-channel phenomenon, thusit requires multiple means for the indication of activity information inan electronic setting. Therefore, in the course of the project several different features were developed, from among which a user may choosethe most appropriate one.

11.3.2 Do Users Meet at all in a Shared Workspace ?

Being present in the same place usually provides an opportunity forchance encounters. One motivation for Tower is the provision of anawareness space that notifies users about the presence of others in theirworking context. We have analysed the log files of user activities in ashared project workspace to check whether situations occur where morethan one user has been present at the same point in time. In such a case,knowing about the local presence would have opened up a chance to meetin the context of the actual work. Such functionality goes beyond the services offered by simple presence awareness systems such as ICQ (“I Seek You”, a service to get in touch with people across the network:http://www.icq.com/).

The log file contains data for each user action in the shared projectworkspace for the period from January until October 2001 (approxi-mately 200 working days). In this time period, about 15 users had accessto the workspace, but there were only 8 active users who used the work-space frequently. The project workspace contained more than 1000objects, e.g. documents, presentations, figures, etc.

The result of our analysis is shown in Figures 11.1 and 11.2. Each figureshows the number of meetings between two or more users within acertain time interval. A meeting is defined by two consecutive actions ofdifferent users on the same object. The x axis indicates the time intervalbetween two actions in minutes. The y axis denotes the total number of

123456789101112345678920111234567893011123456789401112345611


186

meetings that occurred within a time interval. Figure 11.1 indicates thatonly 20 meetings occurred within a short time space of 1–2 minutes,while 60 meetings took place over a 10-minute period. This means thatevery third day two users met on exactly the same object.

If we consider folders within the project workspace as meeting places, then the number of meetings increases drastically. The grey linein Figure 11.1 indicates that within 5 minutes 60 meetings between twoor more users took place in the same folder. Using 15 minutes as the timeinterval we count 90 meetings, i.e. every second day people have metcoincidentally in the same folder. These numbers increase further if weconsider the whole project workspace as a meeting place. In this casemore than 400 meetings took place within a 5-minute period, i.e. two ormore meetings on every working day. This frequency of coincidentalmeetings is similar to the number of meetings between two people in thesame spatial environment.

The statistics provide evidence that users actually meet in a sharedvirtual project or workspace and they further indicate that it is impor-tant to choose the right level of detail for the construction of the meeting

011

011

011

011

11

187

10

140

120

100

80

60

40

20

–20 30 40 50 60–

Same object

Same folder

Figure 11.1 Number of meetings on the same object and in a folder.

1200

1000

800

600

400

200

0

Same project space

10– 20 30 40 50 60

Figure 11.2 Number of meetings in the project workspace.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

places. If the level of detail is too high, then the number of coincidentalmeetings is too low. If it is too low, more meetings will occur but thenthe context of the meetings is too general. The empirical observationsare confirmed by our practical experiences with the construction of theTower world that are reported later in this chapter.

11.4 The Tower Architecture

In order to support awareness in a distributed electronic environment,the events that take place at one location or application, must be con-veyed to the locations of the other users sharing the environment. First,this requires a means to detect events caused by user actions. Secondly,these events need to be recorded and processed. Finally, the events mustbe presented and situated in the local action field of the other users.Consequently, the Tower system provides three basic components: a set of various sensors that recognise user actions; an event and notifica-tion server that stores, administers, filters, and distributes events; and a set of different visualisation tools. In the following we describe thearchitecture of Tower.

Figure 11.3 illustrates the components and architecture of the Towersystem. Different sensors recognise user actions. These are either inte-grated in applications or realised as agents that observe user actions, e.g. modifications of shared files systems or web pages. Sensors forwardevents to the central event and notification server (ENI) by calling appropriate common gateway interface (CGI) methods of a web server.This web-based approach was chosen to provide a simple, yet powerfulinterface that can be used by almost all applications (Prinz, 1999).

The ENI-server stores and administers events. It forwards events toappropriate indicators that have a registered interest in events usingpredicates over event attributes. In addition, the server implements func-tions for the authentication of users, the authorisation of access rights,and the aggregation of events in a history. The transformation moduleallows a semantic transformation of events to satisfy requirements of dif-ferent applications. Further, it supports the interworking of different ENIservers by enabling servers to exchange events. The reciprocity moduleinforms users about the interest of other users. This provides trans-parency to avoid the misuse of the awareness environment for controlpurposes.

Additional modules can extend the ENI-server. For example the context module has been developed to contextualise incoming events.Examples for contexts are a project, a task or collection of document fold-ers that belong to a work package. The classification is done by matchingevent attributes with a predefined context description (Gross and Prinz,2000). In the following sections different indicators are described for the presentation of activity information in inhabited information spaces.

123456789101112345678920111234567893011123456789401112345611


188

11.5 Personalised Overview of Activities: The TowerPortal

Our user studies revealed that an individual user may like to be madeaware of several electronic locations. For example, a user may be involvedin a project organising an electronic seminar and finally being a share-holder. Such a user may want to perceive awareness information aboutall relevant contexts. In case that these contexts are relevant over acertain period of time, this user may assemble awareness informationabout all these contexts in a personalised Tower portal. Figure 11.4 showssuch a portal of a Tower team member. It provides awareness aboutpeople currently present in the Tower project space and the activities inthe Tower workspace as well as information about the current Germanstock market.

The Tower portal provides users with a personalised visualisation ofawareness relevant for their personal work context. For this purposeusers configure their own Tower web page, called MyTower. For the individual configuration of MyTower users specify interest profiles toexpress which events and awareness information they are interested in(Fuchs, 1999; Fuchs et al., 1995). This may be the notification of changesto documents in shared workspaces, activities of group members, or

011

011

011

011

11

189

Activitysensors

Agentsensors

Ambientinterfaces

3D multi-userenvironment

Docu-Drama

Symbolic actingmodule

Space moduleCGI-scripts

Interest profiles

Aggregation

Access control

History

Reciprocity

Transformation

HTTP-server

Work environment Theatre of work

Authorisation

Event and notification server

Event processingmodules

Event processingmodules

Key actionmodule

Contextmodule

Figure 11.3 Tower architecture.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

information delivered by agents observing web contents. In addition tothe selection of contents in the Tower web page, users decide on the visu-alisation of the event information. For this purpose the Tower portaloffers a multitude of configurable indicators.

The configuration of MyTower in Figure 11.4 contains five differentindicators. In the first line the RandomImage-Indicator shows the photosof all users currently online in Tower. The Smartmaps gives an overviewof the user actions in the Basic Support for Co-operative Work (BSCW)workspace of the Tower project. In the next row a ticker tape informs inmore detail about the project activities of the Tower team members. Thelast indicator is a URL-Indicator. It displays the picture of the currentstock development of the German stock exchange.

Figure 11.5 illustrates the concept and the architecture of the TowerPortal. MyTower is an HTML document containing Java applets thatinform users about new events and represent this awareness informationin the web page. One of these Java applets is responsible for the com-munication with the ENI server. It administrates the user-defined inter-est profiles, asks the ENI server for new events, and distributes theseevents to the appropriate indicators, likewise Java applets. With themodular architecture and the implementation in Java, the Tower Portalis application independent. All indicators can also be integrated in exter-nal applications, so that existing systems can be extended with the Towerawareness widgets.

123456789101112345678920111234567893011123456789401112345611


190

Figure 11.4 An example of MyTower.

11.6 Awareness in a Working Context: Smartmaps

The users asked for simple means to provide awareness. To them aware-ness information and the possibility to access the respective objects wereintegral. When connecting to a shared workspace the traces of activitieswhich took place since the last time an actor was connected wererequested. In addition, while connected, awareness about current activ-ities was required. Like in real world settings, situated action (Suchman,1987) requires awareness information about the working space in whichaction takes place but, in contrast, in electronic workspaces situa-tions may span over time. To meet these requirements, Smartmaps weredeveloped.

Smartmaps provide both task-oriented and social awareness (Prinz,1999), i.e. they yield information about the state of artefacts as well asthe presence and activities of people. Furthermore they provide a two-dimensional layout of the electronic workspace which may ease the localorientation in the workspace itself. Traces of past or current actions areindicated as “footprints” in the Smartmaps. Smartmaps thus span syn-chronous and asynchronous work situations. Past actions as well ascurrent actions are indicated in the space.

Smartmaps provide an overview of all activities in shared informationspaces like shared file systems, web sites, or shared workspaces in 2Dgraphics. Figure 11.6 shows the Smartmap of a large file system. All filesin the tree of folders are represented as small rectangles. The hierarchyof sub-folder becomes visible through the thick lines. Files in the same

011

011

011

011

11

191

Userregistration

Communicationapplet

Indicator

Indicator

Indicator

Userlogin

Credentials MyTOWER Events

Eventquery

User authorisation

ENI server

MyTOWERTOWER portal

Personnel myTOWERuser web page

CGI-scripts

Indicator

Figure 11.5 Concept and architecture of the Tower portal.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

sub-folder are represented close to each other. Some rectangles, i.e. arte-facts, are highlighted in a different colour to indicate activity.

Smartmaps is a Java applet implementation based on the tree-mapvisualisation technique (Johnson and Shneiderman, 1991). It shows arte-facts of information spaces, preserving a lexicographic ordering.Different actions like read, write, rename, create, delete, etc. in informa-tion spaces are reported by the ENI-server as events to the Smartmaps.The events carry, for example, data about producer, artefact name, arte-fact type and operation. This information is interpreted and indicated inthe Smartmaps by colouring the corresponding rectangles.

The default presentation mode conveys the overall activity and theirdistribution in the information space. Tool tips, which are activated whenusers move the mouse over the corresponding region, indicate the arte-fact’s name. The tool tips also present the names of the persons currentlyacting on the artefacts. This enables people to know that there are currently others present and working in the same information space, thesame part or even on the same artefact.

To explain the application of Smartmaps for the provision of aware-ness in shared workspaces, we describe the integration of Smartmapsinto BSCW shared workspaces (Figure 11.7). Integrating it into anothergroupware platform, e.g. Lotus Notes, is also possible. When combinedwith shared workspaces, Smartmaps augment the often-applied list modepresentation of the shared objects with the 2D spatial representation of

123456789101112345678920111234567893011123456789401112345611


192

Figure 11.6 A Smartmap representing the files and user activities in a large file system.

the complete workspace. When navigating in the shared workspace, theSmartmap highlights the actual position of the user in the overall work-space by an orange rectangle in the Smartmap. Thus, beyond the provi-sion of the supply of awareness information it also eases navigation inthe BSCW workspace.

Users have the following options to interact with the Smartmaps.When moving the mouse over a Smartmap the complete name of the rep-resented artefact or information about recent user activities is displayed.A mouse click presents the pathname of the artefact in the status bar of the browser window, a shift-mouse click opens the artefact itself, acontrol-mouse click opens the enclosing “folder”. A control-right-mouseclick opens a pop-up menu to open the artefact, the enclosing workspace,and further enclosing workspaces up to the top-level workspace. Theseinteractions enable an activity-based navigation in the information spaceand reduce the amount of time that is normally needed to navigate hier-archical structures.

Several parameters configure the visualisation. The colour for high-lighting can be chosen, the highlighting intensity can be specified, onlyartefacts, or artefacts and enclosing “folders” up to a configurable levelcan be highlighted with decreasing intensities. Furthermore the granu-larity of the visualisation is adjustable: complete (default); only enclos-ing “folders”; only the first x-levels of the hierarchy; and up to the lasty-levels of the hierarchy. In addition the duration of highlighting can beset. Usage experiences have shown that a length of three minutes forSmartmaps representing busy web sites, and 4 hours for shared work-spaces and shared file systems is a good duration. An overview of theactivities that happened during the last day can be achieved by selectinga duration of 24 hours.

011

011

011

011

11

193

Figure 11.7 A Smartmap showing activity in a BSCW shared workspace.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

We have integrated the Smartmaps applet with several large projectworkspaces and we could observe that users quickly apply the Smart-maps to navigate the information space as well as for activity-based nav-igation. For example, they first check the places for which activity of otherusers is indicated. They move the mouse to the corresponding high-lighted rectangle to see who did which operation on the object. Then,often that link is followed and the corresponding folder or object isopened. The Smartmaps applet in the header of a workspace providesusers with an overview of the complete workspace structure, easy navi-gation, direct access to workspaces and artefacts, and informs aboutgroup members carrying out actions, currently or in the past. In fact suchawareness-enhanced workspaces become an inhabited place for socialencounters and activity-based communication.

11.7 Symbolic Actions in a Context-based 3D Environment

All these presentation modes are text-based or use 2D graphics. To comecloser to an intuitive way of presenting awareness information the Towerworld was developed. It presents awareness information in a 3D world.Actions in the electronic space are visualised by symbolic actions ofavatars representing the respective actors. The avatars are located at therepresentation of the corresponding object. In the project the Towerworld was understood as the “Theatre of Work”, i.e. the stage on whichthe activities are played out that take place in an electronic environment.

11.7.1 The Tower World

The Tower world realises context-based presentation of user activities ina multi-user 3D environment. The environment consists of a landscapecontaining representations of shared working objects such as documents,folders, etc. In the following figures, boxes are used for the representa-tion of these objects. This representation can be very detailed, e.g. eachshared document is represented by one object in the 3D world. The con-struction of such a detailed world can be done automatically using thespace module (Gavin et al., 2000). This module allows users to select the type of objects that shall be represented as well as the grouping andarrangement of objects based on their semantics. That is, objects can begrouped based on their location in a shared folder, or on other attrib-utes such as keywords, owner, modification dates, etc. More abstractworlds represent only contexts such as folders or an aggregation offolders in a work package or task context. Figures 11.8 and 11.9 showexamples for detailed and abstract worlds. Our experiences have shownthat most users prefer a more abstract representation, since this allowsa better overview.

123456789101112345678920111234567893011123456789401112345611


194

011

011

011

011

11

195

Figure 11.9 A detailed Tower world representing individual documents.

Figure 11.8 An abstract overview showing different project contexts.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

The Tower world is populated with avatars that represent users andtheir actions in a symbolic way (McGrath, 1998). This is realised by auto-matically moving avatars to places in the 3D world that correspond tothe working context or the document the user is currently working on.Symbolic gestures of the avatar represent the current action of the user.For example, a read operation is visualised by an avatar reading a news-paper, while an avatar indicates a write operation with a large typewriter.The automatic placement of the avatar and the symbolic actions are controlled by the space and symbolic acting module of the Tower system(see the architecture in Figure 11.3).

Team members are represented in Tower only when they are workingon shared documents or when they perform activities in a public space.This distinguishes the Tower approach from video-based media spaces,where users are visible independently of their current working context.The exaggerated presentation of user activities by symbolic actions pro-vides a good overview on the activities in a world and the overall situa-tion, also from a distance. Users can easily recognise which documents,tasks, or colleagues currently perform related processes. Users whoseavatars are close to each other in the Tower world are also working in asimilar working context. Communication channels such as audio or chatare provided to enable a spontaneous conversation. Thus, the Towerworld could serve as a chance encounter and context-based meetingspace that facilitates coincidental meetings.

For a peripheral awareness on user activities, the Tower world can beprojected in the user’s office environment. Alternatively, it can also beincluded as a plug-in in the user’s web browser (Figure 11.10). Cameraagents offer a guided tour inside Tower world. The agents are config-urable and provide a personalised entry to Tower world and the repre-sented events. A history camera agent guides the visitors to places atwhich in the past interesting events have taken place.

11.7.2 User Feedback

User feedback gave evidence that the easy orientation in the world rep-resenting a workspace is of high relevance. Therefore, simplified viewshad to be provided. Now several worlds representing different levels ofdetail are available. One overview world displaying the major folders in a workspace and in parallel a detailed world with all the documents is provided. An experience with using the Tower world as well as feed-back from users indicated that specific navigation support is needed.Users want to navigate to points of interests or to points of activities totake a more detailed look. This is an interesting finding, since at thebeginning of the project, it was considered to be beneficial, that in contrast to other 3D worlds, we “let the system do the walking”. Thatimplies that the world was intended to be a display medium, but not

123456789101112345678920111234567893011123456789401112345611


196

for navigation. This is still true for the control of the avatars that repre-sent the actions of its corresponding user. But in the position of a visitor,users considered it necessary to have easy means available to select suitable viewpoints.

To this end, users requested that the positions of avatars were clearlyindicated. The position of an avatar indicates a location of interest, thatis, a location of activity. Therefore, users requested large avatars relativeto the size of the work objects. Furthermore, the feedback from varioususer groups pointed out that the appearance of avatars may depend onthe formality of the social context. Workshop participants suggestedstarting with simple representations. Nevertheless, the avatars should becomplex enough to be recognised and personalised to some degree.Participants demanded that it be easy to identify the person representedby the avatar. Furthermore, the avatars should look friendly. Among theparticipants, there was a clear preference for stylised avatars as opposedto naturalistic ones. Participants proposed to use team colours for theavatars’ clothing. To satisfy these needs, Tower provides a tool where

011

011

011

011

11

197

Figure 11.10 Tower world as an integral part of the web browser.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

users can select among six different avatars and change colours of hair,eyes and dress and thus shape their personal avatar.

Another interesting finding was that the users wanted to immediatelyaccess the workspace objects represented in the Tower world, althoughwe intended the world in the first instance only as a visualisation and notas an interaction medium. All this gives a clear indication of the need tointegrate the supply of awareness tightly with the action options of anactor. Actually, the users are actors and not spectators. They want tointeract through navigation but also through interaction with the pre-sented objects, just like they would interact in a 3D game environment.We can conclude that like in real worlds, awareness is an integral part ofaction planning (Lantermann, 1980; Leont’ew, 1977).

The various user groups proposed to use the Tower world mainly forplaces with high levels of activity. For example, one user group suggestedusing it in a public space to show the activities on the web portal theyprovide. Another group suggested displaying the Tower world in theircoffee room to show the overall team activity in their office. However, inthe usual team setting the level of activity was too low to benefit fromthe synchronous presentation of awareness in the Tower world. The samefinding also occurred for the usage of video spaces to support ongoingteam work (Fussell et al., 2000). Instead, the support of medium and long-term teamwork requires that awareness spans activities over time. Tomeet these needs, we tried to replay the recorded event data over a giventime period. However, the pure temporal order of events did not conveythe right information about the activities in the electronic workspaceeither. Instead, more complex processing of event data is required. A possible solution for this requirement is presented in the next section.

11.8 DocuDrama

DocuDrama aims at recording the history of desktop events and activi-ties generated by a project team in the Tower collaborative work envi-ronment. Avatars that enact the events as they occurred in the sharedworkspace visualise the replay of the team’s interactions with documents.DocuDrama Conversation, a DocuDrama approach that focuses on inter-action between people on documents, refines the idea of the historyreplay. DocuDrama Conversation rearranges the order of events for thehistory replay instead of a play-out strictly organised by time. Thisapproach enables the user to focus on activities that have taken place ona certain document. For example, a team member uploads a documentin the team’s workspace and emails the project team about its existenceand location. As a follow-up activity, other team members will open the document and read, change or annotate it. DocuDrama Conversationgroups all these activities by different team members into a single scene.The replay of the events in that scene is performed in the order of their

123456789101112345678920111234567893011123456789401112345611


198

occurrence although the scene is not thematically interrupted by con-current activities on a different topic.

In DocuDrama Conversation, avatars and their symbolic actions are atthe centre of interest. The avatars look and turn towards each other toenhance the impression of an ongoing conversation. To enrich the storyand to keep the user’s attention, special focus has been given to cameranavigation and positioning. At the beginning of a scene, the cameraapproaches the centre of activity, the box of the current document, andremains in an overview position. The avatars appear one after the otherand perform their symbolic actions. The camera chooses randomlybetween a variety of close-up views on the avatars. Figure 11.11 shows anexample of possible camera positions and views in DocuDrama Conver-sation. The position of the avatars is dynamically determined. In sceneswith only two or three avatars the avatars are grouped facing each other.If there is a larger number of actors involved, the avatars are grouped incircles on top of the document boxes. Figure 11.12 shows an arrangementof avatars in circles to give the impression of a conversation.

To start DocuDrama users define a timeframe, which, by default,includes all past events. The user chooses a timeframe of the past, a dayor a week, and defines subjects, authors or activities, which are relevantto him. Figure 11.13 shows the user interface of DocuDrama Conversa-tion. The replay of events takes place in a single-user version of Towerworld. Text fields in the middle section give information about useractions currently played out by avatars, the document as the centre ofactivity and about the period in time, in which activities on this docu-ment have taken place. The section below includes the elements for theuser’s personal configuration of DocuDrama.

011

011

011

011

11

199

Figure 11.11 Camera positions.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

Users of DocuDrama Conversations reported that the history replayenabled a better understanding of the process of events and their coher-ence, compared to a sequence of event reports of the same timeframe.The choice of selection criteria has proven to be useful and will be refinedand extended in a future version of DocuDrama.

123456789101112345678920111234567893011123456789401112345611


200

Figure 11.13 DocuDrama user interface.

Figure 11.12 Avatars positioned incircles.

11.9 Ambient Interfaces

Ambient interfaces support presenting and capturing informationbeyond the PC. They go beyond the classical PC with the traditional hard-ware and the desktop metaphor. Ambient interfaces include the wholeenvironment of the user as a medium for the interaction between theuser and the system. The subtle presentation of information exploitspeople’s ability to be peripherally aware of others, thus enabling users tobe aware of activities without disturbance. Furthermore, ambient inter-faces allow the user to capture information permanently – even if theyare not working with their PC (Gross, 2002).

Ishii et al. at the MIT Media Lab developed some early systems (e.g.,ambientROOM). However, these systems were called ambient displaysand primarily focused on presenting information and not on capturinginformation (Wisneski et al., 1998). We explicitly use the term ambientinterfaces to denote systems that use the physical environment of theuser to present information and to capture events.

For the Tower environment we developed two types of ambient inter-faces: binary ambient interfaces, and AwareBots. The technological basesfor binary ambient interfaces are relayboards, which can be connectedto the parallel or serial port of the PC. A special client controls the indi-vidual relays. Examples of binary ambient interfaces that we developedare a fish tank with plastic fish (Figure 11.14a), a fan, a coffee machine,and so forth. The system can release bubbles into the fish tank, whichcan be seen and heard by the users. The different binary ambient inter-faces can be configured in various ways. For instance, releasing bubblescan indicate that the user’s web pages are accessed; the fan can blow airinto the user’s face when a document is uploaded into the shared work-space; and the coffee machine can be switched on when the user logs intothe Tower system from within the office.

AwareBots are ambient interfaces that have the shape of robots. TheTower AwareBots were built based on the LEGO Mindstorms RoboticsInvention System. The first generation of AwareBots was RoboDeNiro(Figure 11.14b). RoboDeNiro uses two motors for the presentation ofinformation and a touch sensor for capturing information. It can lift itshat, rotate its torso, and senses when its arm is pressed. RoboDeNiro wasused both in individual users’ offices and in a public space. The follow-ing configuration was chosen: RoboDeNiro can wait for a specific user,when this user logs in RoboDeNiro lifts its hat; when changes in a spe-cific shared workspace occur, RoboDeNiro rotates its body; when its armis pressed, it sends a specific message to the ticker tape. On the whole,the user reactions were positive and the users could understand thesemantics of the individual movements of RoboDeNiro. However, moreand more users stated that they would like to personalise their robot orbuild their own robot.

011

011

011

011

11

201

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

Therefore, we distributed LEGO Mindstorm packages to the users andasked them to build their own AwareBots. This resulted in the secondgeneration of AwareBots. One user built the EyeBot, which can roll itseyes and has a switch with a flag that can be moved to the front and tothe back (Figure 11.14c). Whenever changes occur in the shared work-space, the EyeBot rolls its eyes. The switch can indicate the user’s avail-ability. Another user built the ClassificatorBot – a mobile robot that canrotate to the left and to the right (Figure 11.14d). The ClassificatorBotcan be used to indicate any kind of relationship (e.g., it was used to indi-cate access to web pages; on an access to the personal pages of the userthe ClassificatorBot turns to the left, on access to other web pages of theresearch group it turns to the right).

123456789101112345678920111234567893011123456789401112345611


202

Figure 11.14 (a) Fish tank; (b) RoboDeNiro; (c) EyeBot; (d) ClassificatorBot.c d

a b

On the whole, both the binary ambient interfaces and the AwareBotsallow the subtle presentation of up-to-the-moment information andevents as well as capturing information over time. Furthermore, severalAwareBots remain in a certain position and can indicate status informa-tion that the users can capture later (e.g., when they return to their office).

The users of the ambient interfaces can not only design and constructtheir ambient interface, but they can also specify their individual map-ping of the TOWER events to the symbols of the ambient interfaces –they do not need to use the default mapping. For instance, a user canspecify which action the pressing of the arm of the RoboDeNiro triggers.Another user might want to have a login or logout event triggered,because she imagines that pressing the arm symbolises a greeting of saying hello or goodbye; another user might want to trigger a message tothe ticker tape, asking if some colleagues are in the same building andwould like to have a short coffee break together. Another example isrotating the body: the default mapping is that rotating the body showschanges in a specific shared workspace. Some users preferred the posi-tion of the torso to represent the number of new emails in the inbox: foreach new email, RoboDeNiro rotates the torso a little more (some userseven draw a scale, so they could see exactly at which position the torsowas standing). For the public places, such as a coffee room, the defaultsettings were kept in order to avoid confusion. Users of personal ambi-ent interfaces also reported that they used a personal mapping for privacy reasons: when other persons were in their office, the guests could see the changes in the ambient interfaces, but they could not tellwhat the mapping was. Therefore, only the respective owner could interpret the mapping.

11.10 Lessons Learned About Awareness

The demonstrations of the Tower features and the discussions with thedifferent groups of users have shown that the needs for awareness dependon many different factors. Specifically, we found differences between theneeds of tele-workers compared with those of collocated team membersor those of members of distributed teams. Another interesting findingwas that not only the events that occurred were of interest but also thosethat were expected but did not occur in time. For example, if a deadlinein a project approaches and a requested contribution does not arrive intime, users wanted to be notified about those outstanding events.

11.10.1 Awareness Is Something One Is Not Aware of

All discussions with users in the various settings, however, have con-firmed that awareness is something they are not aware of. Instead of

011

011

011

011

11

203

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

permanently watching awareness information, they are performing task-related activities. “Seeking awareness information is an out-of-taskactivity”, a user said. Any extra effort is considered to be troublesome.

Furthermore, the team observation disclosed that – like in real worldsettings – all available information was read as conveying awarenessinformation almost unconsciously. For example, the stream of emails inthe project was interpreted as indicating the level of activity. It conveyedpresence information about the senders of the mails. Information aboutthe topics of relevance was read from the subjects used in the emails.This was combined with the information gained in the BSCW daily activ-ity report to achieve awareness about the team and the work process. Ininterviews the team members confirmed this intuitive interpretation ofemail as awareness means.

From real world environments one could learn that awareness is oneof the most unconscious and intuitive ways of human orientation in asurrounding environment. One is always aware without knowing. It isimpossible not to be aware since this would imply not perceiving theenvironment. Tightly interrelated with perception is interpretation, that is, associating meaning to what one perceives (J. Gibson, 1986).Perception and interpretation are preconditions for being aware also inelectronic settings. Accepting the unconsciousness of awareness it is notsurprising that requirements analysis for electronic support of awarenessusing verbal discussions, interviews and so on, did not give a completepicture. The availability of tangible interfaces in Tower often disclosednew aspects. They helped to improve the understanding of awarenessneeds in electronic environments considerably.

11.10.2 Synchronicity of Awareness

Purely synchronous supply of awareness information turned out to bemuch less useful than expected (see also Fussell et al., 2000). In particu-lar it was not suitable for the work of distributed teams as the team obser-vation disclosed. Also the fixed time interval of the daily BSCW activityreport that is compiled and distributed every night was not suitable in all situations. Our team observations revealed changing interactionrhythms depending on the actual work processes. For example, in case of an approaching deadline and in case of joint document produc-tion, a high level of awareness was requested to support co-orientationin the team. While in the phase of programming, continuous mutualawareness was irrelevant but immediate reactions of other partners wererequested once a problem occurred and immediate help from others was needed.

The provision of awareness information should acknowledge theserhythms and also changes of the rhythm. Thus in phases with more activ-ities, awareness information may be provided more often than in phases

123456789101112345678920111234567893011123456789401112345611


204

with low level of activities. Actually, real world places also have their ownrhythm and tempo, depending on the surrounding culture (Levine,1997). Simultaneous awareness is required in situations where one wantsto establish a synchronous contact. To this end awareness about peopleand their actual availability is required. Orientation in the work processrequires asynchronous awareness. This implies compiled stories abouttime periods.

11.10.3 Walking and Talking Are Means to Achieve Awareness

In order to understand particularities of asynchronous awareness better,we asked team members how they achieve awareness in their local realworld settings after a time of absence. “I gain awareness from going fora walk”, an interviewee said. Walking around and talking to colleagueswere named by all interviewees as an important means to gain aware-ness. After some time of absence they ask colleagues about what hap-pened in the meantime: “I like to listen by first getting the current stateof things, what is going well, what is going badly, are there any hitches,and effects of hitches on the overall program.” Length of time for discussion is for most of them about 5 minutes.

This gives clear evidence that summaries are requested, rather thandetailed lists of all events. The feedback from interviewees explains whythe pure presentation of events in the temporal order failed. Instead thechallenge is to process the stream of events as recorded in Tower intomeaningful stories about what happened over time. It also explains whythe basic motto of the project, “Let the system do the walking”, did notcomprise the whole story. Instead, the environment should “talk” to theusers. In effect, this also means getting summaries and an overview kindof awareness stories. In case a user wants to get more detail, they wantto be able to “walk around” in the electronic environment and ask theenvironment for more awareness information, i.e. to retrieve awarenessinformation on request. The Docudrama development is a first steptowards this direction, but further research is needed.

11.10.4 Peripheral Awareness in Electronic Settings

With Tower we wanted to provide means for peripheral awarenesssimilar to real environments. Instead of permanently watching events inthe workspace, awareness should be immediately at hand, when neededand it must be adjusted to temporal fragmentation. For presence aware-ness this implies that a table of people present must be accessible withthe level of temporal accuracy that is requested for a personal contact.For workspace awareness this implies that the environment provides on request the story of intermediate events for the location a user is

011

011

011

011

11

205

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

interested in. Peripheral awareness in electronic environments meansimmediate situative provision. Also in these cases the awareness infor-mation spans over time, that is, comprises awareness information overa time interval.

11.10.5 Awareness Is Double-situated: The Workspace’s and theObserver’s Situation

Instead of constant need for the same kind of awareness information, therelevance of such information depends on two situations, the situationin the actual shared electronic workspace and the current situation of anobserver, respectively a potential actor. For example, to keep all membersof an electronic setting informed about the progress of work, a dailyactivity report that is sent out every day to all members has proven to behelpful. All team members in Tower found the BSCW daily activity reportuseful. “I look at BSCW daily activity reports everyday, it gives you infor-mation about who is doing what”. However, after an absence of severaldays, reading all intermediate reports does not suit the user needs.Instead, this user situation requires summaries of all intermediate events.The environment should compile a story from those events that tookplace since the actor’s last presence in the environment. This implies that instead of receiving a report giving the awareness information infixed intervals of time, users want them to be compiled dynamicallyaccording to their individual observation rhythm.

On the other hand, when something unusual happens and the progressof work in the workspace needs the actor’s attention, awareness infor-mation must be pushed – e.g. by email or by activating an ambient interface in the local work setting – to the respective actor. Thus, theawareness support in a workspace must keep track of what happens in the environment to provide the respective information to its actors.The stories compiled should be adjusted to the interaction rhythm of theenvironment as well as to the observation rhythms of its actors.

We can, for example, differentiate the following potential situations ofan actor who joins a workspace: taking a break and looking around; fin-ishing a document under time pressure; supervising a project’s progress;searching for a particular document or for information on a particulartopic; or a newbie may want to find out about places of relevance andabout the history of the project. All these situations differ with respectto the awareness needed.

The awareness information supplied must be dynamically adapted tothese user situations. Instead of configuring tools, a user may be enabledto choose among the most suitable means of presenting awareness infor-mation, just like the choice of features offered by Tower. Furthermore,the environment should be able to detect as far as possible the user’s

123456789101112345678920111234567893011123456789401112345611


206

situation (by means of the events it has recorded from the user and bymeans of the presentation tool selected) and to adapt the provision ofawareness information respectively.

This requires more research to investigate concepts of adaptability tothe situation of the shared environment, the actual situation of the userand the group. Mechanisms to compile events into meaningful episodesand stories need further investigations. Similarly, the requirements ofsynchronicity for the provision of awareness have to be studied in moredetail.

11.11 Summary and Conclusion

This chapter described an awareness environment for co-operative activ-ities that is based on inhabited information spaces. It provides a numberof user-configurable indicators. The 3D environment creates a sharedspace presenting users and their working context as an inhabited infor-mation space. The portal summarises all notifications in an overview pre-sentation. We have described Smartmaps as a compact visualisation foran inhabited environment. They can be easily integrated into the infor-mation space itself to build social spaces. Furthermore, we presentedambient interfaces.

The complexity of the Tower system became evident upon eachdemonstration and workshop and so does the complexity of its usabil-ity. Many items arose which could have not been foreseen until such atool really existed. For example, the relationship of size between avatarsand documents, the suitable level of detail with respect to documents andfolders, a suitable mapping which facilitates navigation in the file systemand the 3D world in parallel, the navigation in the 3D world to find pointsof interest had to be revised. In particular the relevance of DocuDramabecame evident and many new requirements could be raised.

Discussions with the different groups of users have shown that theneeds for awareness depend on the different settings, modes of workingand the particularities of the co-operation cultures. Awareness may con-tribute to project management, to knowledge management – e.g. forfinding documents by patterns of their usage – and provide chanceencounters in non-collocated teams. However, all discussions with usersin the various settings have confirmed that awareness is a phenomenonof which one is not aware. Therefore any extra effort required to the pro-vision of awareness is considered to be troublesome. Instead, awarenessnotifications must be compiled into meaningful stories that span overtime. Details must be available on request. The presentation of activitynotifications must fit with the changing local work situation of the particular observer. The availability of medium or long-term history ofawareness may be useful in many situations. All this requires sophisti-cated mechanisms to aggregate and select the relevant data.

011

011

011

011

11

207

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


11

The effects of awareness support became evident when an intervieweesaid that he now worries “how I may appear to others, when they seewhat I am doing”. This is an aspect that requires further attention.Awareness is not only a means to see what others are doing. Instead, itwill become a means for presenting oneself to a distributed team.

Acknowledgments

The research presented here was carried out by the IST-10846 projectTOWER, partly funded by the EC. We would like to thank all our col-leagues from the TOWER team at Aixonix, Atkins, blaxxun, BTexact,Fraunhofer FIT, and UCL for their co-operation.

123456789101112345678920111234567893011123456789401112345611


208

Part 5Construction

011

011

011

011

11

209

12DIVE: A Programming Architecture for the Prototyping of IIS

Emmanuel Frécon

12.1 Introduction

This chapter presents the Distributed Interactive Virtual Environment(DIVE) system. DIVE realises an architecture and a programming toolkitfor the implementation and deployment of wide-area, Internet-basedmulti-user virtual environments. The platform is designed to scale toboth a large number of simultaneous participants and a large number ofobjects populating the environments, while ensuring maximum interac-tion at each site.

Started as a lab tool in 1991, DIVE has now reached a stability and statethat allows its use outside the research niche that it has occupied sinceits birth. DIVE can be compiled for a number of UNIX and Windows plat-forms. It benefits directly from the recent advances in three-dimensional(3D) graphics hardware that have moved 3D capabilities from profes-sional workstations into everyday home personal computers (PCs) andprobably soon to mobile terminals (see ATI, 2003; and Intel, 2003).

In this chapter, we present the DIVE system from the perspective ofInhabited Information Spaces (IIS). Our intention is to describe anexample system in order to give a better understanding of the techno-logical issues that are involved by the realisation of large-scale sharedinteractive 3D spaces: networking solutions to keep interaction high andensure the illusion of a shared space, various interaction methods forinput and output (including live audio and video communication),support for large spaces both in extent and detail, openness of the plat-form towards the outside world, etc. Our focus on programming inter-faces to the system is justified by the very eclectic nature of IIS and therecurrent necessity to couple the space with a database.

At many different levels, DIVE tries to ensure an adequate trade-offbetween complexity for the programmer and capability. For example, the

011

011

011

011

11

211

different programming interfaces that it offers will allow the program-mer to pick the interface and language that are most appropriate for the tasks at hand. However, offering such a palette of interfaces lends tocomplexity in mastering all or some of the interfaces and integrating allthe components forming an application together. Similarly, DIVE pro-vides a number of building blocks to interface its communication archi-tecture. Assembling these blocks in various ways will let the applicationdeveloper experiment with different interest management techniques atthe cost of more programming complexity compared to other systems.However, the “standard” DIVE-based application instance will generallybe developed without the need to focus on such complex problems.

The remainder of this chapter is structured as follows. In the nextsection, we describe the conceptual abstraction through which DIVEapplications transparently communicate with each other, namely viavirtual worlds. We then focus on the wide-range of programming interfaces and techniques and describe their use for IIS. Following this, we present the run-time architecture and a summary of the mainmodules composing the system. Finally, we examine a specific applica-tion in more detail to hook its architecture and design choices into thevarious key techniques presented in the rest of this chapter.

12.2 The Virtual World as a Common Interaction Medium

DIVE provides both an architecture and a programming model for theimplementation of multi-user interactive virtual environments over theInternet. The architecture focuses on software and networking solutionsthat enable highly responsive interaction at each participating peer, thatis, interaction results are whenever possible immediately shown locallyat the interacting peer but slightly postponed at remote peers. The pro-gramming model hides networking details, allowing the programmer tofocus on the space, its content and its application logic.

By peer, we mean an application process running at a specific host(computer) connected to the Internet. An application or process is anyactive program interfacing to the virtual environment by presenting andmodifying entities (see below), monitoring and reacting on differenttypes of events, and so on. A typical application is the 3D browser thathandles interaction and visualisation of the environment for a specificuser. Other applications may, for instance, perform animations andcomplex simulations, and present 3D user interfaces for different pur-poses. A number of IIS examples are presented at the end of this chapterand elsewhere in this book, e.g. the Web Planetarium (see Chapter 2) andthe Pond (see Chapter 4).

A central feature in the programming architecture of DIVE is theshared distributed world database. All user and application interactionstake place through this common medium. The world database acts as a

123456789101112345678920111234567893011123456789401112345611


212

world abstraction, since DIVE applications operate solely on the data-base and do not communicate directly with each other. This techniqueallows a clean separation between application and network interfaces.Thus, programming will not differ when writing single-user applicationsor multi-user applications. This model has proven to be successful sinceDIVE has changed its inter-process communication package a numberof times without applications requiring any redesign, only minor codetweaking.

A DIVE world is a hierarchical database of what are called entities.DIVE entities can be compared to objects in object-oriented program-ming, although DIVE is written in plain ANSI C. The database is hierar-chical to ease ordering of information and rendering. In addition tographical information, DIVE entities can contain user-defined data andautonomous behaviour descriptions. We believe that the 3D renderingmodule should not dictate a structure to the content of the database. Bymaking this distinction the system becomes more flexible, and can moreeasily be adapted to a large variety of rendering techniques.

Entity persistence is ensured as long as one application is connectedto a world. When the last application “dies”, the world dies and entitieswill stop living. The next time an application connects to this world, it will reload its state from some initial description files or a UniformResource Location (URL). Currently, active persistence is achieved by running monitoring processes that periodically save the state of the world to files, from which the world database can be restored if thesystem needs to be restarted.

12.3 Partial, Active Database Replication

The DIVE architecture is based on active replication of (parts of) thedatabase, so that a copy resides at each application process (see Figure12.1). This model allows applications to access the world databasedirectly from memory, which provides low-latency user interaction.

Typically, entity additions, removals and modifications are done on thelocal copy first, then distributed to all connected peers through networkmessages and applied to the local copy at each receiving peer. By this wemean that the replication of the database is active. Conceptually, pro-grammers can think of a “global” central database residing somewhere on the network, but the database is indeed replicated at each process.

To achieve scaleable real-time interaction, DIVE:

● uses a peer-to-peer multicast communication model where partitionsof the world are associated to multicast groups;

● tolerates world copies that differ slightly and implements services thatensure their equality over time. Dead-reckoning techniques and run-time object update mechanisms are used to achieve this;

011

011

011

011

11

213

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

DIVE: A Programming Architecture for the Prototyping of IIS

12

● divides the world into sub-hierarchies that are only replicated andused in-between the small number of applications that have expressedan interest in a particular hierarchy.

Modifications to the database are applied locally first, then transmittedand applied to all receiving copies. Thus, on the receiving side, worldevents are captured and transcribed into the database some time afterthey occur. The length of this time is highly dependent on the networksthat packets will cross. At all levels, DIVE is tuned to accommodatevarying round-trip times and packet losses.

Despite network latency, DIVE does not introduce differences betweenpeers that are excessively large. The distribution principle fits with ourexperience of virtual environments: typically, entities will be modified in bursts and then stabilise into a common “inertia”, that is, a commonstate that lasts. Differences introduced by latency and lost networkpackets are “healed” over time by periodic synchronisation usingsequence numbers to keep track of entity versions, followed by possibleupdate requests. As a result, with time, all connected peers will achievethe same “inertia” for these entities. DIVE pursues an idea originally in Scaleable Reliable Multicast (SRM) (Floyd et al., 1995) to reduce theamount of message passing and thereby minimise network load andincrease scalability. The method uses multicasting heavily, makes com-munication entity-based, and bases reliability on a negative acknowl-edgement request/response scheme.

123456789101112345678920111234567893011123456789401112345611


214

P6

P5

P4

H2

W 1

W 1

W 1 W 2

W 1 W 2

W 1

W 2

W 2

W 2

Network

P1P2

P3

H1H3

Figure 12.1 The replication architecture of DIVE. A number of processes Px, hosted on severalmachines Hx are sharing two different virtual worlds W1 and W2, conceptually located “onthe network”. Only P3 and P6 share both worlds, all other processes share one or the otherworld.

To remedy problems with network congestion and traffic overhead,and to offer the possibility for hundreds of participants to share acommon place, DIVE provides a mechanism for dividing the world intosub-hierarchies that are only replicated and used in-between the smallnumber of applications that are actually interested in them. This mech-anism forms logical and semantic partitions. Each sub-hierarchy is asso-ciated with a multicast communication channel, called a lightweightgroup. As a result, processes that are not interested in these sub-hierar-chies can simply ignore this branch of the entity tree and reduce networktraffic. The top-most entity, i.e. the world, is itself always associated witha multicast channel, and every world member process must listen to it.More information on lightweight groups can be found in Frécon andStenius (1998).

On top of lightweight groups, DIVE offers an abstraction called“holders”. This abstraction deals with the initialisation of the databasebranch encapsulated by the group, for example through a URL. Holderscan be associated to empty multicast groups, which implies total local-ity of part of the virtual world and can be used for scenery or other partsof an environment that are guaranteed to never change. Holders,described in more detail in Frécon and Smith (1999) and Frécon et al.(2001), can also be used to drastically reduce network traffic throughsemantically richer application-driven protocols and the realisation ofpredictive behaviours.

12.4 Programming the System

In this section, we present a summary of the palette of techniques offeredto DIVE application programmers. We will see that DIVE offers a widerange of programming interfaces, from a high-level scripting languageto low-level C programming.

The variety of languages, the openness of the platform and the varietyof architectural options (from stand-alone applications to distributedscripts) make DIVE a toolkit of choice for IIS systems. It fits require-ments such as advanced rendering techniques, e.g. through specific plug-ins, and connectivity to external databases, e.g. through its ability toembed application data in the environment and its component-basedapproach.

Above all, DIVE offers a framework for the implementation of applications. Years of research development have led to a number of features that are present in few systems, such as the ability to supporta wide range of avatars, spatialised audio communication between connected users, visual environment subjectivity (Jää-Aro and Snowdon,2001). This makes it a software platform ideally adapted to the proto-typing of IIS systems.

011

011

011

011

111

215

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

12.4.1 The DIVE Programming Model

DIVE applications perform three distinct steps to exist and interact withinthe environment. Each of these steps is optional. An application will:

● introduce one or more (shared) entities to the environment;● register for events on selected parts of the environment, usually the

entities that it has added to the environment itself;● react to events occurring in the environment through modifications

to the database, i.e. modify the environment.

Consider the example of a multi-user whiteboard application such as theone described in Ståhl (1992). Such an application follows the steps above.It introduces a 3D model of a whiteboard together with icons for the different drawing tools available. It listens to user interaction events onthe surface and icons and reacts by drawing graphical elements on thesurface or changing the drawing tool, e.g. from ellipse to free-hand tool.

12.4.2 Programming Interfaces

Given this very general principle, DIVE, written itself in ANSI C, offersa wide range of programming interfaces (see Figure 12.2).

Monolithic applications can be implemented by compiling some or all of the DIVE component libraries into a DIVE stand-alone application(or part of another application if necessary). Examples of such applica-tions are earlier versions of VR-VIBE (Benford et al., 1995a) the WebPlanetarium (Chapter 2), and The Pond (Chapter 4). DIVE offers inter-faces for two major groups of languages: C/C++ and Java.

● The C/C++ interface gives full access to the component libraries ofthe system. This leads to both power and complexity in some cases.

● The Java interface translates the DIVE object-based approach intotrue object orientation. For example, all DIVE entities are mirroredby Java objects that instantiate a class in a class hierarchy that mirrorsthe DIVE class hierarchy (see Chapter 4 in Pettifer, 1999).

Applications can also be loaded dynamically at run-time. These typicallyneed one or several monolithic applications to exist. There are two majorinterfaces for dynamic loading of applications:

● DIVE has support for plug-ins written in C or C++. By construction,plug-ins have full access to the components libraries of the systemwithout any restrictions. Plug-ins can be either loaded when a stand-alone application starts or later on, programmatically or as a result ofuser actions. The standard 3D browser that comes with the systemoffers the user the option to select a plug-in from the local file systemand load it into the 3D browser.

123456789101112345678920111234567893011123456789401112345611


216

● More importantly, DIVE offers a high-level programming interface inthe form of a scripting language. This interface is based on TCL (ToolCommand Language). Any entity in the database can be associatedwith a script that will describe its behaviour over time. Scripts followthe DIVE programming model: they will register for events and reactto events. To this end, the set of standard TCL commands is enhancedwith a number of DIVE specific commands for event registration,modification of the environment, etc. This forms DIVE/TCL (seeFrécon and Smith, 1999 for more information).

Finally, external applications are allowed to connect to a running stand-alone application through the DIVE Client Interface (DCI). In this case,the DIVE application acts as a server and the external application as aclient (note that the communication that takes place is totally differentfrom the standard DIVE communication mechanisms: it is a pureclient–server solution tuned for usage within a machine or on a localnetwork). External applications will be represented in the environmentby an entity of a specific type. This entity will operate in the environmenton their behalf. The language used here is, again, DIVE/TCL. This inter-face is tuned for situations where the external application cannot beeasily modified but needs some DIVE facilities. Finally, standard imple-mentations of the client interface exist in many languages: C, TCL, Java,Oz, Prolog.

011

011

011

011

11

217

DCI

Application interface

C/C++Interface

DIVE/TCL

Core libraries

Java inverface

ExternalapplicationPlugin

Figure 12.2 DIVE offers a wide range of programming interfaces. Most of them operate onthe hierarchical database as a central abstraction. The Java interface mirrors part of this data-base in a pure object-oriented manner. The script interface (DIVE/TCL) is part of the data-base in the sense that scripts are associated to the replicated objects and describe theirbehaviour in a distributed manner.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

12.4.3 Building your Application

Given the wide palette of programming interfaces and architectures, wediscuss now the various ways of writing DIVE applications and thetypical applications that these target.

“Monolithic” Programming

It is possible to write DIVE applications that apply the programmingmodel in one sweep, i.e. within a single piece of code. There are differ-ent ways to implement these types of applications:

● C, C++ or Java applications can be programmed and linked togetherwith the DIVE core libraries.

● It is also possible to build whole applications using the plug-in inter-face. These will be hosted within a stand-alone application such asabove (typically the 3D browser).

● External applications programmed in any language of choice that hassupport for TCP/IP and connected to a running DIVE applicationthrough the DCI.

Such applications will be integrated to a DIVE world through event reg-istration. This offers transparent access to a networked environment.Monolithic DIVE Programming is suitable for applications such as simulation engines, complex applications or simple renderers.

An example of IIS application that uses this model is VR-VIBE(Benford et al., 1995a). VR-VIBE creates visualisations of sets of docu-ments. Users specify keywords that they wish to use to generate the visualisation and place these keywords in 3D-space. Representations ofthe documents are then displayed in the space according to how relevanteach document is to each of the keywords (this relevance is computedby searching the documents for the keywords specified by the user andrecording the number of matches).

Earlier versions of VR-VIBE were written as a stand-alone applicationwritten entirely in C. This is mainly for scalability reasons. Above all, VR-VIBE is typically a compute intensive application, e.g. computation ofdocument icons positions upon interaction and initialisation. It postu-lates the acceptance of networking delays, for example, the time for newdocument position to arrive at all visualising sites. Along the years, VR-VIBE has benefited from the incorporation of DIVE/TCL. Consequently,it has now become a mixed mode application (see below) where the corecomponent is still written in C for scalability and the user interface usesthe behaviour language.

Another example of IIS application that uses this centralised pro-gramming model is the Pond (Chapter 4). The Pond is a multi-user hor-izontally projected system for access and manipulation of information

123456789101112345678920111234567893011123456789401112345611


218

elements from the Internet as well as for communication and collabora-tion between users being present both remotely and physically. Thecurrent demonstrator provides a visualisation of a record database and focuses on touch and sound interfaces, “prohibiting” textual entrythrough a keyboard.

The Pond is mostly implemented as a stand-alone application (inJava), for performance reasons. The logic of the Pond application, e.g.flocking behaviour, Internet search initiated by users, is on purpose sep-arated from the visualisation (a specific DIVE 3D browser, with a numberof rendering plug-ins).

Yet another application largely following this technique is the WebPlanetarium (Chapter 2). The application visualises the structure behindWorld Wide Web documents and hyperlinks as a 3D virtual world ofplanet-like abstract objects and connection beams. An object is a 3D representation of a web page and, once the user is inside, displays thehyperlinks on that page as additional small objects on which the user isable to click in order to fetch new pages, and thus extend the 3D graphwith new site representations.

The Web Planetarium is implemented as a stand-alone application (inC), for performance reasons. As for the Pond above, the logic of the appli-cation (parsing of Web pages, placement of documents, interaction withthe users) is contained in an application separated from the visualisa-tion. This allows for a number of different visualisation set-ups: standarddesktop, multi-screen displays, etc.

Script Programming

A radically different way of programming consists of creating a DIVEworld (or set of objects) with scripts attached to objects in the database.The TCL layer isolates clients from executable changes to the DIVE,which is of importance as DIVE/TCL has stabilised over the years and isgenerally kept backward compatible.

Of interest to this style of programming is the fact that the standard3D browser provides a generic actor environment (with audio and visualinput and output). On top of this generic environment, there can be anynumber of TCL/TK-based “skins”, presenting different user interfaces.This environment allows applications that live within the presented DIVEworlds and programmed in the form of scripts to modify the layout ofthe 2D user interface at run-time, for example to add application specificmenus (see Steed et al., 1999 for an example).

Script programming is tuned for rapid prototyping, simple applica-tions, interface experiments and animations. Script programming is idealwhen the behaviour of the application can be described through anumber of somewhat independent visual objects that consume little CPUpower. It is particularly suitable to adapt to changing environments of

011

011

011

011

11

219

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

all sorts: adding new objects with new scripts to an environment willmake it behave differently. This can even happen at run-time and istherefore very dynamic by nature.

An example of an IIS that uses this programming model is the DIVERoom System (Frécon and Avatare-Nöu, 1998). This application activelysupports collaborative as well as individual work through the concept ofrooms. The application uses a real-life metaphor and introduces virtualcounterparts of objects that are usually found in meeting and sharedgroup rooms: overhead projectors, notebooks, screens, documents, etc.

Every object in the environment has only a slight dependence on otherobjects.This design fact has driven the choicetowardsascriptingprogram-ming model. Some objects have to know about one another, for example,placing an overhead on the projector will have the effect of showing it in a bigger format on the screen. To achieve this, the application usesscript-to-script communication, which is part of the scripting interface.

Mixed Mode

As already alluded to several times already, a key design of the DIVEsystem is the possible interconnection of all programming components.As such, the system encourages application programmers to mix

123456789101112345678920111234567893011123456789401112345611


220

Figure 12.3 Several views of the DIVE room system. Main picture: At the forefront is theremote control to allow remote operation of the screen that is currently enlarging the doc-ument that has been placed on the overhead projector. Close to the projector is a set of virtualfolders containing various documents. Inset: An avatar sitting at the conference table andexamining a document.

programming interfaces in order to best suit the needs of the variouscomponents on which they are working.

A typical example is the combination of plug-ins for services andDIVE/TCL for application logic. This combination provides a very powerful prototyping process. Plug-ins will register functions with theDIVE/TCL layer and the TCL layer is able to inspect loaded plug-ins orrequest a plug-in to be loaded/unloaded.

Another example is external applications connecting through DCI.Such applications will often extend the entity that represents them inDIVE with some application-specific DIVE/TCL code in order to relievethe burden placed on the socket connection between DIVE and the appli-cation. WebPath (Frécon and Smith, 1998) is an example of such anapplication.

Mixed mode offers many possibilities; examples of use are face andbody animation, new navigation styles, simulation of crowds, etc. Thismixed mode is used in many of the latest IIS based on DIVE, since itcombines rapid application development (using the scripting language),a component-based approach (allowing components of the applicationto be associated to specific programmers) and component reuse (both atthe plug-in and scripting level).

An example of an IIS that uses this model is the London Travellerapplication described later on in this chapter (Section 12.6). The appli-cation supports travellers by providing an environment where they canexplore London, utilise group collaboration facilities, rehearse particu-lar journeys and access tourist information data.

To adapt to the amount of data to visualise (3D model over 16 × 10km of London) the core DIVE rendering engine was modified. Addition-ally, the model is brought to life using avatar crowd and face animations.For performance reasons, this is implemented using several plug-ins.Finally, the different applications that are embedded within the Londonmodel were implemented using a set of objects and scripts, designed andimplemented by different programming teams.

Another example of IIS that uses this model is the Library Demon-strator application (Mariani and Rodden, 1999). The overall goal of thisapplication is to provide visitors to the library with an alternative searchmethod, that will help in “fuzzy” searches, that is, searches that typicallywould require the visitor to speak to a librarian to get a grip of what tolook for in the library. Rather than replacing the existing computer-basedsearch methods, which work well for the initiated and focused users, thedemonstrator provides a complementary means of searching, adapted tothe users outside of this category.

The core of the library demonstrator is built on top of the Java inter-face in order to benefit from Java’s ability to interact easily with exter-nal databases. A large part of the user interface is written using theDIVE/TCL interface, which makes it easy to experiment with a numberof different approaches in order to find a suitable design.

011

011

011

011

11

221

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

Remote Dynamic Programming

The DIVE/TCL layer supports a remote procedure declaration and activation protocol that allows distributed script execution. Indeed, thedynamic nature of scripting languages and the distributed nature of the DIVE system make possible the real-time extension and modifica-tion of existing scripts within existing entities.

Building applications in such a context becomes interesting. Forexample, this allows for per-user customised behaviour, e.g. 2D versus3D presentation or the ability to extend the user interface of other par-ticipants by sharing scripts. Finally, it allows distributed applicationdevelopment where the code of the application can be edited betweenmultiple machines interactively from within the system.

There are a number of situations where this model can be useful. Forexample, the various parts of the London Traveller application dynami-cally add a menu to the standard 3D browser to improve control over

123456789101112345678920111234567893011123456789401112345611


222

Figure 12.4 The main window shows the visualisation part of the library demonstrator. At thecentre of the window is a focusing circle. The information for the document contained withinthat circle is shown in a semi-transparent legend in the top left corner of the visualisationwindow. To the right are a number of windows to perform queries within the library data-base.

the information shown. The code for menu extension is contained in thecode of the London Traveller application and brought to all users suc-cessively connected. Similarly, later versions of VR-VIBE added an extrawindow to any person (process) entering the world.

12.5 DIVE as a Component-based Architecture

To complete our description of the system, the following sections discussthe main modules that compose the DIVE system and that are offered aspart of the standard libraries.

12.5.1 System Components

In DIVE, an event system realises the operations and modifications thatoccur within the database. Consequently, all operations on entities such as 3D transformations will generate events to which applicationscan react. Additionally, there are spontaneous and user-driven eventssuch as collision between objects or user interaction with input devices.An interesting feature of the event system is its support of high-levelapplication-specific events, enabling applications to define their contentand utilisation. This enables several processes composing the same appli-cation (or a set of applications) to exchange any kind of informationusing their own protocol. Most events occurring within the system willgenerate network updates that completely describe them.

011

011

011

011

11

223

User

Network

Input devices Output devices

Rendering

Audio and video i/o

MIME Tcl scripting

3D i/oEvent system

Database

Distribution

SRM mechanisms

Userinterface

Figure 12.5 The different modules compos-ing the system and their approximate layering and dependenciesbetween the user and the network.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

In any application, the content of the database must be initialised.DIVE uses a module that manages several three-dimensional file formats and translates them into the internal data structures that best representtheir content. Usually only one peer will load, and parse, a particular fileand the resulting entity hierarchy will be distributed to other connectedpeers through a series of (multicast) updates that describe the resultingentities. This specific mechanism differs from many other systems that rely on being able to access the description files or URLs from allconnected peers.

DIVE has an embedded scripting language that provides an interfaceto most of the services of the platform. Scripts register an interest in, andare triggered by, events that occur within the system. They will usuallyreact by modifying the state of the shared database. Moreover, thesemodifications can lead to other events, which will possibly trigger addi-tional scripts. A series of commands allow the logic of the scripts togather information from the database and decide on the correct sequenceof actions. For example, the simplistic script below would move the associated object 1.0 m upwards at every interaction with the mouse (2Dor 3D) or any other interaction device connected. The procedureon_interaction is bound to any interaction event on the object(which identifier is returned by [dive_self]).

proc on_interaction {event_type object_idtype\origin_id src_id x y z} {

dive_move [dive_self] 0.0 1.0 0.0 LOCAL_C

}

dive_register INTERACTION_SIGNALDIVE_IA_SELECT\[dive_self] “” on_interaction

12.5.2 User-oriented Components

The services described previously are independent of any DIVE applica-tion. This section focuses on the different modules present within the 3D browser.

The primary display module is the graphical renderer. Traditionally,the rendering module traverses the database hierarchy and draws thescene from the viewpoint of the user. This module has several imple-mentations, on top of various graphical libraries such as Performer,OpenGL, Direct3D or PocketGL. Some versions support a constantframe-rate rendering mode (Steed and Frécon, 1999).

DIVE has integrated audio and video facilities. Audio and videostreams between participants are distributed using unreliable multicastcommunication. Audio streams are spatialised so as to build a sound-scape, where the perceived output of an audio source is a function of the

123456789101112345678920111234567893011123456789401112345611


224

distance to the source, the inter-aural distance and the direction of the source. The audio module supports mono-, stereo- or quadra-phonicaudio rendering through speakers or headphones connected to the work-station. Input can be taken from microphones or from audio sample filesreferenced by a URL. Similarly, the video module takes its input fromcameras connected to the workstations or video files referenced by URLs.Video streams can either be presented to remote users in separatewindows or onto textures within the rendered environment.

Users may also be presented with a two-dimensional interface thatoffers access to rendering, collaboration and editing facilities. The inter-face itself is written using the same scripting language as offered by the world database. Consequently, applications can dynamically queryand modify the appearance of the 2D interface. For example, the LondonTraveller application exploits this feature by adding an application-specific menu to the standard interface of the DIVE browser (see Section12.6).

Finally, a MIME (Multimedia Internet Mail Extensions) module is pro-vided to better integrate with external resources. It automatically inter-prets external URLs. For example, an audio stream will be forwardedonto the audio module where it will be mixed into the final soundscape.

12.5.3 The DIVE Run-time Architecture

Not all components are present within the different programs of theDIVE run-time architecture:

● Session initiation: Sitting directly on top of the networking compo-nents, the name server allows other DIVE applications to enter worldsand sub-hierarchies of worlds controlled by holders. The name serverlistens for requests on a well-known multicast group and is requestedonce and only once for each world and holder. Upon request, a mul-ticast address is returned and this address will be used for all furthercommunication. Several name servers, tuned to different addresses,can coexist on the Internet, thereby allowing a number of completelyseparate virtual universes to exist.

● Supporting architecture: On top of the networking component, theproxy server can interconnect sub-islands with multicast connectiv-ity and/or single local networks. A thorough discussion of the proxyserver and an analysis of typical DIVE traffic can be found in Lloydet al. (2001).

● Environment evolution: On top of the system components, persistencemanagers ensure that the content of an environment will continue toexist and evolve even when no user is connected. One persistencemanager is responsible for the state of one world, but there can beany number of managers ensuring this very task on the Internet. The

011

011

011

011

11

225

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

managers guarantee persistence with evolution for environmentswhere the application logic is described using DIVE/TCL.

● Session participation: Finally, the 3D browser uses most componentsto give its user a presence within the environment. It introduces a newentity called an actor to the shared environment. This is the virtualrepresentation of the real user. Additionally, it handles interactionwith entities, and allows the user to move freely (or in a constrainedway) within the environment, listen to all audio sources and talkthrough the mouth of the avatar. DIVE supports a number of inputand output devices: from 3D trackers to standard mice, from multi-screen displays (Steed et al., 2001) to standard workstation screensand even personal digital assistants (PDAs).

12.6 The London Demonstrator: An Example Applicationin More Detail

In this section we describe the London Demonstrator; our descriptionfocuses on the programming aspects of the application itself and howthese relate to the core services offered by the DIVE system. A betterdescription of the application and implementation choices can be foundin Steed et al., (1999).

The broad aim of the demonstrator is to present an application to agroup of users enabling them to specify and rehearse a meeting of anysort. This includes supporting the selection of features at a given loca-tion (hotels, conference venues, etc.), using both abstract and facsimile-based information visualisation approaches. The demonstrator provides

123456789101112345678920111234567893011123456789401112345611


226

3DB8 3DB9 PM1 PM2 3DB1 3DB2

PS5PS4PS3PS2PS1

3DB33DB4PM5PM43DB5

DIVE hybrid multicast backbone

PM33DB63DB7

AP2

AP1

NS

Figure 12.6 An example of a running DIVE session with all involved processes, representedby discs of different grey tones for each class of process. Direct connections betweenprocesses are represented with a joining straight line; otherwise connection is through IPmulticast. The DIVE application-level multicast backbone is formed by a number of proxyservers Px. The name server, NS, is used by all 3D browsers, 3DBx, for session initiation.Persistence managers, PMx, whose existence is sometimes controlled by monitoring appli-cations APx, ensure that the environments will continue living even when a 3D browser isno longer connected.

users with the ability to navigate through a large virtual cityscape (representing a real location). Their navigation is aided by a number ofdynamic information visualisation systems. A suite of collaborative fea-tures aids the users in constructing, rehearsing and participating in bothvirtual and real meetings.

The demonstrator consists of four main geometric and functional lay-ers. All these services are integrated into a single coherent environment.

● A 16 × 10 km geometric model of the centre of London.● Collaboration services for use by groups.● Tourist information data visualisation service.● Simulations of public transport and crowds.

011

011

011

011

11

227

Figure 12.7 A view of the enhanced 3D browser window that was developed for the Londondemonstrator. At the top of the figure the application menu that is built by the environmentfor each new user that connects to it is currently open. To the left are, from top to bottom,a list of view points to quickly navigate and jump to named points within the environmentand a list of all present users with a real-time summary of their activities (here: one user,stand still and talking). At the bottom of the figure is text chat window.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

Any user entering the London world is automatically supplied with anapplication menu that adds itself to the standard menus of the 3Dbrowser. The menu offers a number of application-specific options suchas toggling the visibility of a personal compass or a global orientationmap. To achieve the modification of the menu bar, a script associated tothe object of the world with no geometrical representation watches forthe events generated when new users enter the world. The script sendsthe combination of DIVE/TCL and TK code required to generate themenu to the entering actor. Additionally, this object adds the geometryfor the compass and the map to the visor, a logical object close to theeyes of the avatars. Both these objects have associated scripts that controltheir behaviour.

12.6.1 Centre of London

The model of London is based on two different types of buildings. Thevast majority of the model is based on automated extrusions of the contours of the buildings, according to their height. The appearance isthen controlled by a number of heuristics (texturing, shaping, etc.) togive an appropriate illusion. Additionally, a number of buildings aremodelled at a higher level of detail, i.e. external and internal architectureand furnishings.

London is divided into 16 × 10 tiles of one square kilometre each. Everytile is composed of an object whose representation is an invisible sphere.This object is associated with a script that listens to collision events inorder to react to avatar presence within the sphere. Upon collision, thescript will automatically toggle the loading or unloading of the contentof the tile into or from the local database. This arrangement allowskeeping in memory only the parts of the static model that are within thevicinity of the avatar. The application specific menu offers a 2D graphi-cal user interface to manually toggle the loading or unloading of tiles formachines with sufficient memory.

Within the conference centre and its detailed buildings, a number ofscripted objects represented by invisible boundaries (boxes for most of them) toggle the local visibility of judicious parts of the model. Forexample, internal architecture is made progressively more visible as auser approaches and enters a building and its rooms. The implementa-tion makes use of the collision and subjective views mechanisms that arebuilt in the DIVE platform.

Around the conference centre there are a number of invisible volumesthat embrace the shape of the streets. These volumes will react to colli-sion with avatars by sending the name of the street to an informationtext object that has been placed on the visor of the avatar when the userfirst entered the environment. The visor is a logical object that is alwaysplaced in front of the current eye through which the avatar is lookinginto the environment.

123456789101112345678920111234567893011123456789401112345611


228

12.6.2 Collaboration Services for Use by Groups

Within the conference centre, a number of rooms are furnished withvirtual counterparts of the different items that can generally be found inmeeting rooms: notebooks, hands-outs, overhead projectors, etc. Theserooms are furnished with all the elements that form the DIVE RoomSystem described in more detail in Frécon and Avatare-Nöu (1998). Theroom system is based on a number of scripted objects that understandone another when necessary. For example, when placed on top of anoverhead projector a virtual document will present its content to thenearest screen. This type of action is implemented through communica-tion between the scripts associated with each object.

Additionally, there are a number of invisible boundaries to the rooms.Every conference room is associated with specific multicast groups foraudio and event communication within it. The content of the rooms isnot present in the database of every peer. Loading of database content,subscription to multicast groups and other related operations is con-trolled by the scripts that are associated with the room boundaries. Thisarrangement provides for scalability in the usage of the rooms and thenumber of users that the overall application can host.

12.6.3 Tourist Information Data Visualisation Service

The DIVE city visualisation tool has been developed to help users selectplaces of interest according to predefined requirements. To enable this,the tool retains a database of attributes (e.g. hotel prices, star ratings,etc.) for each attraction type and provides for a 3D interactive visualisa-tion of the data above the city. The visualisation service is programmedusing a number of DIVE/TCL scripts.

It reuses some of the techniques and components described above. Forexample, an invisible scripted boundary encloses the whole visualisationspace so that all users within its vicinity will be able to talk to one anotherwithout sound attenuation. By default DIVE uses a model of sound basedon distance and inter-aural distance, where the perceivable sound leveldiminishes with distance. However, this is undesirable in the demon-strator where collaboration is required over large distances: the attenu-ation would prevent users from collaborating and talking to one anotherabout the data being visualised.

A controlling window is brought up when a user interacts with thevisualisation cube. The scripting code for its construction and operationis contained in the visualisation script and dynamically sent at the verymoment of interaction. Interaction with this window will let users selecthotels, bars and other attractions from a number of preferences. Iconicrepresentations for the attractions are dynamically moved within thevisualisation cube when parameters are changed. This operation is very

011

011

011

011

11

229

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

dynamic by nature and can lead to the movement of a large number of3D objects in real-time. The implementation uses aggregation of 3Dtransformations to relieve the burden on network throughput and easereal-time transmission of visualisation modifications at all sites.

12.6.4 Real-time Simulations

The demonstrator includes three real-time simulations that enhance thetravel scenario. The first is a simulation of journey on the LondonUnderground that arrives at a station close to the conference centre. The second is an audience in a seminar room in order to enable talkrehearsals or support for those that suffer from a fear of public speak-ing (see Slater et al., 1999 for more information).

Finally, the third simulation component is a virtual crowd of avatars.The crowd simulation was designed with scalability in mind. The crowditself is controlled through a plug-in implemented with the DIVE C inter-face. This plug-in, hosted within one and only one process only, controlsthe position of the avatars that are present in the crowd. Animation oflimbs is performed locally at all processes that have an interest in thecrowd. This animation is implemented using a DIVE/TCL script associ-ated to the main crowd avatar objects. Animation information is trans-mitted at a slow rate to give other processes a chance to catch up andavoid swamping the network with animation messages.

123456789101112345678920111234567893011123456789401112345611


230

Figure 12.8 A view of the main conference site. In the top right corner, a view of the touristinformation data visualisation service and its controlling 2D window which is brought tousers by a simple click. In the top left corner, a view over the Canary Wharf. In the bottomleft corner, a view of one of the interiors of the conference centre.

12.7 Conclusion and Future Work

In this chapter, we have presented DIVE and focused on its communi-cation architecture and the programming models and interfaces that itoffers. We have seen that the various programming interfaces are ofimportance to many IIS systems. This variety and the composite aspectsthat are supported by DIVE allow application writers to focus on per-formance at crucial points only and let them experiment with differentinterfaces by reducing the cost of development through the introductionof a scripting language.

We believe that a component-based approach that offers a number ofinterconnected programming interfaces is key to the development of anIIS application. It brings modern application development techniquesinto a novel domain and lets people experiment with an increasednumber of varying interfaces and test these on final users before the final-isation of the application. DIVE addresses this approach and adds a scal-able communication architecture that allows for real-life Internet trialsand deployment.

For the last few years and for a number of years to come, work withinthe system is heading in two distinct but complementary directions.

First, we have engaged in a finer componentisation of the system,through modularisation and the widespread use of plug-ins. Our goal isto reduce the system to a minimalist kernel on top of which a numberof key components are built and through which they interface and communicate. This work is similar in some respects to platforms suchas Bamboo (Watsen and Zyda, 1998), JADE (Oliveira et al., 2000) orNPSNET V (Kapolka et al., 2002).

Secondly, we are slowly moving towards a number of modules that willallow the system to serve as the core for applications that require real-time interaction, that are distributed over the Internet, but that are notnecessarily three-dimensional by nature. Part of this work consists inbeing able to support different communication models and toning downthe multicast orientation of the system and making multicast only oneof the possible communication architectures available. For example, weare currently looking into the latest development on peer-to-peer tech-nologies such as CAN networks (Ratnasamy et al., 2001), SCRIBE (Castroet al., 2002) and the work derived from OceanStore (Chen et al., 2002).

011

011

011

011

11

231

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


12

13Communication Infrastructures for InhabitedInformation Spaces

David Roberts

13.1 Introduction

Inhabited Information Spaces (IIS) require advanced communicationinfrastructures that address issues arising from the use of limited com-putational and network resources to place people within an interactiveinformation space. This chapter describes these issues along with theways in which they are typically addressed. IIS situate people in a socialand information context where they can interact with each other and with the information itself. These users, possibly in remote geographicallocations, access the environment through a variety of display devicesthrough which they gain distinct levels of presence and immersion (Slateret al., 2001). Some may be co-located, seeing each other in the real worldwhile immersed in the virtual environment, while others may be in somegeographically remote location and represented locally as an avatar, a 3Dgraphic character capable of representing human-like communication,appearance, identity and activity. Information presented to the user may be shared or private, objective or subjective. It may be abstractedfrom live data in the real world or from simulation. Users may interactwith the information to adapt its presentation, content or behaviour.Information objects often provide a focus for group activity (Greenhalghand Benford, 1999).

IIS merge the real and the virtual. Ideally the latter should possess the richness and naturalness of the former. We would like to be able tointeract with remote users as if they were standing next to us. Verbal andnon-verbal communication and the use of objects in the environment areprimary methods of social human communication in the real world(Burgoon et al., 1994). In IIS, shared interactive information objects may be observed, used to inform, explain, teach, heal, experiment or asa basis for discussion. Sometimes it is important to see not only what

011

011

011

011

11

233

each participant is doing, in relation to the shared information, but howhe or she is feeling. Expressive human communication should includespeech, gesture, posture and facial expressions. Shaking hands andpassing task-related artefacts, from business cards to a model of a newproduct, are important activities in real world group activities. We are,however, constrained by technology, physics and cost. The constraintsof computers, networks, displays and acquisition devices introduce a gapbetween what we would like to achieve, and what is currently realisable.In practice, we need to make trade-offs, by reducing realism, naturalnessand content where they are not needed, in order to maximise perfor-mance where necessary. This is typically addressed in terms of what eachuser can see and in what detail, as well as the objectivity and respon-siveness of interactions with shared information.

IIS applications are numerous and have diverse requirements.Specialised IIS communication architectures attempt to strike thebalance within various application genres. It has been found that striking the balance within one or more genres requires a complex archi-tecture comprising many co-operating optimisation and balancingmechanisms. Common mechanisms will be dealt with in detail later. This chapter is concerned with the systems issues of communication inIIS. That is, how we make best use of computers and networks to supportco-located and geographically distinct users in an IIS.

13.1.1 Requirements

We set the scene by briefly introducing a number of application genresalong with the balance the communication infrastructure needs to set foreach. A detailed discussion of application genres is beyond the scope ofthis chapter and we restrict our description to Table 13.1. The remain-der of the section discusses common requirements in detail. Some archi-tectures provide a level of configuration, and sometimes adaptation, tocope with differential application requirements, various computationalresources and dynamic network characteristics.

Before we discuss the complexities of IIS communication architec-tures, it is important to understand what this technology can give us,what we can do with it and what kind of information needs to be com-municated. The remainder of this section introduces some functionaland non-functional requirements of an IIS communication infrastruc-ture. Our discussion of functional requirements focuses on situatinginhabitants in a social and information context. Non-functional require-ments are taken from the various communication media used by IIS aswell as the computers and networks that these media must run on.

Information objects act as foci for activity and often collaboration.Users can collaboratively affect the presentation, content and behaviourof shared information. Simple interaction with information is often

123456789101112345678920111234567893011123456789401112345611


234

achieved through selection and manipulation tools, allowing the repre-sentation to be moved to a more suitable viewing perspective. Applicationspecific 3D toolbars give additional control and adaptation. The repre-sentation of information may itself incorporate handles or tools for natural interaction. The presented information can often only be under-stood in the context of how a group is working with it. It is thereforeimportant to demonstrate how others are interacting with the data as wellas supporting instructive and expressive communication within thegroup.

13.1.2 Information

Information may represent anything and be represented in manyabstract forms. A reasonable question is, what can IIS offer in terms ofinformation representation, over and above what we had before? IIS is acombination of advanced technologies and is not restricted to a partic-ular set of these. Let us briefly look at the way in which some componenttechnologies are changing what we can do with information. Access tounprecedented scales of data and processing is now available throughtechnologies such as the e-Science GRID (GRID). 3D graphics, simula-tion and display devices give unprecedented naturalness in the viewingand steering of such information. Mobile and social agents provide powerful ways of finding, assessing and combining information. ACollaborative Virtual Environment (CVE) allows us to share informationand observe those with whom we share it. IIS technology encompassesall of this and thus gives us novel ways of presenting and interacting withshared information in a distributed group setting. So what does this

011

011

011

011

11

235

Table 13.1. Typical compromises for various application genres.Application genre Maximise ReduceTele-conferencing Expressive avatar Group size, complexity and

communication interactability of shared information

Scientific visualisation Faithfulness of simulated Group size, avatar behaviour, consistency communication

Cohabited agent spaces Communication between Avatar communication, agents and between agents responsivenessand users

Social meeting places Group size Avatar representation, responsiveness, consistency

Games Responsiveness Faithfulness, avatar representation, group size

Training and planning Faithfulness, responsiveness, Group size, avatar consistency, repeatability communication

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Communication Infrastructures for IIS

13

information look like? An advantage of computer graphics and virtualreality is that we can tailor the representation of information to anyabstract form that best suits the user, application and display device.What can we do with it? We can alter the representation, change thedetail, content and state, or steer the simulation. Most importantly wecan share information and share the way we work with it.

13.1.3 Avatars

The spoken word is often the most important medium for communica-tion between users. It is, however, not sufficient to demonstrate howothers are interacting with the information. When combined with videostreaming, or a 3D avatar capable of reflecting gesture and posture, wehave an effective tool for instructive and emotive communication.Viewing a remote user through a video window is, however, not effec-tive for demonstrating how the remote user is interacting with informa-tion. Representing both the information and remote user through 3Dgraphics gives a much better impression of how each user is interactingwithin the team and with the data. All that is left to situate the inhabi-tants in a social and information context is the support of expressivecommunication. This brings us to the topic of user-controlled, computer-generated characters (avatars).

Video avatars can provide high levels of detail, realism and expres-sion. They faithfully reflect the actions and emotions of their user.Although there is little technical difficulty in placing a stereo video in a3D world, it is much harder to capture imagery of the user. Problems ofcamera placement within a display system are exacerbated by freedomof movement of the local user and any number of observers. Other prob-lems include isolation of the user from his environment, occlusion of thedisplayed image by the cameras, and the high bandwidth requirementsof multiple streams of video across a network. For these reasons, mostIIS systems use avatars generated from 3D graphics. These are typicallyhumanoid with movable joints that provide a basic reflection of bodymovement. Although such avatars are not as realistic, they can provideinstructive and emotive communication sufficient for many appli-cations. In the real world we look at posture, gesture, subconscious movement and facial expression to gauge emotion. All of these can berepresented through an avatar. The problem again relates to capture. A typical display device takes sparse input from the user to control theavatar. For example, a desktop system may use a mouse to control movement, mouse keys to interact with objects and keyboard to chat. An immersive display system, such as a Head Mounted Display (HMD)or CAVE would typically track the head and a wand held in the domi-nant hand. The wand provides additional input for moving long distances in the environment and interaction with objects. Talking would

123456789101112345678920111234567893011123456789401112345611


236

normally be communicated through streamed audio. Such input is suf-ficient for demonstrating how a user is interacting with data. Showingany emotion through a current desktop interface is almost impossiblewithout additional input. The combination of audio and freedom ofsingle-handed gesture does allow a base level of expressive communica-tion from an immersive device. It has been found that when desktop andimmersive users work together, the latter take dominant roles, presum-ably from their greater ability to express themselves (Slater et al., 2000b).Further to this we have found that where two immersive users share anenvironment with desktop counterparts, the former team up mostlyignoring the latter. Greater levels of emotive communication may beachieved for any device by allowing the avatars to improvise (Slater etal., 2000a). Here the avatar will attempt to fill in the gaps left by the lackof input. Context and profiled personality may be used to interpret userinput, or lack of it, and drive suitable emotive behaviour. Other behav-ioural techniques may further enhance the believability and realism ofavatars. For example, behavioural reasoning may combine concurrentsimple autonomous behaviours such as fidgeting, shifting weightbetween the feet, breathing and eye movement. Reactive behaviour isuseful to define how objects, including avatars, react to given interac-tions. Diverse behaviour can be achieved through applying polymor-phism, allowing something of given type to behave in new ways to givenstimuli.

13.1.4 Interaction

Some basic requirements for interaction within IIS are responsiveness,detail and intuitiveness. As they are of prime importance to the usabil-ity of the environment, they will now be discussed in more detail.

Responsiveness

A key aspect of usability and believability is maintaining responsivenessof interactions close to the level of human perception. Changes in thepresented information must be represented as soon as a user affects it. Low responsiveness will make the system feel unnatural and causefrustration. Immersive displays render the environment from a new perspective every time the user moves his or her head. A low respon-siveness in updating perspective causes disorientation and sometimesfeelings of nausea. IIS introduce the issue of responsive sharing. This isa particular concern where users are in geographically distinct locationsconnected over a network. The communication infrastructure mustprovide sufficient responsiveness to support, and not confuse, the naturalsequences of conversation and interaction.

011

011

011

011

11

237

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

Detail

Some interactions will require more detail than others. For example, tointeract with another person it is often important to communicate bothcomplex language and emotion. Even email users exchange icons to rep-resent how they feel. The detail of presented information will be a balancebetween the data from which it is derived and what is useful and per-ceivable for each user. The communication infrastructure must supporta wide range of detail in interactions and should do so in an optimummanner.

Intuitiveness

Interaction must be both natural and intuitive. A user should be able tointeract with an object or peer, without having to worry about over-coming shortcomings in the technology. Furthermore, an object shouldreact in a believable way regardless of how or where it is implemented.This places requirements on both the device and the infrastructure.Display devices offer various input/output capabilities that may bemapped to interaction scenarios. For example, in an immersive system,users may use a joystick or a wand to move up to an object, their ownbody movement to position themselves at the correct aspect, and thewand to select and manipulate the object. Both the physical device andthe way its inputs are interpreted must map to natural and believablebehaviour in the virtual environment.

13.1.5 Communication Requirements

So where does this leave us in terms of communication? Representinginformation using computer-generated graphics gives unprecedentedpower to tailor its presentation. Virtual Reality (VR) uses 3D graphics toallow the user to control his or her position in the environment, givingnatural access to spatially organised information. CVE socially situate agroup of users around information within a familiar spatial context.Although video avatars would offer a potentially higher level of realism,computer-generated avatars are easier to situate in an environmentwhere users have freedom to walk around. It is not surprising that themajority of IIS systems rely primarily on 3D computer-generated graph-ics to present visual information. Unlike video, 3D graphics scenes, comprising the geometry and appearance of many objects, can be down-loaded in advance. Where users are distributed, the scenegraph may bereplicated at each user’s computer with incremental changes sent acrossthe network. This massively reduces bandwidth usage and also increases

123456789101112345678920111234567893011123456789401112345611


238

the responsiveness. Without such replication, any user movement wouldrequire a perspective recalculation of the scene on a server before theresultant images could be streamed back to the user’s machine. Thisapproach is generally unusable as the network delays result in feelingsof disorientation and nausea as the user’s visual inputs lag behind theirinternal senses of balance and proprioception.

3D graphics may be the primary medium for IIS but it is often com-bined with others. Natural language has been shown to be of vital impor-tance to collaborative tasks. Audio streaming has been found much moreeffective than chat in IIS settings and does not require the use of a key-board. Streaming of video and 3D graphics across the network is usefulfor rich and detailed images provided observer perspective is con-strained. An exception to this is tele-presence, which allows a single userto see through the eyes of a movable robot, but this is outside the scopeof this chapter. Table 13.2 shows how various media are typically usedtogether, when they are used and how they impact on available networkbandwidth.

The degree to which each medium is used is application dependent.We have assumed so far that sharing the use of the information is theprimary goal of an IIS communication infrastructure. Other applications,for example, may place more emphasis on emotive communication andthus use video as the primary medium. The remainder of this chapterdeals with the typical and leaves specialisation to other works. We there-fore restrict our discussion to systems that primarily use 3D graphics forvision, audio streaming for speech, and video and 3D graphics stream-ing for occasional supplementary, high detail, imagery.

011

011

011

011

11

239

Table 13.2. Usage of mediums in IIS and the effect on the network.3D graphics – Audio Video Streamed 3D Text chatreplicated graphicsscene graph

Purpose Primary Primary Supplementary Supplementary Alternative visual natural visual and visual e.g. for to audio medium language audio for high end streaming for

medium perspective graphics on desktop, constrained desktop public high fidelity systems

Usage Continuous When user Occasionally Occasionally When user is is speaking as required as required chatting

Bandwidth High during Medium High High Lowusage initial

download then medium

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

13.1.6 Resources: Computers and Networks

Let us now take a brief look at the relevant characteristics of computersand networks. It is, after all, these that must underpin an IIS communi-cation infrastructure. Computers have limitations on the amount ofinformation they can store and process. An IIS will often contain userssupported by computers of widely differing capabilities. These comput-ers may be connected to various network technologies such as Ethernet,ATM and wireless. These networks are often part of the greater Internetand will communicate through intermediate networks of various tech-nologies. These technologies have widely differing bandwidth, delaycharacteristics and reliability. IIS communication infrastructures use the Internet Protocol (IP) that deals with the heterogeneous nature of theInternet by making low assumptions about the guaranteed service. Thatis, IP assumes that messages may be fragmented and individual frag-ments may arrive late, out of order or be lost. The Internet, and oftensuper computers running display devices and processing information,are shared resources offering highly dynamic levels of throughputdepending on localised load. A final important point is that the speed oflight will introduce perceivable network delays for many intercontinen-tal links. An IIS communication infrastructure must be designed to runover a set of heterogeneous computers and networks, each with possiblyvery different dynamic, throughput and reliability.

13.2 Principles

IIS situate inhabitants in a social and information context that extendsinteraction in the real world in a natural manner. Technology, physicsand cost create a gap between this ideal and reality. This section is concerned with balancing throughput limitations of computers and networks with the requirements of IIS applications. IIS communicationinfrastructures employ a set of co-operating mechanisms and algorithmsthat effectively concentrate resources by maximising the fidelity ofsharing where it is needed by reducing it where it is not. We have lookedat what might reasonably be expected in terms of perception and inter-action and how this may be supported through a combination of exist-ing communication media. We have explained why 3D graphics withreplicated scenegraphs have become the primary medium of communi-cation in IIS and how these may be supplemented with other media. Werestrict our discussion here to the mechanisms for improving the fidelityof sharing through 3D graphics and replicated scenegraphs.

A key requirement of IIS and VR is the responsiveness of the localsystem. Delays in representing a perspective change following a headmovement are associated with disorientation and feelings of nausea. AnIIS system supports a potentially unlimited reality across a number of

123456789101112345678920111234567893011123456789401112345611


240

resource bounded computers interconnected by a network which inducesperceivable delays. Key goals of an IIS communication infrastructure areto maximise responsiveness and scalability while minimising latency.This is achieved through localisation and scaling.

13.2.1 Localisation

Localisation is achieved through replicating the environment, includingshared information objects and avatars, on each user’s machine. Sharingexperience requires that replicas be kept consistent. This is achieved bysending changes across the network. Localisation may go further thansimply replicating the state of the environment and can also include thepredictable behaviour of objects within it.

Object Model

The organisation and content of a scenegraph is optimised for the rendering of images. Although some systems, for example Cavernsoft(Leigh et al., 2000) and Arango (Tramberend, 2001), directly link scene-graph nodes across the network, most systems introduce a second object graph to deal with issues of distribution. Known as the repli-cated object model, we will from here on refer to it as the replication andits nodes as objects. Objects contain state information and may link to corresponding objects within the local scenegraph.

Behaviour

A virtual environment is composed of objects which may be brought tolife through their behaviour and interaction. Some objects will be staticand have no behaviour. Some will have behaviour driven from the realworld, for example by a user. Alternatively, object behaviour may be pro-cedurally defined in a computer program. In order to make an IIS appli-cation attractive and productive to use, it must support interaction thatis sufficiently intuitive, reactive, responsive, detailed and consistent. Byreplicating object behaviour we reduce dependency on the network andtherefore make better use of available bandwidth and increase respon-siveness. Early systems replicated object states but not their behaviour.Each state change to any object was sent across the network to everyreplica of that object. This is acceptable for occasional state changes butbandwidth intensive for continuous changes such as movement.Unfortunately, movement is one of the most frequently communicatedbehaviours in IIS. A more scalable approach is to replicate the behaviourmodel and only send changes to that behaviour. Such changes are knownas events.

011

011

011

011

11

241

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

Deterministic BehaviourBehaviour may be characterised as deterministic or non-deterministic.Deterministic behaviour need not be sent across the network providedit can be calculated independently at each replication. Most proceduralbehavioural descriptions such as reactive, improvisational and emergentmay be defined in a repeatable and deterministic manner. Events cansimply identify the name and possibly arguments to a procedure, the exe-cution of which will be replicated at each machine. Even non-determin-istic behaviour can be approximated as deterministic provided the effectof bad approximations are not catastrophic.

Dead ReckoningConstrained movement, such as that of a vehicle, may be determinedapproximately using a technique called dead reckoning (IEEE1278.1,1995). One of the earliest applications for large-scale virtual environ-ments was battlefield simulation (SIMNET). Here embodiment was orig-inally confined to vehicles, such as tanks, where the vast majority ofcommunicated behaviour was movement around the battlefield. Deadreckoning was introduced to reduce bandwidth consumption of move-ment information. A dead reckoned path represents a predicted approx-imation of near future parametric movement based on recent samples ofposition over time. Paths are sent to other replicas in events. A remotereplication then calculates the probable position of the vehicle based oncurrent time. Divergence is checked at the sender by comparing actualand predicted position by running the same algorithm as the receiver onthe path it has sent. When divergence exceeds a threshold, a new path iscalculated and sent. The algorithms for calculating the path are based onNewton’s Laws and Hamilton’s quaternion expressions from the seven-teenth century. Variations on the approach deal with first and secondorder integration, time constants and smoothing (Miller et al., 1989). Theremote user is presented with an approximation of movement, the mostnoticeable aspect of which is sudden jumps in position when a new eventis received (see Figure 13.1). The magnitude of this discontinuous jumpis the product of the difference in velocity described in two adjacentevents and the network delay.

Consistency

Public switched networks, such as the Internet, introduce both dynami-cally changing delays and the possibility of loss. This can adversely affectthe synchronisation, concurrency, causality and responsiveness ofevents. Synchronisation ensures that events are replicated within real-time constraints. Causal ordering ensures that causal relationships are maintained. Concurrency defines the ability of the system to allow

123456789101112345678920111234567893011123456789401112345611


242

events to occur simultaneously. Lastly, responsiveness is the delay in the user perceiving the effect of an action on the system. Concurrenceand therefore responsiveness are reduced as the level of consistency isincreased. This all leads to the need for consistency management, the role of which is to provide sufficient synchronisation and ordering while maximising concurrence and thus the responsiveness of thesystem. The optimal balance between sufficient synchronisation, order-ing and responsiveness is application and scenario dependent. An ideal ordering mechanism provides a compromise between synchroni-sation and ordering on one side and responsiveness and concurrence onthe other.

SynchronisationBehaviour may be described parametrically. For example, dead reckonedpaths describe movement through time. Some early systems based timeon frame rate. This can be seen in some single user computer games,where the movement of objects slows down as the complexity of the sceneincreases. This approach is unsuitable for IIS as shared behaviour shouldbe consistent and not represented differently to each user dependent onthe performance of the local machine. A common approach is to use thesystem clock on each computer to provide a continuous flow of time.Movement can then be described in terms of metres per second and willbe represented at the same rate to each user.

As well as progression it is important to synchronise the start of repli-cated events. Some systems, for example, NPSNet (Macedonia et al.,1994), set the start time of a received event to the time at which it wasreceived. This removes the need to synchronise local clocks accurately,which is a non-trivial task. The disadvantage with this approach is thatany behaviour is offset by the network delay. Through synchronisinglocal clocks it is possible to synchronise the state of objects from the timean event arrives until the time a subsequent overriding event is sent. ThePaRADE system, developed as part of the author’s PhD (Roberts, 1996),

011

011

011

011

11

243

Local movement Remote representationFigure 13.1 Effect of dead reckoning.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

allows locally predictable events to be sent in advance, thus overcomingthe network delay and synchronising from the start.

Concurrency ControlConcurrency control is an important subset of consistency managementthat deals with the prevention of concurrent conflicting updates. This ismost apparent where two users try and move a given object in conflict-ing directions. Without concurrency control it is difficult to determinethe outcome but it will at least cause confusion and frustration and atworst an unrecoverable divergence between replicas. Many existinginfrastructures do not include concurrency control. Those that do,employ algorithms that themselves are adversely affected by networklatency. This in turn affects the responsiveness of interaction betweenuser and shared information objects. A conservative concurrency algo-rithm, as used in some analytical simulations, would lock the whole worldand allow updates on a turn basis. This unnecessarily restricts respon-sive interaction to a level that is unworkable for general IIS applications.An optimisation is to increase the granularity of locking, to either setsof objects, object or object attribute level. A common mechanism for con-currency control is transferable object ownership, where a user can onlyaffect an object once ownership has been transferred across the network.The effect of such latency is normally apparent in a delay in being ableto interact with an object recently affected by another user. Optimisationshave been developed for predicting interactions and transferring owner-ship in advance (Roberts et al., 1998).

Causality Events sent over the Internet may be lost or arrive in a different orderto which they were sent. In many cases the current state is more impor-tant than history and can be derived from an old state and a new event,even when some preceding events have been missed. For example, a newdead reckoned path overrides the last and is not dependent upon it.Ordering is, however, often vital. A lack of ordering can cause completeconfusion when collaborating with remote users and sharing objects. Itis therefore surprising that the majority of IIS systems do not guaranteeit. This is most likely a throwback to the conventional applications ofcollaborative virtual environments that did not properly support sharedinteraction.

Order must be balanced against responsiveness. The greater the levelof ordering, the lower becomes the concurrence and thus the respon-siveness. A true objective state of an environment cannot be guaranteeduntil all events have been received and processed in the correct order.Generating a new event before the objective environment state is known is dangerous and requires some strategy for dealing with eventsgenerated on the basis of an untrue state. To guarantee objectivity all

123456789101112345678920111234567893011123456789401112345611


244

replicas must be frozen while waiting for events to arrive, thus loweringthe concurrence. Lamport developed an optimisation called causal order-ing which removed the need to order events that could not have beenrelated (Lamport, 1978). The definition of causal relationship was basedon the subjective view of a replication. Total ordering and Lamport causalordering work well in distributed analytical simulation but are not gen-erally suited to IIS applications which require continuous and respon-sive interaction with the environment. One solution is to allow the IISinfrastructure to decide when to apply and where to apply ordering. Thisdecision may be based on application knowledge of causality and impor-tance of ordering, awareness (see below) and network conditions. Suchapproaches have been applied to various degrees in PaRADE, MASSIVE3 (Greenhalgh et al., 2000b) and PING (Section 13.3.2).

Application of ConsistencyNow that we have introduced synchronisation, concurrency control andordering as the basic components of consistency, we can look at how theyare applied. Table 13.3 compares common alternative mechanisms foreach, describing each mechanism, giving an application level example ofuse, comparing typical delay in terms of level of human perception andgiving some example infrastructures in which they are used.

011

011

011

011

11

245

Table 13.3. Comparison of consistency mechanisms.Synchronisation Concurrency Ordering

Description Behaviour of an object Object replicas affected Order of object events is synchronised over concurrently over replicasreplicas

Mechanism Wall clock Tick Convergence Ownership Causal Total

Effect Remote Replicas Diverging Prevents Based All events object update states are divergence on are orderedfollows in step converged through potential parametric unique key causality behaviour

Example Dead Crowd Tug of war Passing a Player’s Player actionreckoning walking with elastic business activity delayed until

in step rope card ordered earlier in ball spectator game but action not with observedspectators

Induced delay None Medium Low Low Medium High

Example Most RTI DIVE DIVE- MassiveIII, RTIInfrastructure Spelunk, PaRADE,

PaRADE, PING

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

13.2.2 Scaling

Scaling allows the amount of information in the environment, includingthe number of users, to increase, without reducing the fidelity of experi-ence to any one user. This is achieved by balancing each individual’s needfor information with what can be achieved given available computationaland network resource (Benford and Fahlén, 1993a). Awareness manage-ment is the mechanism used to balance an individual’s ideal awarenesswith resources. The scale of information provided to any one user orprocess may be controlled in terms of extent, granularity and detail.These define awareness in terms of object subsets of the environment,aggregation of many objects into few, and the attributes of a given object.

Extent

The majority of effort in attaining scalability has focused on subdivisionof the environment and population according to interest. This is oftenreferred to as interest management. Awareness of remote objects is deter-mined by context-dependent interest. Distinct resources such as serversor communication groups, discussed later, are used to support each areaof interest. The interest of a user is dynamic and context dependent. Forexample by walking into another virtual room a user becomes aware ofits contents and occupants. A number of technical issues must beaddressed in order to support this dynamic awareness. Subdivisionshould be natural and appear transparent otherwise it can affect a user’sbehaviour. To be effective it must balance resource usage across the areasof interest. Changing awareness may require much data to be transferred.This can result in delays in the presentation of a new area, which mayreach the order of seconds. Different application genres are suited to dis-tinct definitions of interest and methods of subdivision. The granularityof subdivision may be tackled at world, object or intermediate level. Wenow survey some classic approaches to subdivision used in IIS which aresummarised in Table 13.4.

Multiple WorldsA simple method for dividing the environment and population is toprovide distinct multiple worlds. Each world is typically supported by adistinct server and hosts a distinct set of objects and users. As discussedin the deployment section, this is straightforward to support over theInternet and thus is prevalent in current systems used by the generalpublic, for example, in games such as Ultima Online (Electronic Arts,2003) and social environments. Users typically inhabit a single world ata given time and, in some systems, may move between these worlds usingportals (Snowdon et al., 2001). The disadvantage of this approach is the

123456789101112345678920111234567893011123456789401112345611


246

difficulty in balancing the number of users in each world. Figure 13.2shows multiple worlds interconnected through portals and demonstratesthe potential problems with balancing population.

Static Spatial SubdivisionIncreasing the granularity of subdivision allows worlds to be split intoareas of interest. An approach developed for battlefield simulation andtraining was to divide the environment into areas of interest in the shapeof equal hexagonal tiles and map each to a communication group(Macedonia et al., 1995). A process sends information to the group asso-ciated with the tile occupied by its user and receives information fromthat group and those associated with adjacent tiles. The supportingprocess dynamically joins and leaves groups as the user moves betweentiles. Receiving information from adjacent tiles removes the problem ofnot seeing spatially close objects across a border. Group communicationprovides a mechanism for limiting awareness at a message distributionlevel with the added bonus of removing the need for a server.

011

011

011

011

11

247

Table 13.4. Overview of classic subdivision approaches.Approach Description Granularity Example systemsMultiple worlds Separate world connected World Active worlds,

through portals Ulitima, DIVE

Static spatial Divide world surface Intermediate NPSNetsubdivision into tiles

Dynamic spatial Flexible mesh of tiles Intermediate VIVAsubdivision that stretches to balance

tile membership

Löcales Rooms Intermediate SPLINE

Aura Aura, focus and nimbus Object DIVE, MASSIVE I and II

Regions Abstract spaces Intermediate MASSIVE II, DIVE (COVEN version)

Worlds

Portal

Figure 13.2 Multiple worlds, showingportals and possible populationloading.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

It is not yet, however, generally supported on the Internet and so thismethod adds complexity to deployment which is discussed later.

Again, the static nature of this method can also produce an unbalancedpopulation of areas (see Figure 13.3).

Löcales Environment plays an important role in restricting and focusing humaninteraction. Spatial subdivision approaches are suited to open spaces butdo not take advantage of the awareness limits imposed by buildings.Löcales (Barrus et al., 1996) are areas of interest that map to physicallydivided spaces such as rooms in a building (Figure 13.4). This approachrelies on the adequate provision of resources to support a crowded roomand again suffers from its static nature. It is, however, sufficient for manyapplications.

Dynamic Space SubdivisionThe above approaches rely on an even distribution of users across statically defined areas of interest. This suits them to particular imple-mentations and restricts their general applicability. Dynamic space sub-division attempts to redefine divisions between areas of interest in orderto balance the number of users in each (Figure 13.5). Robinson et al.(2001) divide the environment into a 2D mesh or 3D lattice and move

123456789101112345678920111234567893011123456789401112345611


248

Figure 13.3 Static spatial subdivision.

Figure 13.4 Löcales.

the boundaries between the areas of interest to balance membership.Boundary movement is considered when an area becomes over popu-lated and is determined through negotiation between servers dedicatedto adjacent interest areas. Robinson’s algorithm considers the cost ofmoving a boundary to both servers and clients.

Aura Interest may be determined at the granularity of object pairs by deter-mining their potential for interaction based on spatial proximity. Spatialproximity may be efficiently detected by placing auras around objectsand checking for aura intersection. In the case of avatars, this potentialfor interaction is increased when they face each other. Benford andFahlén (1993b) encapsulate avatars in auras and use aura collision as aperquisite for interaction. Within the aura, focus and nimbus spatiallydefine attention and projection respectively (Figure 13.6). Both focus andnimbus reach out in front of the avatar but have distinct shapes.

RegionsBoth tiles and Löcales are specific definitions of how to divide the envi-ronment and are applicable to distinct forms of interaction and applica-tion genres. MASSIVE 2 combines aura-based awareness within abstractregions which may be mapped to application-specific definitions of inter-est. Figure 13.7 depicts one possible way of dividing an environment intoregions.

011

011

011

011

11

249

Figure 13.5 Dynamic spatial subdivision.

NimbusFocus

Figure 13.6 Aura – focus andnimbus.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

Granularity

In the real world people are able to reason at different levels of granu-larity. For example, a lecturer must be aware of the attention and under-standing of each student during a lecture whereas a university chancellorsees the institute in terms of departments. This approach of aggregationmay be adopted in IIS systems to further increase the scalability.Aggregation reduces not only the rendering but also the amount of infor-mation needed by some observing processes. For example, in a battle-field simulation, the driver of a tank is interested in other tanks whereasa general is more concerned with tank divisions (Singhaland andCheriton, 1996). Another example is that of a crowded stadium repre-sented by a single avatar (Greenhalgh, 1998). The size of the group, theteam they support, and the sound they produce, are represented throughthe avatar’s size, colour and aggregated audio streams respectively.Emergent behaviour may be replicated and communicated in aggregatedform to reduce the load on the network. For example, the behaviour ofa flock of birds could theoretically be replicated by simply communicat-ing the size of the group and then continuing to communicate the move-ments of whichever bird is in front. A reasonable flocking behaviour canthen be replicated at each site through application of local rules basedon following and collision avoidance. This aggregated emergent behav-iour may be applied to many other group behaviours, for example thebehaviour of a human crowd. A similar principle can be applied to anavatar allowing the majority of body movement to be calculated locallyand driven by the communication of movement of selected body point,such as head and hand. Here, a combination of kinematics and selectionsof previously recorded motion tracking data can be used to improvisereasonable local behaviour based on head and hand movement.

In order to reduce network traffic through aggregation it is necessaryfor the sender to know the level of aggregation. Although aggregationcan increase the scalability of a receiving process it can decrease the scal-ability of the sender and the use of the network when many receiversrequire distinct levels of aggregation for the same objects (Roberts, 1996).

123456789101112345678920111234567893011123456789401112345611


250

Figure 13.7 Regions.

Detail

We have seen how scalability may be increased by reducing the numberof communicating objects held on each machine. Scalability can befurther increased by managing the detail at which individual objects arereplicated. Heuristics of interest such as distance or the relationshipbetween the role of the observer and the use of the observed may beapplied. Many graphics languages, such as Inventor, Performer andVRML, support Level of Detail (LOD) modelling where a sufficient framerate is maintained by reducing the graphical complexity of distantobjects. The scalability of communication and computation can begreatly increased by applying this reasoning to the communication ofbehaviour. Objects may be defined in terms of attributes in which remoteprocesses can dynamically express and decline interest, for example asdefined in IEEE 1516 and implemented in the DMSO RTI. Balancing thedetail of communicated behaviour with the interest of remote users is animportant, if under researched, topic. The amount of information beingreceived may be reduced through local filtering or sending control mes-sages back to the sender. The latter approach again suffers from thepotential need to send distinct levels of information to different receivers.A hybrid approach might send the highest detail required by any to alland allow receivers to filter further.

13.2.3 Persistence

Users can join, leave and rejoin collaborative virtual environments atwill. When in the environment, they can affect its state through inter-acting with, and introducing, objects. A real world analogy is a bankaccount. When someone deposits money into a bank cash machine, themoney should not be lost as soon as the card is withdrawn. Persistentenvironments will maintain the effect of changes when the user leaves.Supporting persistence is straightforward when the underlying CVEinfrastructure hosts all master objects on servers. Where a localisedapproach has been adopted to increase scalability or responsiveness,master objects will be held in the memory of a user’s machine. Thesemust be moved to a participating machine when the user leaves. Providedthe behaviour of an object is known at the target site, it is only necessaryto move the current state and master status of the object.

There are two basic forms of persistence: state and evolutionary. State persistence maintains an object in a static state once its owner hasleft. Evolutionary persistence will support the ongoing behaviour of anobject. For example, in a lecturer’s bank account, which is always over-drawn, the money deposited will be reduced over time by interest payments.

011

011

011

011

11

251

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

So far we have only considered what happens to objects when a userleaves. We must also consider the effect of the environment going offline.Such an occurrence may be planned or accidental. In either case we maywish to guarantee persistence. One solution is to store object state infor-mation to disk on a persistency server both periodically and, where possible, when an imminent failure is predicted.

13.2.4 Communication

Previous sections have introduced the kind of information that must bepassed through an IIS and we have described object level mechanismsfor managing this information in order to maximise responsiveness andscalability. We now move down into the message level to examine howto actually communicate this managed information.

Requirements

The communication requirements of an IIS are complex. Those ofresponsiveness, reliability and scale of information transfer differ greatlydepending on application, context and scenario. Before we describe themethod of communication we must look at the content. We now examinesome typical forms of information and their requirements on the under-lying communication system. This is broken down into discovery ofobjects; events; audio and video.

Discovering ObjectsWhen a client alters its awareness by entering a new world or area ofinterest, it must discover the objects within. Some mechanism is requiredfor the client to obtain all the information about every object it dis-covers. This information includes state, behaviour and graphical appear-ance. Behaviours, and particularly appearance descriptions, tend to bemuch larger than state, but in most systems remain unchanged through-out the lifecycle of an object. Such information is typically in the orderof kilobytes per object. Usually such data only needs to be sent to one client at a time and it must be sent reliably, in order and prefer-ably efficiently. Users frequently move between areas of interest, whichresults in traffic bursts as the local system downloads object state andpossibly appearance and behaviour. In turn this can result in delays often reaching several seconds. It is therefore important to use an aware-ness management scheme that minimises movement between areas as well as the number of objects in each. Some systems download froman existing peer process but this can cause that process to lock, which is disorientating for its user. The responsiveness of remote peers may

123456789101112345678920111234567893011123456789401112345611


252

be maintained by obtaining all object information from a persistenceserver.

EventsThe behaviour of objects is driven and communicated by events. Eventsneed to be propagated to any interested process as quickly as possible.They are typically very small in terms of network bandwidth. Manyevents are frequent and quickly superseded. Others are infrequent andtheir loss might cause applications or users to act in an erroneous way.

The majority of events typically describe movement. Constant latencyis important as it improves the realism of remote movement. As dis-cussed above, in the context of event ordering, it is typically more impor-tant to reflect the current position as opposed to how the object arrivedat it. Since we may typically send many movement events for a givenobject in a second and that the probability of message loss is low, lostmovement events will seldom be noticed. An important exception to thisrule is introduced by dead reckoning where the frequency of path gen-eration is considerably lower. We presented a scheme for addressing thisproblem by reliably re-sending dead reckoned paths that had not beensuperseded within a time limit (Roberts, 1996). Tracking systems allownatural non-verbal communication but generate quantities of events thatare difficult to support over the network. During trials between net-worked reality centres in UK and Austria, we found it difficult to realis-tically approximate human movement with dead reckoning but have hadgreater success limiting the frequency of outgoing events for givenobjects by simple filtering.

Bursts of events typically accompany interaction with other avatars,objects or both. For example, avatar communication may well includegesticulation and talking. This results in bursts of movement events andaudio traffic. Such exchanges can occasionally swamp bandwidth andoverrun receive buffers, resulting in high message loss. This is particu-larly the case for groups of interacting users. Remote events can some-times be delayed for seconds while the IIS system attempts to catch upwith the receive buffer resulting in a temporary loss of responsiveness.In this case the loss of movement events is preferable as it brings thesystem back to a synchronised state in a shorter time. Some systems, forexample PING and DIVE, limit this time through a Bucket algorithm.

Some events may be vital particularly where they affect the result of,or ability to process, subsequent events. This includes any event thatchanges the structure of the scenegraph. Such events are commonplacewhere users interact with objects. Losing such events can cause signifi-cant divergence between users’ views. For example, one user sees that hehas taken an object out of another’s hand, while the other sees herselfstill holding it. At best this causes confusion and at worst, an unrecov-erable divergence.

011

011

011

011

11

253

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

AudioVerbal communication considerably improves the performance ofgeneral collaborative tasks as well as the feeling of co-presence. In orderfor audio communication to support human conversations it must be continuous and have a constant rate and sufficient resolution.Network bandwidth and message loss can reduce resolution. Networkjitter, where heavy network traffic causes temporary high delays, can alterthe rate at which the data is delivered. The COVEN trials suggest thataudio traffic is in the order of kilobytes per second for each user(Greenhalgh, 2001).

VideoVideo has similar requirements for continuity and rate but high resolu-tion images can require much higher bandwidths. Typical IIS use videosparingly, mapping low resolution streams to polygons. For example, alow resolution video avatar might require tens of kilobytes per second.

Solutions

Now that we have described how information needs to be communicatedwe will look at ways in which this is achieved in typical IIS. In particu-lar we focus on how data is prepared for sending over the network andhow it may be disseminated to one or many recipients with variousQualities of Service (QoS) of delivery.

PreparationBefore being sent over the network, data is marshalled into a flattenedmessage. This message is split into packets and sent across the network.Transport level network protocols convert between messages andpackets. The size of a packet is determined by the underpinning link level protocol. The Internet Protocol (IP) adopts a maximum packet size from the underlying network technology. In an IIS system a packet might contain one or several events and a continuous flow of packets might support an audio or video stream.

DisseminationDissemination determines if a message is sent to a single or a group ofrecipients. Distinct forms of dissemination are offered by various trans-port level protocols. Group communication multicast is often used in IISto scale the number of users. Multicast allows hosts to express interestin any set of communication groups. A message sent to a group will be distributed to every member at no extra cost to the sender. The

123456789101112345678920111234567893011123456789401112345611


254

scalability of the sender is maximised while that of the receiver can beincreased by mapping awareness to groups. Group dissemination mayalternatively be implemented above point-to-point protocols by sendinga message to a set of connections, for example in Spline (Waters et al.,1997) and PaRADE (Roberts et al., 1999).

DeliveryNot all packets that are sent arrive, or arrive in the correct order. Theirsubsequent assembly into a message, and delivery to the application, maybe delayed while these errors are overcome. QoS determines what crite-ria, in terms of reliability, order and timeliness, will be met before deliv-ery. Generally the higher the reliability and level of ordering, the lowerthe responsiveness and scalability. This is particularly the case for groupcommunications. Different transport level protocols offer distinct QoSin addition to dissemination. Some systems, for example, DIVE andPaRADE, implement additional or improved qualities of services abovethe transport level.

Mapping Information to Dissemination and DeliveryWe have seen how various types of information in an IIS require distinctlevels of dissemination and QoS. Some IIS systems simplify their designby using single dissemination and delivery methods and accept the draw-backs. Others, for example HLA RTI, PaRADE, PING and DIVE, combinevarious dissemination and QoS delivery methods to optimise perfor-mance. Table 13.5 suggests how an IIS might map information type dis-semination and QoS. This table is derived from combining best practiceof PaRADE, PING and DIVE.

011

011

011

011

11

255

Table 13.5. How an IIS might map information type dissemination and QoS.Type Example Reliability Order Respon- Dissemina- Throughput

siveness tionDownloads Object High High Low One High

discovery

Regular Movement Low Latest High Many Mediumevents

Irregular Object creation High High High Many Lowevents

Audio Verbal Low Latest Constant Many Mediumcommunication

Video Facial Low Latest Constant Many Mediumexpression

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

ChannelsManaging the mappings between information, dissemination and QoSbecomes complex when an environment contains many users dynami-cally moving between areas of interest. The channel abstraction may beused to map dissemination to QoS. In PING, for example, events arerouted to channels according to their type and the current area of inter-est. Some events may theoretically be sent down many channels, forexample unreliably to user machines and reliably to a persistence server.

13.3 Architecture

We have introduced the basic requirements and realities of communi-cation within IIS and outlined principles used to balance the two suffi-ciently to support fruitful collaboration between users socially situatedin an information context. This section provides case studies of twoexample systems, DIVE and PING, describing each in terms of modulisedarchitecture and use of principles. The Distributed Interactive VirtualEnvironment (DIVE) is a widely adopted CVE platform that implementsmost of the principles we introduced. The Platform for InteractiveNetwork Games (PING) attempts to bring together best practice fromCVE architecture. Although the latter is still in beta prototype stage itsdesign provides a good tool for explanation.

13.3.1 The DIVE Architecture

The Distributed Interactive Virtual Environment (DIVE) has a classicarchitecture consisting of seven modules (Table 13.6). Each represents aconceptual level and is implemented as a unique library. This providesflexibility when updating the platform.

123456789101112345678920111234567893011123456789401112345611


256

Table 13.6. Modules of the DIVE architecture.Modules DescriptionVideo Allows video to be texture mapped to polygons in the scene

Audio Audio supporting conversations between users as well as attaching sounds to objects

Graphics 3D rendering of the graphical representation of objects and thus scene

Aux Tools for the application building including the scripting language

Core Object database and supporting functionality such as time and events

Sid Communication

Threads Thread library provides concurrence at each computer

DIVE introduces, adopts and adapts most of the principles describedin the previous section. Best practice solutions have been added and iteratively improved for more than ten years. Widely used in research,this platform has proved the principles.

Localisation

DIVE uses localisation to maximise local responsiveness and make bestuse of the network. We will now look at the particular design decisionstaken in implementing this localisation within a framework of the prin-ciples outlined in the previous section.

Object ModelThe responsiveness of a user’s interactions with the environment is max-imised through object replication, negating the need for events to bepassed across the network before the local model is updated. The repli-cated object database resides on participating machines according toawareness management. A replication is organised into a hierarchy ofobjects, each of which contain state information and may be attached tobehaviour scripts and graphical appearance. A scenegraph is coupled to a local replica and mirrors those qualities of objects necessary for ren-dering. An application reads and writes to a replica regardless of the factthat other replicas may exist.

BehaviourA simple reactive behaviour associates triggers and responses to objects.An object’s behaviour is defined in an attached script. The script language, DIVE/TCL, extends the Tool Command Language (TCL) toinclude useful commands for monitoring and updating objects. Typedevents may be triggered through a user input device, collision withanother object, timer and world entry. An interest in events may beexpressed through event call-backs and responses mapped to event types.For example, an application programmer can register an interest in col-lision events for an avatar and define distinct responses to various typesof collided object. Behaviour scripts are replicated along with the object.This allows objects to react to local interactions without network induceddelay. Remote scripts are not called directly but through the communi-cation of the same event that triggered them locally. The concept of deadreckoning is supported but the implementation of the algorithm left tothe application programmer. Each object is able to store a parametricpath from which the current position may be calculated. Use of the pathto communicate and calculate current position is, however, optional.This feature is good, as not all objects move in a predictable way.

011

011

011

011

11

257

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

ConsistencyHigh responsiveness comes at the cost of low consistency. Replicas areloosely coupled allowing divergence and attempting convergence overtime. For example, when a user moves an avatar the remote representa-tion will follow, delayed by the network, and catch up when the avatarstops moving.

There is no specific concurrency control within DIVE. Hence an objectmay be affected in conflicting ways by multiple users, causing the repli-cas to diverge. Mechanisms are provided to settle an object to a meanposition after being pulled in opposing directions. Users, however,observe the object jumping wildly between them until the steady state isreached. A loose form of ownership allows an object to be attached to anavatar. Other users can still affect the object, for example, by changingits relative position to the carrying avatar. An immersive extension toDIVE, Spelunk (Steed et al., 2001), implements concurrency controlthrough object mastership.

Partial casual ordering is implemented at the communication level andis therefore described below.

Scaling

Awareness is managed at the world level as well as within the worldthrough division of the object model hierarchy. All replicas must holdthe route object but can be selectively pruned by local interest. Branchesmay be assigned to interest groups to which the application may expressinterest. This low-level approach can support any higher-level awarenessmanagement scheme that maps to the organisation of the hierarchy. Bothsubjective views (Snowdon et al., 1995; Jää-Aro and Snowdon, 2001) andaura-based focus and attention (Benford and Fahlén, 1993b) have beenimplemented on top of DIVE.

Level of Detail (LOD) is partially supported. Composite objects com-prise a tree of objects within the hierarchy; therefore, interest-based treepruning may be used to reduce their complexity. However, this doesrequire some scripting. The LOD of an atomic object can only be switchedwithin the graphics module. Thus, without custom scripting, LOD affectsappearance and rendering performance but not behaviour and networktraffic. The default renderer supports distance-based LOD switching.Adaptive rendering was incorporated in the COVEN extension to DIVE(Frécon et al., 2001). Here, distance culling and iterative rendering techniques can alter the detail of the rendered scene to meet specifiedframe rates.

Aggregation is not directly supported but again could be implementedat the application level by making use of interest management, this timeto switch from a sub-tree to an alternative atomic object.

123456789101112345678920111234567893011123456789401112345611


258

Persistence

Objects are not owned and thus can be created by an application and leftin the world once the application has closed. Any application can removethe object from the world. By default, clients are responsible for persis-tency. Early versions had no persistency servers but an object wouldremain in the world as long as one copy of it exists. Later versions ofDIVE incorporated persistency servers. Object behaviour can be definedby scripts that are replicated along with the object at each host.Evolutionary persistence is maintained through the continued triggeringand execution of scripts. The triggering events can come from the objectitself or from other objects in the world.

Communication

DIVE uses a combination of point-to-point and group communication.The point-to-point protocol (TCP) is reliable and ordered. Group com-munication is supported at two qualities of service: unreliable, andpartial reliability and order. IP Multicast provides the former and isextended into Scalable Reliable Multicast (SRM) for the latter.

Discovering ObjectsThe first client to enter a world downloads the initial world from anInternet location. Subsequent clients entering obtain the current worldfrom a peer. This approach allows an up-to-date world to be downloadedimmediately without the need for a world server. A downside of thisapproach is that the peer from which the world is obtained freezes whilesending data. This typically takes tens of seconds depending on the com-plexity of the world. Later versions of DIVE address this problem byallowing downloads from persistency servers instead of from clients.

Clients can create objects at any time and must inform peers on doingso. When a client discovers a new object, either through a creation orupdate message, it may request the object. With the exception of the firstclient download, all requests and downloads are done over SRM. An algo-rithm attempts to transfer objects from the nearest client in terms ofnetwork delay.

EventsAll events are sent using SRM. Partial reliability with ordering aremapped to three event categories: movement, geometry and general. Bydefault, all three are set to reliable. Each object has a causal counter whichis stamped to outgoing messages. Partial ordering and reliability ofevents are implemented within SRM. Partial ordering ensures that two

011

011

011

011

11

259

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

events from the same object are delivered in the order they were sent butthe same is not guaranteed for events from distinct objects. Partial reli-ability guarantees that if a lost event is detected through arrival of a laterevent, the object state may be requested. The assumptions made are thatevent loss and disorder are rare and that the partial reliability and order-ing are thus sufficient to converge the databases over time. Both relia-bility and ordering are achieved through object sequencers thusproviding a high level of concurrence and thereby reducing the effect onresponsiveness. Receiving an unexpectedly high sequencer detectsmessage loss. When this occurs, the state of the object is requested ratherthan the set of lost updates. The downside is that loss is not detected untila subsequent event from the same object arrives. The loss of events forinfrequently updated objects can make some applications unworkable.For example, a door might be unlocked by one user but remain lockedto another. Furthermore, a dead reckoned path can result in consider-able divergence if a subsequent path event is lost.

Audio and VideoBoth audio and video are streamed across unreliable multicast. Respon-siveness and consistency are slackened to allow constant delivery ratessuitable for human communication. Each world has a unique multicastgroup for audio and another for video. Sound is spatialised so that objectsand avatars are heard from where they are seen.

13.3.2 PING

The Platform for Interactive Network Games (PING) was developed bya European consortium headed by France Telecom. It combined manybest practice principles into a scalable architecture implemented as acommunications infrastructure for support of massive multi-playergames. Figure 13.8 and Table 13.7 summarise the PING architecture interms of modules.

123456789101112345678920111234567893011123456789401112345611


260

Table 13.7. Modules of the PING architecture.Modules DescriptionEntities Interfaces replicated persistent objects to the application programReplication Manages the replication of objects including life cycle and synchronisationPersistence Maintains persistence using stable storageConsistency Balances synchronisation with responsiveness Interest Manages awareness in terms of world subdivisionCommunication Supports message passing between processesCore Provides core services used by and linking the other modules.

Localisation

Object ModelAt each process, objects are replicated by the replication service accord-ing to awareness determined by the interest service. The entities man-agement service interfaces replicated persistent objects to the applicationprogram. It provides selective levels of transparency of distribution andreplication. The replication service is responsible for the life cycle man-agement of replicas and makes use of the consistency service to updatethem. The object model comprises both data objects and reactive objects.Data objects hold state information. Reactive objects are data objects thatembed a reactive behaviour. Data objects may be shared and may alsobe made persistent. Sharable objects contain a selection of sharableattributes.

BehaviourTwo forms of behaviour support are provided: reactive and reflective.Reactive objects are associated with a reactive program. Within a givenprocess, reactive objects communicate through local broadcast of events.The reactive behaviour is defined at the application level and then repli-cated within the object model by the replication service. The reactiveprogram defines triggers and responses in terms of typed events.

Reflective behaviour allows the behaviour of objects to adapt to theavailability and condition of computational and network resources. Thisfacility is not core to the PING infrastructure but may be placed betweenit and the application as a filter.

011

011

011

011

11

261

Behaviour

Application

Entity

Replication Consistency Persistence

NetworkInterest

Figure 13.8 The PING architecture.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

ConsistencyThe consistency service is highly configurable supporting a range of timemanagement services. The consistency module sits below that of thereplication and above that of event router and in turn above communi-cation. Its purpose is to balance synchronisation with responsiveness andthis is achieved by delaying the sending or delivery of events accordingto some interchangeable time management policy. Each iteration of thelocal simulation process is synchronised by a tick. This tick causes eventsheld in the consistency module to be delivered to the replica accordingto the time management policy. Supported policies fall into two cate-gories: non-causal and causal. Non-causal strategies are: receive order;time stamp; and predictive. Receive order simply delivers events to thereplication in the order received. Time stamp delivers them in the orderthat they were created. Predictive delivers predicted events at the pre-dicted time thus overcoming some effects of network latencies. Thesending of predicted events may be delayed to reduce the likelihood oferroneous predictions. Causal order may be guaranteed with policies thatdefine causality in terms of awareness or interaction. Some causal poli-cies are based on object sequencers and so use the exchange ofsequencers to provide concurrency control.

More general concurrency control is offered as a core service of PINGfrom outside the consistency module. These include read and writelocking of objects using either a pessimistic or optimistic approach aswell as a choice of explicit or implicit locking. Pessimistic concurrencycontrol prevents inconsistencies whereas the optimistic approachresolves them. The former is generally better for human in the loop realtime systems and is used in PING by default.

Scaling

The interest management service provides support for world subdivisionpolicies which may be defined at the application level. The role of thismodule is to manage dynamic grouping, determine the set of object repli-cas needed within the local process and informing other processes ofchanges in interest through the generation of events. Neither control ofLevel of Detail or aggregation is supported within the infrastructure.

Persistency

Persistency is provided at two levels relating to static and evolutionarypersistence. Static persistence is supported over stable storage and isguaranteed when all processes have exited. Evolutionary persistencemaintains and evolves objects as long as one replication of them existsin any process.

123456789101112345678920111234567893011123456789401112345611


262

Communication

Discovering ObjectsThe discovery of objects is directed by the interest management service.The replication service is responsible for the life cycle of each replica andmust thus fetch an object to a process when it is originally discovered. Alocal caching service is provided so that an object need only be fetchedonce even though an interest border may be traversed several times.

EventsEvents are used to synchronise replica updates as well as communicatesystem messages. These events are synchronised by the consistencyservice described above. An event router service takes outgoing events from the consistency service and uses interest management todirect them to appropriate communication channels. Channels provideApplication Level Framing (ALF) to map events to particular dissemi-nation groups and qualities of service. The granularity of the ALF is thatof an object. The event router maps unique object identifiers to channelsusing tables that are updated according to the interest managementservice.

Various underpinning transport level protocols are used includingUDP, TCP and IP multicast. SRM offers object level reliability and order-ing above the latter. Channels hide all the choice of protocol from theservices above. The infrastructure may be configured to implement reli-ability and ordering at either the consistency or communication level.The basic requirement of reliability on the communication service is thatit will not deliver an incomplete or corrupted event to the consistencyservice.

13.4 Deployment

IIS bring together people, possibly from distinct geographical places, intoa shared information space. We have shown how the environment maybe replicated across many processes and synchronised through eventcommunication. So far we have assumed that all the machines are con-nected to some network of reasonable bandwidth which allows them tocommunicate using a combination of peer-to-peer and group commu-nication and using varying qualities of service. Unfortunately the use ofa current wide area network, such as the Internet, introduces problemsthat must be addressed when deploying an IIS over them. This sectionconsiders the impact of real world problems of deployment on the exist-ing Internet. These include firewalls, modems and the lack of Multicastcapability on the Internet. We consider three idealised approaches todeployment: point-to-point, tunnelled group, and hybrid.

011

011

011

011

11

263

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

Firewalls have become essential to maintain the security of corporateand academic networks connected to the Internet. A firewall restrictsaccess to selected port numbers, protocols and remote sites. It is unlikelythat an IIS process can communicate through a firewall without someadjustment or help.

IIS systems may allow inclusion of users from home or mobile com-puters. Such computers connect to the Internet using modems which typically offer low bandwidth compared to corporate and academic networks. Furthermore, modems offer only a point-to-point connection.

Multicast is supported on most local area networks (LANs) but is currently not supported on much of the Internet. This is because of problems with scaling routing strategies and global management of theaddress space as well as the large number of legacy routers in use.

13.4.1 Point-to-point

The traditional approach to distributed processing on the Internet isbased on the simple client–server model (Figure 13.9). This approach hasbeen popular in supporting public IIS applications such as social meetingplaces and games. This popularity arises from the simplicity of accessand reliability offered by restricting communications to point-to-pointconnections as well as the simplicity of security, maintenance and con-sistency offered by servers. Clients connect to servers that maintain thecurrent state of the environment. Scalability is increased by mappingservers to worlds or awareness management subdivisions. Servers decidethe true state of the environment and thus simplify concurrency control.Many offer persistence. Home or wearable computers connect to theInternet through modem links and Internet Service Providers. Those on LANs connect through corporate routers. Although this model is

123456789101112345678920111234567893011123456789401112345611


264

CorporateLAN

CorporateLAN

Client

Client

Client

Client

ISP

ISPServer

Figure 13.9 Point-to-point deployment across the Internet.

fundamentally less scalable than those using group communication,some games applications have boasted tens of thousands of simultane-ous users by mapping awareness management to servers and relaxingconsistency (Ultima Online, 2003).

13.4.2 Tunnelled Group

Group communications mapped to peer-to-peer distribution is generallymore scalable than point-to-point (Figure 13.10). It does, however, com-plicate the development and deployment of an IIS system. This approachis dominant in research and defence simulation training, which both aimfor optimal rather than simple solutions and, furthermore, do not makewide use of low cost modems. To use multicast across the Internet it iscurrently necessary to join some Multicast backbone, such as MBone(Berkeley Laboratory, 2002), or to deploy a private equivalent. Multicastbackbones use an approach called tunnelling. Each connected LAN hasa tunnel process that converts between multicast and point-to-pointnetwork packets. Multicast packets are captured by a tunnel process,encapsulated in IP packets, through firewalls and across the Internet, topeer tunnel processes on remote LANs that strip off the IP headers andredistribute as multicast. Private tunnels typically offer high security and low latency compared to tunnelling across public backbones.

The servers can be placed at any LAN or stand-alone computer con-nected via a tunnel. Servers provide either initial or persistent worlds butare not generally responsible for maintaining the true state of the envi-ronment. Maintaining this true state is the responsibility of the clientswith the help of distributed consistency control.

011

011

011

011

11

265

TunnelTunnel

Server

Server

Tunnel

LAN

Client

Client

ClientClient

Client

LAN LAN

Figure 13.10 Tunnelled group deployment across the Internet.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

13.4.3 Hybrid

A hybrid solution, pioneered in DIVE (Frécon et al., 1999) and refinedin PING, is to allow private computers to link to multicast connectedservice providers (Figure 13.11). Let us call these IIS providers or IISPs.Tunnels link the IISPs and other servers. An IISP is responsible for con-verting point-to-point communication from a client into group multi-cast. Awareness management mapped to group addresses determineswhich clients, IISPs and other servers receive a given message. IISPs arepositioned to minimise latency across the point-to-point link.

13.5 Conclusion

Inhabited Information Spaces (IIS) situate users in a social informationcontext. In the real world, these users may be co-located or at differentgeographical locations. The unique combination of IIS technology provides us with unprecedented access to information, and ways of processing, presenting, interacting and sharing it. The technology mapswell to social human communication supporting not only verbal andnon-verbal communication but also unprecedented communicationthrough information objects in the environment.

Both information, and the way in which users interact with and aroundit, must be supported in a natural and intuitive manner. This requiresthe issues of responsiveness, fidelity, consistency and scalability to beaddressed. A multi-level architecture is required to focus these issues on representation, behaviour, synchronisation and communication.

We have described the principles of supporting these issues at eachlevel and how this is done in example systems. Deploying systems over

123456789101112345678920111234567893011123456789401112345611


266

Client

ClientClient

ClientIISPIISP

IISP

IISP

Server

Tunnelled multicastnetwork

Figure 13.11 Hybrid deployment across the Internet.

the Internet introduces additional problems of security, bandwidth anddissemination. We have shown three idealised models of deployment toexplain how these issues may be addressed for different applications andnetworks.

For reasons of space, this chapter has focused on the support IIS thatallow people to inhabit information spaces through primarily graphicalinterfaces. We have not discussed important systems such as COMRIS,where the emphasis is placed on the large-scale co-habitation of agentsand people within information space and the primary use of audio inter-faces.

The underpinning technology of IIS, and particularly the communi-cation systems, are reaching maturity. Simple networked IIS are alreadyin daily public and commercial use. More advanced systems in researchoffer considerably higher levels of realism and richness. Core to this isthe shared interaction with dynamic and steerable information. Most ofthe core principles of IIS communication are well developed and a deepunderstanding of the usability of such systems is being gained. An IIScommunication infrastructure that addresses all the issues well is yet to emerge. The time has now come for the IIS research community toconsolidate and bring together best practice at each architectural level to develop systems to a commercial standard.

Distinct applications have diverse requirements and it is unlikely thatone system will fit all applications for the foreseeable future. However,we are yet to achieve true realism in social interaction with informationin any system. We are some way from being able to work together in anIIS without constantly thinking about the effects of the system but thelight at the end of the tunnel is growing close.

Acknowledgements

The author would like to thank his PhD students, particularly RobinWolff and Oliver Otto, as well as Anthony Steed and his colleagues atUCL, Emmanuel Frécon and his colleagues at SICS, and Frederic DangTran and his colleagues within the PING consortium.

011

011

011

011

11

267

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


13

Part 6Community

011

011

011

011

11

269

14Peer-to-peer Networks and Communities

Mike Robinson

14.1 Introduction

Inhabited Information Spaces, and associated Virtual Communities, are by definition based on some form of computing and networking technology. The social phenomena displayed are constrained and con-ditioned by the available underlying technologies. There is a link betweenapplication design, actions and activities available to people using thetechnology, and the underlying technologies. This link is the “organis-ing concept”. It acts as a boundary object (Star and Griesemer, 1989;Kaplan et al., 1994) between the three realms. It enables a conceptualgrasp of the underlying technology as it relates to, and emerges into the social world of people and activities. This chapter proposes that theorganising concept of peer-to-peer is community, and that peer-to-peerthus has a special relevance to Inhabited Information Spaces.

In the mid-1980s, local area networks (LANs) became associated withthe organising concept of “group”. This provided a vital lever for sellinga novel and expensive leading-edge technology to organisations. “Group”enabled managers to understand what they could use a LAN for – emailand file-transfer between, file and printer sharing for “group” members,etc. From here, “group” also constrained and directed the genre of applications that were developed. Ideas for applications could be justi-fied (and financed) if they “supported groups”. Group thus set up a formof positive feedback between LAN development, LAN diffusion throughcustomer acceptance, and application design.

Although “group” has correctly been much critiqued, nevertheless itled to many interesting and innovative applications and experiments inComputer Supported Co-operative Work (CSCW) – within a time framethat lasted from the late 1980s to the early 90s. We may call this periodseven fat years for CSCW. As a collection, the CSCW innovations of theseyears met Thompson’s criteria for a major cultural change.

011

011

011

011

11

271

In his view (Thompson, 1972), the move from speech, to writing, toprint effected three significant changes in the surrounding culture – achange in the ease with which stored human experience can be accessed,an increase in the size of the common information space shared by thecommunicants, and an increase in the ease with which new ideas can bepropagated throughout society. As these features are difficult to measuredirectly, he proposes a “test of significance” for each as follows:

1. Must affect the way in which people index information.2. Must increase the range of strategies open to the communicants for

the interrupt act.3. Must increase the probability of transmitting or receiving an inter-

esting but unexpected message (Bannon, 1989).

In the mid-1990s, the general focus of development shifted from LANsto the Web. Technically, the Web can be regarded as one of the majorinnovations of the twentieth century. At the social level it has trans-formed the notion of “information”. The Web has undoubtedly metThompson’s first and (if the ratio of junk to interesting messages is discounted) third criteria.

Nevertheless, we should not lose sight of what the Web did not do. TheWeb did not meet Thompson’s second criterion, and it did not change,add to, facilitate, or provide important tools for the ways people com-municate and interact with each other. Group and CSCW software wasextended (sometimes successfully) to the Web, but “groupware” was nota main focus of interest or commercial success. Generally there was alack of richness and innovation in the Web vis-à-vis interaction betweenpeople. This is not surprising as the organising concept was “library”:one person in silence in front of a screen of information, linked to mil-lions of other screens of information. This metaphor is neither an accu-rate model of real libraries (Bowker and Star, 1999) nor is it conduciveto CSCW, or to virtual community (despite some heroic attempts on theVR front). It would not be an overstatement to classify the years of the Web explosion, the mid-1990s to 2001, as seven lean years for CSCW.They were also seven lean years for community, as the widespread nostalgia for the community spirit of the early Internet shows.

Is the next period likely to be any better, in terms of support for inter-actions between people? This chapter takes the optimistic view that therecould be another seven fat years. “Peer-to-peer” – better known simplyas P2P – is a technology under development. It has some proven appli-cations and some interesting promises and potential areas of application.The term P2P includes networks that expect end users to contribute theirown files, computing time or other resources to the benefit of themselvesand/or others.

Just as interesting is the emerging organising concept: “community”.First let us admit that there are some thin notions of community knock-ing about. It is stretching credibility (or at least the English language) to

123456789101112345678920111234567893011123456789401112345611


272

talk about Napster or Gnutella “communities” as communities. (It canalso be noted that Napster is not a genuine P2P application either,although cited in most articles). Music search and download may bepopular, and it may be convenient to find the sounds on someone’s per-sonal hard disk, not on an official server. To classify the people doingthis as a “community” is beyond the current meaning of the term. But itdoes not do to be too didactic in the critique. Language is changing, andeasily comes to include and reflect new uses. Similarly practices are notstatic. A community that starts as a statistical or functional aggregatemay easily develop forms of interaction, and may even stabilise as a com-munity of practice. And even if the notion of community, in some cases,is thin, this does not mean it will not be useful. The concept of “group”was often used in equally stretched and bizarre ways: consider forinstance the sleight of hand in Decision Support System research thatconsidered “nominal group” as equivalent to a face-to-face discussionThis did not prevent “group” making a significant contribution to inno-vative and stimulating experiments in the context of the LAN.

There is a better example than Napster of the way in which “commu-nity” is needed as a P2P organising concept. Peer-to-peer technology itselfposes some hard technical questions that need “community” to lay out theterrain of possible answers. “Search” is an area which server-based Webtechnology has addressed extremely well. There are search engines to suitalmost all needs. They are very powerful, and generally effective. If, how-ever, the Internet expands to include the majority of personal hard disksas well as all the current Web servers, then the searchable space willexpand about thirty times, on a conservative estimate. Such an order ofmagnitude change is beyond the capacity of current search engines, and,worse, it is beyond the current technology of search engines. In additionto the problem of pure scalability, there is another feature that currentsearch technology cannot address. The hard disks in the expanded spaceare not “always on”. They connect and disconnect, come and go in unpre-dictable ways. Applying current search techniques in an intermittent-connection context would lead to a situation where at least 90 per cent of results led nowhere. This is not acceptable by any standard.

How can these issues of scalability and intermittent connectivity beaddressed? “Community” as an organising concept provides the neces-sary lever to subdivide Web space, and deal technically with the “on/off”nature of the content providers. A general scenario could run like this.The Web remains the domain of the search engine as we know it. Withinthis space there is one additional entity, community. Thus existingengines identify community. To search a community then becomes thework of a community search engine, again similar to those in existencetoday. These already (often) have the capability to search the limited sub-space quickly and frequently. Thereby they do what the Web enginescannot do: maintain an up-to-date awareness of which files are currentlyavailable and which are not.

011

011

011

011

11

273

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Peer-to-peer Networks and Communities

14

The practice of community can be formalised and utilised by com-bining both types of search engine to extend searchability into P2P space.This is driven by technical necessity. Socially, the search scenario suffersfrom the same thin notion of community as the earlier Napster andGnutella examples. We need to understand how the gap between this and a richer notion of community can be bridged. This will be addressedin the following sections.

The next section will address in more detail early (CSCW) inhabitedinformation spaces. It will present some examples of applications. It willexplore how the notion and practice of design for “community” can bestrengthened by this part of history.

The third section will explore the various meanings of P2P, and itsoverall directions. The fourth section will come back to the notion ofcommunity, the strengths and weaknesses of current usages, and designaspects. It will draw on the historical strengths and weaknesses of“group” as an organising concept in CSCW.

We will conclude that the notions of community and P2P are symbi-otic in a profound sense. P2P has no raison d’être without community,and the development of P2P networking will develop our understandingof community. Lastly, we conclude that the design project of InhabitedInformation Space may be fruitfully, although not exclusively, continuedin the area of P2P under a CSCW perspective.

14.2 Early Inhabited Information Spaces in CSCW

14.2.1 Rendering the Invisible Visible

Many of the early inhabited information spaces in CSCW were quite realspaces. A large proportion of them were control rooms of one sort oranother. Heath and Luff studied London Underground control (Heathand Luff, 1991, 1996) and a City of London trading room (Heath et al.,1993). Suchman (1982, 1997; Suchman and Trigg, 1997) studied officework, airport ground control, and “centres of coordination”. Goodwinand Goodwin (1996) studied aircraft ground control. The LancasterGroup studied air traffic control (Harper et al., 1989, 1998). Nardi et al.(1993) studied medical co-ordination in complex neurophysiologicaloperations. Bowers et al. (1995) studied the subtleties of workflowsystems on the print industry shop floor. All these authors pioneered thetradition of ethnographically informed CSCW. In less grandiose terms,basing system design on people’s day-to-day practices.

There was, however, a more far-reaching social consequence of theseethnographic investigations. They changed the meaning of the word co-operation. Historically there had been a great deal of theory (much ofit rather speculative) about co-operation. There was a public agree-ment that “co-operation is a good thing” (something we will see later

123456789101112345678920111234567893011123456789401112345611


274

explicitly claimed for collaboration). In some extremely penetrating arti-cles, Fairclough (1986, 1987) demonstrated, with practical examples, howco-operation meant almost anything (good) to almost everyone. It wasindeed a very special sort of word, more like an absolute moral impera-tive than a description or an activity. Thus one of the weaknesses of prac-titioners (especially in organisations terming themselves co-operatives)was plenty of ideology and a lack of feel for practice (Landry et al., 1986).From within the heart of the CSCW movement arose a new understand-ing of co-operation in practice. Not in terms of ideology, principles andprecepts, but in terms of everyday, minute-by-minute working life.Researchers such as those mentioned above rendered the previouslyinvisible minutiae of co-operation visible. They showed that co-opera-tion was not an external to the work process, and had little to do withorganisational rules about equality. Co-operation was that through whichthe work process was constituted. It was central and could not be deletedwithout deleting the work itself. Co-operation was thus regrounded as avery normal activity, as a part of what it is to be human (a social animal).Co-operation ceased to belong to the world of moral abstraction, andtook its place at the heart of everyday activity.

To this author it seems very likely that the activities and investigationsaround embedding P2P in community and community in P2P will resultin a similar seismic displacement and development of the notion of com-munity. Hence a certain level of optimism about the coming period forinhabited information spaces beyond the confines of the Web.

To try and give some idea of the creativity in the fat years of CSCW, I have chosen three examples. All of them were experimental, and noneare especially well known. Each of them is especially well suited to P2Ptechnology. Examples of virtual reality IISs can be found in Chapters 2,3 and 4.

14.2.2 ClearBoard

In a series of papers, Hiroshi Ishii and co-workers developed ClearBoard(Ishii et al., 1992; Ishii and Kobayashi, 1993). The initial metaphor was asimple glass screen between two people (Figure 14.1). They could bothdraw on the screen and both could see what the other was looking at.This was a breakthrough in the concept of video communication. It waspossible to attend to “the work” and at the same time see what part ofthe work the other was attending to. Such awareness is a precondition ofco-operation. It is a great pity that, 10 years later, most designers ofvideoconference systems still have not learned this simple lesson. Thefinal version of ClearBoard (Figure 14.2) was rather sophisticated. It wasnetworked, and used a large screen angled like a drawing board. Byvarious tricks of video reversal, both the writing and the partner’s gazewere the correct way round for both parties (think about it!).

011

011

011

011

11

275

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

14.2.3 Feather, Scent and Shaker: Supporting Simple Intimacy

In 1996, at the end of the CSCW cusp, Strong and Gaver (1996) of theRoyal College of Art presented a demonstration of some simple devicesto support intimacy (Gaver, 2002). The word love hovers in the back-ground. The scenario was that one partner, thinking of the other, picksup a framed photo of them. The movement of the photo triggers a simplepulse over the Internet. Somewhere else, another continent, anothercountry, the beloved is at home. In the corner, decoratively, is a waisthigh, slim glass tube. As the first partner thinks wistfully of the other, a feather floats softly up the tall glass tube, and hovers, fluttering, nearthe top.

Sadly no video grabs are sufficiently clear to reproduce here, and thereader will have to rely on words for the images. These ideas were nevertaken up in CSCW since they are somewhat outside the idea of “work”.But what better application could P2P develop for acceptance at the mostfundamental and intimate level of community?

123456789101112345678920111234567893011123456789401112345611


276

Figure 14.1 The ClearBoard proto-type. Reproduced with permissionfrom Hiroshi Ishii.

Figure 14.2 ClearBoard. Reproducedwith permission from Hiroshi Ishii.

14.2.4 Gesture Cam: The Nodding Robot

Kusuoka and co-workers work in the field of remote instruction. Theyfaced the normal difficulties when supporting instruction in a complexengineering context. The instructor (remote) needed to know the layoutof the equipment he was dealing with in real time; what the student wasdoing and what they were attending to; and the instructor needed to beable to point out physical items such as pulleys and knobs, and be ableto indicate the direction they should turn in.

Undaunted by the impossibility of the requirements, the researchersproduced GestureCam (Kuzuoka et al., 1994; Yamazaki et al., 1999). Thiswas fundamentally a small robot with an eye (camera) and a finger topoint with (laser). See Figure 14.3. The robot’s joints and motors weresynchronised with an identical twin that the instructor could control inorder to look about and point in the remote location. It suffices to sayhere that the experiments were successful. Both instructors and studentswere satisfied.

Three issues are of special interest here. First, this is an ideal P2P appli-cation. There is no reason why servers should be holding files that areonly of interest to the parties concerned. Second, just as CSCW had todo, P2P needs to broaden its scope from PCs, and look at other devices,including person surrogates (Buxton, 1993) and robots. Third, returningto the specifics of GestureCam, the interactions took on a specially richcharacter that the designers had not anticipated. For instance, in orderfor the instructor to see the direction of the student’s gaze, and the itemthey were looking at, the little robot had to turn frequently from one tothe other. Sometimes the instructor would ask if the student had under-stood, and the student would nod. Reciprocally (although it was visuallydysfunctional for it to do so) the robot learned to nod to the student asan affirmative answer. Then, at the end of the session (and outside thedesign parameters), the robot and the student would do a little bow toeach other, a gesture of farewell and mutual respect (Figure 14.4).

011

011

011

011

11

277

Figure 14.3 GestureCam. Reproducedwith permission from HideakiKuzuoka.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

14.3 P2P Themes and Overall Direction

Peer-to-peer technology can be confusing (Red Herring, 7 May 2001).As already noted, P2P includes all networks that expect end users to

contribute their own files, computing time, or other resources to thebenefit of themselves and/or others. The notion is set to become evenwider, since the P2P Working Group merged with the distributed com-puting community (Global Grid Forum) in April 2002. Most research andcompanies in the P2P area specialise in either distributed file sharing orin distributed computing. There are many well-known examples ofsystems to support (mainly music) file sharing: Napster, Gnutella,Limewire, KaZaA, Morpheus, Grokster, and others.

Less well known are these companies seeking to exploit the business-to-business potential of P2P interactive and collaborative file sharing.Groove is the largest, claiming 200 partners including Microsoft in itsdevelopment program. The keyword for these B2B applications is col-laboration. The Groove Website explicitly says “Collaboration is Good”.It focuses on supporting small group and cross-enterprise collaborations,and, importantly, emphasises collaboration in context. Here it says:

Collaboration rarely happens in a vacuum. People interact with each otherwithin the context of well-defined and ad hoc business processes, using the content in which they are swimming all day. Process and content “surround“ these daily business activities (http://www.groove.net/pdf/backgrounder-product.pdf).

Figures 14.5 and 14.6 show two typical CSCW scenarios involving co-ordination – but not as naively conceived. All the actors (office workers,journalists) are immersed in their own work, as can be seen from theirgaze directions. At the same time, and not so obviously, they are mindfulof the activities of others, synchronising with them, and being ready to change action in mid course should it be needed. In these areas of

123456789101112345678920111234567893011123456789401112345611


278

Figure 14.4 GestureCam and astudent. Reproduced with permis-sion from Hideaki Kuzuoka.

collaboration, P2P appears to have learned these lessons of CSCW, andis targeting many of its application areas:

There are tools in Groove Workspace for sharing content (files, images,maps), having conversations around that content (discussions, instant mes-sages, live voice, text-based chat), and working together on shared activities(real-time co-editing and co-viewing of documents, co-browsing, groupproject management and tracking, meetings management) (ibid.).

The overlap with CSCW is not surprising since the founder of Grooveis Ray Ozzie, one of the creators of the popular Lotus Notes Program,which itself incorporates many findings from CSCW. In an interview,Ozzie outlined the origin of his P2P vision:

The epiphany of sorts was that I kept watching my daughter, Jill, doing herhomework with her friends over [AOL Instant Messenger ], and my son Neilplaying Quake [a search-and-destroy game that can be run on a multiplayernetwork] with his online friends.

011

011

011

011

11

279

Figure 14.5 Office work in Greece.

Figure 14.6 Community of journalistsin Finland.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

In watching Neil in particular, I found that Quake was an immensely effec-tive collaborative environment for a shared task: His team had to “capturethe flag” of the other team. It used every bit of horsepower of the PC andnetwork to help each player be efficient and effective at that one task. Inbusiness, we commonly have projects that require multiple people to self-organize and solve problems. But why are we stuck using e-mail, when tech-nology is being used to serve these kids so much more effectively? (Kharifand Salkever, 2001).

On the other side, there are resource sharing companies andresearchers. The principle of resource sharing is to use idle PCs for dis-tributed processing. P2P connections tap cheap, under-utilised process-ing power and bandwidth. Proof of concept came from David Andersonat the University of California at Berkeley. He recruited 3.2 million vol-unteers, each of whom downloaded a small program that parses radiotelescope data as part of a massive search for extraterrestrial life. Theprogram runs while the volunteers’ PCs are idle and sends the processeddata back to Anderson every time they log on. The network created inthis way has the processing power of 3.2 of IBM’s $100 million ASCIWhite supercomputers. Several companies are basing distributed pro-cessing for corporations on this experiment: for instance, United Devices,DataSynapse and Entropia.

Like the file share systems, distributed processing ranges from thepractical to the utopian. One especially beautiful idea was put forward byTodd Papaioannou, CEO of Distributed Artificial Life Inc. He suggestedsimulating the Indian Ocean on millions of desktop computers and otherdevices. Each screen simulates a tiny part of the Indian Ocean. Virtualwildlife – autonomous digital organisms with their own needs – swimfrom desktop to desktop of their own free will. Users create fish and turnthem loose in the virtual sea – where they can pass through other com-puters, PDAs, cell phones, or any other Java-enabled device. Papaioannouhoped to use this experience as a stepping stone to creating large-scaledistributed P2P games.

So far we have laid out what any book chapter or news article saysabout P2P. Namely, that under a general heading of distributed com-puting there are three generic types of application: file sharing; co-ordination support; and distributed processing. What unifies these three diverse application areas as a single research field? Certainly notthe technology, which is diverse, often proprietary and application spe-cific. Certainly not the idea of a network: the idea of machines acting asboth server and client is as old as the Internet itself, and pre-Web, wasthe usual state of affairs. What unifies P2P is the organising concept of community. A quick scan of P2P texts and commentaries shows “community” to be the key concept. It is often used instinctively andunreflectively and there is rarely a recourse to definition.

123456789101112345678920111234567893011123456789401112345611


280

14.4 Design for Community: Inhabited InformationSpaces

Before we design for community, it is as well to look at some definitions,uses and issues connected with Inhabited Information Spaces (IIS). Theterm IIS originated in the early to mid 1990s in the context of “mixedreality” applications. These combined multi-person Virtual Reality intonon-VR environments, such as television (Benford et al., 1999a), or (reallife) offices (Robinson et al., 2001; Büscher et al., 2001) or theatres(Benford et al., 1997b). As the number of IIS projects blossomed (e.g. thei3 network), the term took on a more generic meaning: any social orhuman activity domain or set of domains with a technologically medi-ated information structure or system. There was much research interestin embedding new information technologies (e.g. VR, hand-held) inexisting technological (e.g. television) or physical (e.g. theatre, neigh-bourhood) spaces. “Community” is widely used in IIS research, but withdifferences from the uses in P2P.

In P2P, as we have seen, community is the organising concept, withoutwhich the area fragments into disparate elements. Community is core tounderstanding the technology, the designs, the social domains, and thelinkages between them. In IIS, community is more often a description ofthe target domain for a particular technology. The technology itself isusually prior, and given externally to the domain. The suggestion in thischapter is that IIS should learn the lesson of P2P, reconceiving commu-nity as a technical and design concept, as well as a social criterion. Atbest this would enable the co-evolution of technologies and socialdomains. At worst, there is little to lose.

Before looking more closely at IIS and P2P design issues, we need toexplore the strengths and weaknesses of “community” more closely.

14.4.1 Communities: An Aside on Definitions

In a seminal and programmatic paper for CSCW, Schmidt and Bannon(1992) deconstructed the idea of “group” as used to define co-operativework. Since much of what they say has direct relevance for, or can beextended to the idea of community, it is worth reflecting on some of theirpoints. With respect to groups, they argue that “the very notion of a‘shared goal’ is itself murky and dubious”. This is prima facie even morerelevant to community, despite the fact that “community goal” is oftenused in the literature. There may be partial objectives for some membersof a community some of the time. Common goals cannot be used todefine communities. “Cooperative ensembles” say Schmidt and Bannon“are either large, or embedded within larger ensembles . . . [they are]often transient formations. Membership . . . is not stable and often even

011

011

011

011

11

281

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

non-determinable. Cooperative ensembles typically intersect. Coopera-tive work is distributed physically, in time and space [and] logically, interms of control, in the sense that agents are semi-autonomous . . .Cooperative work involves incommensurate perspectives (professions,specialities, work functions, responsibilities) as well as incongruentstrategies and discordant motives. There are no omniscient agents . . .”

All of these observations are pertinent for the deconstruction of naïvenotions of community. Communities have neither common perspec-tives, nor motives, nor goals, nor strategies. Membership is often non-determinable, and boundaries intersecting, arbitrary and transient.Activity is physically and logically distributed over shifting eddies of temporary alliances. There are no omniscient agents to co-ordinate, setthe agenda, or even to write the history afterwards. If there is a rule, itis simply that everything and anything can be contested, and that the dialectics of what gets contested and how (and what does not) set atrajectory that may become apparent with hindsight.

Whatever general definition of community is offered, it is easy to comeup with a counter-example. This is partly because the term has manymeanings: my Shorter Oxford Dictionary provides five meanings, eachwith several variants. It is partly because we have a very incompleteunderstanding of what community is.

Tony Salvador (1998) makes some similar points in the excellent, briefarticle “Communities, Schumanities”. He exhibits three attempts todefine, then build software to support communities. He shows that thedefinitions of community cover too little and too much, managing to beunder and over inclusive at the same time. Even more trenchantly, he remarks “In no way that I can tell does their definition influence theirdesign”. Two definitions are cited with approval in the article:

Mark Jeffrey, whose experience is not in academia but rather in business,perhaps has the optimal definition of community: “a group of individuals,typically geographically dispersed, who communicate electronically in anon-line environment for some common purpose of activity”. If only he leftoff the bit about common purpose, he’d have a fairly harmless and thus ben-eficial definition.

In fact, in his talk, [Mark Jeffrey] argues that “anything that allows peopleto get together can be a community building tool’, which for design pur-poses seems to be on the right track.

14.4.2 Communities: An Aside on Use

Most people manage to use the word community without becomingensnared in definitional questions. In this section, we will look briefly atsome everyday uses in one P2P article. The choice is arbitrary, but hasbeen influenced by the inherent interest of the article, which is well worthreading outside the linguistic perspective.

123456789101112345678920111234567893011123456789401112345611


282

Minar and Hedlund (2001) provide an outline of the use of peer-to-peer models through the history of the Internet. Their position is thatthe Internet was peer-to-peer for most of its history, between 1969 and1995. It was an inhabited information space: a medium for communica-tion for machines that shared resources with each other as equals. Theirargument is that today’s P2P applications could learn a great deal fromthe protocols and methods of previous distributed Internet systems likeUsenet and the Domain Name System (DNS). The article contains muchinteresting analysis. For instance, how Usenet’s NNTP protocol avoids aflood of repeated messages – which Gnutella would do well to learn from– and how DNS distributes its load naturally over the whole network, sothat any individual name server need only serve the needs of its clientsand the namespace it individually manages. The article also displays asharp understanding of the issues that have led to firewalls, dynamic IPaddresses, and finally Network Address Translation (NAT), all of whichpresent serious obstacles to P2P and potentially to IIS applications. Theseobstacles give rise to a host of more or less unsatisfactory work-arounds– the most common being the abuse of port 80 (supposedly the port thatallows simple Web access).

The following extracts from the article illustrate some everyday ways in which community may be used sensibly and constructively. The first extract includes a rather strange, but perfectly understandableuse:

Since 1994, the general public has been racing to join the community ofcomputers on the Internet, placing strain on the most basic of resources:network bandwidth.

The article repeatedly emphasises the critical importance of non-auto-mated, human control of networks as the most effective but neverthelessflawed management method.

The beauty of Usenet is that each of the participating hosts can set their ownlocal policies, but the network as a whole functions through the cooperationand good will of the community . . .

but

Usenet has been enormously successful as a system in the sense that it hassurvived since 1979 and continues to be home to thriving communities ofexperts. It has swelled far beyond its modest beginnings. But in many waysthe trusting, decentralized nature of the protocol has reduced its utility andmade it an extremely noisy communication channel.

The next extract shows a relation between community and account-ability, and is a good example of the way community and P2P technol-ogy co-evolve (rather than, as in much IIS work, taking the technologyas a given).

011

011

011

011

11

283

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

A key challenge in creating peer-to-peer systems is to have a mechanism ofaccountability and the enforcement of community standards. Usenet breaksdown because it is impossible to hold people accountable for their actions.If a system has a way to identify individuals (even pseudonymously, to preserve privacy), that system can be made more secure against antisocialbehavior. Reputation tracking mechanisms . . . are valuable tools here as well, to give the user community a collective memory about the behavior of individuals.

The notion of community develops to include technical and socialmeanings of community standards and collective memory.

The shared communication channel of news.admin allows a community gov-ernance procedure for the entire Usenet community. These mechanisms oflocal and global control were built into Usenet from the beginning, settingthe rules of correct behavior. New breed peer-to-peer applications shouldfollow this lead, building in their own social expectations.

Thus we see the unreflective, but entirely appropriate usage of thenotion of community in one article on P2P. Moreover, the postulatedcontrol mechanisms for P2P (as did the search mechanism outlinedearlier in this chapter) depend crucially on a practical notion of com-munity.

In Chapter 15, Burkhalter and Smith discuss uses of social accountingdata in Usenet. Various data are tabulated for newsgroups, such asnumber of messages, number of postings per author, numbers ofresponses to each posting, etc. These statistics are available to Usenetparticipants. Various reactions are noted, from strong suspicion aboutthe interest of Microsoft in sponsoring such information to strong inter-est in the profile of the newsgroup: its place in the activity league of news-groups, and participants’ own rankings as “posters”. Burkhalter andSmith use the terms “online community” and “newsgroup” more or lessinterchangeably. Part of the interest of the chapter is that gathering“social accounting data” depends on an assumption of community – yetwhen the data is taken up by participants it plays a role in transformingthe assumption into a reality. This simple example shows the need in IISfor the technical (the social accounting system) to co-evolve with thesocial (Usenet community). Although co-evolution may be the authors’intention, this is not made clear in the chapter – where the technologyis treated as prior and given. This brings us to some philosophical considerations.

14.4.3 Communities: An Aside on Philosophy

Before assessing the roots of IIS in community, and some implications,we need to delve into a little philosophy. Earlier we claimed that ourunderstanding of community was incomplete. Using a word correctly

123456789101112345678920111234567893011123456789401112345611


284

does not guarantee there is a corresponding referent in the world.“Unicorn”, “The Jabberwok”, and “the present King of France” can allbe used correctly but have no referent. More interesting than pure fic-tions are objects that can only be seen through a glass darkly: complexesin process whose dynamics and attributes can never be directlyinspected, and which can only be known through representations: e.g.(Lynch, 1991a, 1991b). We will never, to paraphrase Sartre (1965) meeta community “in person”. The representations we construct of it, con-sciously or unconsciously, will largely determine our attitudes, actionsand, in the case of P2P and IIS, our designs.

All we really know directly is that the idea of community is meaning-ful, that it does have serious ramifications in the way we mould our livesand on the paths we follow, and that it (whatever it is) is being changedby networking technologies. Until our experience is more complete, andour representations more adequate, then Salvador’s prescriptions for a“fairly harmless and thus beneficial definition” seems most likely to avoidlogical blunders. But this is unsatisfactory from the point of view of thosewho wish for a closer understanding, a better representation of commu-nity in order to build more appropriate community support networks.

Philosophy can help a little. We need representations in order todesign applications and systems. We need to take design and imple-mentation actions on the basis of the representations. But we also needto be able to suspend belief in our representations, to maintain flexibil-ity. It helps with this difficult, even contradictory task, if we take a con-sciously philosophical position, and regard community as a metaphor.

Richard Rorty cites Nietzche’s famous remark that truth is a mobilearmy of metaphors. He explicates this in a late twentieth-century contextby saying:

I take its point that sentences are the only things that can be true or false,that our repertoire of sentences grows as history goes along, and that thisgrowth is largely a matter of the literalization of novel metaphors (Rorty,1991, p. 3).

Later he expands this view of metaphor, saying:

A metaphor is, so to speak, a voice from outside logical space, rather thanan empirical filling up of a portion of that space, or a logical-philosophicalclarification of the structure of that space. It is a call to change one’s lan-guage and one’s life, rather than a proposal about how to systematize either(ibid., p. 13).

Following Rorty, we may regard community as a metaphor. The repertoire of sentences we can articulate using it are growing and willgrow with time. This explains our inability to produce useful defini-tions, combined with our appropriate use of the term in context. It alsoexplains the attraction and excitement of the idea: there is a lot more tofind out. This will not be achieved by attempts at “logical-philosophical

011

011

011

011

11

285

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

clarification”, but by active exploration. This is the point at which wemay be able to understand the role of P2P in community, and the roleof community in P2P and IIS much better. Jeffrey was right to argue that“anything that allows people to get together can be a community build-ing tool”. P2P, and the family of community building tools it will prob-ably lead to in IIS, will help us unroll that set of sentences that developour understanding of what community is. In an epistemological sense,community comes into being as a result of ongoing and deepening inter-actions framed by the intention of build or support community. Hencethe difficulty of providing useful definitions and representations – roadmaps of continents where roads have not yet been built.

Conversely, without the metaphor of community as an organisingconcept, P2P loses its coherence and falls apart into diverse technologiesand applications with little obvious application or raison d’être. P2P andcommunity are deeply symbiotic. This type of deep co-evolution is notyet a feature of IIS. There is an obvious case that it should be.

14.5 P2P, Community and the Design of InhabitedInformation Spaces

The previous sections have shown that “community” is deeply involvedwith P2P, but its use is uncritical, unreflective, intuitive and oftenutopian. None of this disqualifies the concept once its metaphoricalnature is understood. “Group” and “co-operation” were similarly flawed,yet played a key role in the development and acceptance of CSCW appli-cations. The lack of tight definition combines with fluent, natural andintuitive usage to mean that “community” can be a common currencybetween developers of different strains of P2P and IIS. It can act as aboundary object:

plastic enough to adapt to local needs and constraints of the several partiesemploying them, yet robust enough to maintain a common identity acrosssites . . . weakly structured in common use, and become strongly structuredin individual-site use (Star, 1992).

Thus the loose but common notion of community in P2P is a sourceof dialogue, new insights and (undoubtedly) plausible mistakes. All ofwhich will help the development of the field. In addition, ethnographicinvestigation of community will help to further our understanding ofwhat community actually is. It may even result in a paradigm shift similarto that for co-operation in CSCW: the understanding of “co-operation”moved from the structural-ideological to the second-by-second ubiqui-tous process underlying most work activity.

But is P2P the appropriate technology for the design of InhabitedInformation Spaces? It would be extremely foolish to argue this from a

123456789101112345678920111234567893011123456789401112345611


286

technical point of view. The functionalities of different types of IISdemand different architectures in order to function. Within this book, Roberts (Chapter 13) shows the advantages and disadvantages ofdifferent architectures in terms of the compromises that each implies forapplications built on them. He also shows how different applicationgenres have different priorities to be useful. For instance, the more imme-diately responsive an application needs to be, the less detail can be transmitted.

Nevertheless, from a design point of view, it can be argued that P2P isthe most appropriate technology for IIS – despite limitations or awk-wardness of functionality for any particular application. These will turnout to be fewer than we might anticipate, since P2P is not a single archi-tecture, but an evolving set of architectures.

P2P is the technology that offers the most potential for IIS because itis the technology that offers the most interesting problems within theconceptual frame of community. In doing so, it is likely to result in solu-tions that bootstrap our understanding of community itself.

IIS cannot be defined by a specific technology, any more than CSCWcan be defined by a specific technology (Schmidt and Bannon, 1992).However, CSCW was conditioned, constrained and located within a tech-nology framework of the LAN. Similarly, I would argue, IIS is best locatedwithin a technology framework of P2P – but not of course any specificP2P technology. This means no more (and no less) than that every par-ticipant’s machine is a server as well as client. At a conceptual level,within the sociological framework of community, it is quite obvious thateach person, each community member, is a source as well as a sink forinformation, viewpoint, prejudice, emotion, action, and so on. It doesnot seem too much to ask that their mode of connectivity (machine) toan IIS should support being a source as well as a sink. In other words,that the IIS network should be a P2P network.

Some guesses about how this process might go as follows. Some of usremember how in CSCW the ideological notion of co-operation resultedin bad social engineering and very troublesome applications. Similarthings are likely to happen with community. A definitional approach,eschewing metaphor, will result in confusion, as illustrated by the earliersections. We need to remember how real progress was not made in CSCWuntil the insights of ethnographers – called in as fire-fighters and con-sultants when applications were not being accepted, and it was notobvious to the designers why – showed that co-operation was a moment-by-moment business, found almost everywhere where people interact,and having very little to do with ideology. The guess is that a similarprocess will happen with community. Grand sociology, rooted insearches for roles and rules, grand schemas and plans is unlikely to doany better than it did in CSCW. Probably community will be found inthe moment-by-moment interactions between distantly related, evenunrelated in any direct way, people.

011

011

011

011

11

287

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

The secret of co-operation in groups lies with history, mutual knowl-edge and learning from experience. Almost all the co-ordinationsdescribed by, for instance, Heath and Luff, or by Suchman and co-workers, or by Button, Bowers and colleagues, are deeply skilled. Theskill is a result of learning over time with the same people in the same(usually physical) context.

Community is prima facie a different context, since the ongoing inter-actions between the same people in the same place or context over longperiods of time is absent. The interesting interactions will be betweenrelative strangers. They may know each other in some way, yet lack aninterpersonal history. That such people can interact, and interact in sucha way that is mutually beneficial, or that can be labelled “the kindness ofstrangers”, is in need of empirical and detailed examination – not theo-retical explanation. If these micro-processes can be studied, and, at leastto some extent, understood, then it may be possible to start buildingcommunity-supporting applications based on them.

Inhabited Information Spaces will only be inhabited if there is a reasonfor people to inhabit them. Cities are attractive because they offer richopportunities that are lacking in villages and the countryside generally.But cities are not built by plonking a large pile of “opportunity” in themiddle of a field. “Something” starts, and that “something” replicatesand mutates. That something is a form of interaction. Design for com-munity, which is, in all but name, design for IIS, needs by analogy, tostart with simple supports for interaction, as recommended by Jeffrey (in Salvador, 1998) and Burkhalter and Smith (Chapter 15).

14.6 Concluding Remarks

This chapter has argued that computer support for human interactionblossomed from the mid-1980s to the mid-1990s. There were seven fatyears. Most of the advances and insights and novelty of CSCW – theresearch field on supporting co-operation – were generated in thisperiod. Similar evidence could be drawn from other fields devoted to sup-porting forms of human interaction, such as HCI or Groupware. Thencame the Web. While this was a massive advance in universal access toinformation provision, it did little to further support for interaction. Thesupport applications that were generalised from the mid-90s to date (e.g. email, instant messaging, shared editors and whiteboards) weredeveloped much earlier.

The chapter argued that this change was not primarily a matter ofintention, but of the underlying technologies and organising concepts.The early period was based on the spread of LANs, with the organisingconcept of group providing coherence and a trajectory that supportedinteraction. The later period was based on client–server Web technolo-gies, and the organising concept of the library (most e-shopping takes

123456789101112345678920111234567893011123456789401112345611


288

place in library-like structures). Neither the technology nor the organis-ing concept were conducive to developing new ways to support humaninteraction.

The emergence of the new field of P2P, with its organising concept ofcommunity, gives hope of another seven fat years for supporting humaninteraction – but this time centred on community rather than group.

The chapter argued that community plays a central role in the verycoherence of P2P. It is technically necessary for the development ofappropriate search engines, and for maintaining overall control. It isorganisationally necessary, since it provides a common perspective,direction or ambition for otherwise diverse, file-sharing, collaborativeand distributed processing applications. It is suggested that communityshould play a similar role in IIS, where it currently impacts the social,but not the technical. Conversely the technical attempts to provide com-munity support are likely to result in interesting developments in, andfar greater understanding of communities. The conclusion is to try tomove both P2P and “community” to the heart of IIS. There is much stillto discover. That is what makes it exciting.

011

011

011

011

11

289

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


14

15Inhabitant’s Uses and Reactions to Usenet SocialAccounting Data

Byron Burkhalter and Marc Smith

15.1 Introduction

Usenet and other text interaction systems are inhabited by populationsof tens of millions of active contributors and potentially even larger populations of invisible observers. While text systems lack the explicit representation of bodies common to many graphical spaces, most textinterfaces represent individuals as a corpus of messages over time. Witheffort using existing tools, users reading collections of individual mes-sages can piece together a distinct sense of the many participants in par-ticular and the character of different spaces in general. New systems areextending the basic features of text interfaces to introduce measures andmaps of spaces like newsgroups and email lists as well as their popula-tions of contributors. Summary reports and visualisations of individualparticipants and spaces can be produced from the analysis of collectionsof messages exchanged in text spaces bringing new resources for usersand contributors to these environments.

How do participants make use of enhanced context about text spacesand the contributors and conversations with whom they interact? In the following we report examples of the ways social accounting data are incorporated into threaded conversations created in Usenet newsgroups.Netscan (http://netscan.research.microsoft.com) is a research projectthat generates and publishes extensive social accounting data about thepublic Usenet. Netscan contains information on each of more than103,000 newsgroups and 20 million unique authors who contributed mes-sages since the end of 1999. In effect Netscan can take masses of conver-sational data and render a series of summary metrics that can be reflectedback directly or as a way to select content to the readers of and contribu-tors to Usenet newsgroups. Thirty-eight thousand unique visitors haveused the Netscan web site since the start of 2000. Some of these users go

011

011

011

011

11

291

on to discuss the Netscan service or the data it reports by posting mes-sages in Usenet itself, in some cases including URLs pointing to specificreports in the system or copying segments of the reports published on theNetscan site directly into messages posted to their favourite newsgroup.

We have discussed interface components and visualisations as well asthe value of social accounting data for selecting content from newsgroupsin other papers (Fiore et al., 2001). Here we want to explore how socialaccounting data are used by those who participate in Usenet. We do notaddress how people have used the Netscan web site itself but rather howthey have made use of Netscan data by posting it or making referencesto it in and as the context of their particular newsgroup. Our intent is toexplore the ways new representations of social context are made use of in the very spaces they represent and to document the ways these representations of participants and places are used by and between participants in these spaces.

Graphical representations of bodies and geometry are not the onlyway information spaces are inhabited. Groups of people who exchangesimple ASCII text through online conversation systems like email listsand Usenet newsgroups, also come to inhabit an information space. Theapproach of the Netscan project is to provide tools that support socialawareness in text interaction spaces like Usenet by combining measuresof each newsgroup, author and thread’s activity with data visualisationand reporting interfaces that present these patterns back to communityparticipants and interested observers. The underlying assumption is thatsocial accounting measures of the activity of newsgroups, authors, andconversation threads can be used as a social resource providing contextin support of social processes related to boundary maintenance, statuscontrasts between individuals and newsgroups, and the characterisationof conversation partners.

To assess the impact of publishing social accounting data back intoeach space we examined all mentions of the Netscan web site in Usenetmessages. Since the Google Groups (http://groups.google.com, formerlyDeja News) service has saved and indexed large collections of Usenetmessages dating back to the first years of Usenet in 1981 we were able tosearch the contents of millions of Usenet discussion threads for mentionof Netscan and its data. Using this tool it is simple to retrieve the col-lection of messages containing a particular set of words, including thosecontaining a particular web URL address. We searched for fragments ofour system’s URL (“netscan.research”). Our search through April of 2003found 255 threads containing 7,430 messages starting in June of 1997 withthe bulk (87per cent) occurring since 2002. These messages werereviewed and the forms of social use they were put to is discussed in thenext section. This strategy provided a broad general understanding ofthe types of messages that contained Netscan social accounting meta-data and allowed us to see how the data was used in various ways in theselected discussion threads.

123456789101112345678920111234567893011123456789401112345611


292

15.2 Related Work

Several related systems visualise patterns of activity of authors and conversations. These systems focus on the representation of the socialhistory of online conversation spaces and the members of theircontributing (and in some cases observer) population.1

PeopleGarden (Xiong and Donath, 1999), for example, visualisesmessage boards as a collection of flowers where each flower represents auser in the conversational space and its petals represent his/her postings.The ordering of petals and the saturation of each petal indicate time – amessage posted far in the past is less saturated than one posted morerecently. Finally, the height of each flower represents how long a user hasbeen active in the message board. Even though PeopleGarden’s focus wasweb-based discussion boards and not Usenet newsgroups, it representedconversational spaces in terms of its participants’ histories.

The Loom project focused on visualising social patterns within Usenetnewsgroups. It highlighted patterns that indicated varying patterns ofparticipation, for example highlighting rowdy, vociferous users as wellas the number of participants in different threads over time. Althoughits focus was not on the authors per se, Loom managed to uncover inter-esting author dynamics found in newsgroups – for instance the markeddifference between the average numbers of participants per thread intechnical versus social newsgroups. A related project, Conversation Map(Sack, 2000), represents authors present in Usenet newsgroups as nodesin a social network in which patterns of reply form the links betweencontributors. It computes a set of social networks detailing who is talkingto whom in the newsgroup, which visualises the centrality degree of eachof the authors in the newsgroup. The system also analyses the text ofmessages to uncover sets of “discussion themes”. Here, as in Loom,remarkable patterns emerge that are related to people’s interactions inthe conversational space. Babble (Erickson et al., 1999) is a similar effortto visualise the behaviour of groups of people interacting through a net-worked conversational system, in this case a propriety message boardsystem in use at a major corporation. The system attempts to provide aform of “social translucence”, a rapidly graspable representation of thestate and pattern of interaction of people participating in the space.Babble represents participants in a collection of chat/message boards ascircles within a series of concentric rings which indicate how recentlythey have been active.

011

011

011

011

11

293

1 Newsgroup threads are not face-to-face conversations (see Garcia and Jacobs,1999). The insights that informed this research are the notion of recipient-design,which references the way that orderly coherent conversations are constructed byboth the messages that explicitly mention the Netscan data and the responses tothat message.

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Inhabitant’s Uses and Reactions to Usenet Social Accounting Data

15

15.3 Netscan

Netscan generates and publishes social accounting data about the Usenet.It provides a web interface for information about the relationships andactivities of the three major elements of Usenet: the newsgroups them-selves, the authors and the threads their messages create through pat-terns of turn and reply over time. Netscan collects messages but unlikea search engine extracts and retains only the FROM, NEWSGROUPS,DATE, MESSAGE-ID, LINES and REFERENCES headers from eachmessage. Using these message elements Netscan creates aggregations ofthe multiple dimensions of the Usenet over time. Message bodies areretained for a few weeks or months but are not permanently stored.

The main Netscan interface takes a keyword and matches it againstthe name of newsgroups in Usenet instead of against the content of theirmessages. It then displays a report on the number of messages eachmatching newsgroup received in the given day, week or month selected.In addition, the number of authors (also referred to as “posters”) is listedalong with the number of those authors who had also posted in the priortime period (called “Returnees”). Measures of the way each newsgrouplinks to others is presented in the form of the count of the total numberof messages “crossposted” (shared with another newsgroup) and thecount of the total number of other newsgroups that are linked by even asingle message.

From this report users can access a more focused report on a singlenewsgroup in a selected period of time. This “report card” displays dailyactivity in the newsgroup and its change in activity in terms of thenumber and type of messages and participants over the prior time period.These reports are very macro level and address some of the overall struc-ture and dynamics of the newsgroup. The newsgroup report card alsodisplays two examples of content selected from newsgroups on the basisof the size and structure of the threads and the behaviour pattern ofauthors who contributed messages to them. The “thread tracker” reportsthe 40 largest threads in terms of the number of messages that were addedto the chain of turns and replies in the selected time period. This report,therefore, displays the most active and possibly most controversial topicsin the newsgroup. A related report, the “author tracker”, selects contentby the behaviour of the authors who contributed to them. The report lists 40 authors in the newsgroup selected by the number of different dayseach author contributed at least a single message in the time periodselected. In some newsgroups the most active authors contribute mes-sages nearly every day in a month and have consistently done so for manymonths or even, in rare cases, years. This measure, therefore, representsa kind of “costly signal” – a hard to falsify quality that is a relatively reli-able indicator of the tenure of the contributor in the newsgroup. If usersselect a listed author the Netscan system displays the ten threads to which

123456789101112345678920111234567893011123456789401112345611


294

the author contributed the most number of messages in the time periodselected. This way the threads that most attracted the contribution of themost dedicated participants are easily accessed.

15.4 Findings

Within newsgroups discussions of individual participants is quitecommon. In a study of soc.culture newsgroups, evidence suggested thatdiscussants often attempt to categorise other discussants in order to“spin” the arguments in terms that are more preferable (Burkhalter,1999). In newsgroup arguments one may be able to “win” an argumentsimply by successfully categorising the opponent in discrediting fashion.Certainly in newsgroups where advice and other forms of help areoffered, advisee’s have a need to understand the type of person withwhom they are dealing in order to assess information that they are likelyto be unable to evaluate on its face. The massive variety of newsgroupsand respondents makes it very difficult to make these crucial judgmentswith confidence, however, without significant investment of effort overtime as a pattern of messages is pieced together into a overarchingpicture of the author and an assessment of how reliable, trustworthy andvaluable they are. The most common resource for categorising other participants is to consider the content of their most recent post. A moreeffective means of understanding who is whom within the newsgroupcomes from longstanding newsgroup members who provide an institu-tional history. The problem is identifying longstanding members.

Social accounting data is an alternative that goes beyond listing all themessages that contain the keyword or author name used as a query. Forexample, data on which newsgroups an author has participated within,the number of posts per time period in each, the first and last date ofposting for a particular address, the number of threads initiated and howoften their message generates a response, are all ways of typifying a particular author through structural data. As a result social accountingdata provides easy access to the most prolific and longstanding news-group members as well as finding the newcomers or visitors from otherdiscussion spaces.

15.4.1 Social Accounting Data and Author-assessment Threads

The existing interfaces to Usenet and related discussion spaces presentan overwhelming welter of individual messages from a potential popu-lation of thousands or tens of thousands of posters. The diversity of participants and the limits of most Usenet interfaces create a situationin which people, accustomed to the richness of face-to-face interac-tion, have difficulty typifying co-participants. Typifications are a regular

011

011

011

011

11

295

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

practice in interaction and are used to formulate responses and reactionsappropriate to the social status of the participants and the setting (seeBurkhalter, 1999; Sudnow, 1972). In other words, participants seek to putother participants “in context” in order to conduct each subsequentinteractional turn. Thus, a common occurrence in newsgroup messagesis a search for information about particular authors. For example:

. . . Do any of you know Toni, the guy who told you who the perpetrator is?Is he a regular here? . . .

The post above speaks to an important type of newsgroup participant,“the regulars”, who have long-standing relationships to the newsgroup.Regulars are replied to more often than new members. Given the dis-proportionate amount of messages regulars produce, in a very real sense,what regulars do in their messages is what the newsgroup, effectively, is. Regulars can be important for a number of reasons. Regulars in“Q&A/help” newsgroups have presumably answered a number of ques-tions and the accuracy and helpfulness of their answers may be validatedby their continued presence and the absence of repudiation. Regulars arealso an important sign of what the normal procedures are for the news-group. Regular authors, even those who fill large controversial threads,even in negative and hostile roles, are part of the normal operations ofthe newsgroup. The point is not that a person by virtue of being a regularis virtuous, the point is that by being a regular the person is a knownquantity – for better or worse. The importance of known quantitiesshould not be overlooked. Knowing the nuances of another’s identity isa necessary for competent responses.

Messages can be evaluated through social accounting data about theirauthors. For example, the fact that an author is a relative newcomer oris a prolific poster to hundreds of other newsgroups is made visible byreports from the Netscan reports on that author. While such measuresare open to interpretation (an identity with only recent activity could bethe result of a newcomer or a previously active user seeking to create anew identity and lose the baggage of a prior one) these histories, howeverfragmentary, combine to offer a picture of the social context of news-groups, authors and threads. Those who post in undesirable newsgroupscan be seen as outsiders or visitors. Thus, information about an author’sposition in the social structure of a newsgroup in particular and Usenetin general can be used to characterise particular posters. In effect,Netscan’s author profiles are reputation measures creating a definitionof a particular poster as, for example, prolific or troublesome:

Most of politics belong in national local news groups. Foreign people canparticipate on existing local discussions. But [POSTER A] is posting in 115news groups and initiates foreign issues. And just like politicians he suffersfrom shortness of life. He only started to use the name “[POSTER A]” beginSeptember 2001

123456789101112345678920111234567893011123456789401112345611


296

[link to Netscan statistics] October 2001[Posts Replies FirstSeen ThreadsTouched Other NG’s[A] 721 554 09/03/2001 336 58 ]

The particular thread excerpted above involves a call for [POSTER A]to cease posting messages not relevant to the newsgroup’s topic. Forunmoderated newsgroups (which most are) this is a serious issue sinceanyone can post anything to any newsgroup at anytime. A reoccurringpractice of newsgroup maintenance involves “patrolling” the newsgroupfor “offensive posts”. This may include admonishing those who areadvertising, asking questions already answered in a FAQ guide, postinga large number of messages or, as in this case, posting irrelevant andoffensive messages.

However, the message excerpted above does not draw out all thepotential conclusions possible from the data utilised. The message specif-ically mentions that the address in question has only been in use for ashort time. Researchers and participants must remember that messagesare signed with an address that may or may not correlate to a particularperson. The fact that an address is new might suggest that the partici-pant is new; it might also suggest that a poor reputation was connectedto a previous name. For this reason measures beyond the length of participation are important. For example, the number of posts and theratio of initial turns (posts that start a new thread) to replies (posts thatare responses to previous messages) can help to distinguish between a person who over-posts (also known as spam) to a newsgroup, in which case the number of initial turns is likely to be large in proportionto the number of replies which contrasts with someone who is an activeparticipant who contributes replies to other poster’s messages.

An author is further placed in context by reference to their cross-posting data. At the extreme, a large number of newsgroups also suggestspamming behaviour, which might be evidence of a troublemaker oradvertising or other “outsider” work. Participants are aware of theseimplications. Some messages expressed a concern that their own authorprofile might be misinterpreted:

I went to this [Netscan] URL and it worked (I’m using IE 6). searched myself,and apparently I’ve been posting in groups I don’t post in. May have beenaccidental crossposting in replies, though. But I couldn’t figure out if therewas x-no “damage”. I couldn’t get to the actual posts. Nor did I find a way tolook up stuff I know I x-nayed. Oh well.

Many users of Usenet do not examine the cross-posting header line ofthe messages they reply to which frequently means that they “inherit”the cross-posting pattern of the initial message. As a result, their repliesare not confined to the particular newsgroup in which the respondingauthor found the message. In fact, users may be inadvertently cross-posting to ten, twenty, or more newsgroups. Thus the cross-posting data

011

011

011

011

11

297

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

may not be completely indicative of posting intent; however, frequentposting to a particular newsgroup does suggest intentional participation.Stray cross-posts, as indicated by few or even just one message, may beentirely unintended or accidental but that does not mean that others willaccept that account. This is important because where a person posts istaken to reveal information about the person. Participants who exam-ined their own profile constituted half of all author-oriented posts. Anda large portion of these messages concerned perceived inaccuracies inthe reports of cross-posting:

that’s freaky! – try checking out user profiles. i see a whole load i’ve neverbeen near (unless i’ve contributed to some crossposted thread) such asalt.obituaries,uk.gay.london(!),soc.culture.pakistan(!!)

The social accounting data included in author-oriented threads fre-quently involves discussions of the reputation of participants and/or aperson’s position within a particular newsgroup. This data provides asense of whether a person is known to other (established) members ofthe newsgroup. Like message archives, social accounting data providesa way of tracing an individual identity’s activity over time. However,social accounting data places the individual identity in multiple contextsat once. The date the identity was created (the date of its first postedmessage), the relation of this identity to other newsgroup authors, theamount of cross-posting and the names of other newsgroups touched bythe individual name, along with the frequency of posting within eachnewsgroup, helps to establish the position and history of each author.This context can be used to characterise and typify participants and allowother participants to design responses appropriate to their status.

Noting variation in the number of each author’s contributions was themost common use of social accounting meta-data. For many newsgroupsthis was a one-time occurrence. However, a fair number of newsgroupshave made this a monthly practice. Monthly or even weekly rankingswere a regular part of the proceedings, at the end a winner is declared:

Ah, yes, the results for the whole of March are available. I’ve slippedinto 19th place (uh-oh) but the top ten are: . . .[results deleted]Dom Robinson wins the Queen Mother Memorial Award for Most HelpfulPoster

As with newsgroup comparisons (discussed below), the most frequentauthor comparison is in terms of the number of posts per member.Following these posts were self-evaluations accounting for the differentpositions won, eight short examples follow:

I’m at a lowly 40!!Okay, who won in October? I am hardly surprised I posted the most inDecember, as I was unemployed the entire month.

123456789101112345678920111234567893011123456789401112345611


298

I think my trivia posts have inflated my position a littleThe question is though, will I be the Xmas No. 1 . . .Damn I’ll have to try harder next time, I know I can make the top 3!not me i aint even made the list, not even 5 posts??????Ahh, there i am in 6th :o)I was so close to breaking the top ten :(

However, not everyone in the newsgroup can appropriately justifytheir position. One poster suggested that more work was necessary toraise their own number of posts, a rebuke came quickly simply stating“who are you?” This is more than a putdown; it is in many respects thepoint of these “competitions”. They are competitions over who has the requisite status to participate in the newsgroup as full members. Aperson might rank as the fortieth ranked poster, yet still be able to jointhis discussion because they have a certain standing in the newsgroup.

Although these messages focus on the quantity of posting by particu-lar authors, the issue is not one of characterising a particular author butwith a comparison between authors who frequently contribute to theconversations in the newsgroup. Quite often the posting of comparativesocial accounting meta-data created a discussion of participation in thenewsgroup. Occasionally people would express disbelief in the numbers,some took pride in the quantity of messages they posted, and others usedthe opportunity to claim that they were actually the most prolific news-group contributors. On numerous occasions, parties apologised for their lack of participation. While it may seem that more is better, thequantity issue is not clear-cut:

Average 7 per day for the last seven day (and that’s counting a busy Thursjousting with Virt). You’re ahead of me this weekend and you’ve yet to say anything interest-ing, just the usual name calling.

As this example illustrates being the most prolific is hardly a guaran-tee of being beloved by all. We find that those who post frequentlybecome the objects of discussion and attention themselves. There isstatus involved in being the object of discussion (even if the discussionis largely negative).

Interesting stuff, looks like they put Tracker as the number one contributorthis month, Little ‘ol me didn’t even make the top 40, but you have to poststuff for that, right? ;pYes posting helps but as in all things quality as opposed to quantity is whatsreally important. Leaves you know who you know where.

Social accounting data provides an objective justification for doing thework of acknowledging participants, an important function of main-taining interactional newsgroups. Different participants occupy particu-lar positions within the newsgroup. Newsgroups maintain their cohesion

011

011

011

011

11

299

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

when their populations agree to a definition of the situation, includingwho is a member and who is not. Social accounting data allows roles con-nected to participation to be continually re-established through objec-tive measures. This can be done informally by members; however, theobjective data curtails arguments about the measures (although sucharguments may also have a function within the newsgroup). Instead thediscussion immediately moves to the characterisation of participants.These characterisations rely on the increasing complexity of socialaccounting data beyond the counting of posts:

Did you know that on 9th of September 2001, the Flonk got 1202 messages?The daily average in September was 713. The favorite crosspost destinationwas alt.fan.karl-malden.nose, with about 67% of the traffic crossposted there.The top five was rounded out by alt.flame, alt.usenet.kooks, alt.flamenet andalt.fan.scarecrow. The busiest poster was anonyme with 2334 messages, notcounting the numerous morphs. That’s well over 70 articles a day. OpI wassecond with 1897, and Dave Hillstrom third with 333. mimus started the mostthreads, 40 in all.

That’s all mimus does, is start threads. Then he sits back and cackles while people jump all over them and get into fistfights and stuff. He’s a BadMan.

Those who make the list in terms of participation may be open to char-acterisation along different metrics, in this case the number of initialposts (messages that create new threads). Another member uses thelongest post for the month to suggest another category won by thatmember:

. . . You’ll reportedly want to use IE for best results (big surprise). For thoseinterested in such information as: During the month of November, alt.coffeehad: 4014 posts from 556 posters with an average line count of 26 3457 were replies from 438 repliers 96 posts went unreplied, and 76 were cross-posted 245 posters were returnees from the previous month, while 241 weredrive-bysWoohoo! I am NOT the mostestposter for the month! (Barry was!) But I am tops in most stimulating conversations created . . . uh “ThreadsInitiated”. :PWarning signs (a thread I started – go figure) had the most traffic -83 posts. This thing is cool. Too bad there’s no easy way to see how much reader trafficthe ng has . . .

Along similar lines, those who initiate without garnering responses canbe seen as holding a lower status within the newsgroup:

Well you are top with “most posts”, but you fall down because many of yourposts remain unreplied to and you don’t start a lot of your own threads,which is what the “chart” is calculated on, I think.

123456789101112345678920111234567893011123456789401112345611


300

15.4.2 Social Accounting Data and Newsgroup-assessmentThreads

The overwhelming focus of Usenet messages using Netscan data focusedon newsgroup level metrics. For the purposes of this analysis we will sep-arately discuss those messages where social accounting data is used toprimarily discuss single newsgroups from those where multiple news-groups are compared. “Intra-newsgroup assessments” contain messageswhere some perceived characteristic of the newsgroup is discussed byreference to social accounting data. This might include a newsgroup thatseemed to by dying or one which has experienced a sudden large influxof messages. The second type, inter-newsgroup assessments, includesthose where a potential participant is searching for an appropriate news-group among a collection of potential newsgroups using social account-ing data. In each case, social accounting data is used to gain a perspectivenot possible from within existing newsgroup browsers which simplypresent long lists of available messages.

Usenet consists of at least 103,000 groups which makes finding a suit-able newsgroup a daunting task. Social accounting metrics allow search-ing for a newsgroup by properties of the patterns of interaction withineach newsgroup. The excerpt below is an example:

Hey, I just used this tool for this news grouphttp://netscan.research.microsoft.comand I see that the majority of threads here are related to talking about oneconsole vs. the other, and the most posts and replies come from someonewho maintains a FAQ for the news group, and there are some dedicated reg-ulars and some amount of trolling.Can any of you frequent posters tell me the scoop with this news group?Looking back at a week of messages here, am wondering if there is enoughsubstance here to pay it close attention. It seems to be a lot better than thealt. newsgroup, and am willing to put up with some noise to find out whatpeople are doing with XBox.

This poster details characteristics that, for them, comprise a worth-while newsgroup: a particular topical range and focus, a group of regu-lars and few troublemakers or “trolls”. Social accounting data allows thisparticipant to see the newsgroup from a perspective otherwise difficultor impossible to assemble manually. From this vantage point aninformed choice about which newsgroups to join can be effectively made.Central to the question of newsgroup health appears to be the issue ofan adequate base of regular participants:

Anyone good at looking at stats from newsgroup software? How manyregular posters do we have in chi.general? Just wondering . . . .netscan.microsoft.com (doesn’t work in Netscape, surprise!) December stats:3,485 posts from 260 posters, 106 of which were returnees (posted themonth before) November: 2,424 posts from 259 posters, 101 were returnees

011

011

011

011

11

301

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

October: 2,521 from 338, with 106 returnees September: 3,470 from 358, 98returneesI’d say we have about 100 regulars.

The primary benefit for users (particularly users new to Usenet or toa particular newsgroup) is the ability to enter discussion space not as acomplete neophyte but having already acquired a perspective on thenewsgroup (see Lawley, 1994). The number of posts and the ratio ofreplies to total messages can allow new users to characterise the popu-lation of the space, identifying prolific users with different patterns ofcontribution. Conversely those who build and promote a particularnewsgroup may orient to this feature and concern themselves with socialaccounting data as a sign of a newsgroup’s health. The “regulars” withina newsgroup are certainly concerned with the social accounting datarelated to cross-posting and replies:

Of course this means poor uk.media.tv.misc gets lumped with thousands ofoff-topic posts a week . . . . but I don’t think anyone really cares any more.

Regular newsgroup participants often express deep concern abouttheir newsgroup; it takes effort to post over long periods of time as manyregulars do. Many newsgroups have existed for a decade or more andmembers often express concerns about the health and vitality of theirnewsgroups. Social accounting data can be used to gain a perspective onthe health and vitality of newsgroups. In the following example, a pollwhere participants vote on their favourite rollercoaster had fewer par-ticipants in the current year than in the past. An initial post expressesconcern over the declining participation: The message below is in replyto that concern:

My assumption is that there are actually fewer people participating in rollercoaster forums (or websites) when (February) there are not many roller coast-ers available to be ridden. Is there statistical data (posts/day etc.) availableto prove or refute my assumption? In any case, that would not explain whyfewer people participated in the 2001 poll than in the 1999 poll (which wasalso in February).

That month had roughly 2000 more posts than Jan 2002. Jan 2000 wasslightly higher than Jan 2002, but lower than Feb 2000. Both 2000 monthswere corrupted slightly by what appears to be a slight feed loss in bothmonths. Both of these months had average and maximum posting daysquite a bit higher than Jan 2002. So . . . it’s entirely possible that 2000 justhad more posters than 2002. And Mitch, October 2001 was less than eitherFeb 2000 or Jan 2002, for what that’s worth.

The social accounting data accounts for the declining participationbased on a model where the availability of roller coasters corresponds tothe participation in the newsgroup. Thus, the concern over participationis eased.

123456789101112345678920111234567893011123456789401112345611


302

Newsgroup health and vitality is also established by comparison toother newsgroups. Newsgroups were most commonly contrasted interms of the number of posts each received, for example, the two excerptsbelow:

wow, I like this site, thanks Kent look up windowsxp.general it did 34,000posts in Oct. next one is only 20,000

Autism averages in the top 20 “alt.support” groups. Alt.support.depressedseems to have 4 times the number of posts as any other group, over virtu-ally any period of time, which is a bit of a sad statement about our world.

Other messages engaged in more complex formulations, adding dis-cussions of interactivity such as replies and poster to post ratios. Hereare two examples:

Yeah, there is a separate mostly dead newsgroup for Bollywood films.It’s not all that dead – I did a indian.research. Indian.ft.com report on

alt.asian-movies and rec.arts. movies.local.indian, and the Indian group hadabout 78% of our number of posts, 40% of our number of posters and aboutthe same number of replies that we had (that ought to mean their threadsrunlonger).

I put in “motorcycles” in the search field and out of the *37* motorcyclerelated newsgroups the top three were:(totals derived from Nov 1, 2001 to present):1. uk.rec.motorcycles total posts: 19,546 with 780 unique posters2. aus.motorcycles total posts: 8,262 with 781 unique posters*and*3. rec.motorcycles.harley total posts: 7,349 with 787 unique posterswhich beat out our nr 4 newsgroup:4. rec.motorcycles total posts: 6.965 with 838 unique posters.What does all mean? Hell if I know, but it seems that LESS folks in the rmhgroups post MORE than the rm newsgroup. That and those Brits on the otherside of the pond are chatty as hell. Probably because of useless crap like Ijust posted.

This is not merely an opportunity for a structural perspective; manyof these messages (and those in the next section) are explicitly compet-itive. The notion that members on one newsgroup “beat” another news-group on some dimension comes across in many of the messages. Thiscontrast presents an opportunity to characterise outsiders and insiders,to express the difference between “us” and “them”. The drawing of news-group boundaries by the establishment of difference is a crucial aspectof newsgroup identity and maintenance. Thus the ability to comparenewsgroups becomes an opportunity to cast structural differences as evidencing the moral differences between newsgroups. Thus the accountof different structural outcomes is explained by a certain “chattiness”that distinguishes one newsgroup from another. In the next example, acomparison of cross-posted messages in a fitness newsgroup becomes anopportunity to affirm membership:

011

011

011

011

11

303

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

Newsgroup Shared MSGS % SharedTotal Neighbors: 67 [distinct groups] 1282 42 %#1 Neighbor alt.sport.weightlifting 741 57.8 %#2 Neighbor misc.fitness.misc 342 26.68 %#3 Neighbor sci.med.nutrition 332 25.9 %#4 Neighbor uk.rec.bodybuilding 143 11.15 %#5 Neighbor alt.fitness.weights 132 10.3%The only really suprising thing is that MFW appears to rate #1 or #2 on everymeasure of activity. I was always attracted to it by the high energy level –but now I know why it takes more time to read about weights than actuallyto lift them.

However, social accounting data is not only an opportunity to cheerfor one’s own team. Indeed as the complexity of the data increases, thepotential interpretations of that data become increasingly complex. Thenext excerpt features social accounting data about the number of posts,the number of posters, the poster-to-post ratio, the number of returneesto a newsgroup, the average number of lines in each message, the numberof replies and repliers, the number of unreplied to messages and thenumber of cross-posted messages and cross-posted newsgroups. All ofthis brought out a discussion not simply of the pros of the newsgroup,but also the cons:

OK, I’m sure this will be garbled, but here’s a comparison of ASA to our moregenteel counterpart, RBC for November:P—Ps—PP—Rt—ALC—R—Rrs—S—UnR—X-Xp 3#A 6225 181 .03 95 26 5799 137 80 113 39 37#B 2258 445 .20 172 33 2015 354 218 57 640 900

We had triple the posts, but less than half the posters – a much more activegroup. Our linecount was less, probably caused by one very terse posterwhom I won’t name ([name omitted]). One other item stands out – morethan a quarter of [Group B’s] posts were crossposted, with 900 targets! What’sgoing on there?? [Group A] had very few crossposts.

For the personal stats, [name omitted] was the most consistent poster,posting every day in November! But she only initiated 3 threads. Actually, asa group we were very consistent – 30 people posted 14 days or more duringthe month.

RB started the most threads, followed by MC and [name omitted]. SinceCKW defected to AS, MC is on vacation, and RB is preoccupied, the numberof interesting threads has diminished of late.

Content is harder to establish since so many of our threads meander.Amazingly, the great anchor chain debate was less than 10% of the posts.Beverages seemed to hold second place.

Very interesting! This is more a “club” group. These stats suggest that the“company” is good!

Hmmm. Now I see that as meaning we have the same people postingagain and gain with less to say of anything with any depth. It seems thatrec.boat.cruisng has a broader base of contributors with something otherthan chit-chat to post. RBC tends toward the serious side of boating subjectsand do not tolerate funnybusiness . . . they kicked ole’ McNeal off monthsago.

123456789101112345678920111234567893011123456789401112345611


304

The structural perspective in comparative mode precipitates a discus-sion of how wonderful and club-like the newsgroup is and also how offtopic subjects are pursued. No doubt such discussions can be irritatingto members, more consequentially they can lead newsgroups to splinter.However, a newsgroup’s interaction style and topic evolves over time,and the structural data provides an opportunity for these topics to bebroached.

15.5 Conclusion

Netscan social accounting data is applied both to newsgroups and par-ticipants to distinguish between useful newsgroups and those that arenoisy or fractious and between authors who may be regulars or inter-lopers, nice people or not so nice, and those authors who have qualityresponses as well as a quantity of responses. Through these and otherstructural measures newsgroups and their members are typified, cate-gorised and understood from a perspective not possible (or excessivelycostly to manually construct) through a simple archive of messages.Social accounting tools present the historical and sociological tracks ofthe newsgroup and are used to perform functions that seem extremelysimilar to those used by offline groups and organisations. Social account-ing data is not merely useful for our understanding of Usenet newsgroupsbut may be becoming a vital and commonly used tool by the membersof these kinds of discussion spaces themselves.

011

011

011

011

11

305

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●


15

References

@Man (2000) Web Ad Blocking Under Linux/Unix, BeOS, MacOS and Windows.http://www.ecst.csuchico.edu/~atman/spam/adblock.shtml

Activeworlds (2003) Activeworlds Maps, http://www.activeworlds.com/community/maps.asp

Adler, D. (1996) Virtual Audio: Three-Dimensional Audio in Virtual Environ-ments. Swedish Institute of Computer Science (SICS), Internal Report ISRNSICS-T—96/03-SE.

Amazon.com (2002) http://www.amazon.com/webservicesAppelt, W. (1999) WWW Based Collaboration with the BSCW System, In

SOFSEM’99, Springer Lecture Notes in Computer Science 1725, Milovy,Czech Republic, pp. 66–78.

ATI (2003) Wireless Solutions. http://www.ati.com/products/builtwireless.htmlBadler, N., Palmer, M. and Bindiganavale, R. (1999) Animation Control for Real-

Time Virtual Humans. In: Communications of the ACM, 42 (8): 64–73, ACMPress.

Bannon, L. (1989) Shared Information Spaces: Cooperative User SupportNetworks. Mutual Uses of Cybernetics and Science, Amsterdam, 27 March–1April, University of Amsterdam.

Barker, R. (1968) Ecological Psychology, Stanford University Press, Stanford.Barrus, J. W., Waters, R. C. and Anderson, D. B. (1996) Locales: Supporting Large

Multiuser Virtual Environments. IEEE Computer Graphics and Applica-tions, 16(6), 50–57.

Bartle, R. (1990) Early MUDHistory, http://www.ludd.luth.se/mud/aber/mud-history.html

Begole, J. B., Tang, J. C., Smith, R. B. and Yankelovich, N. (2002) Work RhythmsAnalysing Visualisations of Awareness Histories of Distributed Groups, In Proceedings of the ACM 2002 Conference on Computer-SupportedCooperative Work – CSCW 2002, ACM, New Orleans, LO, pp. 334–343.

Benford, S. and Fahlén, L. E. (1993a) A Spatial Model of Interaction in LargeVirtual Environments. Paper presented at the 3rd European Conference onComputer Supported Cooperative Working, Milan, Italy.

Benford, S. D. and Fahlén, L. E. (1993b) Awareness, Focus, Nimbus and Aura –A Spatial Model of Interaction in Virtual Worlds. Paper presented at theHCI International 1993, Orlando, FL.

Benford, S., Snowdon, D., Greenhalgh, C., Inrgam, R., Knox, I., and Brown, C.(1995a) VR-VIBE: A Virtual Environment for Co-operative InformationRetrieval, Computer Graphics Forum 14(3) (Proceedings of Eurographics’95), 30 August–1 September, NCC Blackwell, pp. 349–360.

011

011

011

011

11

307

Benford, S., Bowers, J., Fahlén, L E, Greenhalgh, C., Mariani, J. and Rodden, T.(1995b) Networked Virtual Reality and Cooperative Work, Presence 4(1):364–386.

Benford, S., Brown, C., Reynard, G. and Greenhalgh, C. (1996) Shared Spaces:Transportation, Artificiality and Spatiality. In Proceedings of the ACMConference on Computer Supported Cooperative Work (CSCW’96), Boston,pp. 77–86, ACM Press.

Benford, S., Greenhalgh, C. and Lloyd, D. (1997a) Crowded Collaborative VirtualEnvironments. In Proceedings of ACM CHI’97, Atlanta, GA, USA, March1997, pp. 59–66.

Benford, S., Greenhalgh, C., Snowdon, D. and Bullock, A. (1997b) Staging aPublic Poetry Performance in a Collaborative Virtual Environment. In J.Hughes, W. Prinz, T. Rodden and K. Schmidt (eds.), Proceedings of the Fifth European Conference on Computer Supported Cooperative Work –ECSCW’97. 9–11 September, Lancaster, England. Kluwer AcademicPublishers, Dordrecht, pp. 125–140.

Benford, S. D., Snowdon, D. N., Brown, C. C., Reynard, G. T. and Ingram, R. J.(1997c) Visualising and Populating the Web: Collaborative VirtualEnvironments for Browsing, Searching and Inhabiting Webspace, InJENC’97 – Eighth Joint European Networking Conference, Edinburgh.

Benford, S. D., Brazier, C.-J., Brown, C., Craven, M., Greenhalgh, C., Morphett,J. and Wyver, J. (1998) Demonstration and Evaluation of InhabitedTelevision, eRENA Deliverable 3.1.

Benford, S., Greenhalgh, C. et al. (1999a) Broadcasting On-Line Social Interactionas Inhabited Television. In Bodker, S., Kyng, M. and Schmidt, K. (eds.),Proceedings of the Sixth European Conference on Computer SupportedCooperative Work – ECSCW’99. 12–16 September, Copenhagen, Denmark.Kluwer Academic Publishers, Dordrecht, pp. 129–198.

Benford, S., Bowers, J., Craven, M., Greenhalgh, C., Morphett, J., Regan, T.,Walker, G. and Wyver, J. (1999b) Evaluating Out of this World: AnExperiment in Inhabited Television, eRENA Deliverable D7a.1.

Benford, S., Norman, S. J., Bowers, J., Adams, M., Row Farr, J., Koleva, B.,Rinman, M.-L., Martin, K., Schnädelbach, H. and Greenhalgh, C. (1999b)Pushing Mixed Reality Boundaries, eRENA Deliverable D7b.1.

Benford, S., Bederson, B., Åkesson, K., Banyon, V., Druin, A., Hansson, P., et al.(2000) Designing Storytelling Technologies to Encourage Collaborationbetween Young Children. In Proceedings of ACM CHI’00, The Hague.

Berkeley Laboratory (2002) Introduction to the MBone Distributed SystemsDepartment Collaboration Technologies Group at Ernest Orlando LawrenceBerkeley National Laboratory, http://www-itg.lbl.gov/mbone/

Billinghurst, M., Karo, H. and Poupyrev, I. (2001) The MagicBook: A TransitionalAR Interface. Computer Graphics, 25: 745–753.

Bly, S. A., Harrison, S. R., and Irwin, S. (1993) Media Spaces: Bringing PeopleTogether in a Video, Audio, and Computing Environment, Communicationsof the ACM, 36(1): 28–47.

Blunck, A. (1998) The World Generator – The Engine of Desire, an InteractiveInstallation by Bill Seaman, eRENA Deliverable.

Bowers, J., Button, G. and Sharrock, W. (1995) Workflow from Within andWithout: Technology and Cooperative Work on the Print IndustryShopfloor. In Marmolin, H., Sunblad, Y. and Schmidt. K. (eds.), Proceedings

123456789101112345678920111234567893011123456789401112345611

References

308

of the Fourth European Conference on Computer Supported CooperativeWork – ECSCW’95. 10–14 September, Stockholm, Sweden. Kluwer Acade-mic Publishers, Dordrecht.

Bowers, J., Pycock, J. and O’Brien, J. (1996) Talk and Embodiment inCollaborative Virtual Environments. In Proceedings of CHI ‘96, pp. 58–65.

Bowers, J., Hellström, S.-O., Jää-Aro, K.-M., Söderberg, J., Bino, H. P. and Fahlén,L. E. (1998a) Constructing and Manipulating the Virtual: GestureTransformation, Soundscaping and Dynamic Environments for ExtendedArtistic Performance, eRENA Deliverable 2.2.

Bowers, J., Hellström, S.-O. and Jää-Aro, K.-M. (1998b) Making Lightwork: TheAlgorithmic Performance of Virtual Environments. In Bowers et al. (1998a),pp. 6–20.

Bowers, J., Norman, S. J., Staff, H., Schwabe, D., Wallen, L., Fleischmann, M. andSundblad, Y. (1998c) Extended Performances: Evaluation and Comparison,eRENA Deliverable D2.3.

Bowers, J. and Jää-Aro, K.-M. (1999) Blink: Exploring and Generating Contentfor Electronic Arenas. In Hirtes et al. (1999), chapter 6.

Bowers, J., Hellström, S.-O. and Jää-Aro, K.-M. (1999) Supporting EventManagement by Sonifying Participant Activity. In Hirtes et al. (1999),chapter 4.

Bowers, J., Jää-Aro, K.-M., Hellström, S.-O., Lintermann, B., Hoch, M., Drozd,A., Taylor, I. and Whitfield, G. (2000a) Production and Management ofEvents in Electronic Arenas, eRENA Deliverable 4.5.

Bowers, J., Jää-Aro, K.-M., Hellström, S.-O., Hoch, M. and Whitfield, G. (2000b)Production Support Tools for Electronic Arenas: Using Tangible Interfacesfor Media Editing. In Bowers et al. (2000a), chapter 3.

Bowers, J. (2001) Crossing the Line: A Field Study of Inhabited Television,Behaviour and Information Technology 20(2): 127–140.

Bowker, G. and Star, L. (1999) Sorting Things Out: Classification and itsConsequences. MIT Press, Cambridge, MA.

Boyd, D., Lee, H-Y., Ramage, D. and Donath, J. (2002) Developing LegibleVisualizations for Online Social Spaces, 35th Annual Hawaii InternationalConference on System Sciences, Vol. 4, p. 115. Institute of ElectricalEngineers.

Brave, S., Ishii, H. and Dahley, A. (1998) Tangible Interfaces for RemoteCollaboration and Communication. In Poltrock, S. and Grudin, J. (eds.),CSCW ‘98 Computer Supported Co-operative Work, AACM Press, Seattle,pp. 169–178.

Brooks, Jr, F. P. (1999) What’s Real About Virtual Reality?, IEEE ComputerGraphics and Applications, 19(6): 16–27, IEEE.

Brown, B., MacColl, I., Chalmers, I., Galani, A., Randell, C. and Steed, A. (2003)Lessons from the Lighthouse: Collaboration in a Shared Mixed RealitySystem. To appear in Proceedings of the ACM Computer–HumanInteraction (CHI03), Fort Lauderdale.

BSXMUD (1994) http://www.lysator.liu.se/mud/bsxmud.htmlBullock, A. (1997) Inhabiting the Web: Highlights from a Series of VR Meetings,

Video Proceedings Fifth European Conference on Computer SupportedCooperative Work (ECSCW ‘97), 7–11 September 1997, Lancaster, UK.

Bullock, A. and Gustafson, P. (2001) The VITI Program: Final Report, SICSTechnical Report T2001:02, March 2001, ISSN 1100–3154.

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

309

Burgoon, M., Hunsaker, F. G. and Dawson, E. J. (1994) Human Communication(3rd edn). SAGE Publications, London.

Burka, L. P. (1995) http://www.apocalypse.org/pub/u/lpb/muddex/mudline.html

Burkhalter, B. (1999) Reading Race Online, Communities in Cyberspace.Routledge, London.

Büscher, M., Krogh, P., Mogensen, P. and Shapiro, D. (2001) Vision on the Move:Technologies for the Footloose. Appliance Design 1(1): 11–14.

Button, G. (1992) The Curious Case of the Vanishing Technology. In Button, G.(ed.), Technology in Working Order: Studies of Work, Interaction andTechnology. Routledge, London, pp. 10–28.

Buxton, W. A. S. (1993) Telepresence: Integrating Shared Task and PersonSpaces. In Baecker, R. M. (ed.), Readings in Groupware and ComputerSupported Cooperative Work: Assisting Human-human Collaboration.Morgan Kaufmann, San Mateo, CA, pp. 816–822.

Card, S., Mackinlay, J. and Schneiderman, B. (1999) Readings in InformationVisualization: Using Vision to Think. Morgan Kaufmann.

Carion, S., Beylot, P., Magnenat-Thalmann, N., Emering, L., Raupp Musse, S. andThalmann, D. (1998) Mixed Reality Dance Performance, eRENA Deliverable2.1.

Castro, M., Druschel, P., Kermarrec, A.-M. and Rowstron, A. (2002) SCRIBE: ALarge-scale and Decentralised Application-level Multicast Infrastructure.IEEE Journal on Selected Areas in Communications (JSAC) (Special issueon Network Support for Multicast Communications).

Chalmers, M. (1991) Seeing the World through Word-Coloured Glasses. InProceedings of the Second International Conference on Cyberspace,University of California Santa Cruz.

Chalmers, M. and Chitson, P. (1992) Bead: Explorations in InformationVisualisation. In Proceedings of the ACM Conference on InformationRetrieval (SIGIR’92), Copenhagen. Published as a special issue of SIGIRForum, June 1992, ACM Press, pp. 330–337.

Chalmers, M. (1993) Using a Landscape Metaphor to Represent a Corpus ofDocuments. In Proceedings of the European Conference on SpatialInformation Theory, Elba.

Chalmers, M., Rodden, K. and Brodbeck, D. (1998) The Order of Things: Activity-centred Information Access. In Proceedings of the World Wide Web(WWW98), Brisbane. Published as Computer Networks and ISDN Systems,30: 359–367.

Chalmers, M. (1999) Comparing Information Access Approaches. J. ASIS 50thAnniversary Issue, 50(12): 1108–1118.

Chalmers, M. (2002) Awareness, Representation and Interpretation. Journal ofComputer Supported Co-operative Work 11: 389–409.

Chase, P., Hyland, R., Merlino, A., Talant, A., Maybury, M. and Hollan, R. (1998)Semantic and Content Visualization. Coling-Acl 98 workshop: ContentVisualization and Intermedia Representations, Montreal, Canada, August1998.

Chen, Y., Katz, R.-H. and Kubiatowicz, J.-D. (2002) SCAN: A Dynamic Scalableand Efficient Content Distribution Network. In Proceedings of theInternational Conference on Pervasive Computing (Pervasive 2002), Zurich,Switzerland.

123456789101112345678920111234567893011123456789401112345611

References

310

Churchill, E. F. and Snowdon, D. (1998) Collaborative Virtual Environments: AnIntroductory Review. Virtual Reality: Research Developments and Applica-tions, 3: 3–15.

Churchill, E. and Bly, S. (1999) It’s All in the Words: Supporting Work Activitieswith Lightweight Tools. In Proc. Group ‘99, Phoenix, AZ, November 1999.

Churchill, E.F., Snowdon, D. and Munro, A. (2001) Collaborative VirtualEnvironments. Digital Places and Spaces for Interaction. Springer Verlag,London.

Churchland, P. M. and Churchland, P. S. (1998) On the Contrary: Critical Essays1987–1997. MIT Press, Cambridge, MA.

Colebourne, A., Mariani, J. and Rodden, T. (1996) Q-PIT: A PopulatedInformation Terrain. In Proceedings of Visual Data Exploration andAnalysis III, San José.

Craven, M., Benford, S., Greenhalgh, C. and Wyver, J. (2000) Third Demonstra-tion of Inhabited Television, eRENA Deliverable 7a.3.

Craven, M., Taylor, I., Drozd, A., Purbrick, J., Greenhalgh, C., Benford, S., Fraser,M., Bowers, J., Jää-Aro, K.-M., Lintermann, B. and Hoch, M. (2001)Exploiting Interactivity, Influence, Space and Time to Explore Non-LinearDrama in Virtual Worlds. In Proceedings of CHI 2001, pp. 30–37.

Cuddihy, E. and Walters, D. (2000) Embodied Interaction in Social VirtualEnvironments. In Proc. ACM CVE 2000, San Francisco, September 2000.

Donath, J. (1999) Identity and Deception in the Virtual Community, inCommunities in Cyberspace. Routledge, London.

Donath, J., Karahalios, K. and Viégas, F. (1999) Visualizing Conversations, 32ndAnnual Hawaii International Conference on System Sciences. Institute ofElectrical Engineers.

Dourish, P. and Bellotti, V. (1992) Awareness and Coordination in SharedWorkspaces, In Turner, J. and Kraut, R. (eds.), CSCW 92 – SharingPerspectives, ACM Press, Toronto, Canada, pp. 107–114.

Dourish, P., Adler, A., Bellotti, V. and Henderson, A. (1996) Your Place or Mine?Learning from Long-Term Use of Audio-Video Communication. Journal ofComputer Supported Co-operative Work, 5(1): 33–62.

Drexler, E. (1992) Nanosystems: Molecular Machinery, Manufacturing andComputation, John Wiley & Sons, New York.

Drozd, A., Bowers, J., Benford, S., Greenhalgh, C. and Fraser, M. (2001)Collaboratively Improvising Magic: An Approach to Managing Participationin an On-Line Drama. In Proceedings of ECSCW 2001, pp 159–178, Kluwer.

Dyck, J., and Gutwin, C. (2002) Groupspace: A 3D Workspace Supporting UserAwareness. In: Extended Abstracts of CHI 2002, Minneapolis, MN, pp.502–503, ACM Press.

Edelman, G. and Tononi, G. (2000) Consciousness: How Matter BecomesImagination. Allen Lane Penguin Press.

Electronic Arts (2003) Ultima Online, http://www.uo.com/Erdelez, S. (1999) Information Encountering: It’s More Than just Bumping into

Information, Bulletin of the American Society for Information Science, 25:25–29.

Erickson, T., Smith, D. N. and Kellogg, W. A. (1999) Socially TranslucentSystems: Social Proxies, Persistent Conversation, and the Design of“Babble”. Proceedings of the Conference on Human Factors in ComputingSystems. ACM Press, New York.

011

011

011

011

11

311

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Evard, R. (1993) Collaborative Networked Communication – MUD as SystemsTools, in Seventh USENIX Systems Administration Conference Proceedings,pp. 1–8, Monterey, CA, November.

Evard, R., Churchill, E. and Bly, S. (2001) Waterfall Glen: Social Virtual Realityat Work. In Churchill, E., Snowdon, D. and Munro, A. (eds.), CollaborativeVirtual Environments, Springer.

EverQuest (2003) http://www.station.sony.com/Fahlman (2003) http://www-2.cs.cmu.edu/~sef/sefSmiley.htmFairclough, M. (1986) The Webbs’ Revenge ? Conditional Degeneration and

Producer Co-operatives: A Reappraisal of the Socialist Tradition. NationalConference for Research on Worker Co-operative, Co-operatives ReseachUnit, Open University, London.

Fairclough, M. (1987) Mondragon in Context, Department of Sociology,University of Bristol. Bristol.

Falk, H. and Dierking, L. (1992) The Museum Experience. Whalesback Books,Washington.

Faloutsos, P., Vanne de Panne, M. and Terzopoulos, D. (2001) ComposableControllers for Physics-based Character Animation. In: Proceedings of SIG-GRAPH 2001, pp. 251–260, ACM Press.

Farley, T. (2001) TelecomWriting.com’s Telephone History Series http://www.privateline.com/TelephoneHistory/History1.htm

Fiore, A. T., Lee Tiernan, S. and Smith, M. (2001) Observed Behavior andPerceived Value of Authors in Usenet Newsgroups: Bridging the Gap. InProceedings of the Conference on Human Factors in Computing Systems.ACM Press, New York.

Fjeld, M., Voorhorst, F., Bichsel, M., Lauche, K., Rauterberg, M. and Krueger, H.(1999) Exploring Brick-based Navigation and Composition in anAugmented Reality. In Gellersen, H.-W. (ed.), Handheld and UbiquitousComputing. Vol. 1707, Springer-Verlag, Berlin, pp. 102–116.

Floyd, S., Jacobson,. V., McCanne, S., Liu, C.-G. and Zhang, L. (1995) A ReliableMulticast Framework for Light-Weight Sessions and Application LevelFraming. In Proceedings of ACM SIGCOMM 95, ACM Press, New York, pp.242–256.

Fraser, M., Benford, S., Hindmarsh, J. and Heath, C. (1999) SupportingAwareness and Interaction through Collaborative Virtual Interfaces. InProceedings of UIST’99, pp. 27–36, ACM Press.

Fraser, M., Glover, T., Vaghi, I., Benford, S., Greenhalgh, C., Hindmarsh, J. andHeath, C. (2000) Revealing the Reality of Collaborative Virtual Reality. InProceedings of the Third ACM Conference on Collaborative VirtualEnvironments (CVE 2000), San Francisco, CA, September 2000, pp. 29–37,ACM Press.

Frécon, E. and Stenius, M. (1998) DIVE: A Scaleable Network Architecture forDistributed Virtual Environments. Distributed Systems Engineering Journal(DSEJ), 5: 91–100, Special Issue on Distributed Virtual Environments.

Frécon, E. and Avatare Nöu, A. (1998) Building Distributed Virtual Environmentsto Support Collaborative Work. In Proceedings of ACM Symposium onVirtual Reality Software and Technology (VRST ‘98), Taipei, Taiwan, pp.105–113.

Frécon, E. and Smith, G. (1998) WebPath – A Three-dimensional Web History.In Proceedings of the IEEE Symposium on Information Visualization

123456789101112345678920111234567893011123456789401112345611

References

312

(InfoVis ‘98), part of IEEE Visualization 1998 (Vis98), NC, USA, pp. 3–10.

Frécon, E., Greenhalgh, C. and Stenius, M. (1999) The DiveBone: An Application-Level Network Architecture for Internet-Based CVEs. Paper presented at theVRST’99 – Symposium on Virtual Reality Software and Technology 1999,20–22 December, University College London, UK.

Frécon, E. and Smith, G. (1999) Semantic Behaviours in Collaborative VirtualEnvironments. In Proceedings of Virtual Environments ‘99 (EGVE’99), pp.95–104, Vienna, Austria.

Frécon, E., Smith, G., Steed, A., Stenius, M. and Ståhl, O. (2001) An Overview ofthe COVEN Platform, Presence: Teleoperators and Virtual Environments,10(1): 109–127.

Fruchterman, T. M. J. and Reingold, E. M. (1991) Graph Drawing by Force-directed Placement. Software Practice and Experience 21(11): 1129–1164.

Fuchs, H. (1998) Beyond the Desktop Metaphor: Toward More Effective Display,Interaction, and Telecollaboration in the Office of the Future via a Multitudeof Sensors and Displays, AMCP:98, pp. 30–43, Osaka, Japan.

Fuchs, L., Pankoke-Babatz, U. and Prinz, W. (1995) Supporting CooperativeAwareness with Local Event Mechanisms: The GroupDesk System, InMarmolin, H., Sundblad, Y. and Schmidt, K. (eds.), Fourth EuropeanConference on Computer-Supported Cooperative Work: ECSCW ‘95,Kluwer Academic Publishers, Stockholm, pp. 247–262.

Fuchs, L. (1999) AREA: A Cross-application Notification Service for Groupware,In Bødker, S., Kyng, M. and Schmidt, K. (eds.), ECSCW’99: Sixth Conferenceon Computer Supported Cooperative Work, Kluwer Academic Publishers,Copenhagen, pp. 61–80.

Fussell, S. R., Kraut, R. E. and Siegel, J. (2000) Coordination of Communication:Effects of Shared Visual Context on Collaborative Work, In Whittaker, S.and Kellog, W. (eds.), CSCW 2000, ACM, Philadelphia, PA, pp. 21–30.

Galani, A. and Chalmers, M. (2002) Can You See Me? Exploring Co-Visitingbetween Physical and Virtual Visitors. Proc. Museums and the Web.Archives & Museum Informatics, Boston, USA.

Garcia, A. C. and Jacobs, J. B. (1999) The Eyes of the Beholder: Understandingthe Turn-Taking System in Quasi-Synchronous Computer-Mediated Com-munication, Research on Language and Social Interaction, 32(4): 337–367.

Gaver, B. (2002) Provocative Awareness. Computer Supported CooperativeWork: The Journal of Collaborative Computing (Special Issue onAwareness) 11(3–4): 475–493.

Gaver, W. W. (1992) The Affordance of Media Spaces for Collaboration. InTurner, J. and Kraut, R. (eds.), CSCW ‘92: Conference on ComputerSupported Cooperative Work – Sharing Perspectives, ACM Press, Toronto,Canada, pp. 17–24.

Gaver, W., Moran, T., MacLean, A., Lövstrand, L., Dourish, P., Carter, K., andBuxton, W. (1992) Realizing a Video Environment: EuroPARC’s RAVESystem, Proceedings of CHI’92. ACM, New York, pp. 27–35.

Gaver, W., Sellen, A., Heath, C., and Luff, P. (1993) One Is Not Enough: MultipleViews in a Media Space. In Proceedings of INTERCHI ‘93, pp. 335–341, ACMPress.

Gavin, L., Mottram, C., Penn, A. and Kueppers, S. (2000) Space Module – TOWERDeliverable D3.1.

011

011

011

011

11

313

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Gibson, J. J. (1986) The Ecological Approach to Visual Perception, LawrenceErlbaum Associates, Hillsdale, NJ.

Gibson, W. (1986) Neuromancer, Victor Gollancz, London.Giddens, A. (1984) The Constitution of Society, Polity Press, Cambridge, p. 402.Goodwin, C. and Goodwin, M. (1996) Formulating Planes: Seeing as a Situated

Activity. In Engestrom, Y. and Middleton, D. (eds.), Communication andCognition at Work. Cambridge University Press, New York, 61–95.

Google Groups, http://groups.google.com/Greenhalgh, C. M., and Benford, S. D. (1995) MASSIVE: A Virtual Reality System

for Tele-conferencing, ACM Transactions on Computer Human Interfaces(TOCHI), 2 (3), pp. 239–261, ACM Press, September.

Greenhalgh, C., Bullock, A., Tromp, J. and Benford, S. (1997) Evaluating theNetwork and Usability Characteristics of Virtual Reality Tele-conferencing,BT Technology Journal, 15(4), October.

Greenhalgh, C. (1998) Analysing Awareness Management in Distributed VirtualEnvironments. Paper presented at the Second Annual Workshop on SystemAspects of Sharing a Virtual Reality, at CVE’98, Manchester, UK.

Greenhalgh, C. M. (1999) Large Scale Collaborative Virtual Environments.Springer-Verlag. London.

Greenhalgh, C., Bowers, J., Walker, G., Wyver, J., Benford, S. and Taylor, I. (1999)Creating a Live Broadcast from a Virtual Environment. In Proceedings ofSIGGRAPH ‘99, pp. 375–384.

Greenhalgh, C. and Benford, S. (1999) Supporting Rich and DynamicCommunication in Large-Scale Collaborative Virtual Environments.Presence: Teleoperators Virtual Environments, 8(1): 14–35.

Greenhalgh, C., Purbrick, J., Benford, S., Craven, M., Drozd, A. and Taylor, I.(2000a) Temporal Links: Recording and Replaying Virtual Environments,In Proceedings of the Eighth ACM International Conference on Multimedia(MM 2000), ACM Press, pp. 67–74.

Greenhalgh, C., Purbrick, J. and Snowdon, D. (2000b) Inside MASSIVE-3:Flexible Support for Data Consistency and World Structuring. Paper pre-sented at the Third ACM Conference on Collaborative Virtual Environments(CVE 2000), San Francisco, CA.

Greenhalgh, C. (2001) Understanding the Network Requirements of Collabora-tive Virtual Environments. In Churchill, E., Snowdon, D. and Munro, A.(eds.), Collaborative Virtual Environments: Digital Places and Spaces forInteraction. Springer Verlag, London, pp. 56–76.

Grondin, J. (1994) Introduction to Philosophical Hermeneutics. Trans. J.Weinsheimer. Yale University Press.

Gross, T. (2002) Ambient Interfaces in a Web-Based Theatre of Work, InProceedings of the Tenth Euromicro Workshop on Parallel, Distributed, andNetwork-Based Processing – PDP 2002, IEEE Computer Society Press, GranCanaria, Spain, pp. 55–62.

Gross, T. and Prinz, W. (2000) Gruppenwahrnehmung im Kontext, In Reichwald,R. and Schlichter, J. (eds.), Verteiltes Arbeiten – Arbeit der Zukunft,Tagungsband der D-CSCW 2000. B.G. Teubner, Stuttgart/ Leipzig/Wiesbaden, pp. 115–126.

Gutwin, C. and Greenberg, S. (1998) Design for Individuals, Design for Groups:Trade-offs in Power and Workspace Awareness. In Proceedings of CSCW’98,pp. 207–216, ACM Press.

123456789101112345678920111234567893011123456789401112345611

References

314

Guye-Vuillème, A., Capin, T. K., Pandzic, I. S., Thalmann, N. M. and Thalmann,D. (1999) Non-verbal Communication Interface for Collaborative VirtualEnvironments. The Virtual Reality Journal, 4: 49–59, Springer.

Harper, R. H. R., Hughes, J., Randall, D., Shapiro, D. and Sharrock, W. (1998)Order in the Skies: Sociology, CSCW, and Air Traffic Control. Routledge,London.

Harper, R. R., Hughes, J. A. and Shapiro, D. Z. (1989) Working in Harmony: AnExamination of Computer Technology in Air Traffic Control. EC-CSCW ‘89.Proceedings of the First European Conference on Computer SupportedCooperative Work, Gatwick, London, 13–15 September.

Harrison, S. and Dourish, P. (1996) Re-Place-ing Space: The Roles of Place andSpace in Collaborative Systems. In Proceedings of the ACM Conference onComputer Supported Co-operative Work, ACM Press, pp. 67–76.

Heath, C. and Luff, P. (1991) Collaborative Activity and Technological Design:Task Coordination in London Underground Control Rooms, In Bannon, L.,Robinson, M. and Schmidt, K. (eds.), Second European Conference onComputer Supported Cooperative Work, Kluwer, Amsterdam, pp. 65–80.

Heath, C., Jirotka, M. Luff, P. and Hindmarsh, J. (1993) Unpacking Collaboration:The Interactional Organisation of Trading in a City Dealing Room. InMichelis, G. de, Simone, C. and Schmidt, K. (eds.), Proceedings of the ThirdEuropean Conference on Computer Supported Cooperative Work – ECSCW‘93. 13–17 September, Milan, Italy. Kluwer Academic Publishers, Dordrecht.

Heath, C. and Luff, P. (1996) Convergent Activities: Line Control and PassengerInformation on the London Underground. In Engestrom, Y. and Middleton,D. (eds.), Communication and Cognition at Work. Cambridge UniversityPress, New York, pp. 96–129.

Heath, C. C., Luff, P., Kuzuoka, H., Yamazaki, K. (2001) Creating CoherentEnvironments for Collaboration, Proceedings of ECSCW 2001, Bonn,Germany, pp. 119–138, Kluwer.

Hillier, B. and Hanson, J. (1984) The Social Logic of Space. Cambridge UniversityPress, Cambridge.

Hindmarsh, J., Fraser, M., Heath, C., Benford, S. and Greenhalgh, C. (1998)Fragmented Interaction: Establishing Mutual Orientation in VirtualEnvironments. In: Proceedings of CSCW’98, Seattle, WA, USA, pp. 217–226,ACM Press.

Hirtes, S., Hoch, M., Lintermann, B., Norman, S. J., Bowers, J., Jää-Aro, K.-M.,Hellström, S.-O. and Carlzon, M. (1999) Production Tools for ElectronicArenas: Event Management and Content Production, eRENA DeliverableD4.3/D4.4.

Hoch, M., Jää-Aro, K.-M. and Bowers, J. (1999a) Round Table: A PhysicalInterface for Virtual Camera Deployment in Electronic Arenas. In Hirtes etal. (1999), chapter 5.

Hoch, M., Schwabe, D., Shaw, J., Staff, H., Raupp Musse, S., Garat, F., Thalmann,D., Jää-Aro, K.-M., Bowers, J. and Hellström, S.-O. (1999b) Individual andGroup Interaction, eRENA Deliverable 6.3.

Hood, E. (2000) Ad blocking. http://www.nacs.uci.edu/indiv/ehood/gems/ad-blocking.html

Hubbold, R., Cook, J., Keates, M., Gibson, S., Howard, T., Murta, A., West, A.,and Pettifer, S. (2001) GNU/MAVERIK: A Micro-kernel for Large-scaleVirtual Environments. Presence: Teleoperators and Virtual Environments,10: 22–34.

011

011

011

011

11

315

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Hughes, J. A., Prinz, W., Rodden, T. and Schmidt, K. (eds.) (1997) Proceedings ofthe Fifth European Conference on Computer-Supported Cooperative Work.

IEEE1278.1. (1995) Standard for Distributed Interactive Simulation – Applica-tion Protocols. In I. Institute of Electrical and Electronics Engineers (Ed.).

Ingram, R. and Benford, S. (1995) Legibility Enhancement for InformationVisualisation. In Proceedings of Visualization’95, Atlanta, GA, November.

Insley, J., Sandin, D. and DeFanti, T. (1997) Using Video to Create Avatars inVirtual Reality, Visual Proceedings of the 1997 SIGGRAPH Conference, LosAngeles, CA 08/01/1997–08/01/1997, pp. 128.

Intel (2003) Intel Graphics Performance Primitives. http://developer.intel.com/design/pca/applicationsprocessors/swsup/gpp.htm

Ishii, H., Kobayashi, M. and Grudin, J. (1992) Integration of Inter-Personal Spaceand Shared Workspace: ClearBoard Design and Experiments. Proceedingsof ACM CSCW ‘92 Conference on Computer-Supported Cooperative Work,pp. 33–42.

Ishii, H. and Kobayashi, M. (1993) ClearBoard: A Seamless Medium for SharedDrawing and Conversation with Eye Contact. In R. M. Baecker (ed.),Readings in Groupware and Computer Supported Cooperative Work:Assisting Human-human Collaboration. Morgan Kaufmann, San Mateo,CA, pp. 829–836.

Ishii, H. and Ullmer, B. (1997) Tangible Bits: Towards Seamless Interfacesbetween People, Bits, and Atoms. In Proceedings of ACM CHI’97, pp.234–241, Atlanta.

Ishii, H., Wisneski, C., Orbanes, J., Chun, B. and Paradiso, J. (1999)PingPongPlus: Design of an Athletic-Tangible Interface for Computer-Supported Cooperative Play. In Proceedings of CHI’99, Pittsburgh.

Jää-Aro, K.-M., Bowers, J. M. and Hellström, S.-O. (1999) Activity-OrientedNavigation. In Hoch et al. (1999b), pp. 45–52.

Jää-Aro, K.-M. and Snowdon, D. (2001) How Not to Be Objective. In:Collaborative Virtual Environments, Springer-Verlag, London, pp. 143–159.

Jancke, G., Grudin, J. and Gupta, A. (2000) Presenting to Local and RemoteAudiences: Design and Use of the TELEP System, Proc. CHI’2000, April,ACM Press, pp. 384–391.

Johansson, M. (1998) Designing an Environment for Distributed Real-TimeCollaboration, IWNA ‘98, Kyoto.

Jones, M.L.W. (2000) Collaborative Virtual Conferences: Using Exemplars toShape Future Research Questions, ACM CVE 2000, San Francisco,September.

Johnson, B. and Shneiderman, B. (1991) Tree-Maps: A Space-filling Approach to the Visualization of Hierarchical Information Structures. In SecondInternational IEEE Visualization Conference, IEEE Press, San Diego, CA, pp. 284–291.

Kaplan, S., Leigh Star, S., Tolone, W. J. and Bignoli, C. (1994) Grounding theMetaphor of Space in CSCW: Meta-Structures and Boundary Objects.DRAFT MS.

Kapolka, A., McGregor, D. and Capps, M. (2002) A Unified ComponentFramework for Dynamically Extensible Virtual Environments. In Proceed-ings of the Fourth International Conference on Collaborative VirtualEnvironments (CVE 2002), Bonn, Germany, pp. 64–71.

Kelly, S. U., Sung, C. and Farnham, S. (2002) Designing for Improved SocialResponsibility, User Participation and Content in On-line Communities,

123456789101112345678920111234567893011123456789401112345611

References

316

Proceedings of the Conference on Human Factors in Computing Systems.ACM Press, New York.

Kharif, O. and Salkever, A. (2001) A Chat with the Master of P2P in BusinessWeek: Special Report: Peer to Peer. 1 August.

Koleva, B., Schnädelbach, H., Benford, S. and Greenhalgh, C. (2001) Experiencinga Presentation though a Mixed Reality Boundary. In Proc. ACM SIGGROUPConference on Supporting Group Work (GROUP’01), Boulder, CO, pp.71–80, ACM Press.

Kreuger, W. and Froehlich, B. (1994) The Responsive Workbench. ComputerGraphics and Applications 14(3): 12–15.

Kuzuoka, H., Kosuge, T. and Tanaka, M. (1994) GestureCam: A VideoCommunication System for Remote Collaboration. CSCW 94: TranscendingBoundaries, Chapel Hill, NC, ACM.

Landry, C., Morley, D., Southwood, R. and Wright, P. (1986) What a Way to Runa Railroad: An Analysis of Radical Failure. Comedia, London.

Lamport, L. (1978) Time, Clocks, and the Ordering of Events in a DistributedSystem. Communication of the ACM, 21(7): 558–565.

Lantermann, E. D. (1980) Interaktionen – Person, Situation und Handlung,Urban und Schwarzenberg, Munich.

Lave, J. and Wenger, E. (1991) Situated Learning. Legitimate PeripheralParticipation, Cambridge University Press, Cambridge.

Lawley, E. (1994) The Sociology of Culture in Computer-mediated Communica-tion: An Initial Exploration. Rochester Institute of Technology, available at:http://www.itcs.com/elawley/bourdieu.html.

Leach, N. (ed.) (1997) Rethinking Architecture: A Reader in Cultural Theory.Routledge, London.

Lee, A., Girgensohn, A. and Schlueter, K. (1997) NYNEX Portholes: Initial UserReactions and Redesign Implications. In Hayne, S. and Prinz, W. (eds.),Group ‘97 Conference, ACM Press, Phoenix, AZ, pp. 385–394.

Lee, W., Goto, T., Raupp Musse, S., Aubel, A., Garat, F. and Davary, M. (2000)Participants: Individual Virtual Humans and Crowds Simulation inCollaborative Virtual Environment, eRENA Deliverable 5.4.

Leigh, J., Johnson, A., DeFanti, T. et al. (1999) A Review of Tele-ImmersiveApplications in the CAVE Research Network. In Proceedings of IEEE VR‘99, Houston, TX 13–17 March 1999, pp. 180–187.

Leigh, J., Johnson, A. E., Park, K. S., Cho, Y. J., Scharver, C., Krishnaprasad, N.K. and Lewis, M. J. (2000) CAVERNsoft G2: A Toolkit for High PerformanceTele-Immersive Collaboration. Paper presented at the Symposium onVirtual Reality Software and Technology, Seoul, Korea.

Leont’ew, A. N. (1977) Tätigkeit, Bewußtsein, Persönlichkeit, Klett, Stuttgart.Levine, R. (1997) A Geography of Time. Basic Books, New York.Lloyd, D., Steed, A., Bullock, A., Greenhalgh, C. and Frécon, E. (2001) Making

Collaborative Environments Work, Presence: Teleoperators and VirtualEnvironments, 10(2), April.

Lövstrand, L. (1991) Being Selectively Aware with the Khronika System, InBannon, L., Robinson, M. and Schmidt, K. (eds.), 2nd European Conferenceon Computer Supported Cooperative Work, Kluwer Academic Publishers,Amsterdam, pp. 265–277.

Luff, P., Hindmarsh, J. and Heath, C. C. (eds.) (2000) Workplace Studies:Recovering Work Practice and Informing System Design. CambridgeUniversity Press, Cambridge.

011

011

011

011

11

317

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Lynch, M. (1991a) Pictures of Nothing? Visual Construals in Social Theory.Sociological Theory 9(1): 1–21.

Lynch, M. (1991b) Science in the Age of Mechanical Reproduction: Moral andEpistemic Relations Between Diagrams and Photographs. Biology andPhilosophy 6: 205–226.

MacColl, I., Millard, M., Randell, C., Steed, A., Brown, B. et al. (2002) Shared visiting on EQUATOR City. Proc. 4th International conference on collabora-tive virtual environments (CVE 2002), Bonn, Germany, pp. 88–94. ACM Press.

Macedonia, M. R. and Brutzman, D. P. (1994) MBone Provides Audio and VideoAcross the Internet, IEEE Computer, 27(4): 30–36.

Macedonia, M. R., Zyda, M. J., Pratt, D. R., Barham, P. T. and Zeswitz, S. (1994) NPSNET: A Network Software Architecture for Large Scale VirtualEnvironments. Presence Teleoperators and Virtual Environments, 3(4).

Macedonia, M. R., Zyda, M. J., Pratt, D. R., Brutzman, D. P. and Barham, P. T.(1995) Exploiting Reality with Multicast Groups: A Network Architecture forLarge Scale Virtual Environments. Paper presented at the IEEE VirtualReality Annual Symposium, RTP, North Carolina, 11–15 March.

Maher, M.L., Simoff, S.J. and Cicognani, A. (2000) Understanding Virtual DesignStudios, Springer Verlag.

Manohar, N. and Prakash, A. (1995) The Session Capture and Replay Paradigmfor Asynchronous Collaboration, In Proceedings of the Fourth EuropeanConference on Computer-Supported Cooperative Work – ECSCW’95,Kluwer Academic Publishers, Stockholm, Sweden, pp. 149–164.

Mariani, J. and Rodden, T. (eds.) (1999) The Library Abstract eSCAPE Demon-strator. eSCAPE Esprit Project 25377, Deliverable 4.1.

Marratech (2003) http://www.marratech.comMarsh, J., Pettifer, S., and West, A. (1999) A Technique for Maintaining

Continuity of Perception in Networked Virtual Environments. Proc UKVR-SIG’99. Salford University Press.

Massey, D. (2003) http://www.bellsystemmemorial.com/picturephone.htmlMcGrath, A. (1998) ACM SIGGROUP Bulletin, 19: 21–24.Miller, D. C., Pope, A. C. and Waters, R. M. (1989) Long-Haul Networking of

Simulators. Paper presented at the Tenth Interservice/Industry TrainingSystems Conference, Orlando.

Minar, N. and Hedlund, M. (2001) A Network of Peers: Peer-to-Peer ModelsThrough the History of the Internet. Peer-to-Peer: Harnessing the Power ofDisruptive Technologies. A. Oram, O’Reilly & Associates.

Mira Lab (2003) http://miralabwww.unige.ch/Morrison, A., Ross, G. and Chalmers, M. (2002) A Hybrid Layout Algorithm for

Sub-Quadratic Multidimensional Scaling. In: Proc. IEEE InformationVisualisation, Boston, pp. 152–160.

Murray, C. D., Bowers, J. M., West, A. J., Pettifer, S. R. and Gibson, S. (2000)Navigation, Wayfinding and Place Experience within a Virtual City.Presence: Teloperators and Virtual Environments, 9: 435–447, MIT Press.

Munro, A., Hook, K. and Benyon, D. (1999) Footprints in the Snow. In Munro,Hook and Benyon (eds.), Social Navigation of Information Space, pp. 1–14,Springer.

Nardi, B., Schwartz, H., Kuchinsky, A., Leichner, R., Whittaker, S. and Sclabassi,R. (1993) Turning Away from Talking Heads: The Use of Video-as-Data inNeurosurgery. Proc. INTERCHI ‘93, Amsterdam, 22–29 April, ACM.

123456789101112345678920111234567893011123456789401112345611

References

318

Netscan: A Social Accounting Search Engine. http://netscan.research.microsoft.com

Noriega, P. and Sierra, C. (1999) Towards a Formal Specification of ComplexSocial Structures in Multi-agent Systems, 1624 in Lecture Notes in ArtificialIntelligence, pp. 284–300. Springer-Verlag.

Norman, S. J., Staff, H., Schwabe, D. and Wallen, L. (1998) Extended PerformanceStaging: Background and Evaluation. In Bowers et al. (1998c), chapter 2.

Nöth, W. (1995) Handbook of Semiotics. Indiana University Press, Bloomington,IN.

Oliveira, M., Crowcroft, J. and Slater, M. (2000) Component FrameworkInfrastructure for Virtual Environments. In Proceedings of the ThirdInternational Conference on Collaborative Virtual Environments(CVE’2000), San Francisco, CA, pp. 139–146.

Pankoke-Babatz, U. and Syri, A. (1997) Collaborative Workspaces for TimeDeferred Electronic Cooperation, In Hayne, S. and Prinz, W. (eds.), GROUP‘97: International ACM SIGGROUP Conference on Supporting Group Work,ACM Press, Phoenix, AZ, pp. 187–196.

Pankoke-Babatz, U. (2000) Electronic Behaviour Settings for CSCW. AI andSociety, 14(1): 3–30.

Patterson, J. F., Day, M. and Kucan, J. (1996) Notification Servers forSynchronous Groupware. In Ackermann, M. S. (ed.), Conference onComputer Supported Cooperative Work (CSCW’96), ACM Press, Boston,MA, pp. 122–129.

Pettifer, S. (ed.) (1999) eSCAPE Systems, Techniques and Infrastructures.eSCAPE Esprit Project 25377, Deliverable 5.1.

Pettifer, S., Cook, J. Marsh, J., and West, A. (2000) Deva3: Architecture for a Large-scale Virtual Reality System. In Proceedings of the ACM Sym-posium in Virtual Reality Software and Technology 2000, pp. 33–39. ACMPress.

Pettifer, S., Cook, J. and Mariani, J. (2001) Towards Real-time InteractiveVisualisation Virtual Environments: A Case Study of Q-space. InProceedings of the International Conference on Virtual Reality 2001, pp.121–129. ISTIA Innovations Laval, France.

Plaza, E., Arcos, J. L., Noriega, P. and Sierra, C. (1998) Competing Agents inAgent-mediated Institutions. Personal Technologies, 2: 212–220.

Polycom (2003) http://www.polycom.com/products_services/products_groups/0,1422,pw-186–186–72,00.html

Prinz, W. (1999) NESSIE: An Awareness Environment for Cooperative Settings,In Bødker, S., Kyng, M. and Schmidt, K. (eds.), ECSCW’99: Sixth Conferenceon Computer Supported Cooperative Work, Kluwer Academic Publishers,Copenhagen, pp. 391–410.

Project Oxygen (2003) http://oxygen.lcs.mit.edu/E21.htmlPrussak, L. (1997) Knowledge in Organizations, Butterworth-Heinemann,

Oxford.Raja, V. (1998) The Cybersphere, http://www.vr-systems.ndtilda.co.uk/sphere1.

htmRandell, C. and Muller, H. (2001) Low Cost Indoor Positioning System. In Proc.

UbiComp 2001: Ubiquitous Computing, pp. 42–48, Springer.Ratnasamy, S., Handley, M., Karp, R. and Shenker, S. (2001) Application-level

Multicast using Content-Addressable Networks. In Proceedings of the Third

011

011

011

011

11

319

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

International Workshop on Networked Group Communication (NGC ‘01),London.

Reynolds, C. (1987) Flocks, Herds, and Schools: A Distributed Behavioral Model.Computer Graphics, 21(4): 25–34.

Richardson, T., Stafford-Fraser, Q., Wood, K. R. and Hopper, A. (1998) VirtualNetwork Computing, IEEE Internet Computing, 2(1): 33–38.

Rinman, M.-L. (2002) Forms of Interaction in Mixed Reality Media Performance– A Study of the Artistic Event DESERT RAIN, Fil.lic. thesis, Royal Instituteof Technology, TRITA-NA-0214.

Rivera, K., Cooke, N. J. and Bauhs, J. A. (1996) The Effects of Emotional Iconson Remote Communication, CHI Conference Companion, April.

Roberts, D. J. (1996) A Predictive Real Time Architecture for Multi-User,Distributed, Virtual Reality. Unpublished PhD, University of Reading,Reading, UK.

Roberts, D. J., Lake, T. W. and Sharkey, P. M. (1998) Optimising Exchange ofAttribute Ownership in the DMSO RTI. Paper presented at the SimulationInteroperability Workshop, SISO, Orlando, USA.

Roberts, D. J., Strassner, J., Worthington, B. G. and Sharkey, P. (1999) Influenceof the Supporting Protocol on the Latencies Induced by ConcurrencyControl within a Large Scale Multi User Distributed Virtual Reality System.Paper presented at the International Conference on Virtual Worlds andSimulation (VWSIM), SCS Western Multi-conference ‘99, San Francisco, CA.

Robertson, G., Czerwinski, M. and van Dantzich, M. (1997) Immersion inDesktop Virtual Reality. In Proceedings of UIST’97, pp. 11–19, ACM Press.

Robinson, M., Pekkola, S., Korhonen, J., Hujala, S., Toivonen, T. and Saarien, M.-J. O. (2001) Extending the Limits of Collaborative Virtual Environments. InSnowdon, D., Churchill, E. F. and Munro, A. J. (eds.), Collaborative VirtualEnvironments: Digital Places and Spaces for Interaction. Springer Verlag,London, pp. 21–42.

Rorty, R. (1991) Essays on Heidegger and Others: Philosophical Papers Volume2. Cambridge University Press, Cambridge.

Rosen, E. (1996) Personal Videoconferencing, Manning Publications Co.Roseman, M. and Greenberg, S. (1996) TeamRooms: Network Places for Collab-

oration, In Ackermann, M. S. (ed.), Conference on Computer SupportedCooperative Work (CSCW’96), ACM Press, Boston, MA, pp. 325–333.

Sack, W. (2000) Conversation Map: A Content-Based Usenet NewsgroupBrowser, Proceedings of the International Conference on Intelligent UserInterfaces. New Orleans, LA, ACM Press.

Salem, B. and Earle, N. (2000) Designing a Non-verbal Language for ExpressiveAvatars. In Proceedings of CVE 2000, San Francisco, CA, pp. 93–101, ACMPress.

Salvador, T. (1998) Communities, Schumanities. SIGGROUP Bulletin 19(2):37–39.

Sandor, O., Bogdan, C. and Bowers, J. (1997) Aether: An Awareness Engine forCSCW. In Hughes et al. (1997), pp. 221–236.

Sartre, J. P. (1965) The Psychology of Imagination. The Citadel Press, New York. SAS (2002) SAS press release 10th April 2002, “Personal meetings lead to more

business” available from http://www.scandinavian.net/EC/Appl/Core/Templ/PRView/0,3463,SO%253D0%2526CID%253D434109%2526MKT%253DSE,00.html

123456789101112345678920111234567893011123456789401112345611

References

320

Saussure, F. de (1983) Course in General Linguistics. Trans. Wade Baskin.McGraw-Hill. (Originally published in 1906.)

Savolainen, R. (1995) Everyday Life Information Seeking: Approaching Informa-tion Seeking in the Context of “Way of Life”. Library and InformationScience Research 17: 259–294.

Schäfer, K., Brauer, V. and Bruns, F. (1997) A New Approach to Human-Computer Interaction: Synchronous Modeling in Real and Virtual Spaces.In Proceedings of the DIS’97, Amsterdam.

Schindler Jr, G. E. (ed.), Bell Laboratories RECORD, Vol. 47, No. 5, May/June1969, available online at http://www.bellsystemmemorial.com/pdf/picture-phone.pdf

Schlichter, J. H., Koch, M. and Bürger, M. (1998) Workspace Awareness for Distributed Teams. In Conen, W. and Neumann, G. (eds.), Coordina-tion Technologie for Collaborative Applications – Organisations, Processesand Agents, Lecture Notes in Computer Science, Springer, Berlin, pp. 197–218.

Schmidt, K. and Bannon, L. (1992) Taking CSCW Seriously: SupportingArticulation Work. CSCW 1(1): 7–40.

Schreer, O. and Kauff, P. (2002) An Immersive 3D Video-Conferencing SystemUsing Shared Virtual Team User Environments, ACM CVE 2002, Bonn,Germany.

Schutz, A. (1970) On Phenomenology and Social Relations. University of ChicagoPress, Chicago.

Schwabe, D. and Stenius, M. (2000) The Web Planetarium and other Applicationsin the Extended Virtual Environment EVE. In Proceedings of the 16th SpringConference on Computer Graphics, Budmerice, Slovakia, 3–6 May.

Segall, B. and Arnold, D. (1997) Elvin Has Left the Building: A Publish/subscribeNotification Service with Quenching, In AUUG, http://www.dstc.edu.au/Elvin/, Brisbane, Australia.

Sharkey, P. M., Roberts, D. J., Tran, F. D. and Worthington, B. G. (2000) PING– Platform for Interactive Networked Games: IST Framework V.

Shaw, J. (1997) PLACE – A User’s Manual: From Expanded Cinema to VirtualReality. Hatje Cantz.

Shaw, J., Staff, H., Row Farr, J., Adams, M., vom Lehm, D., Heath, C., Rinman,M.-L., Taylor, I. and Benford, S. (2000) Staged Mixed Reality Performance,“Desert Rain” by Blast Theory, eRENA Deliverable 7b.3.

Shneiderman, B. (1983) Direct Manipulation: A Step Beyond ProgrammingLanguages. Computer 16(8): 57–69.

Singhaland, S. K. and Cheriton, D. R. (1996) Using Projection Aggregation toSupport Scalability in Distributed Simulation. Paper presented at theInternational Conference on Distributed Computing Systems ICDCS´96.

Slater, M., Pertaub, D.-P. and Steed, A., (1999) Public Speaking in Virtual Reality:Facing an Audience of Avatars. IEEE Computer Graphics and Applications,19(2): 2–5.

Slater, M., Howell, J., Steed, A., Pertaub, D.-P., Garau, M. and Springel, S. (2000a)Acting in Virtual Reality. Paper presented at the ACM Collaborative VirtualEnvironments.

Slater, M., Sadagic, A., Usoh, M. and Schroeder, R. (2000b) Small GroupBehaviour in a Virtual and Real Environment: A Comparative Study.Presence: Teleoperators and Virtual Environments, 9(1): 37–51.

011

011

011

011

11

321

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Slater, M., Steed, A. and Chrysanthou, Y. (2001) Computer Graphics and VirtualEnvironments: From Realism to Real-Time. Addison Wesley Publishers,Harlow.

Smith, M. (2000) Invisible Crowds in Cyberspace: Measuring and Mapping theSocial Structure of USENET, Communities in Cyberspace, Routledge,London.

Snowdon, D., Greenhalgh, C. and Benford, S. (1995) What You See is Not WhatI See: Subjectivity in Virtual Environments. Paper presented at theFramework for Immersive Virtual Enviroments (FIVE’95), QMW Universityof London, UK.

Snowdon, D., Fahlén, L. and Stenius, M. (1996) WWW3D: A 3D Multi-user WebBrowser. In Proceedings of WebNet’96, San Francisco, CA, October.

Snowdon, D., Churchill, E. F. and Munro, A. J. (2001) Collaborative VirtualEnvironments: Digital Spaces and Places for CSCW: An Introduction. InSnowdon, D., Churchill, E. F. and Munro, A. J. (eds.), Collaborative VirtualEnvironments. Digital Places and Spaces for Interaction. Springer Verlag,London, pp. 3–17.

Snowdon, D. and Grasso, A. (2002) Diffusing Information in OrganisationalSettings: Learning from Experience. In Proceedings of ACM CHI2002,Minneapolis, MN, April, pp. 331–338.

Sohlenkamp, M., Prinz, W. and Fuchs, L. (2000) AI and Society – Special Issueon CSCW, 14: 31–47.

Ståhl, O. (1992) Tools for Cooperative Work in the MultiG TelePresenceEnvironment. In Proceedings of the 4th MultiG Workshop, Stockholm-Kista, Sweden, pp. 75–88.

Ståhl, O., Wallberg, A., Söderberg, J., Humble, J., Fahlén, L., Bullock, A. et al.(2002) Information Exploration Using The Pond. In Proceedings of CVE’02,Bonn.

Star, S. L. (1992) The Trojan Door: Organisations, Work, and the “Open BlackBox”. Systems Practice 5: 395–410.

Star, S. L. and Griesemer, J. R. (1989) Institutional Ecology, “Translations” andBoundary Objects: Amateurs and Professionals in Berkeley’s Museum ofVertebrate Zoology, 1907–39. Social Studies of Science 19: 387–420.

Steed, A. and Frécon, E. (1999) Building and Supporting a Large-scaleCollaborative Virtual Environment. In Proceedings of 6th UKVRSIG,University of Salford, UK, pp. 59–69.

Steed, A., Frécon, E., Avatare, A., Pemberton, D. and Smith, G. (1999) TheLondon Travel Demonstrator, In Proceedings of VRST’99 – Symposium onVirtual Reality Software and Technology, University College London, UK,pp. 50–57.

Steed, A., Mortensen, J. and Frécon, E. (2001) Spelunking: Experiences using theDIVE System on CAVE-like Platforms. In B. Frohlicj, J. Deisinger, and H-J.Bullinger (eds.), Proceedings of Immersive Projection Technologies andVirtual Environments 2001 Springer-Verlag, Vienna, pp. 153–164.

Steffik, M., Bobrow, D. G., Foster, G., Lanning, S. and Tatar, D. (1987) WYSIWISRevised: Early Experiences with Multiuser Interfaces. Transactions on OfficeInformation Systems, 5 (2): 147–167, ACM Press.

Stenius, M., Frécon, E., Fahlén, L., Simsarian, K. and Nord, B. (1998) The WebPlanetarium Prototype – Visualising the Structure of the Web. In Mariani,J., Rouncefield, M., O’Brien, J. and Rodden, T. (eds.), eSCAPE Deliverable

123456789101112345678920111234567893011123456789401112345611

References

322

3.1 Visualisation of Structure and Population within Electronic Landscapes,Esprit Long Term Research Project 25377, Lancaster University, pp.117–125.

Stephenson, N. (1992) Snowcrash, Bantam, New York.Storrs Hall, J., Utility Fog: The Stuff that Dreams are Made Of, http://discuss.

foresight.org/~josh/Ufog.htmlStrong, R. and Gaver, B. (1996) Feather, Scent, and Shaker: Supporting

Simple Intimacy. In M. S. Ackerman (ed.), Proceedings of the ACM 1996Conference on Computer Supported Cooperative Work. New York, ACM,p. 444.

Stults, R. (1986) MediaSpace, Technical Report, Xerox PARC.Suchman, L. A. (1982) Systematics of Office Work. Office Studies for Knowledge-

Based Systems, Digest. Office Automation Conference, San Francisco, 5–7April.

Suchman, L. A. (1987) Plans and Situated Actions: The Problem of Human-machine Communication, Cambridge University Press, Cambridge.

Suchman, L. A. and Trigg, R. H. (1991) Understanding Practice: Video as a Medium for Reflection and Design. In Greenbaum, J. and Kyng, M. (eds.), Design at Work. Lawrence Erlbaum, London and New Jersey, pp.65–89.

Suchman, L. (1997) Centers of Coordination: A Case and Some Themes. InResnick, L. B., Säljö, R., Pontecorvo, C. and Burge, B. (eds.), Discourse,Tools, and Reasoning: Essays on Situated Cognition. Springer-Verlag,Berlin, pp. 41–62.

Sudnow, D. (ed.) (1972) Studies in Social Interaction. Free Press, New York, pp. 229–258.

Swan, J., Newell, S., Scarbrough, H. and Hislop, D. (1999) KnowledgeManagement and Innovation: Networks and Networking. Journal ofKnowledge Management, 3: 262–275.

Thompson, G. (1972) Three Characterizations of Communications Revolutions.Computer Communication: Impacts and Implications: InternationalConference on Computer Communication. S. Winkler, New York.

Törlind, P., Stenius, M., Johansson, M. and Jeppsson, P. (1999) CollaborativeEnvironments for Distributed Engineering, CSCWD’99 – ComputerSupported Cooperative Work in Design 99, Compiègne, France, 29September–1 October.

Townsend, A.M., Hendrickson, A.R. and DeMarie, S.M. (2002) Meeting theVirtual Work Imperative, CACM 45(1).

Tramberend, H. (2001) Avango: A Distributed Virtual Reality Framework. Paperpresented at the Afrigraph, ACM.

Underkoffler, J. and Ishii, H. (1999) Urp: A Luminous-Tangible Workbench forUrban Planning and Design. In Proceedings of the ACM ConferenceComputer–Human Interaction (CHI99), pp. 386–393.

Ullmer, B., Ishii, H. and Glass, D. (1998) MediaBlocks: Physical Containers,Transports, and Controls for Online Media. In Proceedings of SIG-GRAPH’98, Orlando.

Valin, S., Francu, A., Trefftz, H. and Marsic, I. (2001) Sharing Viewpoints inCollaborative Virtual Environments. In Proceedings of HICSS-34, Hawaii,IEEE.

VR Lab (2003) http://vrlab.epfl.ch/

011

011

011

011

11

323

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

References

Waters, R. C., Anderson, D. B., Barrus, J. W., Brogan, D. C., Casey, M. A.,McKeown, S. G., Nitta, T., Sterns, I. B. and Yerazunis, W. S. (1997) DiamondPark and Spline: A Social Virtual Reality System with 3D Animation, SpokenInteraction, and Runtime Modifiability. Presence: Teleoperators and VirtualEnvironments, 6(4): 461–480.

Watsen, K. and Zyda, M. (1998) Bamboo – A Portable System for DynamicallyExtensible, Real-Time, Networked, Virtual Environments. In Proceedings ofthe Virtual Reality Annual International Symposium (VRAIS’98), Atlanta,GA, pp. 252–259.

Wei-Chao, Chen, Towles, H., Nyland, L., Welch, G. and Fuchs, H. (2000) Towarda Compelling Sensation of Telepresence: Demonstrating a Portal to a Distant(static) Office, IEEE Visualization 2000, pp. 327–333.

Weiser, M. (1991) The Computer for the Twenty-first Century, ScientificAmerican, 265(3): 94–10.

Weiser, M. (1994) Creating the Invisible Interface (abstract). In Proc. ACM UserInterface Software and Technology, p. 1.

Williamson, K. (1998) Discovered by Chance: The Role of Incidental InformationAcquisition in an Ecological Model of Information Use. Library andInformation Science Research, 20(1): 23–40.

Wisneski, C., Ishii, H., Dahley, A., Gorbet, M., Brave, S., Ullmer, B. and Yarin, P.(1998) Ambient Displays: Turning Architectural Space into an Interfacebetween People and Digital Information. In Streitz, N. A., Konomi, S. andBurkhardt, H.-J. (eds.), Cooperative Buildings – Integrating Information,Organization, and Architecture. Springer – LNCS, pp. 22–32.

Wittgenstein, L. (1958) Philosophical Investigations. 3rd edn, trans. G.E.M.Anscombe, Oxford University Press.

Xiong, R., and Donath, J. (1999) PeopleGarden: Creating Data Portraits for Users.Proceedings of the 12th Annual ACM Symposium on User InterfaceSoftware and Technology, pp. 37–44. ACM, New York.

Yamazaki, K., Yamazaki, A., Kuzuoka, H., Oyama, S., Kato, H. et al. (1999)Gesture Laser and Gesture Laser Car: Development of an Embodied Spaceto Support Remote Instruction. In Bodker, S., Kyng, M. and Schmidt, K.(eds.), Proceedings of the Sixth European Conference on ComputerSupported Cooperative Work – ECSC W’99, 12–16 September, Copenhagen,Denmark. Kluwer Academic Publishers, Dordrecht.

123456789101112345678920111234567893011123456789401112345611

References

324

Index

011

011

011

011

11

325

Aabstract landscape, 27access model, 46accountability, 143activity landscape, 171, 172activity-based navigation, 193Activity-oriented Navigation, 178affordances, 29, 30, 31agent-mediated institution, 105aggregate, 17aggregation, 250AlphaWorld, 116Amazon.com, 55, 61Ambient interfaces, 201, 203Aotea Youth Symphony, 96Apache, 78appropriated, 74architectural collaboration, 115artefact, 27attention, 71audio communication, 130audio feedback, 51augmented reality, 36, 72, 101Auld Leaky, 77Aura, 249, 258authenticity, 166autonomous, 104, 170avatar, 236, 253Avatar-centred Navigation, 176avatars, 133, 197, 199AwareBots, 201, 203awareness, 101, 102, 171, 184, 186, 189, 203,

204, 206, 246, 249awareness service, 104, 105

Bbanner advert, 20, 22BEAD, 51Behaviour, 41, 103blacklist, 22Blink, 160, 161, 164boid flock model, 54

boundary architecture, 34boundary object, 271, 286brain activity, 91“broken” link, 14

CCAPIAs, 106CARE HERE, 94CARESS, 92, 94Causal ordering, 242, 245causality, 242, 244CAVE, 236challenges of teleconferencing, 124chance encounter, 196characteristics, 40Charles Rennie Mackintosh, 72City project, 72, 76city’s meaning, 75cityscape, 27, 29ClearBoard, 275client/server, 39closed world, 71cluster, clustering, 17coherent perception, 38collaborate, 3collaboration, 115Colour coded, 35Common goals, 281common sense, 103Common sense knowledge, 102community, 272Community Place, 165compound link, 19Computer Supported Co-operative Work,

271COMRIS, 102, 105, 107, 108, 109, 110, 111conceptions of space, 73Concurrency control, 244conference calls, 127conference centre application, 106Consistency, 242, 243container, 17

context, 75context aware, 103context of use, 71contextualisation, 76continuative computing, 103, 104, 110control rooms, 274conversational orientation, 30co-operation, 3, 274co-presence, 150costly, 32creative process, 89creels, 62crowd, 154, 171CSCW, 77, 271, 274, 277, 281Cultural context, 152CWall, 22CyberDance, 153Cyberspace, 11, 74, 116Cyc, 103cyclist, 29

Ddead reckoning, 213, 242decontextualised information spaces, 73delivery service, 104, 105Desert Rain, 154design, 287desktop-VR, 22DEVA, 26, 39, 40, 44, 47disability, 89Dispersed Avatars, 178distributed computing, 278distributed file sharing, 278Distributed Legible City, 29DIVE, 12, 16, 20, 26, 57, 66, 123, 211, 212,

215, 218, 223, 224, 252, 255, 256, 266DocuDrama, 184, 198, 199, 205Domain Name System (DNS), 283Doors of Perception, 47dual space, 104

Eecosystem metaphor, 51, 54electronic, 50electronic arenas, 151, 152electronic landscape, 25, 27Embodiment, 118, 126Emoticons, 118, 119emotive communication, 237Enforced, 42Equator, 72, 78EQUIP, 77eRENA, 151, 152, 153, 155eSCAPE, 11, 12, 25, 26, 27, 29, 33

e-scape, 25, 33ethnographic, 179ethnographic studies, 26, 28, 31expectations, 30experience, 32experience design, 152expression amplifier, 92EyesWeb, 91

Ffacial expression, 236familiar structure, 35FDP, 15, 22feedback loop, 95field of view, 134, 135filter, 43Firewalls, 264flocking algorithm, 55flocking behaviour, 250fly, 28focus, 171, 249focused engagement, 71Force Directed Placement (FDP), 15Four Senses, 96

Ggame show, 166, 168GestureCam, 277Gnutella, 273Grokster, 278Groove, 278groundplane, 27, 36, 47, 168, 176groupware, 272

Hhaptic feedback, 143Head Mounted Display, 236head-set, 22Hearing ANd Deaf Sign Singers, 96heart rate, 91Heaven and Hell {-} Live, 165, 168Heidegger, 75heterogeneity, 81heterogeneity of media, 76hierarchical database, 213hot spots, 178HTML, 12HTTP, 58, 77human{-}computer interaction,

151human-like representation, 135hybrid objects, 82hyperlink, 27hyperlinking, 36

123456789101112345678920111234567893011123456789401112345611

Index

326

Iicon, 12Imbued, 42immersive, 11, 37, 237, 238implementing virtual worlds, 37Indian Ocean, 280informal interactions, 134information creatures, 52Information gathering, 109infrared, 91, 95infrared movement sensors, 90Inhabited Television, 164, 165, 170Innate, 42input devices, 31Instant Messenger, 279Intelligibility, 179interactional breakdowns, 86Interactive painting, 95interest landscape, 109interest management, 246intimacy, 276introspection, 42

KKaZaA, 278

LLAN, 271landscape, 35Large scale participation, 152large screen, 22LEADS, 17legibility, 17Legible City, 29LEGO Mindstorm, 202LEGO Mindstorms, 201Level of Detail, 251, 258level of detail (LOD), 12level-of-detail (LOD), 17levitate, 28light collage, 97Lightwork, 154, 155, 158, 160Limewire, 278line of sight, 143Linux, 93local area networks, 271Löcales, 248, 249LOD, 17Loom project, 293

MMackintosh Interpretation Centre, 72marked-up text, 12Masaki Fujihata, 33

MASSIVE, 26, 123MASSIVE 2, 249MASSIVE 3, 245MASSIVE-2, 17, 135, 168mass-participatory, 166MAVERIK, 38MAX, 91MBone, 120, 265Media rich, 152mediation services, 104Memory Theatre VR, 48metaphors, 31Microsoft, 12MIDI, 89, 91, 93MIDI bass, 90migrate, 45migrating, 44Mixed reality, 151modify existing artefacts, 32mood, 96MOOs, 116, 118Morecambe, 31Morpheus, 278MUDs, 116, 118multi-agent systems, 104multicast, 213, 224, 254, 259, 263multiple simultaneous touches, 64multiple users, 14muscle tension, 91mutual availability, 147

NNanotechnology, 99Napster, 273natural language generation, 108navigable, 34navigate, 194, 196navigating, 24navigation, 53, 152, 176NetMeeting, 120Netscan, 291, 292, 294Netscape, 22Netscapes, 12Network Address Translation (NAT), 283neuropsychologists, 99nimbus, 171, 249Nuzzle Afar, 33, 35

Oobject, 39object behaviour, 241Object-centred Navigation, 177objective reality, 38objectivity, 234OpenGL, 26, 49

011

011

011

011

11

327

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Index

organising concept, 271Out of this World, 166, 168, 170,

177out of view, 138outlined field-of-view, 135own content, 32

PP2P, 272Panoramic Navigator, 36PaRADE, 245, 255parallelisation, 19parrot, 108patch data, 96pathways, 27pedestrian areas, 28peer-to-peer, 39, 213, 272PeopleGarden, 293perceptions, 38Performer, 49peripheral awareness, 184, 196,

205peripheral lenses, 135, 146Persistence, 251persistent, 103pervasive computing, 101phicon, 174, 175phicons, 175physical landscape, 36Physical sensors, 102, 103physical space, 104physical tags, 53Picturephone, 117PING, 245, 253, 256, 260, 266PlaceWorld, 25, 27, 39plasma display, 56plug-in, 58, 216, 230PocketPC(tm), 66popularity, 36Populated Information Terrains,

133portals, 20, 27Presence, 179Production management, 153, 169proprioception, 239proxy server, 22pseudo-humanoid, 133public performance, 89, 152puppy camera, 172, 174

QQPIT, 51Q-SPACE, 15Quake, 279

Rradio frequency, 78realism, 134Real-time applications, 152reciprocal perspectives, 147, 148recommender system, 87relaxed WYSIWIS, 134remote collaboration, 81, 87representation, 133representations, 285responsiveness, 237, 242RFID, 57, 58, 62, 66ribbons of light, 27Round Table, 174, 179

SSaussure, 75scalability, 16, 146, 251scalable, 241scenegraph, 238, 241, 253, 257Script Programming, 219seamfulness, 87search engine, 273searchability, 274sense-of-presence, 150seven fat years for CSCW, 271seven lean years for CSCW, 272SGI Reality Monster, 93Shared awareness, 84shared awareness of location, 82shared environment, 124, 126shared location, 84shared objects, 134shared visiting experience, 78shared whiteboard, 121Shared Workspace, 186shoal, 55, 59, 61, 63shutter glasses, 93Smartmaps, 183, 190, 191, 192, 193social accounting, 284, 292Social accounting data, 295, 298Social contacts, 182social interaction, 71, 76, 83social surroundings, 108social translucence, 293society of agents, 105solidity, 143sonifying, 171Sony Playstation(r), 97Spatial awareness, 78Spatial Interaction Model, 171spatialised, 224spatialised audio, 146, 215speed of movement, 134Spelunk, 258

123456789101112345678920111234567893011123456789401112345611

Index

328

Spline, 255SQL, 51structuralist semiotics, 73subjective perception, 38subjective view, 39, 228, 245, 258subjective visualisations, 24subjects, 39Symbolic Actions, 194Symbolic gestures, 196synchronisation, 47synomorphy, 181synthesiser, 96

Ttangible artefacts, 72technological limitations, 37telephone space, 74teleport, 177teleporting, 27telepresence, 117tele-workers, 203tethered viewpoints, 147text chat, 116texture, 20texture mapping, 12texture maps, 20The Pond, 51, 52, 216, 219theatre of work, 1863D browser, 2123D world, 93To the Unborn Gods, 154Touch Compass, 96touch screen, 36touchable interface, 37touch-sensitive, 56Tourist Information Centre, 31trace, 33Trackballs, 33traditional media, 72trails, 27travellators, 168tree-map, 192trigger zones, 80TWI-AYSI, 92, 94twines, 40

Uubiquitous computing, 101Ultima Online, 246

ultrasonic, 95ultrasonics, 78unencumbered interaction, 51urban design, 75urban evolution, 28urban models, 72Usenet, 283, 291, 292, 294, 296Utility Fog, 99

VVideo avatars, 236, 238Video imagery, 130Videoconferencing, 117view control, 152, 170Virtual Communities, 271virtual conferencing, 115, 127virtual creatures, 61virtual reality, 275visual complexity, 17visual syntax, 34visualiser, 40, 57, 58VR Juggler, 77VRML, 11, 20, 86, 251VR-VIBE, 51, 216, 218

Wwayfinding, 28, 33wearable computer, 101, 105, 108web, 12web browser, 11, 12, 22web decoration, 20, 22web page, 27Web Planetarium, 52, 216, 219whiteboard, 216William Gibson, 11working life, 275workplace, 73

XXML, 58, 77

ZZKM, 29, 36, 174zooming, 61, 6

011

011

011

011

11

1● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Index

329

Out of print titles

Mike Sharples (Ed.)Computer Supported Collaborative Writing3-540-19782-6

Dan Diaper and Colston SangerCSCW in Practice3-540-19784-2

Steve Easterbrook (Ed.)CSCW: Cooperation or Conflict?3-540-19755-9

John H. Connolly and Ernest A. Edmonds (Eds)CSCW and Artificial Intelligence3-540-19816-4

Duska Rosenberg and Chris Hutchison (Eds)Design Issues in CSCW3-540-19810-5

Peter Thomas (Ed.)CSCW Requirements and Evaluation3-540-19963-2

Peter Lloyd and Roger Whitehead (Eds)Transforming Organisations Through Groupware: Lotus Notes in Action3-540-19961-6

John H. Connolly and Lyn Pemberton (Eds)Linguistic Concepts and Methods in CSCW3-540-19984-5

Alan Dix and Russell Beale (Eds)Remote Cooperation3-540-76035-0

Stefan Kirn and Gregory O’Hare (Eds)Cooperative Knowledge Processing3-540-19951-9

Reza Hazemi, Stephen Hailes and Steve Wilbur (Eds)The Digital University: Reinventing the Academy1-85233-003-1

Alan J. Munro, Kristina Höök and David Benyon (Eds)Social Naviation of Information Space1-85233-090-2

Mary Lou Maher, Simeon J. Simoff and Anna CicognaniUnderstanding Virtual Design Studios1-85233-154-2

Elayne Coakes, Dianne Willis and Raymond Lloyd-Jones (Eds)The New Sociotech1-85233-040-6

123456789101112345678920111234567893011123456789401112345611