gesture-based interaction for a magic crystal...

8
Gesture-based Interaction for a Magic Crystal Ball Li-Wei Chan, Yi-Fan Chuang, Meng-Chieh Yu, Yi-Liu Chao, Ming-Sui Lee, Yi-Ping Hung and Jane Hsu Graduate Institute of Networking and Multimedia Department of Computer Science and Information Engineering National Taiwan University Abstract Crystal balls are generally considered as media to perform divina- tion or fortune-telling. These imaginations are mainly from some fantasy films and fiction, in which an augur can see into the past, the present, or the future through a crystal ball. With the distinct impressions, crystal ball has revealed itself as a perfect interface for the users to access and to manipulate visual media in an intu- itive, imaginative and playful manner. We developed an interactive visual display system named Magic Crystal Ball (MaC Ball). MaC Ball is a spherical display system, which allows the users to see a virtual object/scene appearing inside a transparent sphere, and to manipulate the displayed content with barehanded interactions. In- teracting with MaC Ball makes the users feeling acting with magic power. With MaC Ball, user can manipulate the display with touch and hover interactions. For instance, the user waves hands above the ball, causing clouds blowing from bottom of the ball, or slides fingers on the ball to rotate the displayed object. In addition, the user can press single finger to select an object or to issue a but- ton. MaC Ball takes advantages on the impressions of crystal balls, allowing the users acting with visual media following their imagi- nations. For applications, MaC Ball has high potential to be used for advertising and demonstration in museums, product launches, and other venues. CR Categories: I.3 [Computer Graphics]: ;— [H.5]: Information Interfaces and Presentation—(e.g., HCI) Keywords: 3D interaction, haptics, entertainment 1 Introduction In ancient times, many people deemed that the crystal balls have incredible strength. Using a crystal ball to divine first appears in the Middle Ages. This skill is called “Crystalomancy”. In other words, they gazed through a crystal ball to find indication about the future. As time goes by, many fantasy films and fiction expand the appearance of the crystal ball. In the magic world, the witch always holds a crystal ball on the table which reflects dim lights and shines mysterious radiance on the face of the witch. Subsequently, she be- gins to mumble the incantation that nobody understands, and at the same time waves her hands on the crystal ball. Allegedly the augur can see into the past, the present, or the future through the crystal ball. The witch can have a distant wonderland or precious treasures revealed inside the ball. With the distinct impressions, the crystal ball has revealed itself as a perfect interface for the users to access and to manipulate visual media with intuition and fantasy. In this work, we developed an interactive visual display system named Magic Crystal Ball (MaC Ball). MaC Ball is a spherical display system, which allows the users to see a virtual object/scene appearing inside a transparent sphere, and to manipulate the dis- played content with bare hands. The display module of MaC Ball is based on the optical system from i-ball2 proposed in [Ushida et al. 2003]. We redesigned the interface of i-ball2 so the users can feel that they are having magic power while playing with MaC Ball. MaC Ball lets the users to perform gestures by their bare hands. The user can wave hands above the ball, and then computer-generated clouds blowing from bottom of the ball quickly surrounding the dis- played content. When the user slides fingers on the ball, the viewing direction of the displayed content is changed accordingly. In addi- tion, MaC Ball provides pointing gesture with which the user uses single finger to select an object or to issue a button. The motivation for MaC Ball is to realize general public’s imaginations to crystal balls. The goal of MaC Ball is to transform different impressions from movies and fiction into the development of a medium for the users to access multimedia in an intuitive, imaginative and playful manner. Figure 1: The user is browsing relics using MaC Ball. This paper starts by discovering the nature of crystal ball which makes the ball a perfect medium to access virtual content in an in- tuitive, playful and entertaining manner. We then provide pointers to related research in Section 2. Section 3 discusses the design principles for the MaC Ball to meet several expectations. Section 4 describes the configuration and implementation of MaC Ball in- cluding the hardware and software aspects. Section 5 presents a relics exhibition application and together with a discussion from an opening presentation of the system, followed by the conclusion and future work in Section 6.

Upload: others

Post on 08-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

Gesture-based Interaction for a Magic Crystal Ball

Li-Wei Chan, Yi-Fan Chuang, Meng-Chieh Yu, Yi-Liu Chao, Ming-Sui Lee, Yi-Ping Hung and Jane Hsu

Graduate Institute of Networking and MultimediaDepartment of Computer Science and Information Engineering

National Taiwan University

Abstract

Crystal balls are generally considered as media to perform divina-tion or fortune-telling. These imaginations are mainly from somefantasy films and fiction, in which an augur can see into the past,the present, or the future through a crystal ball. With the distinctimpressions, crystal ball has revealed itself as a perfect interfacefor the users to access and to manipulate visual media in an intu-itive, imaginative and playful manner. We developed an interactivevisual display system named Magic Crystal Ball (MaC Ball). MaCBall is a spherical display system, which allows the users to see avirtual object/scene appearing inside a transparent sphere, and tomanipulate the displayed content with barehanded interactions. In-teracting with MaC Ball makes the users feeling acting with magicpower. With MaC Ball, user can manipulate the display with touchand hover interactions. For instance, the user waves hands abovethe ball, causing clouds blowing from bottom of the ball, or slidesfingers on the ball to rotate the displayed object. In addition, theuser can press single finger to select an object or to issue a but-ton. MaC Ball takes advantages on the impressions of crystal balls,allowing the users acting with visual media following their imagi-nations. For applications, MaC Ball has high potential to be usedfor advertising and demonstration in museums, product launches,and other venues.

CR Categories: I.3 [Computer Graphics]: ;— [H.5]: InformationInterfaces and Presentation—(e.g., HCI)

Keywords: 3D interaction, haptics, entertainment

1 Introduction

In ancient times, many people deemed that the crystal balls haveincredible strength. Using a crystal ball to divine first appears inthe Middle Ages. This skill is called “Crystalomancy”. In otherwords, they gazed through a crystal ball to find indication about thefuture. As time goes by, many fantasy films and fiction expand theappearance of the crystal ball. In the magic world, the witch alwaysholds a crystal ball on the table which reflects dim lights and shinesmysterious radiance on the face of the witch. Subsequently, she be-gins to mumble the incantation that nobody understands, and at thesame time waves her hands on the crystal ball. Allegedly the augurcan see into the past, the present, or the future through the crystal

ball. The witch can have a distant wonderland or precious treasuresrevealed inside the ball. With the distinct impressions, the crystalball has revealed itself as a perfect interface for the users to accessand to manipulate visual media with intuition and fantasy.In this work, we developed an interactive visual display systemnamed Magic Crystal Ball (MaC Ball). MaC Ball is a sphericaldisplay system, which allows the users to see a virtual object/sceneappearing inside a transparent sphere, and to manipulate the dis-played content with bare hands. The display module of MaC Ballis based on the optical system from i-ball2 proposed in [Ushidaet al. 2003]. We redesigned the interface of i-ball2 so the users canfeel that they are having magic power while playing with MaC Ball.MaC Ball lets the users to perform gestures by their bare hands. Theuser can wave hands above the ball, and then computer-generatedclouds blowing from bottom of the ball quickly surrounding the dis-played content. When the user slides fingers on the ball, the viewingdirection of the displayed content is changed accordingly. In addi-tion, MaC Ball provides pointing gesture with which the user usessingle finger to select an object or to issue a button. The motivationfor MaC Ball is to realize general public’s imaginations to crystalballs. The goal of MaC Ball is to transform different impressionsfrom movies and fiction into the development of a medium for theusers to access multimedia in an intuitive, imaginative and playfulmanner.

Figure 1: The user is browsing relics using MaC Ball.

This paper starts by discovering the nature of crystal ball whichmakes the ball a perfect medium to access virtual content in an in-tuitive, playful and entertaining manner. We then provide pointersto related research in Section 2. Section 3 discusses the designprinciples for the MaC Ball to meet several expectations. Section4 describes the configuration and implementation of MaC Ball in-cluding the hardware and software aspects. Section 5 presents arelics exhibition application and together with a discussion from anopening presentation of the system, followed by the conclusion andfuture work in Section 6.

Page 2: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

2 Related Work

Display technology: Display technology plays an important rolein changing our lives. Different display techniques are developedand have achieved much progress in recent years. Some of themtry to render imagery on different medium for different purposes.Fogscreen1 produces a thin curtain of dry fog that serves as atranslucent projection screen, displaying images that literally floatin the air. The Jeep waterfall display2 organizes falling dropletsinto visible logo and other messages, which appear as the dropletsdescend into a trough. Instead of pursuing for large display orquality displaying, these techniques provides the viewers differentways to experience digit media in a fun and entertaining manner.In the following we focus on the related works which construct aspherical display and interactions for this kind of displays.Volumetric displays generate true volumetric 3D images byactually illuminating points in 3D space. First kind of volumetricdisplays is swept volume display, such as the display developedby Actuality Systems, Inc[Favalora et al. 2002]. It generates aspherical, 360-degree-viewable 3D volumetric image by sweepinga semi-transparent 2D image plane around the vertical axis.A serial of works which support one to directly interact withand manipulate the 3D data by using the volumetric display isproposed[Grossman et al. 2005][Grossman and Balakrishnan2006]. One of the major concerns for the works is to provideselection mechanisms. Another kind of volumetric displays isstatic volume display. SOLID Felix[Langhans et al. 2003], a staticvolume 3D-laser display, is doped with optically active ions ofrare earths. These ions are excited in two steps by two intersectingIR-laser beams with different wavelengths and afterwards emitvisible photons.Rendering images in a transparent ball, i-ball[Ikeda et al. 2001]applies another solution. With a special optical system, i-ballis able to bring images displayed on an LCD monitor to beappeared in the air. Notice that the display given by i-ball systemis quite different from a volumetric display. The displaying qualityfor i-ball inherits all the benefits of an LCD monitor, which isquite suitable for presenting high quality imagery to the viewers.However, the optical system only provides single view point andtherefore is for single person use. i-ball2[Ushida et al. 2003] is thenext version to i-ball.The two systems share the same optical system but differentinteraction means. i-ball has a motor which automatically rotatesthe ball according to the motion of the user’s hands captured by thecamera. As the ball rotated by the motor, the displayed content ischanged accordingly. Since only hand motion is detected, i-ballcan not differ whether the user touches the ball. Instead, i-ball2provides another interaction mode which realizes that the useris physically rotating the ball by using a optical sensor installedunder the ball. The optical sensor detected the physical rotationof the ball for interacting with the displayed content. Displayingimages on the ball surface, Globe4D3 projects the Earth’s surfaceon a physical sphere. Globe4D is a representative system whichis an interactive four-dimensional globe. The sphere can be freelyrotated along all axes, viewed from any angle, and enables the userto control time as its fourth dimension.

1Fogscreen: http://www.FogScreen.com2Jeep waterfall: http://www.pevnickdesign.com/index1.html3Globe4D: http: //www.globe4d.com/

3 Design Principles

3.1 Obtaining photo-realistic displaying quality

We have searched for different display techniques capable of bring-ing a photo-realistic imagery into a transparent ball. For this ob-jective, we must work together with software and hardware. In thesoftware, the display content shall achieve photo-realistic quality.For the presentation of MaC Ball, we construct a virtual museumby combining techniques of image-based rendering, virtual realityand augmented reality.Both image-based and model-based techniques can be used to buildthe virtual exhibition environment, and can provide viewers a re-alistic and interactive tour. In the image-based approach, photo-realistic scenes can be generated, but it is hard for the user to viewthe scene from arbitrary viewing directions. In model-based ap-proach, 3D models are constructed.By rendering the 3D models,this approach allows users to interactively view the virtual worldfrom arbitrary viewing directions. 3D models used in this approachusually have to be created manually, and the generated virtual worldis usually not very realistic. In our virtual museum application, theartifacts are presented as object movies (image-based approach) inthe virtual exhibition. The image of the artifact shown on the ball isselected according to the viewing direction of the user. In this way,artifacts can be rotated and moved in 3D. The exhibition rooms arepresented as panoramas. By integration of the two image-basedtechniques[Hung et al. 2002], the virtual museum presents photo-realistic quality to the viewers.As the contents from image-based techniques are photo-realistic,for the hardware, we need a display mechanism that brings theseimages into the transparent ball. i-ball2 system which uses a spe-cial optical system well fits for our expectations. By using the spe-cial optical system, i-ball2 can render high quality images insidethe transparent ball. Referred to related work, volumetric displayis another possible solution which displays inside the space of theball. Yet it is designed for volume data, not images.

3.2 Seeing like a crystal ball

The transparent ball designed for i-ball2 is rotatable, which allowsthe user to manipulate the displayed content by physically rotatingthe ball. In MaC Ball, the transparent ball is set to be fixed. Theviewers are allowed to manipulate the content by waving hands andpointing a single finger above and on the ball. Three reasons thatwe make the ball fixed are summarized as follows:For the imagination reason: Instead of rotating the crystal ballphysically, an augur’s hands are hovering over the ball while scry-ing. Physically rotating the ball gives the user a better sense ofcontrol, but less fantasy in how we interact with the real world. Thedevelopment of MaC Ball shall meet general imaginations to crys-tal balls so as to give the sensation of playing with a real magiccrystal ball.For the practical reason: It is better to retain as fewer movablecomponents as possible in order to minimize the hardware main-tenance. MaC Ball has the ball fixed, so the whole body is rigid.In addition, interaction means for MaC Ball are given by sensorsinstalled underneath the glass body, so that the system will not bedamaged by negligent users. Another consideration to the ball fixedis for the security issue. The movable ball might be taken away eas-ily by guests during an official exhibition.For the cognitive reason: Physically rotating the ball enhances thesense that the ball is strongly connected to the displayed content.Rotation of the ball is highly related to the rotation (viewing direc-tion) of the displayed content. As a result, the viewer regards theball and the displayed content as one unity. A tiny delay or mis-

Page 3: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

match between the ball movement and the rotation of the displayedcontent may lead to an undesired separation effect. To avoid theeffect, an accurate and responsive sensor for measuring the degreesof rotation of the ball is required. In comparison, MaC Ball hasthe ball stayed stationary while the user interacting with it. Insteadof controlling the ball, the user directly manipulates the displayedcontent. MaC Ball does not suffer any separation effect.

3.3 Interacting like an augur/witch

Many systems provide gesture-based interactions which require theusers to learn pre-defined gestures. Though these gestures are inprinciple designed to be meaningful, the users have to memorizethem. Moreover, users may feel restricted since they need to forcetheir body/hands to meet some specific shape while issuing certainfunction.During the design process of MaC Ball, we hope to provide a min-imal learning curve for the users. One way to achieve such goal isto allow users to mimic the motion of an augur during the fortune-telling process. In our research, we simulate such motions and ex-tract them to be integrated with MaC Ball’s user interface design.Two types of gestures are defined for MaC Ball, the waving gestureand the pointing gesture, which are recognized by image processingtechniques.With the waving gesture, the users may wave their hands at will toperform a direction indication to Mac Ball. This gesture is very nat-ural to use since the recognition is achieved by detecting the motionof the user’s palms, requiring no constraint to the users’s hand pos-tures. For the pointing gesture, the user may use a single finger inorder to issue a visual cursor for selection. Switching between thetwo gestures is automatically detected by the algorithm, so that theusers are able to perform most interactions at will. In addition, pres-sure sensors are installed for MaC Ball as an auxiliary distinguishthese two gestures into touch and hover interactions.

Figure 2: The touch and hover zone.

Recall the impression that an augur mumbles the incantation and atthe same time waves hands following a spiral shape above the ball.For a while, the clouds glow within the ball, followed by some mys-terious images revealed. In order for the realization, the interactionfor MaC Ball comprises touch surface and hover zone as shown inFigure 2. Waving hands or pointing a single finger in the hoverzone issues hover interactions. In the design, hover interactions areweaker interactions which are subject to generate assistant visualeffects like computer-generated flashes and clouds, leaving no ef-fect on the major content. However, repeated weak interactions turninto a strong interaction. For example, a hovering waving gesturewould generate clouds spiraling around the major content. If theuser performs waving for a while, the clouds are gathering and fi-nally covering the whole major content. Once the waving stopped,the clouds disappear and the content changed. This is how weakinteractions grow into a strong interaction and further influence themajor content. The interaction makes the users feel that they areperforming some magic. In contrast, touch interactions are stronginteractions which directly influence the major content. The user is

allowed to rotate the content by sliding multiple fingers on the ball,or to issue a button by pressing a single finger.

4 System Implementation

The construction of MaC Ball tries to meet the principles mentionedin the previous section so as to give the users the sensation of us-ing a magic crystal ball. The system implementation including thehardware configuration and software implementation is describedin detail as follows.

4.1 Hardware Configuration

Figure 3: The architecture of MaC Ball system.

The architecture for MaC Ball is shown in Figure 3. It consists oftwo modules, the display module and the detection module. For thedisplay module, the optical system from i-ball2 is adopted since itprovides the user the perception of a real crystal ball. The Imagedisplayed on the LCD (15 inch, XGA) is reflected by a mirror andthen penetrates the Frenel Lens. As a result, the user can see theimage appearing inside the glass ball. In addition, there is a mi-nor advantage from use of i-ball2. The image shown in the ball isslightly distorted under the influence of the lenses, which leads theusers to have the experience as viewing real 3D objects in the glassball. The detection module for MaC Ball consists of one infraredcamera and three pressure sensors. The Infrared camera coupledwith an infrared illuminator are settled underneath the ball to rec-ognize the users’ hand gestures. Since observing the content inthe ball is desirable under a dim light condition along with indirectambient lighting, the infrared camera works well with our detec-tion algorithms. Three pressure sensors put on the corbelings of theglass ball are used to determine whether the users touch the ballor not. The pressure sensors, called FlexiForce Sensor, are com-mercial products from Tekscan technology. FlexiForce sensor isan ultra-thin(0.008”) and flexible printed circuit, which can detectand measure a relative change in force or applied load. The sensoroutput exhibits a high degree of linearity, low hysteresis and mini-mal drift. FlexiForce Sensors are available with different maximumforces. In the implementation, we use the sensor which is capableof 1 lbs 4 in maximum force. The sensors produce an analog signaland the resolution depend on the electronics. We utilize an 16-Bit(65535 levels) A/D converter, which can produce approximately0.03 gram in resolution (2.2kg is divided by 65535 = 0.03g). Itis sensitive to aware the modification of pressure on the ball. Anattachment which functions like a shock absorber is used to bridge

41 lbs equals about 2.2kg

Page 4: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

the pressure sensor and the ball surface as shown in Figure 3. Whenthe user touches the ball, the pressure increment is transferred to thepressure sensor for further computation. For a practical considera-tion, the attachment is adhered to the ball surface, so that the usercan not take the ball away.

4.2 Software Implementation

The detection module comprises one infrared camera and threepressure sensors. The infrared camera coupled with an infraredilluminator is settled underneath the ball, observing human handmotions and fingertip positions above the ball. Three pressure sen-sors are put on the corbelings of the glass ball to determine whetherusers touch the ball. In the following, we describe the detectionmeans for MaC Ball which consists of (1) hand motion detection,(2) fingertips finding, and (3) pressure sensing.

4.2.1 Motion Detection for Waving Gesture

Waving gesture is for the user to deliver direction indications toMaC Ball. A typical motion detection, optical flow, for developingwaving gestures is utilized. The optical flows extracted from con-sequent image frames are used to determine the major directions ofthe waving hands. This approach does not rely on hand shapes, sothe user can perform waving gesture in free form. The details of theimplementation is as follows.Lucas-Kanade method[Bouguet 2000] is applied to extract opti-cal flows from two consequent images. This method starts withbuilding image pyramids and then extracting Lucas-Kanade featurepoints of two images at different scale levels. By means of itera-tively maximizing a correlation measure over a small window, thedisplacement vectors between the feature points of two images arefound from the coarsest level up to the finest one. Then the displace-ment vectors are considered as the motions in the image. Thoughmotions of the waving gesture can be successfully extracted, somefalse motions do exist. In the following, we remove these false mo-tions by using a two-step filtering to provide a robust estimate of thewaving direction. In the first step, large motions are dropped sincethey probably result from mismatches of local features in coarserlevels of the pyramid. In the second step, we build a motion his-togram, H(j), according to the angles of the motions. In the im-plementation, we set bin size of the histogram as 20 degrees. Afterthat, the bin in the histogram having maximal count is selected andmotions in that bin are averaged to come up with the major directionof this image frame.

4.2.2 Fingertip Finding for Pointing Gesture

The pointing gesture is achieved by applying a fingertip finding al-gorithm. We use the method proposed in [Chan et al. 2007] andproceed modifications to suit our case. The proposed algorithm iscapable of finding multiple fingertips from images. When the al-gorithm reports only one fingertip found, the pointing gesture isaccordingly issued. For more detail, we describe each step of thealgorithm in the following few sections.

(1) Background subtraction: We first extract region of interest byapplying background subtraction as shown in the second column ofFigure 4.(2) Separating finger parts from observed image by a morpho-logical opening: After background subtraction, we extract fingerpart by a morphological opening operation with a structure elementhaving its size larger than a normal finger and smaller than a palm.Specifically, we define a normal fingertip pattern with r as the ra-dius of a circular fingertip (Figure 4a). The size of the structureelement for opening is set twice of r. In the implementation, we use

Figure 4: The images produced during and after processing. Icon(a) is the fingertip template used in the process. Icons labeled (b),(c), and (d) are three cases of gestures. Produced images for eachcase are arranged in corresponding row. First three columns collectintermediate results during processing. The last column shows thefinal results.

a template having a square of 25*25 pixels with a circle whose ra-dius r is 7 pixels. This size is determined according to the distancebetween the camera and the observation. Finally, the finger part isthen binarized into the finger regions and the background. Identi-fying finger region greatly reduces potential area where fingertipsmight locate. The third column in Figure 4 shows the finger regionsthat are successfully extracted in all the cases.(3) Calculating principal axis for each finger part region: In thisstep, we further reduce the potential area to a principal line by usingprincipal component analysis technique. In each finger part, posi-tions around two ends of the principal line are selected as fingertipcandidates and form a group. Candidates in each group are scoredin next step. The survived candidate with best matched score in thegroup is then selected as fingertip. The principal lines of finger re-gions are augmented to potential areas as shown in third column ofFigure 4. This step reduces the search space from a region to somehandful points.(4) Rejecting fake fingertips by pattern matching and false match-ing removal: After previous steps, only a few fingertip candidatesare passed. In this step, we verify fingertip candidates by using fin-gertip matching and false matching removal, which are two heuris-tics borrowed from [KOIKE and KOBAYASHI 2001] and modifiedto suit our case. In this step, we verify fingertip candidates usingbackground subtracted image (the first column in Figure 4). In theprocess of fingertip matching, for each candidate, a template-sizedregion located at the candidate’s position in the background sub-tracted image is copied, which is referred as the fingertip patch. Wethen binarize the patch by a threshold set as the average of max andmin intensity in the patch. Next, we compute sum of absolute dif-ference between the patch and the fingertip template. Candidateswith low scores are discarded. In the process of false matching re-moval, if pixels in the diagonal direction on the boundary of thefingertip patch coexist, then it is not considered as the fingertip andis removed. Final results are shown in the last column of Figure 4.After fingertip detection, the detected fingertips are multiplied witha homography matrix which transforms the fingertip positions from

Page 5: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

the camera coordinate to the display coordinate. The homographyis computed during a manual calibration phase in advance.

4.2.3 Pressure Sensing for Separating Touch and Hover In-teractions

As we mentioned in System Implementation, the pressure sensorsare very accurate, responsive (approximately 200Hz), and capableof measuring applied force in sub-gram scales. In addition, thesensors are stable when some fixed force is applied. That is, thesensors report the measured force with small variations. Intuitively,with the characteristics of the sensors, we can discern touch eventsby using simple threshold method which reports a touch event as anincrement of force in one pressure sensor is larger than a predefinedvalue. In our implementation, it can be more stable when we detecttouch events by thresholding on the absolute difference betweenthe current measured force and the background force. The reasonof doing so is described in the next paragraph. Specifically, thesteps are described as follows. At the first, we take the backgroundforce for each of the three sensors when there are no external forces,except for the base force from glass ball, applying to the pressuresensors. The background forces are annotated as B(i), where i =1, 2, 3. On detecting touch events, the formula is defined as

T (i) =

{True, if Abs(P (i)−B(i)) > ThFalse, otherwise, (1)

where P (i) is the current measured force from sensor i, Th isa predefined threshold, and T (i) is a boolean function indicatingwhether sensor i reports a touch event. As a result, the system re-ports a touch event when one of the T (i) is true.It is common sense that if there is no other force applied to thecrystal ball, the pressure sensors receive only the force from theball. In contrast, if some force is applied such as in the cases that auser places his palm on the ball, the pressure sensors receive greaterforce. However this is not always the case when the user is playingwith MaC Ball. Figure 5 is a plot showing pressure data collectedfrom the three sensors when the user manipulates MaC Ball lightlyand hardly. To visualize the data in 2D feature space, we projectthe data on P1-P2 plane (P1 and P2 are two of the three pressuresensors as indicated in Figure 5). In the plot, the value in x-axisindicates the force received by P1, and the value in y-axis indicatesthe force received by P2.The range of the value is mapped from 0 to -3000, where largervalue (approach to 0) relates to greater force measured by the sen-sors. The yellow diamond in the plot indicates the backgroundforce. The blue circles are the data collected when the user lightlymanipulates the ball while the red crosses are collected when theuser hardly manipulates the ball. If no external force applies to thecrystal ball, the received pressure data would stay around the dia-mond. If the user places a palm on the ball lightly and hardly asindicated in Figure 5a, the data moves forward to the origin. How-ever, if the user slides a palm or fingers from one side of the ball tothe other slide as indicated in Figures 5b and 5c, one side of theball will be lifted. As a result, one sensor receives greater force andthe other receives smaller force comparing to the background force.Based on the observation, the formula (1) which counts on absolutedifference meets all the cases.

Since we are using a threshold method to determine touch events,the predefined threshold should be small enough in order to givesensitive touch sensing. Noted that the pressure sensor is accurateand responsive, slight changes to the system may shift the back-ground force and thus cause false alarms. Moreover, when the useris operating with MaC Ball, the background force may also shift

Figure 5: A plot showing pressure data collected from two sen-sors (P1 and P2) when the user manipulates MaC Ball lightly andhardly. The blue circle indicates the user lightly manipulates theball. The red cross indicates the user hardly manipulates the ball.

imperceptibly. An update mechanism to correct background forceis required to provide sensitive touch sensing while preserving alow false alarm rate. In particular, we design two rules to updatethe background force so MaC Ball has self-correction characteris-tic in touch sensing.(1) Update with no palm shown in camera view: The first rule isvery simple and effective. If there is no foreground detected bythe camera, the pressure background updates. This rule makes surethat the pressure background will be reset in between of two usersplaying MaC Ball. More closely, in the duration of a user playingMaC Ball, the pressure background updates in intervals of opera-tions, possibly when the user is appreciating a virtual treasure withthe hands placed beside.(2) Update with moving palms and stable pressure sensing: Al-though the first rule works well in most cases, the pressure back-ground might shift imperceptibly if the user playing MaC Ball withthe hands seen by the camera all the time. The second rule countson this situation. If the camera detects a moving palm and the pres-sure data reveals a small variation for a short time, the pressurebackground updates. In this case, the user possibly waves handsabove the ball. This rule takes advantages on the sensitivity of thepressure sensing.

5 Virtual Exhibition

Although the first rule works well in most cases, the pressure back-ground might shift imperceptibly if the user playing MaC Ball withthe hands seen by the camera all the time. The second rule countson this situation. If the camera detects a moving palm and the pres-sure data reveals a small variation for a short time, the pressurebackground updates. In this case, the user possibly waves handsabove the ball. This rule takes advantages on the sensitivity of thepressure sensing.

Page 6: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

5.1 Content Production

5.1.1 Acquisition of Object Movies

Figure 6: Some of the artifacts displayed in MaC Ball.

We choose artifacts in the National Palace Museum to be displayedin our virtual museums. The artifacts chosen are valuable for aca-demic research since they are the representatives with completerecords of excavation. Some of the artifacts are shown in Figure 6.In order to render high quality and photo-realistic 3D artifacts in thevirtual exhibition, we use image-based technique (Object Movie).An object movie is a set of images taken from different viewsaround a 3D object. When the images are played sequentially, theobject seems to rotate around itself. This technique was first pro-posed in Apple QuickTime VR[Chen 1995] and its advantage ofbeing photo-realistic is suitable for delicate artifacts. Furthermore,each image is associated with the angles of the viewing direction.Thus some particular images can be chosen and shown in the ballaccording to the hand motion of the user. More specifically, whenthe user slides fingertips on the ball, the motion is computed andtranslates into changes in viewing direction. Then the object movieis changed accordingly. In this way, the user can interactively ro-tate the virtual artifacts and enjoy what he/she cannot see or feel ingeneral static exhibitions. For capturing object movies, we use Tex-nais autoQTVR standard edition, which provides accurate angularcontrol and automatic camera control. As Figure 7 shows, the en-tire system is controllable with traditional PC. After setup process,pictures will be automatically taken under some commands.

Figure 7: Acquisition of Object Movies with autoQTVR.

5.1.2 Augmentation of Panoramas with Object Movies

The panorama is the most popular image-based approach, whichcould provide an omni-directional view. In this system, thepanorama is a cylinder view stitched with images acquired by ro-tating a camera at a fixed point. In this approach, users are only al-lowed to see the contents of the panorama from specific viewing di-

rections. The panorama recorded in a cylinder would be de-warpedto an image plane when being watched, as shown in Figure 8. Apanorama is recorded as one single 2D image and an object movieis composed of a set of 2D images taken from different perspectivesaround a 3D object. The goal is to augment a panorama with objectmovies in a visually 3D consistent way. Based on the method pro-posed by Hung et al [Hung et al. 2002], a system for authoring andfor browsing augmented panorama is implemented in this work.

Figure 8: (a) The cylinder used to record the panorama. (b) thepanorama (in part) of an exhibition room. (c) a de-warped imagefrom the area within the red rectangle of panorama (b).

5.2 Application I: Virtual Museum

The virtual museum application includes two operation modes, thescene mode and the artifact mode. In the beginning of a tour, theuser navigates the exhibition room in the scene mode. The user isallowed to change the viewing direction of the panorama to browsethrough different object movies. Once the user selects an artifact,the application shows a close-up view of the artifact and switchesto the artifact mode. After that, the user is allowed to rotate theartifact, appreciating every angle of it. Figure 9 shows a shot ofthe virtual museum application in MaC Ball. For interacting withvirtual exhibitions, we need to support a variety of basic operationssuch as rotation and selection.

Figure 9: A shot of seeing the virtual exhibition in MaC Ball. Theleft image shows a view of the exhibition room and the right oneshows a particular view of an artifact.

RotationIn interactions with virtual museum, rotation is the most basic oper-ation. The rotation operation is achieved by recognition of wavinggesture. To perform a rotation operation, the user simply slides fin-gers on the ball. Since the waving gesture is based on motion of thehands, the user can perform the gesture arbitrarily while appreciat-ing a delicate artifact. Figure 10 shows a user sliding fingers on theball to appreciate an artifact.

Page 7: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

Figure 10: The user is sliding fingers on the ball to rotate a virtualartifact.

SelectionBrowsing in a virtual exhibition room, the selection operation al-lows the user to activate an interesting artifact. The selection op-eration is achieved by the pointing gesture. The user poses a sin-gle finger pressing on the ball to choose an artifact and the systemswitches to the artifact mode. It is noted that the pointing gesture isonly activated when single finger is recognized by the fingertip find-ing algorithm. Since a rotation operation is usually performed withmultiple fingers seen by the camera, the implementation separatesthe rotation and selection operations in a natural way.

5.3 Application II: Relic Browsing

The relic browsing application is designed for the users to focus onthe beauty of artifacts. Instead of browsing through artifacts in avirtual exhibition room, the user forthright sees close-up views ofartifacts in MaC Ball in relic browsing application. One more op-eration is needed to switch among artifacts. Here, the applicationgenerates clouds while the switch is issued. In addition, a virtualmagnifier is provided for the user to see the detail of the artifact.HoveringMaC Ball also provides hover interaction. That is, the user per-forms hovering palm above the ball. Hover interaction is a weakinteraction dedicating to activate the supportive visual effect suchas the spiraling clouds in our case. In the relic browsing applica-tion, hovering is used to switch among artifacts. Hovering palmsabove the ball, the user sees the computer-generated clouds gener-ated from the bottom of the ball. If the user keeps hovering for ashort time, the clouds gather quickly as shown in Figure 11. Whilethe clouds cover the present artifact, the artifact is then switched.Immediately a new artifact revealed with the clouds dispersed.

Figure 11: The user is having the palms hovering above the ball.The clouds are generated and the artifact switches in a short time.

Other InteractionsIn the application of artifact browsing, a virtual magnifier is pro-vided for the user to see the details of the artifact. To show up amagnifier, the user makes the fingers forming a circle as shown inFigure 12. The circle means a telescope with which the user cansee an enlarged view of the artifact. The interaction is implementedby finding a large connected component in camera view.

Figure 12: The user have fingers forming a circle on the ball toshow up the virtual magnifier.

5.4 Discussion

MaC Ball has been demonstrated in an opening presentation. Dur-ing the presentation, more than twenty people joined the demon-stration. For participants playing with the ball, we would not showthem how MaC ball works in the beginning. Instead, the partici-pants will try their way to interact with the transparent ball. Onestaff is served for guiding the participants in case they have anyproblem with the system. Two staffs were arranged to observe theparticipants’ behaviors and their reaction to the system. In the fol-lowing, we list some lessons learned from an analysis on the obser-vations after the demonstration.Constrained and unconstrained interactionsInteractions carried out by different means suffer different con-straints. In general, less constrained interaction leads to fluent oper-ations and usually makes the users confident with the system. How-ever, unconstrained means can damage the richness of interactionavailable to the system. In contrast, constrained interactions, likethose based on hand shape analysis, can provide many kinds of in-teractions. But the users can easily be frustrated by the system ifthey cannot meet the given constraints well. In this work, we findthat a good combination of constrained and unconstrained interac-tions can work perfectly by use of the users’ intention.In this work, the waving gesture based on motion detection is used.The gesture imposes almost no constraint on the user except mov-ing/waving the palm. Since the detection is robust and responsive,the user can quickly feel confident with the interaction. This isvery important because an interactive system can easily frustrate theusers especially in their first try. However, when an unconstrainedgesture like waving in our case is applied, other gesture-based inter-actions can hardly be added to the system since it is difficult to sep-arate them from the waving gesture. To add other interactions forMaC ball, we choose the pointing gesture because the users rarelyperform waving in pointing posture (single-finger hand shape). Inthe demonstration, all users could quickly be familiar with wav-ing gesture. However when they were told to use pointing gesture,many of them had to practice several times to successfully issue anpointing gesture before they realized the single-finger hand shapeis the basic requirement for the gesture. It is noted that the point-ing gesture is more constrained comparing with the waving gesture.When the users issue a button, they are assumed intentionally us-ing the pointing gesture because they have to strictly hold a pointinghand shape. Therefore when the two gestures occur simultaneously,they are separated by the users’ intention. Specifically, MaC ballgives a higher priority for the pointing gesture.Hover zone: an extended interaction spaceThe field of computer-human interface seeks ways to build a morehumane user interface. With the goal, the researchers in this com-munity have explored variety means including but not limited tovoice, gesture, and bio signal recognitions. Yet, a common prob-lem exists for an interactive installation: people feel uneasy beforethey realize the correct way to communicate with the installation.

Page 8: Gesture-based Interaction for a Magic Crystal Ballagents.csie.ntu.edu.tw/~yjhsu/pubs/2007/vrst2007.pdf · the ball, causing clouds blowing from bottom of the ball, or slides fingers

In this case, an interaction responding to the user’s curiosity in earlystage can ease the tenseness of the user as soon as possible.In this work, we provide hover and touch interactions. The twokinds of interactions occupy separate interaction spaces of the ball:touch surface and hover zone. The hover zone here is regarded asan extended interaction space which is possible to attract its poten-tial users in early stage. In the demonstration, a beginner was al-lowed to explore the ball by himself. At this stage, the user revealeduncomfortable since no specific instruction was given by our staff.However, if the user was curiously to use the ball, he/she then foundthe ball reactive to them responsively. In this case, a cautious usermight merely stretch out a hand in the hover zone and some cloudsthen spiraled in the ball interactively. This response obviously en-couraged the user to further touch the ball or to keep waving thehand in the zone. If the user just didn’t know how to do, he/shewould be told some indications such as “touch the ball” and “imag-ine that you are a witch and are trying to do scrying”. Then theycould quickly find the correct way to interact with MaC Ball. InMaC Ball, the hover zone provides an extensive space in which theusers’ curiosity can be discovered and enlarged by the responsivevisual effects.

6 Conclusion and Future Work

In this work, an interactive visual display system named MagicCrystal Ball (MaC Ball) is developed. MaC Ball is a spherical dis-play system, which allows the users to see a virtual object/scene ap-pearing inside a transparent sphere, and to manipulate the displayedcontent with barehanded interactions. MaC Ball transforms differ-ent impressions from movies and fictions into the development of amedium for the users to access multimedia in an intuitive, imagina-tive and playful manner.Many suggestions from an opening presentation will be includedin future work. We will explore other gesture interactions. For ex-ample, the position of detected fingertips can feed back to producesophisticated visual effects not only clouds but some flashes andlightings can be integrated together. In addition, other interactionssuch as volume and frequency of sounds from the user can blendin to the effect design of MaC Ball. Instead of computer-generatedeffects, MaC Ball can generate some response from external world.For example, a computer controllable ambient light installed behindthe transparent ball will be fun, or a fog generator can be installedaround the ball to aid magical sensations.

Acknowledgements

This work was partially supported by grants from NSC 95-2422-H-002-020 and NSC 95-2752-E-002-007-PAE.

References

BOUGUET, J.-Y. 2000. Pyramidal implementation of the lucaskanade feature tracker. OpenCV Documents.

CHAN, L.-W., CHUANG, Y.-F., CHIA, Y.-W., HUNG, Y.-P., ANDHSU, J. 2007. A new method for multi-finger detection us-ing a regular diffuser. In International Conference on Human-Computer Interaction.

CHEN, S. E. 1995. Quicktime vr: an image-based approach tovirtual environment navigation. In SIGGRAPH ’95: Proceed-ings of the 22nd annual conference on Computer graphics andinteractive techniques, ACM Press, New York, NY, USA, 29–38.

FAVALORA, G. E., NAPOLI, J., HALL, D. M., DORVAL, R. K.,GIOVINCO, M., RICHMOND, M. J., AND CHUN, W. S. 2002.

100-million-voxel volumetric display. SPIE, D. G. Hopper, Ed.,vol. 4712, 300–312.

GROSSMAN, T., AND BALAKRISHNAN, R. 2006. The design andevaluation of selection techniques for 3d volumetric displays. InUIST ’06: Proceedings of the 19th annual ACM symposium onUser interface software and technology, ACM Press, New York,NY, USA, 3–12.

GROSSMAN, T., WIGDOR, D., AND BALAKRISHNAN, R. 2005.Multi-finger gestural interaction with 3d volumetric displays. InSIGGRAPH ’05: ACM SIGGRAPH 2005 Papers, ACM Press,New York, NY, USA, 931–931.

HUNG, Y.-P., CHEN, C.-S., TSAI, Y.-P., AND LIN, S.-W. 2002.Augmenting panoramas with object movies by generating novelviews with disparity-based view morphing. Journal of Visual-ization and Computer Animation, Special Issue on Hallucinatingthe Real World from Real Images 13 (September), 237–247.

IKEDA, H., NAEMURA, T., HARASHIMA, H., AND ISHIKAWA, J.2001. i-ball: Interactive information display like a crystal ball.In Conference Abstract and Applications of SIGGRAPH, 122.

KOIKE, H., AND KOBAYASHI, Y. 2001. Integrating paper anddigital information on enhanceddesk: a method for realtime fin-ger tracking on an augmented desk system. ACM TransationComputer-Human Interaction 8, 4, 307–322.

LANGHANS, K., GUILL, C., RIEPER, E., OLTMANN, K., ANDBAHR, D. 2003. Solid felix: a static volume 3d-laser display.SPIE, A. J. Woods, M. T. Bolas, J. O. Merritt, and S. A. Benton,Eds., vol. 5006, 161–174.

USHIDA, K., HARASHIMA, H., AND ISHIKAWA, J. 2003. i-ball2: An interaction platform with a crystal-ball-like display formultiple users. In International Conference on Artificial Realityand Telexistence.