enhancing art education with ar.pdf

Upload: sara-namdarian

Post on 01-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 enhancing art education with AR.pdf

    1/6

    Copyright 2011 by the Association for Computing Machinery, Inc.

    Permission to make digital or hard copies of part or all of this work for personal or

    classroom use is granted without fee provided that copies are not made or distributed

    for commercial advantage and that copies bear this notice and the full citation on the

    first page. Copyrights for components of this work owned by others than ACM must be

    honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on

    servers, or to redistribute to lists, requires prior specific permission and/or a fee.

    Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail

    [email protected] .

    VRCAI 2011, Hong Kong, China, December 11 12, 2011. 2011 ACM 978-1-4503-1060-4/11/0012 $10.00

    ENHANCING ART HISTORY EDUCATION THROUGH MOBILE AUGMENTED

    REALITY

    Ann M. McNamara

    Department of Visualization, Texas A&M University

    College Station, Texas, USA

    Abstract

    This paper describes a new project which will focus on the integra-tion of eye-tracking technology with mobile Augmented Reality(AR) systems. AR provides an enhanced vision of the physicalworld by integrating virtual elements, such as text and graphics,with real-world environments. The advent of affordable mobiletechnology has sparked a resurgence of interest in mobile ARapplications. Inherent in mobile AR applications is the powerfulability to visually highlight information in the real world. We areworking on new algorithms to harness this ability to direct gazeto Points of Interest (POIs). Combining mobile AR and imagemanipulation gives visual distinction to POIs in order to directlyinfluence and direct gaze in real world scenes. Our initial testdomain is that of Art History Education. POIs are determined

    based on salient regions of paintings, as identified by the visualnarrative of the painting. We are developing new system thatwill be deployed in the Museum of Fine Art in Houston that willenhance visitor education through the use of gaze-directed mobileAR.

    1 Introduction

    Figure 1: Current web-browser based educational tools use textpop-ups and rectangular outlines to highlight important informa-tion in a visual narrative. This not only distracts the viewer fromappreciating the image, but also breaks up the image into smaller

    pieces so it is not viewed in a holistic manner. The red colored rect-angle destroys the visual experience by superimposing distractingoverlay on the original painting.

    AR applications use factors such as location and navigationdirection to deliver contextual information to the user [Azuma1997] citePapa:2008 [Tumler et al. 2008] [Feng et al. 2008][Sielhorst et al. 2008] [Carmigniani et al. 2011]. The opportunity

    e-mail: [email protected]

    to capitalize on AR to influencegaze and visually guide the viewerthrough a scene has gone largely unexplored. The goal of this workis to use the dynamic nature of AR elements to provide a height-ened perception of the real world. We propose using eye-tracking,to determine where users look in an augmented scene, and thenuse this information to accomplish these goals. The objectives forthis work are to a) develop models that can transform eye-trackedinformation into interest, b) develop display visualizations thatuse these models to present more informed content which doesnot distract from the users task, and c) use AR to give visualdistinction to areas in the real world in order to visually guide theviewer to important or interesting features.

    Imagine a scenario in which an Art History major is trying to

    improve his visual literacy skills. Narrative art tells a story, eitheras a moment in an ongoing story or as a sequence of eventsunfolding over time. A synoptic narrative depicts a single scenein which a character(s) are portrayed multiple times within aframe to convey that multiple actions are taking place. This cancause the sequence of events to be unclear. Synoptic narrativestypically provide visual cues that convey the sequence, but stillmight be difficult to decipher for those unfamiliar with the story.For example, the student is studying The Tribute Money byRenaissance artist Masaccio, Figure 1. This painting describesa scene from the Gospel of Matthew, in which Jesus directs Peterto find a coin in the mouth of a fish in order to pay the templetax. The optimal way to visually navigate this piece is to beginin the center with the tax collector demanding the money, Jesussurrounded by his disciples instructs Peter to retrieve the moneyfrom the mouth of a fish. By moving their gaze to the left ofthe painting (perhaps counter-intuitive to western civilizationwho normally read left to right) viewers notice Peter executingJesus instruction. The viewers eyes finally need to travel to theextreme right of the painting to view the third episode in whichPeter pays the tax collector. At the time it was painted, audiencesunderstood the order in which each episode of the painting wasto be viewed to convey the correct story. However, our ability,as artists and audiences, to correctly read these paintings maynot be so accurate in present day because our visual literacy is notconditioned to follow the viewing pattern the artist intended. Whileweb-based solutions exist to show the narrative, they manipulatea digital representation of a painting using strong outlines, orinterruptive text over the image to explain where the viewer shouldlook, Figure 1. While these represent a good first start, a moreelegant solution would not interrupt the visual experience of

    the audience. Employing mobile AR devices with eye-trackingcapabilities would allow the viewer to see the actual paintingwith areas of interest accentuated in a manner which protects thevisual experience. This scenario illustrates the need to displayinformation in a manner that minimizes disruption to the view,but can direct gaze to certain locations of an image, in a specificsequence.

    Now imagine an AR scenario that accounts for where the user islooking on their mobile device, and delivers content based on gazelocation. Not only that, but it delivers that information to an area

    507

  • 8/9/2019 enhancing art education with AR.pdf

    2/6

    of the screen that will not obstruct image features that are (or willbecome) important to the user. Also, imagine a complementary ARsystem that can influence where viewers look in a scene, both spa-tially and temporally. This work proposes strategies to realize theseAR scenarios. The ideal outcome is an eye-tracking AR systemthat is fully integrated into mobile devices and can inform AR ap-plications on the optimal placement of AR elements based on gazeinformation, and also manipulate AR elements to direct visual at-tention to specific regions of interest in the real world.

    A healthy number of (mobile) AR applications have successfullybeen applied in the Art domain [Gwilt 2009] [Damala et al. 2008][Andolina et al. 2009] [Bruns et al. 2007] [Choudary et al. 2009][Chou et al. 2005] [Srinivasan et al. 2009]. To date, however, fewhave proposed eye-tracking as an added dimension. The novelty ofthis approach lies in the eye-tracking and in attracting and directingthe gaze to the correct region of the artwork in a sequence that willencourage appropriate visual navigation and understanding of theimage and strengthen observation skills

    Approach

    1.1 Eye-Tracking

    Eye tracking refers to techniques used to record and measure eyemovements [Yarbus 1967]. Recent years have seen a rapid evo-lution of eye-tracking technology with systems becoming cheaper,easier to operate and less intrusive on the viewer[Duchowski 2003].Generally eye-tracking data is analyzed in terms of fixations andsaccades. Saccades are rapid eye movements used to position thegaze. During each saccade, visual acuity is suppressed and the vi-sual system is in effect blind. Only during fixations is clear vi-sion possible. The brain virtually integrates the visual images thatwe acquire through successive fixations into a visual scene or ob-

    ject. Eye-tracking systems first emerged in the early 1900s [Dodge1900;Dodge and Cline 1901] (see [Duchowski 2003]for a reviewof the history of eye-tracking). Until the 1980s, eye-trackers wereprimarily used to collect eye movement data during psychophysi-cal experiments. This data was typically analyzed after the com-

    pletion of the experiments. During the 1980s, the benefits of real-time analysis of eye movement data were realized as eye-trackersevolved as a channel for human-computer interaction [Levoy andWhitaker 1990]. More recently, real-time eye-tracking has beenused in interactive graphics applications [Cole et al. 2006] [De-Carlo and Santella 2002] [Hyona et al. 2003] [Bourlon et al. 2011]and large scale display systems to improve computational efficiencyand perceived quality. These techniques followgaze. For this workwe need to influence gaze. Subtle Gaze Direction (SGD)[Baileyet al. 2011] [Bailey et al. 2009] [Bailey et al. 2007] [McNamaraet al. 2009] [McNamara et al. 2008], is a technique that exploits thefact that our peripheral vision has very poor acuity compared to ourfoveal vision. By presenting brief, subtle modulations to the periph-eral regions of the field of view, the technique presented here drawsthe viewers foveal vision to the modulated region. Additionally, bymonitoring saccadic velocity and exploiting the visual phenomenon

    of saccadic masking, modulation is automatically terminated beforethe viewers foveal vision enters the modulated region. Hence, theviewer never sees the stimuli that attracted her gaze. This gaze di-recting technique can successfully guide gaze about an image.

    1.2 Gaze Direction using Mobile AR

    Tablet computing and mobile devices promise to have a dramaticimpact on education [El Sayed et al. 2011] [Chen et al. 2011]Imagine a student holding a tablet computer in front of an artifactor image and, instantly, that object is annotated with more infor-

    mation. The overlay may include virtual text, images, web links oreven video. AR applications for education are exploding in manyacademic arenas [Medicherla et al. 2010] [Kaufmann and Schmal-stieg 2002] [Kaufmann and Meyer 2008]. What separates thisworkfrom existing applications is theintegration of eye-tracking. Eye-tracking gives information about where the student is looking, whatthey are looking at and they have actually looked at all the mostpertinent regions. Also, if they are not looking where they aresupposed to, subtle techniques will be introduced to draw atten-

    tion back to those region. These may be subtle but do not have tobe: the most promising solution may iteratively increase in strengthuntil the users gaze is drawn to that location. Building on SGD weplan to incorporate innovative ways to attract and focus attention tovisual information in AR mobile applications. The focus (initially)is on on Art History Education, although the ideas presented herehave potential application in many disciplines. A healthy number of(mobile) AR applications have successfully been applied in the Artdomain [Gwilt 2009] [Damala et al. 2008] [Andolina et al. 2009][Bruns et al. 2007] [Choudary et al. 2009] [Chou et al. 2005][Srinivasan et al. 2009]. To date, however, none have proposedeye-tracking as an added dimension. The novelty of this approachlies in the eye-tracking and in attracting and directing the gaze tothe correct region of the artwork in a sequence that will encourageappropriate visual navigation and understanding of the image andstrengthen observation skills.

    Obviously conspicuous objects in a scene (such as a black sheep ina white flock) will draw the viewers attention first. However, thereare more subtle image characteristics that can also draw our gaze.Image properties such as color, size and orientation can be usedto control attention[Veas et al. 2011] [Underwood and Foulsham2006] [Underwood et al. 2009]. We are also researching com-plementary ways to use AR elements to filter the image underscrutiny by overlaying virtual templates to highlight or defocusimage details. In movies directors using an arsenal of cinemato-graphic tricks to lead the audience to look where they want themto look (see[Bordwell 2011]). Taking an automated approach, Ittiand Koch [Itti and Koch 2000] [Itti and Koch 2001]developed analgorithm to measurevisual saliency(how likely people are to lookat parts of an image) on the basis of image characteristics such

    as intensity distribution, color changes, and orientation. Saliencymaps could prove to be a good candidate to indicate the initialattention in a painting. Then by modifying the digital version ofthe painting to re-distribute saliency we build several versions ofthe painting with the pre-selected interesting regions manipulatedto increase saliency. For example, in The Tribute Money when itis time to look at Peter retrieving the coin from the mouth of thefish, image processing of the AR overlay could boost the intensitycontrast in that region and thereby influence the viewer to re-direct their gaze. We also plan to iteratively adjust emphasis untilthe desired result is achieved. Such a scenario is shown in Figure2.

    We will also investigate gracefully degrading the regions of theimage that are not important at that moment in time. Takinginspiration from work by DeCarlo and Santella, we can apply a

    softening filter to all areas of the AR overlay apart from the areawhere the user should be looking. DeCarlo and Santella [Coleet al. 2006] [DeCarlo and Santella 2002] used an eye tracker toidentify regions in photographs that viewers tended to focus on.Taking this information, they generated abstract renderings of thesephotographs with the other, less interesting regions presented inreduced in detail. They then used eye tracking to confirm that thisabstraction was effective at directing the viewers gaze. This wouldin essence visually fade unimportant information.

    This approach will also insure that the viewer does not inadver-

    508

  • 8/9/2019 enhancing art education with AR.pdf

    3/6

    Figure 2: A mock-up of how manipulating the AR version of theimage has the potential to draw visual attention. The image on theiPad shows the main characters brighter than the other elements tohighlight them(top). An alternative approach is to blur out unim-

    portant information, this is shown in the mock-up on the bottom.

    tently miss any areas of importance by directing the gaze about thepainting. Using eye-tracking we can ensure that viewers hit thehighpoints of an image and receive all the salient visual informa-

    tion. The novelty lies in the use eye-tracking and image features toguide the eye, neither of which will interrupt the visual experienceof viewing the painting, or disrupt the original painting in any way(unlike current technical approaches, Figure1). One of the mostattractive features of incorporating eye-tracking in mobile AR isthat the virtual layer of information lights-up only when the userlooks at a certain point. That means the information delivered isrelevant at that particular gaze location, and at that particular time.

    2 Implementation

    2.1 Image Retrieval

    To retrieve the appropriate AR information to present, image recog-nition is into the application. OpenCV, a library of programmingfunctions for real time computer vision, will be used in the proposedwork as these functions can easily capture and analyze images andvideo, [openCV 2011]. Also, OpenCV has been successfully portedto work with iOS, the mobile devices operating system. OpenCVcan also handle event input (such as mouse events). Rather thanuse x,y position from the mouse, we measure the x,ygaze positionin order to drive the gaze direction events. OpenCV commands areused to stream video capture to the device e.g( #0 CvCapturecapture = cvCaptureFromCAM(0);).

    2.2 Image Alignment

    Aligning the real-world image on the mobile device with the (en-hanced) augmented version of the image is necessary for this work.Two popular algorithms for achieving this are SIFT and SURF.Scale-Invariant Feature Transform (or SIFT) detects and describeslocal features in images. SIFT transforms an image into a largecollection of local feature vectors. Each of these feature vectorsis invariant to any scaling, rotation or translation of the image.

    Speeded Up Robust Features (SURF) has similar performance toSIFT, but executes faster, which is important for mobile devices dueto processing power. OpenSURF, an open-source vision algorithmto find salient regions in images, forms the basis of many visionbased tasks including object recognition and image retrieval andwill be used to address image recognition and registration [open-SURF 2011] [Takacs et al. 2008] [Chen and Koskela 2011].

    3 Conclusion

    Existing mobile AR devices use cameras, angular velocity sensors,and accelerometers to gauge the absolute position of the user [Laneet al. 2010]. This information is then used to inform placementof annotations in the virtual view. By contrast, very little workhas focussed on gauging where the users attention is focussed and

    leveraging that information for the placement and delivery of ARelements. Our work uses eye-tracking in conjunction with AR ap-plications to determine where the viewer is looking at each pointin time. We can then use where people look as a mechanism to in-form the placement of AR elements in a manner that aligns with theusers visual attention and eliminates ambiguity by hiding informa-tion that the user ignores. This will ultimately lead to gaze-awaremobile AR applications that minimize visual clutter and enhancevisual literacy by eliminating elements that are not being attendedto. This project is in its preliminary stages, and has great potentialto impact human-centered AR systems.

    References

    ANDOLINA, S., SANTANGELO, A., CANNELLA, M., GENTILE,A., AGNELLO , F., AN D VILLA, B. 2009. Multimodal virtualnavigation of a cultural heritage site: the medieval ceiling of steriin palermo. In Proceedings of the 2nd conference on HumanSystem Interactions, IEEE Press, Piscataway, NJ, USA, HSI09,559564.

    AZUMA, R. T. 1997. A survey of augmented reality. Presence:Teleoperators and Virtual Environments 6, 4 (Aug.), 355385.

    BAILEY, R., MCNAMARA, A., SUDARSANAM, N., A ND G RIMM,C. 2007. Subtle gaze direction. In ACM SIGGRAPH 2007sketches, ACM, New York, NY, USA, SIGGRAPH 07.

    BAILEY, R., MCNAMARA, A., SUDARSANAM, N., A ND G RIMM,C. 2009. Subtle gaze direction.ACM Trans. Graph. 28(Septem-ber), 100:1100:14.

    BAILEY, R., MCNAMARA, A., COSTELLO, A., A ND G RIMM, C.2011. Impact of subtle gaze direction on short-term spatial in-formation recall. In ACM SIGGRAPH 2011 Talks, ACM, NewYork, NY, USA, SIGGRAPH 11.

    BORDWELL, D., 2011. http ://www.davidbordwell.net/blog/2011/02/14/watching you watch therewill be blood/.

    BOURLON, C., OLIVIERO, B., WATTIEZ, N., POUGET, P., AN DBARTOLOMEO, P. 2011. Visual mental imagery: What the

    509

  • 8/9/2019 enhancing art education with AR.pdf

    4/6

    heads eye tells the minds eye. Brain Research 1367, 287 297.

    BRUNS, E . , BROMBACH , B . , ZEIDLER , T., AN D BIMBER, O.2007. Enabling mobile phones to support large-scale museumguidance. IEEE MultiMedia 14(April), 1625.

    CARMIGNIANI , J. , FURHT, B. , ANISETTI , M., CERAVOLO, P.,DAMIANI, E., AN D IVKOVIC, M. 2011. Augmented realitytechnologies, systems and applications. Multimedia Tools Appl.51(January), 341377.

    CHE N, X., AN D K OSKELA, M. 2011. Mobile visual search fromdynamic image databases. In Proceedings of the 17th Scandi-navian conference on Image analysis, Springer-Verlag, Berlin,Heidelberg, SCIA11, 196205.

    CHE N, N.-S., TEN G, D . C. -E. , LEE , C.-H., AN D KINSHUK.2011. Augmenting paper-based reading activity with direct ac-cess to digital materials and scaffolded questioning. Comput.

    Educ. 57(September), 17051715.

    CHO U, S . -C. , HSIEH, W.-T., GANDON, F. L., AN D SADEH,N. M. 2005. Semantic web technologies for context-aware mu-seum tour guide applications. InProceedings of the 19th Inter-

    national Conference on Advanced Information Networking andApplications - Volume 2, IEEE Computer Society, Washington,DC, USA, AINA 05, 709714.

    CHOUDARY, O . , CHARVILLAT, V., GRIGORAS , R., AN D GUR -DJOS, P. 2009. March: mobile augmented reality for culturalheritage. In Proceedings of the 17th ACM international con-

    ference on Multimedia, ACM, New York, NY, USA, MM 09,10231024.

    COL E, F., DECARLO, D., FINKELSTEIN , A., KIN , K., MORLEY,K., AN D SANTELLA, A. 2006. Directing gaze in 3D mod-els with stylized focus. Eurographics Symposium on Rendering(June), 377387.

    DAMALA, A . , CUBAUD, P., BATIONO, A . , HOULIER , P., AN D

    MARCHAL, I . 2008. Bridging the gap between the digital andthe physical: design and evaluation of a mobile augmented real-ity guide for the museum visit. InProceedings of the 3rd interna-tional conference on Digital Interactive Media in Entertainmentand Arts, ACM, New York, NY, USA, DIMEA 08, 120127.

    DECARLO, D., AN D SANTELLA, A. 2002. Stylization and ab-straction of photographs. ACM Trans. Graph. 21 (July), 769776.

    DODGE, R., AN D CLINE, T., 1901. The angle velocity of eyemovements.

    DODGE, R., 1900. Visual perception during eye movement.

    DUCHOWSKI, A. T. 2003.Eye Tracking Methodology: Theory and

    Practice. Springer-Verlag New York, Inc., Secaucus, NJ, USA.

    EL SAYED, N . A . M . , Z AYED, H. H., AN D SHARAWY, M . I .2011. Arsc: Augmented reality student card. Comput. Educ. 56(May), 10451061.

    FEN G, Z., DUH , H. B. L., A ND B ILLINGHURST, M. 2008. Trendsin augmented reality tracking, interaction and display: A reviewof ten years of ismar. InProceedings of the 7th IEEE/ACM In-ternational Symposium on Mixed and Augmented Reality, IEEEComputer Society, Washington, DC, USA, ISMAR 08, 193202.

    GWILT, I. 2009. Augmented reality and mobile art. InHandbookof Multimedia For Digital Entertainment and Arts, B. Furht, Ed.Springer US, 593599.

    HYONA, J. , RADACH, R., AN D DEUBEL, H. 2003. The mindseye: cognitive and applied aspects of eye movement research.North-Holland, Boston.

    ITT I, L., AN D KOC H, C. 2000. A saliency-based search mecha-nism for overt and covert shifts of visual attention. Vision Re-

    search 40, 10-12 (May), 14891506.

    ITT I, L., A ND KOC H, C . 2001. Computational modelling of visualattention. Nature Reviews Neuroscience 2, 3 (Mar), 194203.

    KAUFMANN, H., AN D M EYER, B. 2008. Simulating educationalphysical experiments in augmented reality. In ACM SIGGRAPH

    ASIA 2008 educators programme, ACM, New York, NY, USA,SIGGRAPH Asia 08, 3:13:8.

    KAUFMANN, H., AN D SCHMALSTIEG , D. 2002. Mathematicsand geometry education with collaborative augmented reality. In

    ACM SIGGRAPH 2002 conference abstracts and applications,ACM, New York, NY, USA, SIGGRAPH 02, 3741.

    LAN E, N., MILUZZO , E., PEEBLES, D., CHOUDHURY, T., AN DCAMPBELL, A. T. 2010. A survey of mobile phone sensing.

    LEVOY, M., AN D WHITAKER, R. 1990. Gaze-directed volumerendering. InProceedings of the 1990 symposium on Interactive3D graphics, ACM, New York, NY, USA, I3D 90, 217223.

    MCNAMARA, A., BAILEY, R., A ND G RIMM, C. 2008. Improvingsearch task performance using subtle gaze direction. In Proceed-ings of the 5th symposium on Applied perception in graphics andvisualization, ACM, New York, NY, USA, APGV 08, 5156.

    MCNAMARA, A., BAILEY, R., AN D GRIMM, C. 2009. Searchtask performance using subtle gaze direction with the presenceof distractions. ACM Trans. Appl. Percept. 6(September), 17:117:19.

    MEDICHERLA, P. S., CHANG, G., A ND M ORREALE, P. 2010. Vi-sualization for increased understanding and learning using aug-

    mented reality. In Proceedings of the international conference onMultimedia information retrieval, ACM, New York, NY, USA,MIR 10, 441444.

    OPENCV, 2011. http: //opencv.willowgarage.com/wiki/.

    OPENSURF, 2011. http: //www.chrisevansdev.com.

    SIELHORST, T., FEUERSTEIN , M., AN D NAVAB, N. 2008. Ad-vanced medical displays: A literature review of augmented real-ity. J. Display Technol. 4, 4 (Dec), 451467.

    SRINIVASAN, R., BOAST, R., FURNER, J., A ND B ECVAR, K. M.2009. Digital museums and diverse cultural knowledges: Mov-ing past the traditional catalog. The Information Society 25(July), 265278.

    TAKACS, G., CHANDRASEKHAR , V., GELFAND, N., XIONG, Y.,CHE N, W.-C., BISMPIGIANNIS , T., GRZESZCZUK, R., PULLI,K. , A ND G IROD, B. 2008. Outdoors augmented reality on mo-bile phone using loxel-based visual feature organization. In Pro-ceeding of the 1st ACM international conference on Multime-dia information retrieval, ACM, New York, NY, USA, MIR 08,427434.

    TUMLER, J . , DOI L, F., MECKE, R. , PAUL , G . , SCHENK, M.,PFISTER, E . A . , HUCKAUF, A . , BOCKELMANN, I ., AN D

    ROGGENTIN , A . 2008. Mobile augmented reality in industrialapplications: Approaches for solution of user-related issues. In

    510

  • 8/9/2019 enhancing art education with AR.pdf

    5/6

    Proceedings of the 7th IEEE/ACM International Symposium onMixed and Augmented Reality, IEEE Computer Society, Wash-ington, DC, USA, ISMAR 08, 8790.

    UNDERWOOD, G., AN D FOULSHAM, T. 2006. Visual saliencyand semantic incongruency influence eye movements when in-specting pictures. Q J Exp Psychol (Colchester) 59, 11 (Nov.),19311949.

    UNDERWOOD, J . , TEMPLEMAN , E ., AN D UNDERWOOD, G.

    2009. Attention in cognitive systems. Springer-Verlag, Berlin,Heidelberg, ch. Conspicuity and Congruity in Change Detection,8597.

    VEA S, E. E., MENDEZ, E., FEINER, S. K., A ND S CHMALSTIEG ,D. 2011. Directing attention and influencing memory with vi-sual saliency modulation. In Proceedings of the 2011 annualconference on Human factors in computing systems, ACM, NewYork, NY, USA, CHI 11, 14711480.

    YARBUS, A. 1967. Eye movements and vision. Plenum Press, NewYork.

    511

  • 8/9/2019 enhancing art education with AR.pdf

    6/6

    512