bundeswehr university munich lmu munich / bundeswehr
Post on 23-Feb-2022
16 Views
Preview:
TRANSCRIPT
Evaluation in Human-Computer Interaction: Beyond Lab Studies
CHI 2021 Tutorial
Albrecht Schmidt, LMU Munich
Florian Alt, Bundeswehr University Munich
Ville Mäkelä, LMU Munich / Bundeswehr University Munich
Organizers
Albrecht SchmidtLMU Munichalbrecht.schmidt@ifi.lmu.de
2
Florian AltBundeswehr University Munichflorian.alt@unibw.de
Ville MäkeläLMU Munichville.maekelae@ifi.lmu.de
Bundeswehr University Munichville.maekelae@unibw.de
Course Outline● Introductions
● Motivation
● Overview of Approaches to Out-of-Lab Studies
● Discussion
● Deep Dive
● Conclusion
3
Introductions - Breakout Groups● Breakout groups of 5 people
● 5 minutes
● Each of you discuss two of your own studies:
○ One you ran within the last year
○ One you wanted to run, but didn’t/couldn’t
● Does not need to be resolved fully
… go and talk to each other in the break after the tutorial :-)
4
Motivation● Does research need to be validated?
● Why do we evaluate?
● What is a good evaluation?
● Why is it difficult to change to an alternative evaluation?
5
Challenges and Opportunities● Reproducibility - but we do not value replication (nearly no papers)
● Internal vs external validity
● Long-term effects
Freely add your answers, ideas, and comments:
https://docs.google.com/document/d/18rbJORJnEepVF4CwyWaq6ZX-ocqjW_g3oY7SiH2R3d4/edit
7
Using Existing Stuff● Use existing data sets
Example: [Khamis et al., 2018]
Search Engine: https://datasetsearch.research.google.com/
● Use (previously collected) log data Examples: [Henze et al., 2013], [Alt et al., 2020]
● Mining existing information Example: [Mäkelä and Schmidt, 2020]
● Meta studies, synthesis of research results,
and literature researchExample: [Buschek et al., 2018], [Katsini et al., 2020]
8
[Alt et al., 2020] Alt, Buschek, Heuss & Müller. 2021. Orbuculum -
Predicting When Users Intend to Leave Large Public Displays. In Proc.
IMWUT.
[Buschek et al., 2018] Buschek, Hassib & Alt. 2018. Personal Mobile
Messaging in Context: Chat Augmentations for Expressiveness and
Awareness. In ToCHI.
[Henze et al., 2013] Henze, Sahami, Schmidt, Pielot, Michahelles. 2013.
Empirical Research through Ubiquitous Data Collection. In IEEE
Computer.
[Mäkelä and Schmidt, 2020] Mäkelä & Schmidt. 2020. I Don’t Care as
Long as It’s Good: Player Preferences for Real-Time and Turn-Based
Combat Systems in Computer RPGs. In Proc. CHI PLAY ‘20.
[Katsini et al., 2020] Katsini, Abdrabou, Raptidis, Khamis, Alt. The Role
of Eye Gaze in Security and Privacy Applications: Survey and Future
HCI Research Directions. In Proc. CHI ‘20
[Khamis et al., 2018] Khamis, Baier, Henze, Alt & Bulling. 2018.
Understanding Face and Eye Visibility in Front-Facing Cameras of
Smartphones used in the Wild. In Proc. CHI ‘18.
Web and App Usage● Create applications / web pages that
implement your study / evaluation
○ Example: [von Zezschwitz et al., 2016]
○ Use “interactive” survey tools (e.g., SoSci)
○ Distribution and Recruiting Channels
■ Crowdsourcing (e.g., MTurk, Prolific, etc.)
■ App Stores [Schneegass et al., 2014]
● Piggyback experiment into an app / web page
○ Example: ResearchIME [Buschek et al., 2018]
○ How to: [Henze et al., 2013]
9
[Buschek et al.,. 2018] Buschek, Bisinger & Alt. 2018. ResearchIME: A
Mobile Keyboard Application for Studying Free Typing Behaviour in
the Wild. In Proc. CHI '18.
[Henze et al., 2013] Henze, Sahami, Schmidt, Pielot, Michahelles.
2013. Empirical Research through Ubiquitous Data Collection. In
IEEE Computer.
[Schneegass et al., 2014] Schneegass, Steimle, Bulling, Alt & Schmidt.
2014. SmudgeSafe: geometric image transformations for
smudge-resistant user authentication. In Proc. UbiComp '14.
[von Zezschwitz et al., 2016] von Zezschwitz, Eiband, Buschek,
Oberhuber, De Luca, Alt, & Hussmann. 2016. On quantifying the
effective password space of grid-based unlock gestures. In Proc.
MUM '16.
Users at Home● Engage with users through remote
communication
○ [Fröhlich et al., 2021]
○ [Rivu et al., 2021, case study 2]
● Create prototypes that can be experienced
remotely
○ [Rivu et al., 2021, case study 1]
● Supply study equipment to your users at home
○ [Prange et al., 2019]
○ [Bramley et al., 2018]
10
[Fröhlich et al., 2021] Fröhlich, Wagenhaus, Schmidt, Alt.
Don’t stop me know! Exploring Challenges of First-Time
Cryptocurrency Users. In Proc. DIS ‘21 (to appear)
[Prange et al., 2019] Prange, Tiefenau, von Zezschwitz, Alt.
Towards Understanding User interaction in Future Smart
Homes. In Proc. CHI EA ’19.
[Rivu et al., 2021]. Rivu, Mäkelä, Prange, Delgado Rodriguez,
Piening, Zhou, Köhle, Pfeuffer, Abdelrahman, Hoppe,
Schmidt, & Alt. 2021. Remote VR Studies: A Framework for
Running Virtual Reality Studies Remotely Via
Participant-Owned HMDs. arXiv preprint.
[Bramley et al., 2018] Bramley, Goode, Anderson, & Mary.
2018. Researching in-store, at home: Using virtual reality
within quantitative surveys. International Journal of Market Research 60, 4 (2018), 344–351.
New Approaches to Evaluation● Run studies in virtual reality
● Use analytic methods (e.g. KLM)
● Do computational evaluation = proof it is better
(e.g. keyboard optimization, menu layout)
11
Mäkelä et al. 2020. Virtual Field Studies: Conducting Studies on Public Displays in Virtual Reality. In Proc. CHI ‘20.
Schneegaß et al. 2011. Support for modeling interaction with automotive user interfaces. In Proc. AutomotiveUI ‘11.
Appropriate Your Research Questions● Study phenomena that happen online
○ study Facebook behavior, fake news
● Study HCI in the home and in home office using remote methods
● Change to technical evaluation rather than working with users
12
https://thomaskosch.com/wp-content/papercite-data/pdf/hoppe2021odins.pdf
Matthias Hoppe, Daria Oskina, Albrecht Schmidt, and
Thomas Kosch. 2021. Odin’s Helmet: A Head-Worn Haptic
Feedback Device to Simulate G-Forces on the Human Body
in Virtual Reality. Proc. ACM Hum.-Comput. Interact. 5,
EICS, Article 212 (June 2021), 15 pages.
https://doi.org/10.1145/3461734
Breakout Groups● Topic per group: a specific approach (name of your breakout group)
○ What are the positive and negative aspects in contrast to lab studies? (report this in 30 secs)
○ Do you have examples: how was this used? how could it be used? where should it not be used?
● Topics:
○ Using Existing Stuff (don’t generate the data yourself - use data that is out there, incl. literature)
○ Web and App Usage (piggyback your experiment in a smartphone app / web site)
○ Users at Home (send equipment to people or use what they have at home)
○ New Approaches to Evaluation (just invent an evaluation method that works for your setting)
○ Appropriate Your Research Questions (change your research to make it fit what you can do)
● Google Doc for collecting research examples:https://docs.google.com/document/d/18rbJORJnEepVF4CwyWaq6ZX-ocqjW_g3oY7SiH2R3d4/edit
14
Discussion● Summarize your discussion in 30 seconds
○ What are the positive and negative aspects in contrast to lab studies?
● Do you have examples: how was this used? how could it be used? where
should it not be used? Add them here:
○ https://docs.google.com/document/d/18rbJORJnEepVF4CwyWaq6ZX-ocqjW_g3oY7Si
H2R3d4/edit
15
● Discuss in groups: How will reviewing need to change if we do more
out-of-the-lab evaluations?
○ What would we have to report that we don’t report now?
○ What criteria should be added for judging papers?
○ How to ensure reproducibility?
○ What different insights would papers begin to generate?
○ What do want to keep, what do we want to throw away after COVID?
● GoogleDoc for collecting research examples:https://docs.google.com/document/d/18rbJORJnEepVF4CwyWaq6ZX-ocqjW_g3oY7SiH2R3d4/edit
Breakout Groups
16
Discussion● Summarize your points for each question in 30 seconds
○ What would we have to report that we don’t report now?
○ What criteria should be added for judging papers?
○ How to ensure reproducibility?
○ What different insights would papers begin to generate?
○ What do want to keep, what do we want to throw away after COVID?
17
Deep Dive● VR studies
○ Simulation studies
○ Remote studies
● Use data that’s “out there”
● Large-scale piggyback
● Technical evaluation (speed, forces)
19
VR Studies● Simulation studies
○ Studies where we utilize VR as a research testbed; we study phenomena that exist outside
of VR, but we study them in VR
● Remote studies
○ Studies involving VR technologies - for whatever purpose - can be conducted remotely,
without having users come to the lab
○ The easiest way is to recruit people who already own the necessary hardware, such as a
VR head-mounted display (HMD), and have them participate using their own setup
20
Simulation of Studies in VR● Can we simulate user studies in virtual reality?
- Yes (sort of)
● Promising results from several studies, where
results between a real-world study and an identical VR study were comparable
21
Mäkelä et al. 2020. Virtual Field Studies: Conducting Studies on Public Displays in
Virtual Reality. In Proc. CHI ‘20.Rivu et al. 2021. Exploring Emotions and Emotion Elicitation Techniques in Virtual Reality. In Proc. INTERACT ‘21 (to appear).
Remote VR Studies● Can we recruit people who own the
necessary VR hardware, and have them participate from home?
● Yes! And there are many ways to do it
22
Rivu et al. 2021. When Friends become Strangers: Understanding the Influence of Avatar Gender On Interpersonal Distance Between Friends in Virtual Reality. In Proc. INTERACT ‘21 (to appear).
Rivu et al. 2021. Remote VR Studies: A Framework for Running Virtual Reality Studies Remotely Via Participant-Owned HMDs. arXiv preprint.
Remote VR Studies
23
Rivu et al. 2021. Remote VR Studies: A Framework for Running Virtual Reality Studies Remotely Via Participant-Owned HMDs. arXiv preprint.
Remote VR Studies
24
Rivu et al. 2021. Remote VR Studies: A Framework for Running Virtual Reality Studies Remotely Via Participant-Owned HMDs. arXiv preprint.
Remote VR Studies
25
Rivu et al. 2021. Remote VR Studies: A Framework for Running Virtual Reality Studies Remotely Via Participant-Owned HMDs. arXiv preprint.
Remote VR Studies
26
Saffo et al. 2021. Remote and Collaborative Virtual Reality Experiments via Social VR Platforms. In Proc. CHI ‘21.
Rivu et al. 2021. Remote VR Studies: A Framework for Running Virtual Reality Studies Remotely Via Participant-Owned HMDs. arXiv preprint.
Use Data That’s Out There● Example: analysis of online discussions
about video game -related preferences
○ We gathered relevant discussion threads
from several websites, resulting in 546 total
posts from 17 discussion threads and eight
different websites.
○ Thematic analysis over multiple rounds
27
Mäkelä & Schmidt. 2020. I Don’t Care as Long as It’s Good: Player Preferences for Real-Time and Turn-Based Combat Systems in Computer RPGs. In Proc. CHI PLAY ‘20.
Use Data That’s Out ThereOnline discussion are..
● Authentic
● Insightful in unexpected ways
● Messy and unstructured
● Little to no background information
available
28
Mäkelä & Schmidt. 2020. I Don’t Care as Long as It’s Good: Player Preferences for Real-Time and Turn-Based Combat Systems in Computer RPGs. In Proc. CHI PLAY ‘20.
Large-Scale Piggyback - Example● Investigating typing behavior on smartphones
● Applications: adaptive user interfaces, security
● Approach
○ Android keyboard app
○ Logging filter
○ 3 week field study
○ 6 Million events
29
[Buschek et al.,. 2018] Buschek, Bisinger & Alt. 2018.
ResearchIME: A Mobile Keyboard Application for Studying
Free Typing Behaviour in the Wild. In Proc. CHI '18.
Large-Scale Piggyback - How to● Identify research goals
● Select study method (relational or experimental)
● Devise an incentive mechanism
● Choose target platform
● Design and develop app
● Prepare data collection
● Implement scheme to obtain informed consent
● Distribute and promote app
● Monitor data collection
● Filter and analyze data to answer research question
30
[Henze et al., 2013] Henze, Sahami, Schmidt, Pielot,
Michahelles. 2013. Empirical Research through Ubiquitous
Data Collection. In IEEE Computer.
Technical Evaluation ● Perform technical measurements
● Measure parameters that are not directly dependent on a person using the system, device, application, e.g. bandwidth requirement, delay of presentation, forces experiences
● Compare technical features of your work to previously published work and show that your solution is “better” with regard to specific parameters
https://thomaskosch.com/wp-content/papercite-data/pdf/hoppe2021odins.pdf Matthias Hoppe, Daria Oskina, Albrecht Schmidt, and Thomas Kosch. 2021. Odin’s Helmet: A Head-Worn Haptic Feedback Device to Simulate G-Forces on the Human Body in Virtual Reality. Proc. ACM Hum.-Comput. Interact. 5, EICS, Article 212 (June 2021), 15 pages. https://doi.org/10.1145/3461734
31
Mathematical Modelling ● Modelling Users and Interaction
32
Fischer, Florian; Fleig, Arthur; Klar, Markus; Grüne, Lars; Müller, Jörg. An Optimal Control Model of Mouse Pointing Using the LQRBayreuth, 2020. https://arxiv.org/pdf/2002.11596.pdfSeinfeld, Sofia; Feuchtner, Tiare; Maselli, Antonella; Müller, Jörg. User Representations in Human-Computer Interaction in Human–Computer Interaction (2020) . - page 1-39 doi:10.1080/07370024.2020.1724790 ...
Computational Optimization● Using optimization techniques to improve systems
● Proof that your approach/design is better with regard to an objective function
● Example: optimize key assignment, adapative and predictive keyboard
Daniel Buschek. Intelligent Text Entry - Adaptive and predictive keyboards. Lecture in the Intelligent User Interfaces course. 2021. https://iui-lecture.org/
33
top related