meetingcoach: an intelligent dashboard for supporting

13
MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings Samiha Samrose 1 , Daniel McDuff 2 , Robert Sim 2 , Jina Suh 2 , Kael Rowan 2 , Javier Hernandez 2 , Sean Rintel 2 , Kevin Moynihan 2 , Mary Czerwinski 2 1 University of Rochester, 2 Microsoft Research [email protected],{damcduff,rsim,jinasuh,kael.rowan,javierh,serintel,kevinmo,marycz}@microsoft.com ABSTRACT Video-conferencing is essential for many companies, but its limi- tations in conveying social cues can lead to ineffective meetings. We present MeetingCoach, an intelligent post-meeting feedback dashboard that summarizes contextual and behavioral meeting information. Through an exploratory survey (N=120), we identi- fied important signals (e.g., turn taking, sentiment) and used these insights to create a wireframe dashboard. The design was eval- uated with in situ participants (N=16) who helped identify the components they would prefer in a post-meeting dashboard. After recording video-conferencing meetings of eight teams over four weeks, we developed an AI system to quantify the meeting features and created personalized dashboards for each participant. Through interviews and surveys (N=23), we found that reviewing the dash- board helped improve attendees’ awareness of meeting dynamics, with implications for improved effectiveness and inclusivity. Based on our findings, we provide suggestions for future feedback system designs of video-conferencing meetings. CCS CONCEPTS Human-centered computing Empirical studies in collab- orative and social computing. KEYWORDS group, feedback, meeting, sensing, video-conferencing ACM Reference Format: Samiha Samrose 1 , Daniel McDuff 2 , Robert Sim 2 , Jina Suh 2 , Kael Rowan 2 , Javier Hernandez 2 , Sean Rintel 2 , Kevin Moynihan 2 , Mary Czerwinski 2 . 2021. MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclu- sive Meetings. In CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/3411764.3445615 1 INTRODUCTION Meetings are a primary mode of work [58], but many employ- ees find them frustrating and even counter-productive when good meeting practices are lacking or violated [1, 46]. The violations of general meeting norms and disrespectful behaviors have been Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. CHI ’21, May 8–13, 2021, Yokohama, Japan © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8096-6/21/05. . . $15.00 https://doi.org/10.1145/3411764.3445615 shown to be negatively correlated with meeting effectiveness and satisfaction [45]. In 2020, the “stay at home” orders and travel restrictions of the COVID-19 pandemic dramatically accelerated the use of the video-conferencing meetings for work. By March 2020, daily usage of video-conferencing services such as Zoom and Microsoft Teams had increased by 300% and 775% respectively, and video-conferencing apps jumped to the top of the Apple app store. 1 Although video-conferencing has the potential to reduce the cost and effort behind organizing travel, space, and scheduling of in-person meetings 2 , the “fractured ecologies” [37] of video- conferencing can aggravate negative outcomes and marginalize some members of the meeting [38]. Video-conferencing has consis- tently presented communicative challenges [19, 52] and introduced distractions [61] which can lead to ineffective and non-inclusive meetings [23, 30]. A primary goal of remote collaboration tools should be to support the most effective meetings possible for all participants. Cutler et al. [9] conducted a large scale e-mail survey on remote meeting effectiveness (N=4,425) at a technology com- pany (pre-COVID-19) and showed that online meeting effectiveness correlates with meeting inclusiveness, participation, and comfort in contributing. They identify a large potential financial benefit to companies that can achieve these goals, and an opportunity to establish and maintain a workplace culture in which everyone feels free to contribute. There are clearly opportunities for technological solutions that assist attendees in feeling more included and improv- ing meeting effectiveness by helping them understand their own and others’ core meeting dynamics. This paper reports on an exploratory study with in situ work teams to identify the challenges they face in video-conferencing based meetings, and proposes a post-meeting feedback system to address the issues. In particular, we aimed to provide insights on the following research questions: (1) What aspects of meetings do video- conferencing attendees need help with?; (2) How can we leverage AI systems to make video-conferencing meetings more inclusive and effective?; (3) How should AI-extracted meeting features (including content, behavioral measurements, and sentiment) be categorized and visualized in a feedback dashboard?; and (4) What concerns exist regarding data privacy and accuracy for such systems? Our work addresses these research questions through a series of user studies and design iterations. Via an initial exploratory require- ment analysis survey of 120 information workers, we identified the communicative signals (e.g., participation, facial sentiment) which are important in improving meeting effectiveness and inclusion. 1 https://www.marketwatch.com/story/zoom-microsoft-cloud-usage-are- rocketing-during-coronavirus-pandemic-new-data-show-2020-03-30 2 https://financesonline.com/video-web-conferencing-statistics/

Upload: others

Post on 23-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for SupportingEffective & Inclusive Meetings

Samiha Samrose1, Daniel McDuff2, Robert Sim2, Jina Suh2, Kael Rowan2, Javier Hernandez2, SeanRintel2, Kevin Moynihan2, Mary Czerwinski2

1University of Rochester, 2Microsoft [email protected],{damcduff,rsim,jinasuh,kael.rowan,javierh,serintel,kevinmo,marycz}@microsoft.com

ABSTRACTVideo-conferencing is essential for many companies, but its limi-tations in conveying social cues can lead to ineffective meetings.We present MeetingCoach, an intelligent post-meeting feedbackdashboard that summarizes contextual and behavioral meetinginformation. Through an exploratory survey (N=120), we identi-fied important signals (e.g., turn taking, sentiment) and used theseinsights to create a wireframe dashboard. The design was eval-uated with in situ participants (N=16) who helped identify thecomponents they would prefer in a post-meeting dashboard. Afterrecording video-conferencing meetings of eight teams over fourweeks, we developed an AI system to quantify the meeting featuresand created personalized dashboards for each participant. Throughinterviews and surveys (N=23), we found that reviewing the dash-board helped improve attendees’ awareness of meeting dynamics,with implications for improved effectiveness and inclusivity. Basedon our findings, we provide suggestions for future feedback systemdesigns of video-conferencing meetings.

CCS CONCEPTS•Human-centered computing→ Empirical studies in collab-orative and social computing.

KEYWORDSgroup, feedback, meeting, sensing, video-conferencingACM Reference Format:Samiha Samrose1, Daniel McDuff2, Robert Sim2, Jina Suh2, Kael Rowan2,Javier Hernandez2, Sean Rintel2, Kevin Moynihan2, Mary Czerwinski2. 2021.MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclu-sive Meetings. In CHI Conference on Human Factors in Computing Systems(CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA,13 pages. https://doi.org/10.1145/3411764.3445615

1 INTRODUCTIONMeetings are a primary mode of work [58], but many employ-ees find them frustrating and even counter-productive when goodmeeting practices are lacking or violated [1, 46]. The violationsof general meeting norms and disrespectful behaviors have been

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] ’21, May 8–13, 2021, Yokohama, Japan© 2021 Association for Computing Machinery.ACM ISBN 978-1-4503-8096-6/21/05. . . $15.00https://doi.org/10.1145/3411764.3445615

shown to be negatively correlated with meeting effectiveness andsatisfaction [45]. In 2020, the “stay at home” orders and travelrestrictions of the COVID-19 pandemic dramatically acceleratedthe use of the video-conferencing meetings for work. By March2020, daily usage of video-conferencing services such as Zoomand Microsoft Teams had increased by 300% and 775% respectively,and video-conferencing apps jumped to the top of the Apple appstore.1 Although video-conferencing has the potential to reducethe cost and effort behind organizing travel, space, and schedulingof in-person meetings2, the “fractured ecologies” [37] of video-conferencing can aggravate negative outcomes and marginalizesome members of the meeting [38]. Video-conferencing has consis-tently presented communicative challenges [19, 52] and introduceddistractions [61] which can lead to ineffective and non-inclusivemeetings [23, 30]. A primary goal of remote collaboration toolsshould be to support the most effective meetings possible for allparticipants. Cutler et al. [9] conducted a large scale e-mail surveyon remote meeting effectiveness (N=4,425) at a technology com-pany (pre-COVID-19) and showed that online meeting effectivenesscorrelates with meeting inclusiveness, participation, and comfortin contributing. They identify a large potential financial benefitto companies that can achieve these goals, and an opportunity toestablish and maintain a workplace culture in which everyone feelsfree to contribute. There are clearly opportunities for technologicalsolutions that assist attendees in feeling more included and improv-ing meeting effectiveness by helping them understand their ownand others’ core meeting dynamics.

This paper reports on an exploratory study with in situ workteams to identify the challenges they face in video-conferencingbased meetings, and proposes a post-meeting feedback system toaddress the issues. In particular, we aimed to provide insights on thefollowing research questions: (1)What aspects of meetings do video-conferencing attendees need help with?; (2) How canwe leverage AIsystems to make video-conferencing meetings more inclusive andeffective?; (3) How should AI-extracted meeting features (includingcontent, behavioral measurements, and sentiment) be categorizedand visualized in a feedback dashboard?; and (4) What concernsexist regarding data privacy and accuracy for such systems?

Our work addresses these research questions through a series ofuser studies and design iterations. Via an initial exploratory require-ment analysis survey of 120 information workers, we identified thecommunicative signals (e.g., participation, facial sentiment) whichare important in improving meeting effectiveness and inclusion.

1https://www.marketwatch.com/story/zoom-microsoft-cloud-usage-are-rocketing-during-coronavirus-pandemic-new-data-show-2020-03-30

2https://financesonline.com/video-web-conferencing-statistics/

Page 2: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

We conducted a longitudinal user study to record in situ video-conferencing meetings from eight teams over a four-week period.We used the insights from the requirement analysis survey to createa wireframe prototype of a post-meeting dashboard and 16 partici-pants from the user study teams evaluated the design and helped usfurther refine the components. Finally, we developed an AI sensingsystem to quantify these meeting dynamics features and createdpersonalized, interactive, post-meeting dashboards for the partic-ipants. Through surveys (N=23) and interviews (N=9), we foundthat participants were able to become more aware of the groupdynamics by reviewing the dashboard. Our study shed light on theprivacy concerns participants had regarding such insights withinthe dashboard. The dashboard also helped participants identifyand recollect important events of the past meetings. Our findingsalso showed that participants perceived that the dashboard wouldimprove meeting effectiveness and inclusivity. In addition, partici-pants expressed that actionable suggestions were more helpful thandata visualizations alone. The main contributions of this work areas follows:

• We developed MeetingCoach, an AI-driven dashboard thatprovides both contextual and actionable insights based onmeeting behaviors.

• We implemented both behavioral (e.g., participation) andtopical (e.g., questions) meeting dynamics features in ourfeedback system using state-of-the-art AI.

• We identified and implemented shared and private designapproaches for different feature components based on users’preferences from two design iterations and evaluations.

• We demonstrated that MeetingCoach helped improve behav-ioral awareness and recollection of past meeting events, andbears the potential to improve perceived effectiveness andinclusivity.

• We proposed design guidelines explaining the need for ac-tionable suggestions, reminders or highlights based on tim-ing, and multi-modal feature implementations to be adoptedfor future video-conferencing feedback systems.

2 RELATEDWORK2.1 Factors in Meeting DynamicsMeeting effectiveness includes both task processing and interactionefficiency by a team [10, 14, 31, 47, 53]. Dickinson andMcIntyre [10]emphasized the importance of goal specification, entailing identi-fication and prioritization of tasks and sub-tasks, in agendas andother meeting resources. Even with clear goals, though, interactionefficiency has a clear impact on both outcomes and satisfaction.Balanced, active, and equal participation have been found to im-prove team performance [13, 32]. Depending on the type of meeting,equal participation may not always be applicable or feasible, but ina collaborative decision-making discussion, participation from allmembers ensures at least the exchange of opinions and a sense of“being heard”, which ultimately improves team satisfaction [32, 50].Turn-taking patterns also influence team performance and satisfac-tion, as some members may dominate the discussion without real-izing they are doing so, reducing time for other members to voicetheir opinion or expertise [14]. Lawford [31] found that rapport

building through verbal and non-verbal signals was an importantfactor in effective and inclusive discussions. To ensure coordinationand rapport, affect management has been found to play an impor-tant role in a team’s success [4, 8]. Barsade [4] has shown that amember’s positivity can improve the mood of the whole team, mak-ing it more inclusive and improving the quality of decision-making.Cannon-Bowers et al. [8] discussed the importance of effective strat-egy formulation to consider alternative courses of action in caseof disagreement or task failure. Non-verbal gestures, through headnodding and shaking, indicate signs of agreement or disagreement,and levels of interest, acknowledgement, or understanding [22, 34].

While the face-to-face views of video-conferencing intuitivelyseem to support the above, they have been found to constrain atten-tion to the non-verbal signals and the overall progress of the meet-ing [19, 23]. Sellen [51] showed that having video did not improvethe interruption or the turn-taking rate for video-conferencingmeeting participants compared to audio-only ones. This impliesthat even though video is important in online meetings, it cannotfully resemble in-person meeting dynamics. Especially during longmeetings, additional support for monitoring meeting progress andparticipation may be needed. Taking into consideration these con-cerns, we designed and developed an automated feedback systemto summarize meeting content and attendee behaviors, with thegoal toward improving meeting dynamics over time.

2.2 Feedback Systems for VideoconferenceMeetings

Researchers demonstrated the impact of feedback systems on meet-ing dynamics and discussion outcomes for in-person, text chat, andvideo-conferencing meeting setups [7, 11, 24, 27, 35, 43, 48, 56].Feedback on participation [48], turn-taking [13], interruption [49],agreement [25], and valence [15], have effectively improved groupdiscussion dynamics. These studies show that the timing (e.g., real-time, post-meeting) and the design (e.g., number of features, vi-sualization strategies) of the feedback are important in effectivelymodulating collaboration behaviors in a group discussion.

Researchers explored feedback systems with affective, behav-ioral and topical group discussion features. Dimicco et al. [11]presented a real-time, shared-display feedback system measuringspeaking contributions from audio recordings, visualized as bargraphs during a co-located meeting. They showed that effectivevisualization can help improve the group discussion, even thoughshared visualization can also motivate behavioral changes due tosocial pressure [39]. Nowak et al. [44] provided feedback on vocalarousal and explored the impact on oneself and one’s partners be-havior during a negotiation-based task conducted over the phone.They found that real-time feedback can be difficult to process duringan ongoing task and can negatively impact user performance.

Therefore, even though real-time feedback has been found to beeffective in modulating behaviors during a discussion, it can also bedistracting and cognitively demanding [54, 56]. Samrose et al. [48]presented CoCo, an automated post-meeting feedback system pro-viding summarized feedback on talk-time, turn-taking, speech over-lap and sentiment through a chatbot for a video-conference groupdiscussion. They showed that post-meeting feedback can effectivelymake successive discussion more balanced. Through a longitudinalstudy in a video-conference learning environment, EMODASH [15],

Page 3: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

an interactive dashboard providing feedback on affective meet-ing features, improved behavioral awareness over time. Insteadof showing numeric or categorical feedback on linguistic features,Tausczik and Pennebaker [56] provided real-time and individualizedactionable suggestions in text chat group discussions. Their find-ings showed that individualized suggestions helped teams shapetheir behavior; however, too much information in the feedbackcan be cognitively taxing. Suggestion-oriented feedback has beenfound effective in behavior modulation [54], and could be usefulpost-meeting. As we will later explain, our design incorporated anindividualized and suggestion-oriented approach to a post-meetingfeedback system. While identifying the feedback features for ourdashboard during our requirement analysis, we prioritized featuresfrom these prior systems, such as talking time and turn-taking, butwith an eye toward reducing cognitive load.

Beyond the meeting context, researchers have developed a num-ber of interfaces that allow users to view emotional or affective sig-nals captured by self-report, diaries or sensor-based systems. Thesehave been used in several domains, such as self-reflection [21, 36,41], data exploration and information retrieval [18, 59, 60], stressmanagement [2, 20], and studying interpersonal dynamics [28, 29].Data portraits can help people understand more about themselvesand other people [12]. However, there is still a lot left to be under-stood about how to best represent complex and subjective data,such as emotions or group dynamics. Affective data is often highlydimensional, multi-modal, and continuous, all difficult when design-ing useful visualizations. There are also important privacy concernsraised by creating digital systems and artifacts that encode highlypersonal information [12].

The feedback needs of organizational teams conducting video-conferencing meetings require special attention, as these teams arerelatively stable over time and the members need additional supportin tracking the progress and the outcomes of their meetings [8, 26,33]. As video-conferencing discussions can be prone to distractionand multitasking [30], we hypothesize that a meeting feedbacksystem that helps members reflect on meeting goals and progresscould be a useful tool. Feedback on the non-verbal behaviors canalso help with effective and inclusive videoconference meetings.The Matrix to a Model of Coordinated Action (MoCA) presentedby Lee and Paine [33] is an elaborate framework that explains acomplex collaborative environment by seven dimensions. Withinthe context of MoCA, the post-meeting feedback dashboard thatwe propose is characterized as a long-term asynchronous periodicevent for small teams placed in different locations.

In this study, we observe the meeting challenges and the needsof several in situ work teams, and propose and test technologi-cal solutions for them. Meetings have evolved from engaging in-person, to fully computer-mediated (all members join remotelyvia audio/video/chat), and hybrid (some members join remotely togroup/s who are together in person), and each have their distinctcharacter [17, 19, 58]. This study focuses on the fully-computer-mediated meetings of teams in which all members were remote dueto COVID-19’s mandatory requirement to work from home. Allteams used the same video-conferencing system. We followed aniterative, human-centered design process to address our researchquestions. Our approach was comprised of two main phases. In thefirst phase, we performed a preliminary investigation via survey

to understand the current challenges and needs for online meet-ings, and gathered design requirements for our technology probe.Informed by our findings from the preliminary study, in phase 2we conducted two design iterations through a longitudinal studyof actual remote, recurring team meetings. In the following sec-tions, we describe the details of the requirements analysis and thelongitudinal studies.

3 PHASE 1: REQUIREMENT ANALYSISSURVEY STUDY

In the first phase of our work, we aimed to identify the challengesand the expected solutions through a survey study. Our goal is tounderstand how participation and inclusivity duringmeetings affectmeeting effectiveness, what social signals are relevant in the contextof meeting effectiveness, and what challenges people face in video-conferences. The findings from our survey were used to gatherrequirements and inform the design of our AI-assisted feedbacksystem that we discuss later (Section 4). The survey was approvedby the Microsoft Research Institutional Review Board (IRB).

3.1 Survey DesignThe topics of our survey questions spanned the frequency of meet-ings, meeting effectiveness, challenges during meetings, and usefulinformation and signals from meetings. We asked the participantsto self-assess the meeting effectiveness, perceived inclusivity, andparticipation in the most recent effective and ineffective meetingthat they had. We asked the participants to provide quantitativeand qualitative feedback on the importance of a variety of infor-mation for meeting effectiveness, such as social signals, meetingsummary, participation of the attendees (both time spent talkingand turn-taking), tone of the meeting, etc. Participants were askedto reflect on when this information would be useful and how itcould be presented to them (e.g., personalized view, highlight reel).

3.2 ParticipantsWe recruited survey participants through an e-mail advertisementat a large technology company. A total of 120 completed surveyresponses were collected. While participating in the survey, partic-ipants were working from home and joining meetings via video-conferencing due to COVID-19 mandatory requirements. Partici-pants reported as working in the roles of a program manager (25%),developer (23%), manager (17%), researcher (3%), and administra-tive assistant (1%). 58% of participants reported that they organized1-5 meetings per week, and 40% of participants reported that theyattended more than 15 meetings per week.

3.3 Analysis & FindingsWe used a chi-square test on quantitative responses to investi-gate differences between effective and ineffective meetings. Wealso performed a thematic analysis [6] on open-ended responsesto derive themes around topics of interest (e.g., challenges, infor-mation needs). We then categorized the responses into variousthemes and quantified their occurrence. Below, we highlight threefindings from our analysis: (1) factors that influence meeting effec-tiveness, (2) challenges participants faced in online meetings, (3) so-lutions that participants proposed or desired, and (4) privacy andtrust related concerns.

Page 4: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

A

B

Figure 1: Relationship between effective & ineffective meet-ings with respect to participation & inclusivity. (A) Fullor equal participation prevailed more in effective meet-ings (𝜒2 (1, 𝑁 = 231) = 34.89, 𝑝 < 0.001). (B) The feel-ing of inclusivity was more prominent in effective meet-ings (𝜒2 (1, 𝑁 = 240) = 21.78, 𝑝 < 0.001).

3.3.1 Meeting Effectiveness & Inclusivity. From the meetingsthat the participants reported as effective or ineffective, we com-pared the differences in self-assessed participation or feeling ofbeing included. We found that full participation was significantlymore prevalent in effective meetings in comparison with ineffectiveones (𝜒2 (1, 𝑁 = 231) = 34.89, 𝑝 < 0.001.). We also found thatthe feeling of being included was significantly more prevalent ineffective meetings in comparison with ineffective ones (𝜒2 (1, 𝑁 =

240) = 21.78, 𝑝 < 0.001). Fig. 1 shows the comparison of effectiveand ineffective meetings across participation and inclusivity.3.3.2 Challenges in Online Meetings. The top two challengesfaced in online meetings that participants reported were relatedto the connection issues (69.3%) and the meeting agenda (or lackof) (64.91%), as can be seen in Fig 2. Excluding tech issues, responsesindicate that attendees faced difficulties with maintaining agendaitems (64.91%), repetition (46.49%), reaching goals (30.7%), beingheard (self: 28.95%, others: 27.19%), negative tone (14.91%). A total of78% of the participants reported understanding social signals to bevery important in meetings and 55% of the participants expressedthat social signals are more difficult to understand in online meet-ings, even with audio/video, compared to face-to-face meetings.3.3.3 Proposed Solutions. Through thematic analysis on the re-sponses, we identified five main categories of solutions broughtup by the participants to address the aforementioned challenges.(1) Preparing for the meeting: 37% of participants wanted to have aclear agenda before the meeting and a summary of goal/outcomeafter the meeting. (2) Ensuring productive meeting dynamics: 30% of

69.3m

64.9m

46.5m

35.1m

32.5m

30.7m

29.0m

27.2m

14.9m

5.26m

Connectivity issues

Failure to stick to the agenda

Repetitious discussion of topics

Software issues

Hardware issues

Failure to reach the goals of the meeting

Difficulties in communicating - myself

Difficulties in communicating - other people

Negative tone to discussions

Other

Figure 2: Percentage of participants in our survey that re-ported experiencing challenges in workplace meetings.

participants mentioned specific features that could assist in main-taining productive meeting dynamics – 26% were participation orturn-taking related (e.g., attendees speaking as per their role in theagenda, who are participating and for how long, whether someoneis getting a chance to speak), and others included having enoughpauses for smoother turns, what questions are being asked, whatthe agreement level is. (3) Improving the usability of the tools: 22% ofparticipants asked to improve general hardware/software/usabilityrelated components of the video-conferencing platform. (4) Con-veying social cues: 14% of participants suggested keeping attendees’cameras on to understand each other’s social cues. (5) Moderatingthe meeting: 8% of participants requested some form of effectivemoderation whereas 4% brought up the need to have a dashboardthat summarizes the meeting events or notes, and the other 4%suggested having a meeting moderator who can keep the discus-sion on track and productive during the meeting. Nine out of tenparticipants expressed interest in a post-meeting feedback tool with56% asking for combined highlights of audio, video, and transcripts.3.3.4 Privacy andTrust Concerns. Participants brought up con-cerns regarding with whom the meeting data would be shared. Theydid not want sentiment as a behavioral feature to be included inthe post-meeting tool for four reasons: (1) trust of AI: suspicionregarding how sentiment computation works, its accuracy, whatthe outcome would be if the computation is wrong, etc. (“I don’tfeel that AI is good enough at sentiment.” ), (2) privacy concerns: par-ticipants felt uncomfortable sharing their own or accessing others’sentiment data (“I think it would be overly intrusive to have this datafrom other participants” ), (3) applicability: participants were notcertain how they would use sentiment information (“I’m not sure ifI’d find that useful, or how.” ), and (4) subjectivity: sentiment mightbe context-dependent and negative/neutral tone might not makea meeting ineffective (“I have some doubts on whether this can bedone accurately, given how subjective some of this information wouldbe. I would have to see an example.” ). People were intrigued to seehow the real data would look (“As long as the intelligence could trulydetermine negativity or positivity it would be good documentationand a way to think about how to flip a negative meeting to the positivewith the same group.” )

Participants expressed challenges and opportunities regardingbehavioral (e.g., participation, turn-taking) and contextual (e.g., agree-ment, questions) issues faced in online meetings. Their interest in apost-meeting tool with highlighted meeting events was also promi-nent. Based on these, we identified the features and designed apost-feedback tool, which we discuss in Section 4.

Page 5: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

4 PHASE 2: LONGITUDINAL USER STUDYThe second phase of our work involved a 4-week longitudinal userstudy. We collected in situ meeting data to develop an AI systemthat extracts salient and desired signals identified in our require-ment study (Section 3). The collected data were used to conduct twodesign iterations and evaluations of a meeting feedback systemwiththe participants, first with mocked data (Section 4.2) and secondwith real data from meetings captured during the user study (Sec-tion 4.3).

4.1 Longitudinal Study DesignOur longitudinal user study involved capturing audio and videostreams from regularly scheduled weekly meetings of teams. Wetargeted recurring meetings of teams rather than one-off meetingsbecause the members of recurring meetings have more opportuni-ties to invest in improving their meeting effectiveness. We collectedseparate audio-video stream from each participant of the meeting.We captured the meetings of in situ work teams to ensure thatthe data used for our system and the meeting feedback we pro-vided were contextually appropriate and relevant. In addition, eachmember of the participating teams provided feedback on our initialwireframe design (Section 4.2) and evaluated their own, personal-ized interactive dashboards (Section 4.3).

The participants were compensated $10 per meeting hour for tak-ing part in the data collection, and an additional $10 for participatingin an interview after the study. We recruited a total of 8 teams fromthe same company (49 participants, with an average of 6 participantsper team (𝑚𝑖𝑛 = 3,𝑚𝑎𝑥 = 10)), who consented for us to record theirweekly project meetings over 4 weeks. This second phrase studywas also approved by the IRB. All teams conducted their meetingsvia the Microsoft Teams video-conferencing platform, closely mir-roring their current workplace meeting practices: at the time of thestudy, all meetings were conducted remotely due to the COVID-19work-from-home policy. We recorded a total of 28 meetings withan average duration of 32 minutes (𝑚𝑖𝑛 = 11,𝑚𝑎𝑥 = 62).

4.2 Design and Evaluation of Meeting FeedbackWireframe

Our findings from Section 3 informed the initial design of the meet-ing feedback system.We first enumerated specific feedback featuresthat could support challenges and solutions expressed by our surveyparticipants. We then designed meeting feedback wireframe that in-cluded those feedback features. Using the prototype wireframe, weconducted a survey study with our longitudinal study participantsat the end of their 2𝑛𝑑 week of the 4-week longitudinal study. Thegoal of the survey was to evaluate the usability and effectiveness ofeach of these features and to inform which features or componentsshould appear in the next design iteration. We collected a total of16 completed evaluations from the participants. Here, we describethe details of the design of the meeting feedback wireframe and theresults of the evaluation survey of our first design.

4.2.1 Design Strategies & Feature Definitions. As our previ-ous requirement analysis survey study revealed the need for a post-meeting feedback tool with meeting highlights, we focused on de-signing a wireframe prototype with the meeting dynamics features.First, we categorized the feedback features into two classes: (1) be-havioral (e.g., participation, sentiment), and (2) topical (e.g., tran-scription, questions about the meeting content). Second, we iden-tified that participants felt more comfortable sharing the topicalcontents with others, whereas theywantedmost behavioral featuresto be kept private. We incorporated those priorities in the wire-frame dashboard design. Finally, participants expressed interest ina combination of context-based summary and action items. As such,we organized the feature visualization into three ways: (1) sum-marized: showed a quick snippet of the average measurement of afeature; (2) suggestive: provided actionable suggestions related tothe individual’s feature outcomes; and (3) temporal: presented meet-ing events in a contextualized timeline manner. For all temporalfeatures, the idea was that upon clicking on any of the event com-ponents, the corresponding video and transcript portions would be

A

B

C

D

E

F

G

H

I

J

K

L

M

Engagement Summary

Suggestion on Engagement

Tone Summary

Suggestion on Tone

History By Meeting

Transcript

Meeting Video

Question Event Markers

Agenda Distribution

Consensus Event Markers

Tone Modulation

Speech Overlap Event Marker

Speaking Pattern

Figure 3: Visualization of our wireframe dashboard design and its components.

Page 6: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

Feature Definition Type Viz. Privacy(A) Engagement Summary: The percentage of time one spoken duringthe whole meeting. It also showed the number of attendees who spoke atleast once. The intention was to help everyone feel included and heard.

Behavioral Summarized Shared

(B) Suggestion on Engagement: Tips on improving meeting engage-ment, which were updated based on whether the “Engagement” metricwas “too high”, “too low”, or “equal”. The intention was to provide specificguidance to improve group engagement in future.

Behavioral Suggestive Private

(C) Tone Summary: The percentage of time one’s sentiment remainedpositive/neutral/negative. The sentiment was measured from facial sig-nals. The intention was to grow awareness about perceived tone.

Behavioral Summarized Private

(D) Suggestion on Tone: Tips on improving one’s meeting sentiment re-lated to facial expression. The intention was to provide specific guidanceto improve individual sentiment in future.

Behavioral Suggestive Private

(E) History by Meeting: Average speaking time and sentiment in pastmeetings. The intention was to track behavior modulation over the time.

Behavioral Summarized Private

(F) Transcript: Timestamped text of what was spoken during the meet-ing by which participant. The intention was to allow reading the scriptfor connecting context to the other temporally visualized features.

Topical Temporal Shared

(G) Meeting Video: The recorded audio-video feed of the meeting. Theintention was to allow replaying the video for connecting context to theother temporally visualized features.

Topical Temporal Shared

(H) Question Event Markers: The moments when and what questionswere asked by which attendee.

Topical Temporal Shared

(I) Agenda Distribution: Identification of which agenda items werediscussed during the meeting and when.

Topical Temporal Shared

(J) Consensus Event Markers: The moments when attendees agreedor disagreed. It was measured by a combination of head-nods and head-shakes. The number represented the count of members contributing tothat agreement/disagreement.

Topical Temporal Shared

(K) Tone Modulation: Facial sentiment of the group and the individualthroughout the meeting. The intention was to be able to connect affect-rich moments with other meeting attributes, such as agenda.

Behavioral Temporal Private

(L) Speech Overlap Event Markers: The moment when two or moremembers spoke at the same time. The intention was to observe speaker’sfloor transfer, interruption, etc.

Behavioral Temporal Shared

(M) Speaking Pattern: Showed which member spoke during what por-tion of the meeting. The intention was to review turn-taking, speakingduration, agenda-wise speaker selection, etc.

Behavioral Temporal Shared

Table 1: Definitions of the features included in ourwireframedashboard. Each featurewas associatedwith behavioral or topicalinformation, some were summarized states, some temporal graphical visualizations, and others suggested actions. Those thatwere only shown to the individual are labeled as private.

played/highlighted. We organized the left panel to have the summa-rized and suggestive features, whereas the right panel presented allthe temporal features in their timelines. The wireframe prototypedesign is shown in Fig 3. We provide our primary feature definitionsand the corresponding constraints in Table 1.

4.2.2 Results & Next Steps. The responses from the evaluationsurvey showed that participants perceived the system prototypeto be important (Fig. 4A, 𝑀 = 4.5, 𝑆𝐷 = 1.62) and capable of im-pacting meeting inclusivity (Fig. 4B,𝑀 = 4.63, 𝑆𝐷 = 1.32). Notably,as the wireframe version was not interactive, the contextual infor-mation tied to the temporal feedback could not be fully explored.

Participants were interested in using the interactive version ofMeet-ingCoach and discussed the potential of the AI sensing capabilities:

P2: “I like the dashboard and the insights it provideson engagement and recommendation. The effectivenessand impact of the dashboard really depend on the defi-nition (AI algorithm) of the metrics on the dashboard.”

P8: “This would be great if you could hover over thecomponents to see what the questions, comments, andtopics were.”

Page 7: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

A

B

Figure 4: Survey responses evaluating the wireframe dash-board. (A) Perceived agreement on the impact of the dash-board on improving meeting effectiveness and inclusiv-ity. (B) Rating of the potential usability of the wireframe.

Based on these responses, we modified the implementation ofthe interactive dashboard features: (1) how each feature was de-fined (e.g., hovering over the term consensus revealed that it wascaptured from visual signals with a combination of head-nodsand head-shakes), and (2) what inherent information each featureholds (e.g., hovering over each consensus event revealed the exactdistribution of head-nod/head-shake leading to the overall con-sensus, and upon clicking on an event marker, the correspondingportion of the meeting video was played).

Among the features, speaking turn (𝑀 = 5.8, 𝑆𝐷 = 0.98), engage-ment (𝑀 = 5.53, 𝑆𝐷 = 1.32), and topic distribution (𝑀 = 5.47,𝑆𝐷 =

1.15) were perceived as the most useful features. The usefulnessof the wireframe feature components are shown in Fig. 7 (yellowbars). Participants valued the idea of quickly traversing throughspecific portions of the meeting and reviewing the discussion (“Iwould personally see more use for it (speaking turn) to jump to specificsections in a meeting recording where a specific speaker is talking” ).Some participants also asked for suggestions based on turn-takingpatterns, whereas some mentioned that it might be too contextdependent. Temporal sentiment modulation was rated the leastuseful (𝑀 = 3.73, 𝑆𝐷 = 1.24). Participants found it interesting butwere unsure how to utilize the information (“This might be useful,but not sure how to take action on it” ). However, some also expressedan interest in evaluating it with their own real data. Participantsalso suggested some modifications to the representations of thecomponents:

P7: “This (speech overlap) should be integrated into thespeaking turn graphic if it is meant to overlap.”P2: “Is it (average tone) accessible and how can I be surethe information is private and confidential to me?”

In the next step, we updated the design to integrate speakingpatterns and speech overlap into one representation. We updatedterms such as engagement to participation as used in some previousliterature, tone to your sentiment to make its scope unambiguous.Accurately identifying different agendas of the meeting and pre-cisely marking the timestamp for each of them were difficult toimplement. Another major challenge was differentiating betweensmall talk and a short agenda discussion. Therefore, we exclude topicdistribution from the final dashboard. The next section discussesthe interactive dashboard in detail.

4.3 Design and Evaluation of InteractiveFeedback Dashboard

Our evaluation of the prototype wireframe revealed the need forseveral design changes. We carried out these changes in a seconddesign iteration phase, extracted the proposed features from actualmeeting data collected from the study, and implemented the inter-active dashboard. At the end of the 4-week study, we evaluated theinteractive dashboard through a mix of semi-structured interviewsand a survey study with our study participants.

First, we conducted a 30-minute, semi-structured interview ses-sion with nine randomly selected participants, one at a time, tocapture their interaction behaviors with the dashboard. We con-ducted the interviews first so that all participants were interviewedabout their first impression of the dashboard. The interview adopted‘think aloud’ or cognitive walk-through approach to understand theinteraction patterns while answering the interview questions. Thesemi-structured questions asked the interviewees to find out whospoke most and least in the meeting, how meeting participationchanged from week to week, when meeting sentiment changed andwhether they agreed, when a question was asked by attendee-Xand whether it was answered, who interacted the most, how likelyit was that they would continue using the system, what meetinginclusivity meant to them and whether the dashboard supported it,and what further modification the dashboard needed.

Second, the participants in the study received an individualizedfeedback dashboard populated with their actual meeting data. Eachparticipant could see only their own participation history over mul-tiple meetings, average and temporal sentiment, etc., and meetingspecific shared information, such as the meeting transcript, tempo-ral group sentiment, etc. After interacting with the dashboard, theparticipants filled out a survey evaluating the system’s ability toimprove meeting awareness, effectiveness, and inclusivity. Here, wedescribe the details of the system design, feature implementation,and the evaluation results of our interactive dashboard.4.3.1 SystemDevelopment. WeusedMicrosoft Teamsmeetingsfor our data collection. A customized video recording bot3 wascreated to record the video and audio data from each participant inthemeeting separately. This bot was added to all the meetings in ourstudy and the recordings stored on a secure server. The recordingswere then processed with a series of software pipelines (shown inFig. 5) to automatically extract features related to the verbal andnon-verbal content of the meeting. After extracting the featuresignals and processing them into categorized feedback metrics, weimplemented a web-based interactive dashboard by using HTML,

3https://docs.microsoft.com/en-us/microsoftteams/platform/bots/calls-and-meetings/real-time-media-concepts

Page 8: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

Figure 5: We developed a custom Microsoft Teams Bot to record and analyze the independent audio and video streams fromeach member of an online meeting. Each video stream was processed to detect the face of the participant and then analyzeexpressions and head gestures. Each audio streamwas processed to detect voice activity of the participant and then determineturn-taking. The audio was transcribed and questions detected from the resulting text.

Video

Transcript

Sentiment

Consensus

Questions

Speaking Turn

Your Participation History by Meeting

Participation

Your Sentiment

IAA?

)N?

NA?

qN?

A?A8xIIxqAqA A8xI8xqAqA A8xqNxqAqA A9xAqxqAqA

positive negative

?

SentimentThis graph shows the modulationof your teamws and your facialemotion during the meetings:Higher score means more positivesentiment:

Example of hoverxover explainer text:

?

?

?

?

?

?

You Rest of the Team

Tina

Sarah

Abdul

Tina

Sarah

Abdul

Tina

q

Sarah: I think the app youwrereferring to was quickE right?

Agreement:Disagreement:

q Head NodlsfA Head Shakelsf

Figure 6: A screenshot of our interactive web dashboard on the left. Hover-over functionalities are shown on the right.

jQuery, D3.js, and Google Charts Developer Tool. The dashboardwas hosted in Microsoft Azure so that participants could interactwith it from anywhere. Participants received unique access IDs toview their individualized dashboards. Fig 6 shows the interface ofthe interactive dashboard. Right side of the corresponding figureshows the hover functionalities. Upon clicking a temporal event,the corresponding recorded meeting video was played.

4.3.2 Feature Implementation. After collecting the recordedmeeting audio-video feed, we used a multi-modal sensing pipeline(shown in Fig. 5) to extract the group dynamics related signals fromthe collected feed and process them into feedback features:

Transcription and Questions Detection. We used the Mi-crosoft conversation transcription API4 to extract the transcriptsof the meeting recordings. Given that the video and audio feedsof each attendee were recorded separately, we used the speech-to-text system to extract the transcript for each attendee and thensynchronized them with time stamps.

Participation & Turn-taking Detection. Our participationmetric was based on the duration for which each meeting attendeespoke during the meeting. The audio received from the recordingswas sampled at 16kHz and passed through the Microsoft Windows

4https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/conversation-transcription

Page 9: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

Voice Activity Detector (VAD) [55] which provided a Boolean valuefor every second of audio. For each attendee of a meeting, partic-ipant feature was computed as the percent average talk-time. Toimplement speaking-turn, the individually separated audio for eachattendee was analyzed and then the metrics were synchronizedusing the video timestamps.

Face and Facial Landmark Detection.We used the MicrosoftFace API5 to detect the faces in each of the video frames and applieda landmark detector to identify the eyes, nose, and mouth. Thesedata are used for the downstream components of head gesturedetection and facial sentiment classification.

Agreement/Disagreement and Consensus Detection. Weused a Hidden Markov Model (HMM) to calculate the probabil-ities of the head nod and head shake gestures. The HMM used thehead yaw rotation value over time to detect head shakes, and thehead Y-position values over time to detect head nods. Consensuswas measured whenever two or more attendees asserted eitherhead nod or head shake signals in the synchronized timeline.

Sentiment Classification. The faces were cropped from thevideo frames using the bounding box information provided by theface detector, and the resulting image-patches were sent to a fa-cial expression detection algorithm. The facial expression detectorreturned eight probabilities, one for each of the following basicemotional expressions: anger, disgust, fear, happiness, sadness, sur-prise and neutral. This is a frequently employed categorization offacial expressions; however, it is not without critics, as displays ofemotion are not uni-modal or necessarily universal [3, 16]. We usedthe publicly available perceived EmotionAPI6, allowing other re-searchers to replicate our method. The emotion detection algorithmis a Convolutional Neural Network (CNN) based on VGG-13 [5]which has been externally validated in prior work [40, 42].

Suggestion Construction. We prepared a list of suggestionsto appear based on the different levels of participation and senti-ment. Example of suggestions included “Before moving onto thenext agenda, consider checking whether everyone is on the samepage” (more than average participation), “While speaking is notalways needed, sharing your agreement or disagreement on theagenda can make the meeting effective” (less than average partic-ipation), “Continue cultivating a welcoming environment for theteam” (more positive sentiment), “Consider expressing any con-cern you may have about the agenda” (more negative sentiment),“Your webcam was off during the meeting. Turning on video canhelp your team engage better” (no sentiment data due to no videofeed), etc.4.3.3 Results. The average responses of the survey questions arepresented in Fig 8. In general, on a scale of 1-7 in which the higherthe better, participants found the system to be important (𝑀 =

4.91, 𝑆𝐷 = 1.41) and useful (𝑀 = 5.23, 𝑆𝐷 = 1.38). They agreedthat the dashboard would improve their meeting awareness (𝑀 =

5.45,𝑆𝐷 = 0.86). They perceived that the dashboard would improvemeeting effectiveness (𝑀 = 4.45, 𝑆𝐷 = 1.47) and inclusivity (𝑀 = 5.23,𝑆𝐷 = 1.23). Five participants specifically commented that they foundthe system to be useful enough to utilize across other meeting se-ries (“I really like the types of data that are presented in the dashboard,

5https://azure.microsoft.com/en-us/services/cognitive-services/face/6https://docs.microsoft.com/en-us/xamarin/xamarin-forms/data-cloud/azure-

cognitive-services/emotion-recognition

Wireframe Dashboard Interactive Dashboard

Figure 7: Comparison of the usefulness ratings of the fea-ture components in our iterative evaluation phases. Yellowand blue bars represent the ratings participants provided toeach component after reviewing the wireframe and the in-teractive dashboards, respectively. Notably, Topic Distribu-tion and Speech Overlap ratings are null for interactive dash-board as the features are excluded during iteration.

and I would be curious to use it for more meetings to help drive inclu-sivity and engagement.” ).

After interacting with the dashboard and reviewing their ownmeeting data, participants were able to draw insights from themeeting (𝑀 = 4.68, 𝑆𝐷 = 1.43). However, the impact varied acrossdifferent features. For example, participants could easily determineat which point each attendee spoke and for how long from theturn-taking feature (𝑀 = 6.00, 𝑆𝐷 = 0.82), and could deduce fromthe consensus feature who generally contributed to any decisionsmade (𝑀 = 3.45, 𝑆𝐷 = 1.22). Although the consensus feature visu-alized when an agreement/disagreement may have occurred, in-spections of the meeting video and transcript were still neededto understand exact details, such as whether the group made adecision and precisely who contributed to it.

In terms of the usefulness of the feature components, video, speak-ing turn, and participation were highly rated (shown in Fig 7 (bluebars)). Participants appreciated having the meeting video to pro-viding detailed context for the event highlights. During the semi-structured interview, participants expressed that having the videoand the transcript helped them verify the accuracy of the featuresand understand the context of that event snippet. Sentiment re-lated features received lower ratings. The interview participantsexpressed that because of the nature of any workplace meetingthe sentiment features remained mostly neutral, therefore most ofthe time there was not any specific events to focus on. However,participants also expressed that feedback might be only useful incase of a major shift towards negative sentiment.

P13: “I’d be worried about the accuracy of the sentimentmeasurements. Maybe a better understanding of what“negative” sentiment means, together with some exam-ples, would help alleviate my worry.”P23: “People just might be serious in the meeting, whichmight translate as negative, and that would not be ap-propriate.”

Participants expressed how comfortable they were with the pri-vacy strategy adopted in our system. On a scale of 1-5 (“Not at all”to “A great deal”), participants felt comfortable knowing their par-ticipation (𝑀 = 4.41, 𝑆𝐷 = 0.67) and turn-taking (𝑀 = 3.59, 𝑆𝐷 =

Page 10: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

QO: Improve my awareness of meeting behaviors,Q2: Help identify important events in a meeting,Q3: Help prepare for the next meeting in a recurring series,Q4: Improve meeting effectiveness,Q5: Improve meeting inclusivity,Q6: I think the dashboard is important,Q7: I think the dashboard is useful,Q8: Satisfied with the dashboard,Q9: Draw insights from my meetings,QON: Understand how my participation changed from week to week,QOO: Determine when each attendee spoke and for how long,QO2: Determine what each attendee said,QO3: Determine WHAT questions were askedT if any,QO4: Determine WHEN questions were askedT if any,QO5: Determine WHO asked questionsT if any,QO6: Tell if consensus was reached,QO7: Determine who contributed to decisions madeT if any,QO8: Determine if the overall sentiment in the meeting changed,

O 2 3 4 5 6 7 8 9 ON OO O2 O3 O4 O5 O6 O7 O8Question

AverageScore

O

2

3

4

5

6

7

Figure 8: User ratings evaluating the interactive dashboard

1.18) information were shared and sentiment information was pri-vate (𝑀 = 3.36, 𝑆𝐷 = 1.26).

Participants also provided recommendations for improving thedashboard. They suggested having fewer and/or customizable fea-tures: “The dashboard should be customizable so that individualscan see at a glance the things they most care about.” Participantsrequested feature refinements: “ It’s hard, for example, to identifyimportant events of a meeting because there are so many events. Onthe questions view, for example, some events just say “Yes?” and someare paragraphs of text long.” New features were also suggested: “I’dlove to see stats on interruptions as well. Maybe also stats on aggres-sive behavior, e.g., raising my voice.” To improve trust in the system,users mentioned that it could be more transparent: “I feel furtherwork is required to improve transparency, information design andlayout, and trust with the user using the system.”

For most feature components, participants expressed the need foractionable suggestions that could help them effectively modify theirdynamics in future, if needed. In our current dashboard, only twosummarized features on the left panel (participation and sentiment)had explicit suggestions. The participants thought that those briefsuggestions regarding how to improve meeting behaviors in thefuture made the feedback more impactful.

P2: “I really enjoy the actionable recommendations. Ithink it’s great to have these recommendations on thescreen. The recommendations were actually the firstthing I looked at.”P3:“At the end of a meeting you don’t want to sit andstudy data. You want one nudge. Good nudges would be... giving hints on what you could do next time to makethe meeting more inclusive.”

Our dashboard was a combination of event highlights (e.g., whatquestions were asked) and behavior patterns (e.g., length of eachspeaking turn). The findings showed that based on when feedback

was being delivered, the information need might be different. Par-ticipants mentioned that actionable suggestions would be usefulto have as a reminder right before the next meeting, whereas theevent highlights would be useful to review after the meeting.

P11: “I feel there’s lots of interesting ideas in this ‘onedashboard’ that I’d rather see in different places ratherthan all in one overview. I would rather receive an emailor a reminder just before a meeting about the informa-tion in the left column to serve as a reminder on howto improve my upcoming meeting. The right-columnlooks very useful in Teams as maybe a tab inside themeeting chat-channel? I’d personally split these two upinto separate components.”P13: “I feel like this would be more valuable to managersand new members on the team if they’re trying to figureout how they fit in. I would probably not use this forevery meeting, but I would use it to reflect to meetingsthat I want to reflect on, either if I was feeling a meetingdidn’t go well that I wanted it to, or if I wanted to lookback and remember what was said during the meeting.”

5 DISCUSSIONOur study reveals a unanimous agreement that attendees of video-conferencing meetings are in the need of feedback assistance tomake their meetings more effective and inclusive. The inabilityto understand the social cues, in association with hardware- andsoftware-related challenges, make online meetings difficult forworkplace meeting attendees. However, currently no major video-conferencing platforms incorporate such elements. There is anopportunity to help attendees improve their meeting dynamics byleveraging the audio-video feed within such platforms.

The complexity of meeting dynamics makes it challenging togenerate contextually appropriate feedback for the attendees. Weexplored the usefulness of various behavioral and topical featuresand provided the context through a temporal visualization. Partici-pants found it useful to have the recorded meeting video and thetranscript included with the dashboard, as these enabled them toquickly review the context of different feature events.

Group-based features require data from participating team mem-bers. If not everyone in the meeting agrees to be recorded andincluded in the feedback tool, these features may not fully describethe meeting dynamics. Although our work studied real workplacemeetings, we only recruited teams whose entire team agreed toparticipate in the study.We found that potential participants carriedconcerns regarding how the data would be captured and shared,who would have access to the data and the dashboard, how wouldsensitive meeting contents be excluded from external analysis, etc.In addition to our potential participants’ concerns, team membersmay feel coerced into participating in the group’s effort to improveits meeting effectiveness and inclusivity, even when one is not com-fortable with sharing their video and audio feeds. Another concernis in using this data for performance evaluations, with or withoutthe employees’ direct knowledge. Making sure that the data captureand the dashboard visualization addresses such concerns of all theattendees is key to deploying such a system. Future research shouldstudy how to appropriately capture consent and communicate therole of the feedback tool.

Page 11: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

5.1 Implications for DesignDuring the iterative process of the study, the participants men-tioned suggestions, concerns, and expectations they had regardinga post-meeting dashboard. Extrapolating from those suggestions,we suggest some key design improvements and future possibilities.5.1.1 Provide actionable suggestions. Being able to dive intothe temporal features is valuable for examining specific events andcross-matching those with themeeting context. However, analyzingso many features is potentially time-consuming and overwhelming.Based on our findings, we propose that feedback systems, especiallywith a high number of features, need to provide more actionablesuggestions that can help direct the behavior nudges.

From an implementation perspective, providing suggestions forevery meeting dynamic is challenging, as the suggestions will bynecessity be context dependent. For example, turn-taking patternswill greatly vary based on the type of the meeting (scrum, planningmeeting, presentation, etc.). To understand the meeting context, AIsystems need more information about the meeting type and theteam culture, beyond simply capturing the meeting audio-videosignals.5.1.2 Modify feedback design based on its delivery timing.Our iterative study revealed that meeting attendees have differentneeds before, after, and during the meeting. Therefore, feedbackshould be modified based on the timing. We categorize the prospec-tive delivery or timing of feedback information as below:

(a) Reminders beforeMeetings:Right before attending ameet-ing, quickly reviewing any actionable suggestions or feedback fromthe past meetings can help maintain effectiveness and inclusiv-ity. We find that participants wanted actionable suggestion as areminder immediately before meetings. These notifications canhelp individual attendees remember behavioral goals. Therefore,we propose that summarized suggestions from the last meetingcould be sent as a reminder before the next meeting of the series.

(b) Highlights after Meetings: Participants mentioned thatthe temporal information, especially regarding the topical fea-tures (e.g., transcript, questions) would be more useful when some-one missed the meeting, or needed to recollect particular informa-tion from the previous meeting. They also mentioned that exploringthe temporal behavior features (e.g., consensus) after the meetingwould be useful to evaluate what actions needed to be taken inthe next meeting. Thus, feedback systems built with the intent tohelp formulate goals for the next meeting should incorporate eventhighlights from the previous meetings.5.1.3 Provide training opportunities based on feedback froma series of meetings. The current implementation focuses on per-meeting feedback, but participant interest in splitting feedback intopre-meeting and post-meeting time periods indicates that there isthe potential for further valuable longitudinal use of the feedback.Aggregating personal and team feedback across multiple meetingsmight enable deeper learning and foster good habits for the long-term. Further, feedback from such a system could also be used topersonalize meeting training programs for both managers and indi-viduals who are seeking to measurably improve their effectivenessand inclusion.5.1.4 Incorporate multi-modal signals to provide rich feed-back. Our findings show that to get deeper insights of a feature ofgroup discussion, analyzing a single modality might not be enough.

Multi-modal analysis of a feature may not only provide a more re-fined and accurate metric, but also may achieve more trust from theuser. For example, in our implementation we measured consensusby using head nod and shake visual signals. Combining this withlanguage properties to capture whether words related to agree-ment or disagreement were used co-temporally with head nods andshakes could improve the understanding of consensus. Previouswork [56] has also used agreement analysis from a single modalityby applying Language Style Matching. Based on our findings, wepropose that future systems should incorporate multi-modal signalsto implement features for capturing group dynamics.

5.2 LimitationsThe meetings we recorded, while diverse in terms of gender and jobrole, were all from the employees of a single large technology com-pany during COVID-19 mandatory working from home restrictions.Future work will be needed to validate whether they generalizebeyond that sector and context. In addition, the teammeetings wererecurring, which meant that the study participants all knew one an-other to some degree. This may also have influenced the perceivedeffectiveness and inclusivity of the meetings, but also the comfortlevel with sharing the tone and participation information about themeetings. All participants were remote in this study, but enablingsuch a system for hybrid meetings will add a significant level ofextra complexity, both technically and in the way that people canuse the system. Being recorded for a study might have caused aHawthorne effect impacting how participants behave. However, wepresume the Hawthorne effect to be minimal because our partici-pants were already accustomed to being recorded during meetingsfor documentation purposes, and the meetings we recorded over 4weeks were recurrent meetings with already familiar colleagues. Inour study, the participants interacted with the dashboard once, butfrequent and repeated usage over time can provide better insightson the impact of such a dashboard on meeting dynamics. In ourfuture work, we would explore the dashboard usage over a longerperiod of time.

6 CONCLUSIONVideo-conferencing meetings are a standard feature of modernwork, even though meeting effectiveness and inclusivity can bediminished due to the constraints on the availability of social sig-nals. We conducted an iterative study with two phases, includinga requirements analysis and interactive system design (wireframeand interactive dashboards) to understand the effectiveness of thesetools for video-conferencing meeting attendees. Our requirementanalysis phase showed that, even though equal participation isnot always expected, participation to some degree can increasethe perceived effectiveness and inclusivity of a meeting. It alsorevealed that, in online meetings, attendees are painfully aware ofnot having access to each other’s social cues, and that there is aneed for a post-meeting dashboard to summarize meeting dynamichighlights. While some video-conferencing platforms provide somereal-time features to make up for their constraints (e.g., hand raises,thumbs up, clapping, etc.), no major commercial platform leveragesaffective signals to provide actionable meeting metrics.

In our iterative system design, we included behavioral and con-textual features in a summarized, temporal, and suggestive user

Page 12: MeetingCoach: An Intelligent Dashboard for Supporting

CHI ’21, May 8–13, 2021, Yokohama, Japan Samrose, et al.

interface dashboard. Based on the privacy concerns expressed bythe participants, we kept sentiment features private to the individ-ual, while including other features shared just among the meetingattendees (exactly as would be true of a meeting recording). Theevaluation showed that reviewing behavior history over time canimprove attendee’s awareness to the meeting dynamics. Partici-pants expressed an interest in comparing the meeting dynamicsover different meeting series and expected more actionable sugges-tions that they could use in future meetings. Based on the feedbackfrom our participants, we proposed having meeting reminders withsuggestive feedback before a meeting, and event highlights withdetailed insights after the meeting.

The kind of sensing explored in this study looks forward to afuture of human-AI partnerships. We are at the point where we caninfluence the nature of those partnerships. On the one hand, suchsystems can act as prosthetics, enabling people to do things theyotherwise could not but also setting up a relationship of dependency.On the other hand, such systems could be empowering and changethe skills and capabilities of users over time [57]. Meetings, evenwhen mediated, are valuable precisely because they leverage theimmediacy and intimacy of human social connection to achievewhat would be difficult to achieve by other means. We hope futuresystems incorporating similar kinds of affective and actionablehighlights will enable people to be more inclusive and effectiveduring meetings and, in the long-run, improve their comfort andproductivity.

REFERENCES[1] Joseph A Allen, Nale Lehmann-Willenbrock, and Steven G Rogelberg. 2015. The

Cambridge handbook of meeting science. Cambridge University Press.[2] Yadid Ayzenberg, Javier Hernandez, and RosalindW. Picard. 2012. FEEL: frequent

EDA and event logging - a mobile social interaction stress monitoring system. InExtended Abstracts of Human Factors in Computing Systems. 2357–2362.

[3] Lisa Feldman Barrett, Ralph Adolphs, Stacy Marsella, Aleix M Martinez, andSeth D Pollak. 2019. Emotional expressions reconsidered: challenges to inferringemotion from human facial movements. Psychological Science in the Public Interest20, 1 (2019), 1–68.

[4] Sigal G Barsade. 2002. The ripple effect: Emotional contagion and its influenceon group behavior. Administrative science quarterly 47, 4 (2002), 644–675.

[5] Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016.Training deep networks for facial expression recognition with crowd-sourcedlabel distribution. In Proceedings of the 18th ACM International Conference onMultimodal Interaction. 279–283.

[6] Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qualitative research in psychology 3, 2 (2006), 77–101.

[7] Dan Calacci, Oren Lederman, David Shrier, and Alex ‘Sandy’ Pentland. 2016.Breakout: An open measurement and intervention tool for distributed peerlearning groups. arXiv preprint arXiv:1607.01443 (2016).

[8] Janis A Cannon-Bowers, Scott I Tannenbaum, Eduardo Salas, and Catherine EVolpe. 1995. Defining competencies and establishing team training requirements.Team effectiveness and decision making in organizations 333 (1995), 380.

[9] Ross Cutler, Yasaman Hosseinkashi, Jamie Pool, Senja Filipi, Robert Aichner,and Johannes Gehrke. 2020. Meeting Effectiveness and Inclusiveness in RemoteCollaboration. arXiv preprint arXiv: To Appear (2020).

[10] Terry L Dickinson and Robert M McIntyre. 1997. A conceptual framework forteamwork measurement. Team performance assessment and measurement (1997),19–43.

[11] Joan Morris DiMicco, Katherine J Hollenbach, Anna Pandolfo, and Walter Bender.2007. The impact of increased awareness while face-to-face. Human–ComputerInteraction 22, 1-2 (2007), 47–96.

[12] Judith Donath, Alex Dragulescu, Aaron Zinman, Fernanda Viégas, and RebeccaXiong. 2010. Data portraits. In ACM SIGGRAPH 2010 Art Gallery. 375–383.

[13] Wen Dong, Bruno Lepri, Taemie Kim, Fabio Pianesi, and Alex Sandy Pentland.2012. Modeling conversational dynamics and performance in a social dilemmatask. In 2012 5th International Symposium on Communications, Control and SignalProcessing. IEEE, 1–4.

[14] Norah E Dunbar and Judee K Burgoon. 2005. Perceptions of power and interac-tional dominance in interpersonal relationships. Journal of Social and PersonalRelationships 22, 2 (2005), 207–233.

[15] Mohamed Ez-Zaouia, Aurélien Tabard, and Elise Lavoué. 2020. Emodash: Adashboard supporting retrospective awareness of emotions in online learning.International Journal of Human-Computer Studies 139 (2020), 102411.

[16] José-Miguel Fernández-Dols and Maria-Angeles Ruiz-Belda. 1997. Spontaneousfacial behavior during intense emotional episodes: Artistic truth and opticaltruth. In The psychology of facial expression, James A Russell and José-MiguelFernández-Dols (Eds.). Cambridge University Press, New York, NY, 255–274.

[17] C Marlene Fiol and Edward J O’Connor. 2005. Identification in face-to-face,hybrid, and pure virtual teams: Untangling the contradictions. Organizationscience 16, 1 (2005), 19–32.

[18] Michelle Gregory, Nancy Chinchor, Paul Whitney, Richard Carter, ElizabethHetzler, and Alan Turner. 2006. User-directed sentiment analysis: Visualizingthe affective content of documents. In Proceedings of the Workshop on Sentimentand Subjectivity in Text. 23–30.

[19] Steve Harrison. 2009. Media space 20+ years of mediated life. Springer Science &Business Media.

[20] J. Hernandez, X. Benavides, P. Maes, D. McDuff, J. Amores, and R.W. Picard. 2014.AutoEmotive: Bringing empathy to the driving experience to manage stress. InProceedings of the Conference on Designing Interactive Systems: Processes, Practices,Methods, and Techniques, DIS. https://doi.org/10.1145/2598784.2602780

[21] J. Hernandez, D. McDuff, R. Fletcher, and R.W. Picard. 2013. Inside-out: Reflectingon your inner state. In IEEE International Conference on Pervasive Computing andCommunications Workshops, PerCom Workshops 2013. https://doi.org/10.1109/PerComW.2013.6529507

[22] Dirk Heylen. 2005. Challenges ahead: Head movements and other social acts inconversations. Virtual Social Agents (2005), 45–52.

[23] Ellen A Isaacs and John C Tang. 1994. What video can and cannot do for collabo-ration: a case study. Multimedia systems 2, 2 (1994), 63–73.

[24] Alejandro Jaimes, Takeshi Nagamine, Jianyi Liu, Kengo Omura, and Nicu Sebe.2005. Affective meeting video analysis. In 2005 IEEE International Conference onMultimedia and Expo. IEEE, 1412–1415.

[25] Jeroen Janssen, Gijsbert Erkens, and Gellof Kanselaar. 2007. Visualization ofagreement and discussion processes during computer-supported collaborativelearning. Computers in Human Behavior 23, 3 (2007), 1105–1125.

[26] Florian Jentsch, John Barnett, Clint A Bowers, and Eduardo Salas. 1999. Whois flying this plane anyway? What mishaps tell us about crew member roleassignment and air crew situation awareness. Human Factors 41, 1 (1999), 1–14.

[27] Taemie Kim, Agnes Chang, Lindsey Holland, and Alex Sandy Pentland. 2008.Meeting mediator: enhancing group collaborationusing sociometric feedback. InProceedings of the 2008 ACM conference on Computer supported cooperative work.457–466.

[28] Uroš Krčadinac, Jelena Jovanović, and Vladan Devedžić. 2012. Visualizing theaffective structure of students interaction. In International Conference on HybridLearning. Springer, 23–34.

[29] Kostiantyn Kucher, Daniel Cernea, and Andreas Kerren. 2016. Visualizing excite-ment of individuals and groups. In Proceedings of the 2016 EmoVis Conference onEmotion and Visualization. 15–22.

[30] Anastasia Kuzminykh and Sean Rintel. 2020. Classification of Functional At-tention in Video Meetings. In Proceedings of the 2020 CHI Conference on HumanFactors in Computing Systems. 1–13.

[31] G Ross Lawford. 2003. Beyond success: Achieving synergy in teamwork. TheJournal for Quality and Participation 26, 3 (2003), 23.

[32] Harold J Leavitt. 1951. Some effects of certain communication patterns on groupperformance. The Journal of Abnormal and Social Psychology 46, 1 (1951), 38.

[33] Charlotte P Lee andDrew Paine. 2015. FromTheMatrix to aModel of CoordinatedAction (MoCA) A Conceptual Framework of and for CSCW. In Proceedings of the18th ACM conference on computer supported cooperative work & social computing.179–194.

[34] Jina Lee and Stacy C Marsella. 2010. Predicting speaker head nods and the effectsof affective information. IEEE Transactions on Multimedia 12, 6 (2010), 552–562.

[35] Gilly Leshed, Diego Perez, Jeffrey T Hancock, Dan Cosley, Jeremy Birnholtz, Soy-oung Lee, Poppy L McLeod, and Geri Gay. 2009. Visualizing real-time language-based feedback on teamwork behavior in computer-mediated groups. In Proceed-ings of the SIGCHI Conference on Human Factors in Computing Systems. 537–546.

[36] Madelene Lindström, Anna Ståhl, Kristina Höök, Petra Sundström, Jarmo Laak-solathi, Marco Combetto, Alex Taylor, and Roberto Bresin. 2006. Affective diary:designing for bodily expressiveness and self-reflection. In CHI’06 extended ab-stracts on Human factors in computing systems. 1037–1042.

[37] Paul Luff, Christian Heath, Hideaki Kuzuoka, Jon Hindmarsh, Keiichi Yamazaki,and Shinya Oyama. 2003. Fractured ecologies: creating environments for collab-oration. Human–Computer Interaction 18, 1-2 (2003), 51–84.

[38] Sandi Mann and Lynn Holdsworth. 2003. The psychological impact of telework-ing: stress, emotions and health. New Technology, Work and Employment 18, 3(2003), 196–211.

Page 13: MeetingCoach: An Intelligent Dashboard for Supporting

MeetingCoach: An Intelligent Dashboard for Supporting Effective & Inclusive Meetings CHI ’21, May 8–13, 2021, Yokohama, Japan

[39] Rob McCarney, James Warner, Steve Iliffe, Robbert Van Haselen, Mark Griffin,and Peter Fisher. 2007. The Hawthorne Effect: a randomised, controlled trial.BMC medical research methodology 7, 1 (2007), 30.

[40] Daniel McDuff, Eunice Jun, Kael Rowan, and Mary Czerwinski. 2019. Longitudi-nal Observational Evidence of the Impact of Emotion Regulation Strategies onAffective Expression. IEEE Transactions on Affective Computing (2019).

[41] Daniel McDuff, Amy Karlson, Ashish Kapoor, Asta Roseway, and Mary Czerwin-ski. 2012. AffectAura: an intelligent system for emotional memory. In Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems. 849–858.

[42] Daniel McDuff, Kael Rowan, Piali Choudhury, Jessica Wolk, ThuVan Pham, andMary Czerwinski. 2019. A Multimodal Emotion Sensing Platform for BuildingEmotion-Aware Applications. arXiv preprint arXiv:1903.12133 (2019).

[43] Poppy Lauretta McLeod, Jeffrey K Liker, and Sharon A Lobel. 1992. Processfeedback in task groups: An application of goal setting. The Journal of appliedbehavioral science 28, 1 (1992), 15–41.

[44] Michael Nowak, Juho Kim, Nam Wook Kim, and Clifford Nass. 2012. SocialVisualization and Negotiation: Effects of Feedback Configuration and Status. InProceedings of the ACM 2012 Conference on Computer Supported Cooperative Work(Seattle, Washington, USA) (CSCW ’12). ACM, New York, NY, USA, 1081–1090.https://doi.org/10.1145/2145204.2145365

[45] Isabelle Odermatt, Cornelius J König, Martin Kleinmann, Maria Bachmann, HeikoRöder, and Patricia Schmitz. 2018. Incivility in meetings: Predictors and outcomes.Journal of Business and Psychology 33, 2 (2018), 263–282.

[46] Steven G Rogelberg. 2018. The surprising science of meetings: How you can leadyour team to peak performance. Oxford University Press, USA.

[47] Marcello Russo. 2012. Diversity in goal orientation, team performance, andinternal team environment. Equality, Diversity and Inclusion: An InternationalJournal (2012).

[48] Samiha Samrose, Ru Zhao, Jeffery White, Vivian Li, Luis Nova, Yichen Lu, Mo-hammad Rafayet Ali, and Mohammed Ehsan Hoque. 2018. Coco: Collaborationcoach for understanding team dynamics during video conferencing. Proceedingsof the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018),1–24.

[49] Sanat Sarda, Martin Constable, Justin Dauwels, Shoko Dauwels, Mohamed El-gendi, Zhou Mengyu, Umer Rasheed, Yasir Tahir, Daniel Thalmann, and Nadia

Magnenat-Thalmann. 2014. Real-time feedback system for monitoring and facili-tating discussions. In Natural Interaction with Robots, Knowbots and Smartphones.Springer, 375–387.

[50] Mark Schittekatte andAlain VanHiel. 1996. Effects of partially shared informationand awareness of unshared information on information sampling. Small GroupResearch 27, 3 (1996), 431–449.

[51] Abigail J Sellen. 1995. Remote conversations: The effects of mediating talk withtechnology. Human-computer interaction 10, 4 (1995), 401–444.

[52] John Short, Ederyn Williams, and Bruce Christie. 1976. The social psychology oftelecommunications. John Wiley & Sons.

[53] Tony L Simons and Randall S Peterson. 2000. Task conflict and relationshipconflict in top management teams: The pivotal role of intragroup trust. Journalof applied psychology 85, 1 (2000), 102.

[54] M.I. Tanveer, E. Lin, and M.E. Hoque. 2015. Rhema: A Real-time in-situ Intelli-gent Interface to Help People with Public Speaking. In Proceedings of the 20thInternational Conference on Intelligent User Interfaces. 286–295.

[55] Ivan J Tashev. 2015. Offline voice activity detector using speech supergaussianity.In 2015 Information Theory and Applications Workshop (ITA). IEEE, 214–219.

[56] Yla R Tausczik and James W Pennebaker. 2013. Improving teamwork usingreal-time language feedback. In Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems. 459–468.

[57] Anja Thieme, Ed Cutrell, Cecily Morrison, Alex Taylor, and Abigail Sellen. 2020.Interpretability as a dynamic of human-AI interaction. Interactions 27, 5 (2020),40–45.

[58] Wilbert van Vree. 2019. Formalisation and Informalisation of Meeting Manners.In Civilisation and Informalisation. Springer, 291–313.

[59] Shiliang Zhang, Qingming Huang, Shuqiang Jiang, Wen Gao, and Qi Tian. 2010.Affective visualization and retrieval for music video. IEEE Transactions on Multi-media 12, 6 (2010), 510–522.

[60] M. Zisook, J. Hernandez, M.S. Goodwin, and W. Picard R. 2013. Enabling Vi-sual Exploration of Long-term Physiological Data. In IEEE Conference on VisualAnalytics Science and Technology.

[61] Annuska Zolyomi, Andrew Begel, Jennifer Frances Waldern, John Tang, MichaelBarnett, Edward Cutrell, Daniel McDuff, Sean Andrist, and Meredith RingelMorris. 2019. Managing Stress: The Needs of Autistic Adults in Video Calling.Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–29.