ena user guide - gapsedgaps.org/gaps/wp-content/uploads/ena-user-guide.pdfuser guide for epistemic...

26
U SER G UIDE FOR E PISTEMIC N ETWORK A NALYSIS W EB V ERSION 4.0 T ECHNICAL R EPORT 2016-1 DRAFT D AVID W ILLIAMSON S HAFFER DWS @ EDUCATION . WISC . EDU ABSTRACT The purpose of this report is to describe the basic features of the online Epistemic Network Analysis (ENA) tool. Key definitions are provided for the concepts in creating and analyzing conversation-based interaction data in terms of codes, conversation, and units, and examples are provided from a sample data set. E PISTEMIC G AMES G ROUP G AMES A ND P ROFESSIONAL S IMULATIONS (GAPS) T ECHNICAL REPORT SERIES

Upload: dokhanh

Post on 14-Apr-2018

254 views

Category:

Documents


19 download

TRANSCRIPT

Page 1: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

U S E R G U I D E F O R

E P I S T E M I C N E T W O R K A N A LY S I S

W E B V E R S I O N 4 .0

TECHNICAL REPORT 2016-1

DRAFT DAV I D W I L L I A M S O N SH A F F E R

D W S @ E D U C A T I O N . W I S C . E D U

ABSTRACT

The purpose of this report is to describe the basic features of the online Epistemic Network Analysis (ENA) tool. Key definitions are provided for the concepts in creating and analyzing conversation-based interaction data in terms of codes, conversation, and units, and examples are provided from a sample data set.

E P I S T E M I C G A M E S G R O U P

G A M E S A N D P R O F E S S I O N A L S I M U L A T I O N S ( G A P S ) T E C H N I C A L R E P O R T S E R I E S

Page 2: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

2

U S E R G U I D E F O R E P I S T E M I C N E T WO R K A N A LY S I S

W E B V E R S I O N 4 . 0 TECHNICAL REPORT 2016-1

A NOTE ON DATA SOURC ES

The examples and analyses included in this user guide come from the sample data set RSdata1.csv, which can be downloaded from the ENA website at (http://www.epistemicnetwork.org).

DESCRIPTION OF THE RESCUSHELL DATA SET

The RSdata1.csv data set contains the logfile entries for all student chat messages in one virtual internship, called RescuShell, in which students design a robotic exoskeleton for rescue workers. The students were 48 first-year college students enrolled in an introductory engineering course.

Some of the students in the sample used RescuShell in the first part of the semester, before they used any other virtual internship. Others in the sample used RescuShell in the second part of the semester, after they had already used another virtual internship.

During the course, each student participated in two virtual internships, or simulation games in which the students role-play as interns at an engineering firm.

For each virtual internship, students review internal technical documents from the company, conduct background research, and examine research reports based on actual experimental data. Based on their research, they develop hypotheses, test those hypotheses in the provided design space, and analyze the results in teams. Students also become knowledgeable about consultants within the company who have a stake in their design choices. These consultants value different performance metrics for the device being designed. During the final days of the internship, students present their final prototype and justify their design decisions.

Students work in Groups during the virtual internship, communicating with each other and with their supervisor via email and chat. Each student works with two groups during the internship: one in the first half and one in the second half of the game.

Students also complete a survey of their Confidence in and Commitment to Engineering at the beginning and end of each virtual internship.

The variables in the data set are provided in the appendix to this report.

FORMATTING OTHER DATA SETS

If you would like to use your own data, you can find instructions for formatting datasets into standard ENA format in Technical Report 2014-1: Formatting Data for Epistemic Network Analysis. You can also download the formatting guide from the ENA website at (http://www.epistemicnetwork.org).

Page 3: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

3

GETTING STARTED

URL

Epistemic Network Analysis Web version 4.0 can be accessed at (http://www.epistemicnetwork.org/live).

Alternatively, it can be reached through the main ENA Website (http://www.epistemicnetwork.org) by choosing “Launch” from the main menu.

SETTING UP AN ACCOUNT

To use ENA, you first need to create a user account.

1. Enter your information in the registration form on the left side of the page. You must select the checkbox indicating that you understand that you cannot upload data containing personally identifiable information about any human subjects in your data. This is a requirement of the University of Wisconsin-Madison Human Subjects Committee and a condition of use for ENA.

2. When you click the Signup button, you can access the system using your new username and password.

UPLOADING DATA

In order to analyze data, you must first upload it to your ENA account.

1. Click the upload icon and select the data file in ENA Format that you want to upload. The upload button is on the upper left side of the Create page. It looks like a cloud with an upward pointing arrow. (For more on formatting data see Technical Report 2014-1: Formatting Data for Epistemic Network Analysis, which can be downloaded from the ENA website at (http://www.epistemicnetwork.org).

2. Click the Upload button.

3. The name of your data set will appear in the left column of the Create page under Data Files.

Page 4: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

4

The Create page after uploading the RSdata1.csv sample data set, which can be downloaded from the ENA website at (http://www.epistemicnetwork.org)

A list of the Data Sets available in your ENA account is always available in the left column of the ENA Create page.

CREATING YOUR FIRST ENA SET

An ENA Set is the basic unit of an ENA analysis. It contains two key parts:

1. A Data Set that has been converted into a collection of Network Models using ENA.

2. A Loading Model (also known as a Rotation Matrix or Rotation Model) for visualizing the Network Models in the ENA set.

To create an ENA Set, select one of your Data Sets from the left column.

Page 5: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

5

When a Data Set is selected, ENA populates the Create page with the variable names from the data file.

ENA will populate the Create page with the variable names from the data file.

To create an ENA set:

1. Choose at least one variable to divide the data into units, and one variable to divide the units into Conversations.

2. Select the values of the unit and conversation variable(s) to include in your model.

When you choose a variable for units or conversations, ENA populates the middle column with all of the values of the variable you choose. If you select more than one variable, the middle column will be populated with all possible combinations of values for the variables.

By default, all of the combinations of values for variables are selected. You can select and unselect values to include in the model using the checkboxes in front of each value.

If the unit variables are not also selected as conversation variables, a moving stanza window is necessary. To use a moving stanza window, check the Stanza Window check box on the right and enter the number of previous utterances you want to be included in a moving stanza window. Below is a brief overview of moving stanza windows, for a more in depth description read the Siebert-Evenstone et. al. 2016 paper.

There are two primary ways of modeling conversations in ENA, using Strophes or Moving Stanza Windows. In both cases, ENA models connections among concepts: (I) by identifying coherent topics, activities, and/or conversations in the data as strophes; and (II) by defining collections of utterances within strophes that are related to one another as stanzas. The two methods differ in the relationship between strophes and stanzas. Specifically:

I. The Strophe Method models connections within an entire activity or strophe: that is, all the utterances within an activity are related to one another. Or, equivalently, each strophe is composed of a single stanza.

II. The Moving Stanza Window Method models connections within an activity or strophe by dividing the strophe into multiple stanzas: that is, utterances are related to one another only within some designated stanza window. In other words, the moving stanza window method models connections only when utterances are in close temporal proximity within a strophe.

Page 6: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

6

For example of consider the coded data from one activity (a). The moving stanza window method analyzes connections within the referent utterance and between the referent utterance and the window (b). After analyzing a window, the moving stanza method slides to the next utterance and repeats the process of finding connections within and between the referent utterance and the window. The strophe method analyzes all connections in an activity (d).

(a) (b) (c) (d)

3. Choose the variables that you want to use as Codes in the model. Multiple codes can be chosen by holding down the shift key while selecting variables.

4. Provide a Set Name for the ENA set.

5. Click the Create ENA Set button.

How the Create page looks when variables have been chosen to create an ENA set.

HOW TO CHOOSE VARIABLES FOR UNITS, CONVERSATIONS, AND CODES

ENA is designed to build Network Models from conversation-based interaction data.

Conversation-based interaction data is information about a set of units, the way they relate to one another, and a series of conversations which reveal evidence about the relations between the units:

Page 7: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

7

1. Units can refer to people, concepts, or anything whose network of connections is being modeled.

2. Relations between units can refer to associations like strength of social tie, conceptual similarity, any connection, interaction, or association that links one unit to another.

3. Conversations can be units of time, steps in a process, or any way of identifying a unit in the data for quantifying relations between units.

4. Evidence refers to any specific elements of the data that can be used to identify the relations being modeled.

Data in ENA Format represents this information as separate variables, or columns in a .csv file. There are four types of variables in ENA Format.

UNIT VARIABLES

Unit variables distinguish the units of analysis for which ENA will build Network Models: for example, networks from different people, or from people in different conditions. For each row of data in the Data Set, the unit variables indicate to which unit the row belongs.

CONVERSATION VARIABLES

Conceptually, the key idea behind a conversation is that:

1. Units in lines anywhere within the same conversation are related to one another in the model.

2. Units in lines that are not in the same conversation are not related to one another in the model.

A conversation variable is a variable that indicates, for every row of data in each unit in the Data Set, to which conversation it belongs.

CODE VARIABLES

In ENA format, the relation information is represented in a series of columns that are referred to as Codes such that:

1. Each unit in the model is assigned a single code column.

2. The value in the code column for unit A, B, or C indicates, for each line in the data, whether that unit is being related to some other unit.

3. Every line in the data has a value for each code.

Typically, the values are binary (0 or 1). Fractional values or other weights can be used, but that is discussed later in this report.

RAW DATA VARIABLES

In addition to Unit, Conversations, and Code variables, an ENA formatted data set may contain any other columns of identifying information about the data. In particular, the Raw Data that was coded with the units of interest in the model can be included in the data set. This might include a column for the raw data excerpts, for the speaker (in the case of discourse data), time of the action coded, and so on.

Page 8: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

8

Raw Data variables do not need to be selected when ENA sets are created—they are automatically included in the final ENA set.

AN EXAMPLE ENA SET

To give an example of an ENA set using the RSdata1.csv sample data, we might choose the following:

Unit Variable UserName ENA will construct a Network Model for each student in the

Data Set

Conversation Variables

ActivityNumber, GroupName

ENA will construct a Network Model based on the co-

occurrences of Codes for each group within each activity. Both

variables are needed to ensure utterances from different groups

and activities are not modeled together.

Code Variables

E.data, S.data, E.design, S.design, S.professional, E.client, V.client, E.consultant,

V.consultant, S.collaboration, I.engineer, I.intern, K.actuator, K.rom, K.materials,

K.power, K.sensor, K.designspecs, K.attribute, K.data, K.design

ENA will construct a Network Model based on the co-

occurrences of these variables by group in each activity.

This set requires a moving stanza window. For this example, a stanza window size of five will be used. We might call this ENA set “students.by.activity.allcodes” to indicate how it was created.

HOW ENA CONSTRUCTS AN ENA SET

The full details of how ENA constructs an ENA set are beyond the scope of this user guide.

Briefly, however:

1. For each Unit in the Data Set, ENA constructs a cumulative adjacency matrix where each cell i,j represents the number of conversations in the data for the Unit that contain both code i and code j.

2. Each of cumulative adjacency matrices is converted into an adjacency vector in a high dimensional Analytic space of n choose 2 dimensions, where n is the number of codes (or units) in the original Data Set.

3. Each adjacency vector is converted to a normed adjacency vector by dividing by its length. This maps Units that have the same pattern of connections but different numbers of conversations to the same point in the space. That is, if two people use the same pattern of discourse, but one talks for longer, they will map to the same point in the space.

Page 9: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

9

4. ENA uses singular value decomposition of the Analytic space to produce a Projection Space that maximizes the variance among the adjacency vectors.

5. The singular value decomposition produces a rotation matrix that is used to represent each Unit in the Data Set as a Network Model in the Projection Space. If R is the rotation matrix for an ENA set, the location of the Network Model corresponding to the normed adjacency vector V is given by V x R.

ANALYZING YOUR FIRST ENA SET

When you click the Create ENA set button, ENA shows a progress update while it is constructing the ENA set. Once the set is complete, ENA progresses to the Sets page so you can begin your analysis.

You can return to the create page by click on its link in the upper right corner of the ENA window.

The ENA sets you have created appear in the left column of the Sets page. When you choose a set, the selections that you used to create it on the Create page are displayed in the second column of the Sets page. You can also export ENA data spaces from the Sets Page. To export your ENA space, click the download icon to the right of the ENA set you wish to download. ENA sets are exported in an R data file for analysis in R Statistical Computing software.

To analyze a set:

1. Select the set from the left column of the Sets page.

2. Click the Plot button on the upper left side of the Sets page.

ENA will open a new Analysis window in your browser. You may need to “allow” the site to open pop-up windows.

Page 10: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

10

EXAMINING THE NETWORK SPACE

When the Analysis window opens, ENA shows a Data tab with a Selection tree containing all of the Network Models on the left panel of the window.

1. You can include or exclude points by marking their checkboxes in the Selection tree.

You can show or hide labels for the points using the checkbox in the Show/Hide section at the bottom of the right-side panel of the window.

2. At the center of the window, ENA shows a plot that represents to positions of each of the Network Models relative to each other in the Projection Space.

The Projection Space is a 2-dimensional representation of the location of each of the Networks in the high dimensional Analysis space. (More details on ENA network analysis are provided below.)

For example, keegan q and christina b’s Network Models are close to each other in the Projection Space below, while mitchell h’s Network Model is further away. This suggests that keegan q and christina b’s Network Models are more similar to one another than they are to mitchell h’s Network.

Page 11: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

11

EXAMINING THE NETWORK MODELS

We can examine each of these Network Models in more detail to understand the relationship between the chat discourse of these players.

Click on one of the points representing a player’s Network Model. ENA creates an Equiload tab containing the Network Model of the point you selected. The resulting Network Model shows connections between the codes (e.g. K.power and S.professional) used to analyze the RescuShell data set as a function of their co-occurrence in the discourse. The following Network Model shows connections for the player keegan q. Strength of connections is indicated by the thickness and saturation of the lines between codes. For example, we can see based on the thickness and saturation of connections in keegan q’s Network Model that keegan q made more connections between K.design and K.data than he did between S.collaboration and I.engineer.

To select a second Network for comparison:

Page 12: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

12

1. Hide the first Network Model by unselecting the checkbox for the Equiload tab at the top of the window.

2. In the Camera section of the right panel of the Analysis window, click the Data button next to Zoom to.

3. Select another point.

You can now compare Equiload Projections of the Network Models by showing and hiding them using the checkboxes for Equiload tabs at the top of the window. If you select multiple Equiload tabs to display, ENA superimposes them on top of each other for comparison.

INTERPRETING THE NETWORK SPACE

You can now use the Equiload Projections to interpret the different Network Models in the Network space.

For example, here are the three Equiload Projections of the Network Models for mitchell h, keegan q, and christina b:

mitchell h keegan q christina b

Page 13: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

13

Keegan q and christina b’s networks are, indeed, more similar to one another than they are to mitchell h’s, as we predicted based on the position of the points corresponding to their networks in the Projection Space:

The Equiload Projections of the Network Models also suggest what in particular is different about the chat discourse of these three students. In particular, both keegan q and christina b connected their design thinking with data and attributes much more than mitchell h while mitchel h connected his thinking on power and sensors more than the other two.

In general, we would expect that Network Models that are located further down in the Projection Space would be from students who made more connections to explicit decisions based on design criteria.

Similarly, we might notice that all three of these students have points located on the center of the x-axis in the Projection Space. Thus we see that they made connections both to collaboration and professionalism in their chat discussions (on the left hand side of the Equiload models) and to understanding of, use of, and decision-making based on data (on the right hand side of the Equiload models).

We can thus interpret the Projection Space by saying that:

1. Points to the right in the Projection Space correspond to Network Models of students who made a lot of connections to data in their discourse.

2. Points to the left in the Projection Space correspond to Network Models of students who made a lot of connections to professionalism and collaboration in their discourse.

3. Points to the bottom of the Projection Space correspond to Network Models of students who made a lot of connections to decisions based explicitly on design criteria in their discourse.

Page 14: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

14

4. Points to the top of the Projection Space correspond to Network Models of students who made a lot of connections to specific features of the device being designed in their discourse.

Finally, the numbers in parentheses after the axis labels indicate the total amount of variance among Network Models in the ENA set that is accounted for by each dimension. Thus, the “Professionalism v Data” axis accounts for 25% of the variance in discourse patterns. The “Device Details v Design Criteria” accounted for 21% of the variance in discourse patterns.

SUBTRACTING NETWORKS

Visual inspection of individual Network Models sometimes shows differences between networks, but using a Subtracted Model can help visualize these differences more clearly. A Subtracted Model simply subtracts the strengths of node connections in one Network Model from another. To make a Subtracted Model, select which two Network Models you would like to compare in the upper left corner of the ENA tool where it says “Subtract Equiloads” (indicated by the red rectangle). The two drop menus above this button will allow you to select which Network Models you are comparing. In this case, we will compare mitchell h against christinia b:

This subtracted view shows us that mitchell h and christina b’s respective networks are stratified across the x-axis. Mitchell h’s comparatively greater connections are shown in red with christinia b’s in green. Analytically, we can conclude that mitchell h made more connections in the discourse between K.sensor and K.power and christina b made more connections between K.data and K.actuator.

EXAMINING THE CONNEC TIONS IN A NETWORK MODEL

If you want to understand what it means that two codes or units are connected in a Network Model:

1. Click on any of the lines in an Equiload Projection.

Page 15: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

15

ENA will open a new Utterance window.

2. Choose the variable (column) in the data that contains the raw data you want to examine for connections using the dropdown menu at the top of the Utterance window.

In the case of RescuShell, this would be the “text” column.

ENA shows all of the rows of data, colored according to the codes that are present.

So, for example, if we click on the K.actuator—K.data link in brandon’s Network Model, we would see this:

The top of the window indicates what ENA set and Network Model (unit name) is being shown, as well as the specific codes that are being linked in the data.

The codes are colored such that excerpts in the data with one code are shown in red, and with the other are shown in blue. Excerpts that have both codes have both red and blue dots.

The data is divided by conversation.

Uncoded excerpts are hidden by default, but can be shown by clicking the ellipses in the data.

The “link” between two codes in a Network Model is based on how many conversations contain excerpts with both codes in them. For example, let’s say we set our moving stanza window to four. This means that ENA will run through each of your codes and calculate the number of other codes that occur in the four excerpts preceding every instance of a code. When there is a co-occurrence, ENA treats this as a connection. The more connections there are, the stronger the “link” between two codes or nodes within a Network Model.

Page 16: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

16

MORE FEATURES FOR INTERPRETING EQUILOADS

LINE THICKNESS AND THRESHOLDING

The thickness and saturation of lines in the Equiload Projection of a Network Model are proportional to the strength of association in the network that each line represents. The line connecting two nodes, A and B, is proportional to the strength of association in the normed adjacency vector, VAB.

By default, ENA sets the thickest line (and highest saturation) equal to the largest strength of association of any Network Model in the ENA set, and the thinnest line (and lowest saturation) to the smallest strength of association of any Network in the ENA set.

Line thickness can be optimized for viewing differences in strength of association for a single Network Model using the Thickness minimum and maximum in the Equiload Settings section on the right panel of the Analysis window.

Line thickness can also be adjusted by setting the maximum and minimum color manually in the same section of the right panel. This can be particularly useful if one hides weak connections in a Network Model (by setting a higher minimum color level) to show the strongest connections more clearly.

The orange Defaults button returns line thicknesses to their default setting.

All of the Equiload Projections in the same Analysis window have the same scaling so that Network Models can be accurately compared.

ABOUT THE EQUILOAD PROJECTION

The critical property of an ENA analysis of Network Models is that a Projection Space (that shows each Network Model as a point) is related to the Equiload Projections of individual Network Models in the following ways:

1. The nodes of the network in the Equiload Projection (which correspond to the codes in the Data Set or units being related in the network) are located in the same place in the space for all networks in the ENA set.

This means that two Network Models can be compared by comparing the structure of their edges (the lines connecting the nodes) in their Equiload Projections.

2. The center of mass of the Equiload Projection of a Network Model approximates the location of the point that represents the Network in the Projection Space.

The center of mass is the center of a distribution of mass in space where the weighted position vectors of the units sum to zero. Put another way, the center of mass is the mean location of a distribution of mass in space. In this case, the center of mass is the mean of the edge weights of the Network Model distributed according to the Equiload Projection in space.

Page 17: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

17

Point CM is the center of mass of the network with edge weights illustrated by the thickness of the lines connecting the nodes of the network (Points A, B, C, and D).

ENA’s Analysis window shows this relationship between Equiload Projections and the Projection Space by superimposing Equiload Projections on top of the points in the Projection Space. You can see this in the following Analysis window, which zooms the camera in on the original data points in the space, but also shows the Equiload Projection for one point—the small black point just above the x axis on the left side. Note that this dot is on the left of the distribution of points because its Equiload has strong connections to nodes to the left of the projections space (S.collaboration, which is outside the viewing window to the left).

Page 18: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

18

NETWORK OPTIMIZATION INTERPRETIVE CORRELATION MEASURES

The nodes are positioned in the projection space as to maximize the interpretive correlation, a measure of the accuracy of network projections given the stress of dimensional reduction. The interpretive correlation is a measure of the correspondence between the projected points and the centers of mass of their network representations. For each pair of units in the data set ENA computes the signed differences between the projected points and associated centroids. The interpretive correlation is then the correlation between the signed differences of each pair of projected points and the signed differences of each pair of centroids. This interpretive correlations for any ENA set can be checked with the ‘calculate correlations’ button. Spearman and Pearson correlation coefficients are reported.

Page 19: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

19

CAMERA POSITION AND ZOOM

Because the Analysis window is showing coordinated representations of both the Projection Space and the Equiloads, ENA provides the ability to zoom the plot window to optimize viewing of each representation.

1. When the Analysis window is first opened, it is optimized for viewing data points representing Network Models in the Projection Space.

2. Once an Equiload is created, the Analysis window automatically zooms out to optimize viewing the Equiload Projection.

3. The Zoom to Data button in the Camera section of the right panel of the Analysis window sets the camera to optimize viewing of data.

4. The Zoom to Equiload button in the Camera section of the right panel of the Analysis window sets the camera to optimize viewing Equiloads.

MORE FEATURES F OR COMPARING NETWORK MODEL S

ENA provides some basic statistical support for comparing collections of Network Models.

GROUP MEANS

To show the means of a group of points:

1. Select the points that compose the first group of Network Models in the Data tab.

Page 20: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

20

2. Click the checkbox next to Mean in the Show/Hide section of the right side panel in the Analysis window.

A square will appear in the plot indicating the location of the mean of the points selected.

MEAN EQUILOADS

Click on the square representing the mean of a group of points to show the mean Equiload for the group, which shows the mean strength of association for each pair of codes (units) in the model across all of the Network Models in the group.

COMPARING MEANS

To compare means between two groups:

1. Create a second Data tab by clicking on the + in the tab row at the top of the Analysis window.

2. Select points that compose the second group of Network Models in the new data tab.

3. Click the checkbox next to Mean in the Show/Hide section of the right side panel in the Analysis window.

This will show a second square corresponding to the location of the mean of the second set of points.

You can show the 95% confidence interval for the location of means by selecting the corresponding checkbox in the Show/Hide tab in the right panel of the Analysis window.

T-TESTS

Once you have used data tabs to select two groups of Network Models, you can compare the groups using a t-test.

1. On the left panel of the Analysis window, scroll down to find two drop down menus labeled “Sample 1” and “Sample 2.”

2. Select the Data tabs that correspond to the two groups you want to compare in the drop down menus.

3. Click the “Run” button. The t statistic, p value, and Cohen’s d will be displayed below the drop down menus.

ITEM BANK AND PRIORITY LIST

ENA preserves all of the variables in the Data Set that are uniquely associated with the Units in the ENA set. So, for example, if the data contains a variable for Pretest Score (like the variable CONFIDENCE.Pre in the sample Data Set) or gender that is the same in every row of the data a given student, when a set is created, ENA associates this additional metadata with the Network Models in the set.

This additional data appears in the Item Bank in the upper part of the left panel in the Analysis window.

Page 21: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

21

Metadata from the item bank can be dragged to the Priority List (also in the upper part of the left panel of the Analysis window) and used to group and sort the Network Models in the ENA set.

The Analysis window below, for example, shows the variable C.change (indicating change from Pre to Post in survey items indicating commitment and confidence in engineering) has been dragged to the Priority List and used to sort the data in the RescuShell sample Data Set and plot groups of points. The plot shows the mean and confidence interval for each group, showing that the two groups (positive v negative change) are different along the x dimension of the Projection Space.

MORE FEATURES FOR CR EATING ENA SETS

WEIGHTED DATA

While many Data Sets used in ENA analyses have binary codes (including the sample set) there are occasions where codes indicate not the presence or absences of a unit of interest in the data, but the probability or magnitude of the unit of interest in the data.

In these cases, weighted ENA models are often more appropriate.

Weighted models can be specified on the Create page by choosing the Weighted option on the right side of the page.

When a weighted model is chosen, each conversation indicates not the presence or absence of each code, but the sum of the values for each code across all rows in the conversation. The adjacency matrix for each conversation then represents not the co-occurrence of each pair of codes ij, but the product of the value of code i and code j in the conversation.

All other aspects of a weighted model are the same as for a binary model.

Page 22: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

22

TRAJECTORIES

Because discourse unfolds over time, it is often useful to create ENA sets that reflect a trajectory of Network Models for each Unit in the Data Set. To construct a trajectory set:

1. On the Create page, turn on trajectories at the bottom of the page. Also, select “Advanced” on the right side of the trajectory box.

2. Choose a variable or variables that determine how to set data points within the trajectory. Often an easy choice is to use the same variable that determines conversations in the ENA set. When Basic is selected in place of Advanced the conversation variables are selected by default. The result of variable choice will show how the Network Model changes conversation-by-conversation for each unit.

3. ENA defaults to an accumulated trajectory, meaning that the Network Model at point 3 in the trajectory includes the strength of association at point 2 and adds to it the new associations at point 3.

4. To show independent Network Models at each point in the trajectory, choose the radio button labeled Separate. Each point in the trajectory for each unit will be computed independently.

The Create page below illustrates the creation of a Trajectory set. The Analysis window that follows shows the trajectories of two students in the RescuShell data set.

Page 23: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

23

ENA can show Equiload Projections for any point in a trajectory in a trajectory model.

It can also display mean trajectories and confidence intervals for mean trajectories in a Trajectory set. It does this in the same way that it shows means and confidence intervals for groups of data points.

However, it is not currently possible to display Equiload Projections for points in a mean trajectory.

PROJECTS

For those who plan on uploading and analyzing multiple data sets, ENA provides the ability to create separate projects to hold different Data and ENA sets. To create a new project or to change projects, click on the folder icon in the upper left corner of the Create page.

ADVANCED FEATURE: LOADING SETS

LOADING SETS

Every ENA set is created with its Projection Space. The Projection Space for a set maximizes the variance in the Network Models along the dimensions shown in the Analysis window.

However, there are times when it is useful to display Network Models from one ENA set in the Projection Space of another. For example: if a Projection Space is created from a group of experts and novices doing a task, it is possible to project new data into the space to determine whether the Network Models from new participants are more like known experts or known novices.

In these cases, the ENA set from which the Projection Space comes is called the Loading set because the Projection Space is determined by a rotation matrix from singular value decomposition, and thus functions like the loadings in a principal components analysis.

In general, ENA set A can be used as a Loading set for ENA set B as long as both sets were created using the same collection of codes.

Page 24: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

24

To show ENA set B in the Projection Space of ENA set A:

1. Choose set B on the Sets page and click the Plot button.

2. Click on the set name under Loading Set in the Axes section of the right panel of the Analysis window.

3. All of the ENA sets in the current Project that were made using the same codes as set B will appear in a drop-down menu.

4. Choose set A from the list.

The Analysis window will reset the Axes and data points to show the Projection Space from set A.

APPENDIX: VARIABLES IN THE RESCUSHEL L DATA SET

VARIABLE CONTENTS TYPE

UserName Name of student Text

Condition

Indicates whether Student used RescuShell first or second. If Student used RescuShell first, they only participated in one virtual internship. If Student used RescuShell second, they participated in

two virtual internships.

Categorical

CONFIDENCE.Pre Student’s Confidence and Commitment score before the internship

Numerical

CONFIDENCE.Post Student’s Confidence and Commitment score after the internship Numerical

CONFIDENCE.Change Change in Student’s Confidence and Commitment score during the internship (Post - Pre)

Numerical

C.Level.Pre Categorical variable indicating whether Student’s initial confidence score was above the mean for the group

Categorical

C.Change Categorical variable indicating whether Student’s confidence score increased during the internship

Categorical

Timestamp Time at which the Chat message occurred Time

ActivityNumber Indicates in which activity in the game the chat took place Numerical

GroupName Indicates to which group the Student belonged when the chat was sent

Text

GameHalf Indicates in which half of the game the chat was sent Categorical

GameDay Indicates on which day of game play the chat was sent Numerical

text The actual text of the chat Text

E.data Chat contains evidence that the student is making a decision based on data

Binary

Page 25: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

25

S.data Chat contains evidence that the student is using data Binary

E.design Chat contains evidence that the student is making a decision based on the design process

Binary

S.design Chat contains evidence that the student is using the design process Binary

S.professional Chat contains evidence that the student is using professional tools or skills

Binary

E.client Chat contains evidence that the student is making a decision based on the needs of the client

Binary

V.client Chat contains evidence that the student is valuing the needs of the client

Binary

E.consultant Chat contains evidence that the student is making a decision based on the criteria of the company’s consultants

Binary

V.consultant Chat contains evidence that the student is valuing the criteria of the company’s consultants

Binary

S.collaboration Chat contains evidence that the student is collaborating Binary

I.engineer Chat contains evidence that the student sees him or herself as an engineer

Binary

I.intern Chat contains evidence that the student sees him or herself as an intern

Binary

K.actuator Chat contains evidence that the student understands actuators, a component of the exoskeleton

Binary

K.rom Chat contains evidence that the student understands range of motion, a feature of the exoskeleton

Binary

K.materials Chat contains evidence that the student understands the materials being used in the exoskeleton

Binary

K.power Chat contains evidence that the student understands power sources being used in the exoskeleton

Binary

K.sensor Chat contains evidence that the student understands the sensors being used in the exoskeleton

Binary

K.attribute Chat contains evidence that the student understands the evaluation criteria for the exoskeleton

Binary

K.data Chat contains evidence that the student understands data being used in the design process

Binary

K.design Chat contains evidence that the student understands the design process

Binary

Page 26: ENA User Guide - GAPSedgaps.org/gaps/wp-content/uploads/ENA-User-Guide.pdfUSER GUIDE FOR EPISTEMIC NETWORK ... The examples and analyses included in this user guide come from the sample

26

ACKNOWLEDGMENTS

This work was funded in part by the National Science Foundation (DRL-0918409, DRL-0946372, DRL-1247262, DRL-1418288, DUE-0919347, DUE-1225885, EEC-1232656, EEC-1340402, REC-0347000), the MacArthur Foundation, the Spencer Foundation, the Wisconsin Alumni Research Foundation, and the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison. The opinions, findings, and conclusions do not reflect the views of the funding agencies, cooperating institutions, or other individuals.