understand my steps - using edge-bundling to visualize gps tracks · understand my steps - using...

10
Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil ecnico Lisboa Lisbon, Portugal [email protected] Abstract—The number of devices capable of recording GPS data is growing, and so more people are interested in collecting and analysing their geospatial data. However, GPS analysis and visualization techniques are not simple and easy to use, specially when one needs to deal with large quantities of geospatial data. Time is a relevant parameter when analysing movement, and having it together with geospatial data in the same visualization is what we aimed for this work. We wanted to explore a solution to visualize sets of geospatial data featuring temporal information. In order to make it easier to analyse big amounts of GPS data, we developed and studied a set of prototypes as proof of concept of trajectories visualization, also presenting time, and using edge-bundling techniques. We also developed an algorithm to process collected GPS data and prepare it to be visualized. We established that we wanted our prototypes to be able to show multiple trajectories, allowing to understand the start, end and intermediary points of each one, including points that different trajectories had in common, as well as presenting temporal information about each trajectory. Prototypes were tested with users and the results’ evaluation showed that users were able to use and understand information presented by the visualizations, hence, our goal to this work was achieved. Index Terms—Trajectory visualization; Spatio-temporal data; edge-bundling I. I NTRODUCTION The ”key to good visualisation design is the appropriate visual encoding of data for the data characteristics and the task at hand” [1]. Nowadays, there is a huge amount of data, and data about almost everything. Geospatial data is an example of data that has been growing through the years. What is contributing to this growth of data is that, now, everybody can create data. While, in the past, data could only be recorded with expensive and specific devices, now, electronic devices (like smartwatches or smartphones, for example) can record information about the Global Positioning System (GPS) positions of their owners. Those devices are becoming more and more popular, creating a big quantity of data that needs satisfactory ways to be visualized. This data does not only collect GPS positions (geospatial data) but it can also record time, associating it with the positions (temporal data). Therefore, we consider this data to be geo-temporal. A user might not want to deal with collected GPS data and try to find interesting information about them, because it is too complicated. Geo-temporal data, per se, is not easy to understand, users need to visualize it in a way that they can be able to make some conclusions from that visualization. It can either be just for entertainment purposes, for helping in planning future journeys or even remembering events from the past. With the proper tools, everybody can be a visual analyst [2]. When visualizing a track, which can easily be repre- sented by a line over a map, a user wants to be able to understand where that track starts, where it ends and where it has passed by. But if this action is multiplied by a big quantity of tracks, all crossing the same streets, it might not be that easy to understand all this information about each track separately, nor to gather conclusions about similarity of tracks, like points in common between trajectories. A visualization of movement must take into account time and space, because those are the two measures that define a movement. Space is a set of locations with some distance separating them, and it can be seen as an area, a line containing locations or a set of points in different locations. Time is a continuous set, but it can also be discrete, when referring to events, and it is linear and cyclic at the same time. A visualization that only gives information about space does not allow a user to know when a trajectory occurred. At the same time, a visualization that focuses much more on time, might not be clear about how the trajectory and displacement really occurred. To visualize geospatial data, maps are used with some additional features, that present the information. These features can be glyphs over the map, colour encoded values to colour the map itself or even by arrows and lines, usually if one wants to represent trajectories. Currently, there are ways to visualize both time and space, however, they are not accessible for all types of users, with all types of knowledge. Options like space- time cubes or geospatial visualizations with complemen- tary temporal visualizations are some examples, but each technique has limitations, and it is not always simple to decide which is the best technique, since it might depend on the data to be presented. A common solution might be to use a combination of visualization techniques, allowing the analyst to reach conclusions, one way or another.

Upload: others

Post on 09-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

Understand My Steps - Using Edge-bundling toVisualize GPS Tracks

Daniel SilTecnico Lisboa

Lisbon, [email protected]

Abstract—The number of devices capable of recordingGPS data is growing, and so more people are interested incollecting and analysing their geospatial data. However, GPSanalysis and visualization techniques are not simple and easyto use, specially when one needs to deal with large quantitiesof geospatial data.

Time is a relevant parameter when analysing movement,and having it together with geospatial data in the samevisualization is what we aimed for this work. We wanted toexplore a solution to visualize sets of geospatial data featuringtemporal information.

In order to make it easier to analyse big amounts of GPSdata, we developed and studied a set of prototypes as proofof concept of trajectories visualization, also presenting time,and using edge-bundling techniques. We also developed analgorithm to process collected GPS data and prepare it to bevisualized.

We established that we wanted our prototypes to be able toshow multiple trajectories, allowing to understand the start,end and intermediary points of each one, including points thatdifferent trajectories had in common, as well as presentingtemporal information about each trajectory. Prototypes weretested with users and the results’ evaluation showed that userswere able to use and understand information presented bythe visualizations, hence, our goal to this work was achieved.

Index Terms—Trajectory visualization; Spatio-temporaldata; edge-bundling

I. INTRODUCTION

The ”key to good visualisation design is the appropriatevisual encoding of data for the data characteristics andthe task at hand” [1]. Nowadays, there is a huge amountof data, and data about almost everything. Geospatial datais an example of data that has been growing through theyears. What is contributing to this growth of data is that,now, everybody can create data. While, in the past, datacould only be recorded with expensive and specific devices,now, electronic devices (like smartwatches or smartphones,for example) can record information about the GlobalPositioning System (GPS) positions of their owners. Thosedevices are becoming more and more popular, creating abig quantity of data that needs satisfactory ways to bevisualized. This data does not only collect GPS positions(geospatial data) but it can also record time, associating itwith the positions (temporal data). Therefore, we considerthis data to be geo-temporal.

A user might not want to deal with collected GPS dataand try to find interesting information about them, becauseit is too complicated. Geo-temporal data, per se, is noteasy to understand, users need to visualize it in a waythat they can be able to make some conclusions fromthat visualization. It can either be just for entertainmentpurposes, for helping in planning future journeys or evenremembering events from the past. With the proper tools,everybody can be a visual analyst [2].

When visualizing a track, which can easily be repre-sented by a line over a map, a user wants to be able tounderstand where that track starts, where it ends and whereit has passed by. But if this action is multiplied by a bigquantity of tracks, all crossing the same streets, it might notbe that easy to understand all this information about eachtrack separately, nor to gather conclusions about similarityof tracks, like points in common between trajectories.

A visualization of movement must take into account timeand space, because those are the two measures that definea movement. Space is a set of locations with some distanceseparating them, and it can be seen as an area, a linecontaining locations or a set of points in different locations.Time is a continuous set, but it can also be discrete, whenreferring to events, and it is linear and cyclic at the sametime. A visualization that only gives information aboutspace does not allow a user to know when a trajectoryoccurred. At the same time, a visualization that focusesmuch more on time, might not be clear about how thetrajectory and displacement really occurred.

To visualize geospatial data, maps are used with someadditional features, that present the information. Thesefeatures can be glyphs over the map, colour encoded valuesto colour the map itself or even by arrows and lines, usuallyif one wants to represent trajectories.

Currently, there are ways to visualize both time andspace, however, they are not accessible for all types ofusers, with all types of knowledge. Options like space-time cubes or geospatial visualizations with complemen-tary temporal visualizations are some examples, but eachtechnique has limitations, and it is not always simple todecide which is the best technique, since it might dependon the data to be presented. A common solution might beto use a combination of visualization techniques, allowingthe analyst to reach conclusions, one way or another.

Page 2: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

Therefore, it is considered useful to have a tool that canprovide satisfactory visualizations about movement withoutneeding to be a tool expert user. Current tools for non-expert users are simplistic and are not very complete,regarding the information that they can provide. They donot allow users to compare tracks regarding time (which isolder, which is more recent, which happened at the sametime. . . ), and they hardly have the capability to compareand find similarity between trajectories. They also do notallow users to draw considerations on movement patternsand regularity of movements.

A. Objective

The goal of this work is to study techniques to visual-ize, in an integrated way, geo-temporal data collectedthrough large periods of time, allowing geo-spatialcomparison between trajectories and information aboutthe times of all tracks.

In order to achieve this goal, we defined some sub-objectives:

• Identify common problems regarding collected GPSdata;

• Develop an algorithmic solution to pre-process geo-temporal data, fixing those problems, and preparingdata to be visualized;

• Develop a set of prototypes that could provide infor-mation about multiple trajectories, in time and space,and allowing comparisons between them, using edge-bundling;

• Evaluate and compare prototypes in order to under-stand how usable they are and how users can retrieveand understand information from them.

Our work consists in four prototypes created and eval-uated through statistical analysis of results obtained inusability tests. We also developed an algorithm that pre-processes data that exports what will be used as input forour prototypes.

B. Document Structure

We start by presenting and discussing related work,focusing on Visual Analytics and looking at some worksand techniques already in use to visualize geo-temporaldata. In the end of that section, we define requirementsfor our solution. In the next section, we describe howwe prepared data for our visualizations and then, on thefollowing section, we present the prototypes created. Afterthat, we explain how we conducted our usability tests andthe results of it. Finally, in the last section, we expose ourconclusions on this work and reflect about future work.

II. RELATED WORK

This section will present some of the existing works andtechniques to visualize geo-temporal data. In the end, weidentify what we want to achieve with this work, definingrequirements and techniques to be used.

A. Visual Analytics

A visualization is a ”visual representation of data orconcepts” [3] but Visual Analytics (VA) is more than that.It takes into account the human factors that are going toanalyse what is being visualized [4].

The usage of VA to analyse movement and trajectoriesis a topic that is becoming ever more popular, with thegrowth of available devices that can track personal GPSdata about its owners. When applying VA to movement,time and space are the two most relevant dimensions. Thesimplest method to represent a geospatial trajectory of asingle object is a line over a map, but it only gives us theset of positions where our object has been. We do not knowhow much time it spent in each position, and we cannotknow the direction of the movement. Other techniques, likespace-time cubes and heatmaps, are alternatives that pro-vide more information on space and time of the trajectoryof our object. In the following subsections, we will presentsome different visualization techniques, first regarding justspace and then both space and time.

B. Visualizations of Space

Maps are used to stimulate visual thinking about geospa-tial patterns. It is important to view the same geospatialdata sets using multiple representations [5], using either2D or 3D techniques, since both can be useful, dependingon what is wanted to be concluded [6]. We now presentsome techniques to visualize geospatial data.

1) Heatmaps: A heatmap consists in encoding coloursto values and applying those colours to a map, in orderto present some information about that place [7]. They arecommonly used to represent information about a place, andnot about movement. However, heatmaps can also be usedto visualize trajectories.

In a recent work, called SmartAdP [8], heatmaps wereused to help billboard placement companies to visual-ize taxi trajectories, in order to understand which placesare most frequented and, therefore, better for advertising.When a heatmap aggregates all trajectories passing in eachstreet and allows to have an immediate visual reason aboutwhich places are the most frequently used in trajectories,it can be called a traffic density map.

2) Edge-Bundling: Edge-bundling consists in clusteringsimilar data, and presenting it as one [9], making iteasier to have a cleaner visualization and more informationpresented at once. The most common usage is to displayoriented graphs, clustering edges that go to (or from) thesame node [10].

Graser et al. explored the possibility to use this tech-nique to visualize geospatial data sets [11]. Figure 1 showsan example of edge-bundling being used to visualize gullmigrations in regions of Africa and Europe. Figure 1ashows the original data recorded, reduced to start and endpoints (origin-destination) and figure 1b shows the edge-bundling solution, with tracks clustered and aggregated,resulting in much fewer lines being displayed.

Page 3: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

Fig. 1. Edge-bundling showing gull migrations. Figure a) shows the rawdata and Figure b) shows the clustered data using edge-bundling.

C. Visualizations of Space and Time

Currently, there are no systematic methods to detectthe scales needed, both in space and time, to apply tovisualization techniques. Therefore, the ”trial-and-error”approach is usually used by analysts [2]. It is also notpossible to define a method that can show all possibleretrievable information equally, i.e., every VA techniquewill, inevitably, favour an interpretation and a conclusionover others [12]. We now present some techniques tovisualize both time and space that already exist.

1) Space-time Cubes: A space-time cube is a three-dimensional approach to visualize both space and time, firstintroduced in 1970 [13]. The term refers to a geographicalrepresentation where time is treated as a third dimension[14]. Vertical lines in a space-time cube means that theobject was in the same position for some amount of time[15].

2) Timelines and Maps: Timelines can deal with muchlarger datasets than, specially, space-time cubes. Althoughthey might not give the best geospatial view per se, whenone prefers to conclude about time intervals, timelines areusually a better option.

In a work by Zhang et al., timelines were used tounderstand which parts of Tallinn (Estonia) were mostfrequented by people, considering some socio-economicdata about each person participating in the research, whentrying to find a relationship between those socio-economiccharacteristics and the most frequented places by eachperson. The authors divided Tallinn in parts and assignone colour to each part. Then, to represent each person,they used a bar in a timeline, where time was representedhorizontally. For each user, the colour of the bar for aperiod meant that that person was in the area correspondingto that colour in that period.

D. Discussion and Requirements

Considering the works presented here, we could con-clude that there are some issues regarding the visualizationof space and time, specially regarding scalability andusability. Space-time cubes can represent trajectories overtime but tend to not be able to deal with scalability,

and there are not many solutions for non-expert users.Timelines and maps, however, can deal with scalabilityand usability much better, but do not give the perceptionof trajectories. The ideal solution should, therefore, bescalable, easy to use by non-experts and able to providegeospatial, temporal and additional information about thetrajectories presented.

Having in mind what we have already presented, weintended to conduct a visualization design study that couldcreate and evaluate a set of prototypes fulfilling the fol-lowing list of requirements:

• Be able to visualize full trajectory paths (starting,ending and intermediary points).

• Be able to present when a trajectory occurred.• Be able to present more than one trajectory and allow

trajectory comparison.• Be able to know the places where the object has been

the most.We decided to explore the edge-bundling solution, but,

this time, applied to ”ground level” coordinates and trajec-tories, instead of only considering start and end points, likethe work by Graser et al.. We also wanted to explore thecapability to use this technique to present space and timetogether. To accomplish this, we needed to have a wayto understand which trajectories passed through the samestreets, roads or paths, merge those segments in one, andthen properly visualize them, also featuring informationabout time.

In order to do this, we needed to find ways to pre-processour GPS data algorithmically. This algorithm consists ina set of actions that includes treating each piece of dataindividually and, then, treating data as a whole, applyingcomparing and merging actions between every element.The algorithm development methods and the visualizationprototypes created are described in the following sections.

III. UNDERSTAND MY STEPS

The following subsections will address the solutionsthat we developed for this work. First, we describe thealgorithm to process GPS data and, then, we present theprototypes created.

A. Preparing Data

GPS data is usually recorded in GPS Exchange Format(GPX) files, so we needed to develop a solution thatcould deal with those files and prepare data to be properlyvisualized. We created an algorithm to deal with thisproblem. Even though it was not the main focus of thiswork, it was an important part of our work, since it allowedus to pre-process and prepare data to be visualized.

Recorded GPS data usually presents some problems,like outliers and inaccurate trajectories, due to recordingproblems, so we created an algorithm to mitigate thoseproblems, while also identifying parts of tracks (segments)that were similar between them, in order to have themgrouped for the visualization prototypes. This process of

Page 4: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

identifying similar segments was done through an iterativeprocess that compared all pre-determined segments (basedon direction turns, in a way that each street, road or path ofeach trajectory is a different segment). Segments that wereconsidered similar and close to each other were unified andtreated as one from that point on in future comparisons.Figure 2 shows an example of two segments (X and Y )unifying.

Fig. 2. Example of two segments unifying.

B. Visualizing Data

We have created four proof-of-concept prototypes, usingHyperText Markup Language (HTML), Cascading StyleSheets (CSS) and JavaScript (JS) (specially including theD3 library). All prototypes used the outputted data from theprocessing algorithm described in the previous subsection.

For our visualizations, we used a greyscale map withcoloured lines to represent trajectories made. The colour ofthe lines represented the relative time when that trajectoriesoccurred. Lighter blue lines are older trajectories than theones represented by darker blue lines.

The first prototype (presented in Figure 3) is the simplestsolution, consisting in simply draw each trajectory as a line,with colour corresponding to relative time. This prototypeintended to be a standard starting point of comparison tothe other prototypes and it does not feature any kind ofedge-bundling. It does not scale well, because the moretracks to display, the more confusing it is expected to be,specially if all those tracks are located within the samestreets. It might be possible, however, to get an initialoverview about the density of tracks in the map, and knowwhich areas of the map are more populated with data.

The second prototype is the first to use edge-bundling.It consists in presenting all segments unified, and havingthe width of the lines matching the total of segmentsrepresented by that line. With this, it is possible to knowwhich streets were taken more often. Each line is colouredaccording to the average of the times of the segments itrepresents. This prototype is shown in figure 4.

The third prototype joins the first and the second visu-alizations, featuring the edge-bundling solution introducedby the later but also displaying the original tracks, in alower opacity, but still giving the possibility to understandthe original tracks and the times of each segment repre-sented in the edge-bundling solution. Figure 5 shows thisprototype.

The fourth and last prototype, shown in Figure 6, usesthe edge-bundling solution presented in prototype twobut, this time, lines are coloured according the averageof times. Instead, lines are coloured using gradients, toshow the temporal dispersion of segments represented bythat line. Hence, a segment that features, for instance,three segments, will feature a gradient with three colourpositions, according to the time of each segment. It allowsto know if that segment is featured in more old or newtracks.

Fig. 3. Prototype 1.

Fig. 4. Prototype 2.

IV. EVALUATION

Having the four prototypes developed, we had 24 userstesting all of them. Each user was asked to complete aset of six tasks in all four prototypes. We measured timetaken for each task, if it was successful and difficulty, asevaluated by users, in a Likert scale of 1 to 5, being 1”no difficulty at all” and 5 ”extreme difficulty”. For each

Page 5: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

Fig. 5. Prototype 3.

Fig. 6. Prototype 4.

prototype, users were also asked to take Raw NASA -Task Load Index (NASA-TLX) questionnaire, allowing toevaluate the workload of that prototype [16].

A. Tasks

To evaluate all prototypes equally, evaluation methodsshould be the same between them, so we decided thatwe would apply the same six tasks to each prototype. Weneeded to change datasets between prototypes, otherwiseusers would be able to answer the questions by justremembering them, and that would not allow us to evaluatethe prototypes. We wanted tasks that should not be toocomplex but should be able to explore the capabilities ofour prototypes on what we are studying: how it can showtime and space together in a map.

We started by defining one task exclusively for time andanother exclusively for track count. Those should be twosimpler tasks, to test if the user is really understanding thebasic elements of our visualization prototypes. We thencreated four more complex tasks, that already mixed time,space and track count. With those tasks we wanted to seehow good our visualization elements were connected andprovided complete information to the user. Despite havingonly six tasks, we considered it enough to gather conclu-sions about how elements are integrated and information iscorrectly presented. We also did not want to create manytasks because users would have to do them all for all fourprototypes, and that could make them unhappy to have toanswer the same questions six times in a row.

The complete set of six tasks is:1) Select the street where you have been more times.2) Select the start and end points of the trajectory made

more recently.

3) Select a street where you have been multiple timesrecently.

4) Select a trajectory made through different paths.From those paths, select the oldest.

5) Select a trajectory made through different paths.From those paths, select the most used one.

6) Select a street where you have been multiple timesin the past, but not recently.

Tasks 1 and 2 are simple tasks. Task 1 evaluates theusage of width to show the number of tracks being rep-resented. We wanted to make sure a user could get toconclusions on the amount of tracks being representedby just one line. Task 2 evaluates the usage of colour torepresent time. We wanted to make sure users could reasonabout which tracks were more recent and which were older,by just looking at the tracks on the map, with no interactionor extra information provided.

Tasks 3 to 6 are the most complex tasks. Task 3 focus onthe ability to select a street with a line that is wider but alsowith a darker shade of blue, meaning that those streets weretravelled more recently. Task 4 focus on the ability to relatedifferent streets to the same trip (in this case, a trip has astart and an end point, but it does not take into account theactual trajectories). It also tests if users can relate thosedifferent paths to time, knowing which were made more inthe past. Task 5 is similar to Task 4 but, instead of focusingon the perception of time among a path of a multiple-pathtrack, it focuses on the perception of quantity. We wantedto test if users could reason about the amount of timeseach path was used to cover the same trajectory. Task 6is similar to Task 3, but asks users about perception overregularity, in order to understand that, despite being in astreet multiple times, the user should be able to understandif that streets was travelled many times in a short periodor regularly across time.

B. Results

We performed statistical analysis over data collectedduring tests with the goal to understand if there was anyprototype with significant difference (either for better orfor worse, and always using a reference p-value of 0,05)to the others. We evaluated the successfulness, time takento answer, difficulty evaluated by the users and NASA-TLXscores.

For time and difficulty, we used the Kruskal-Wallis HTest (a modification of the Analysis of variance (ANOVA)test) and for successfulness we used the Chi-Square Testof Independence. For NASA-TLX scores we used thetraditional one-way ANOVA test.

1) Results by task: Table I shows the calculated p-values for the analysis of variance comparing success rate,time and difficulty across all tasks. Those p-values werecalculated by the test methods described above. Precisely,for time (since it did not follow normal distribution in anycase) and difficulty, we used the Kruskal-Wallis H Test, andfor success we used the Chi-Square Test of Independence.

Page 6: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

TABLE ICALCULATED p-value FOR ALL VARIABLES AND EACH TASK,

COMPARING ACROSS ALL PROTOTYPES.

Task Variable p-valueSuccess 0.178

Time 0.0111Difficulty 0.00035Success 0.363

Time 0.8372Difficulty 0.838Success 0.564

Time 0.6363Difficulty 0.863Success 0.682

Time 0.1694Difficulty 0.188Success 0.679

Time 0.7305Difficulty 0.166Success 0.155

Time 0.4126Difficulty 0.382

If any more actions were taken, like post-hoc tests andfurther conclusions, they will be described below.

Prototype

4321

Dif

ficu

lty

5

4

3

2

1

Fig. 7. Tukey’s box-plot for difficulty in Task 1.

In Task 1, time and difficulty were statistically signif-icantly different. By performing post-hoc tests, (pairwisecomparisons, provided by the same Kruskal-Wallis H Test)we could conclude that the difference between prototype1 and the others, regarding difficulty, is significant. Byobserving the Tukey’s box-plot presented in figure 7 weconcluded that prototype 1 was considered by the users asmore difficult than all the others. Regarding time, post-hoc tests allowed to conclude that difference betweenprototypes 1 and 3 is significant, however, even thoughdifference was observed between prototype 1 and theothers, it was not significantly different. Therefore, we canonly conclude, with significance, that users needed more

time to answer Task 1 in prototype 1 than in prototype 3.In all other tasks, however, we could not reject the null

hypothesis (p-value was always bigger than 0.05) for anyof the variables, meaning that there was no significantdifference between prototypes in any variable in the restof the tasks.

2) Results by prototype: After comparing the results ofeach task between prototypes, we also compared the resultsof all tracks in each prototype, in order to try to see ifthere was any task, for any prototype, with a statisticallysignificantly difference to other tasks, in any variable.

TABLE IICALCULATED p-value FOR ALL VARIABLES AND EACH PROTOTYPE,

COMPARING ACROSS ALL TASKS.

Prototype Variable p-valueSuccess 0.139

Time 0.0051Difficulty 0.017Success 0.00046

Time 0.7652Difficulty 0.747Success 0.127

Time 0.0823Difficulty 0.493Success 0.168

Time 0.1104Difficulty 0.420

Table II shows the calculated p-values for the analysis ofvariance comparing success rate, time and difficulty acrossall prototypes. Those p-values were calculated by the testmethods described above. For time (since it did not follownormal distribution in any case) and difficulty, we used theKruskal-Wallis H Test, and, for success, we used the Chi-Square Test of Independence. Post-hoc tests, when applied,will be described below.

In prototype 1, there was a statistically significant differ-ence between tasks for time and difficulty. Applying post-hoc pairwise comparison, we were able to detect significantdifference between tasks 6 and 4 and between tasks 3 and4, however, no more significant differences were detected.By observing Tukey’s Box-Plot from figure 8, we can seethat, in prototype 1, tasks 3 and 6 are the ones that tookless time from users and task 4 is the one that took mosttime, and difference between them is, therefore, consideredstatistically significant. For difficulty, the same post-hoctests only detected significant difference between tasks 6and 1, therefore, we are not able to conclude anythingregarding the other prototypes.

In prototype 2, statistically significant difference be-tween tasks was observed for success. As post-hoc test,we considered adjusted residuals. An absolute value ofadjusted residual bigger than 1.6 indicates that that taskis significantly different from the others. Task 3 presentedan adjusted residual value of 2.6 for success, meaningthat this task was considered statistically more successfulthan all the others. Indeed, this task was always completedsuccessfully in this prototype. On the opposite side, Task

Page 7: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

Task

654321

Tim

e

60

40

20

0

Page 1

Fig. 8. Tukey’s box-plot for time in Prototype 1.

6 presented an adjusted residual value of -3.7 for success,meaning that this task was statistically less successful (and,consequently, more unsuccessful) than all the others, forprototype 2.

3) Task Comparison: We wanted to compare task re-sults regardless of prototype, in order to try to understandif users did significantly better or worse in any of the tasks.Using the methods described above, we calculated the p-value for each variable, and they are presented in tableIII. For time and difficulty we used the Kruskal-WallisH Test and for success we used the Chi-Square Test ofIndependence.

TABLE IIICALCULATED p-value FOR ALL VARIABLES, COMPARING BETWEEN

ALL TRACKS, REGARDLESS OF PROTOTYPES.

Variable p-valueSuccess 0.00003

Time 0.004Difficulty 0.187

For difficulty, tests showed that there is no statisticallysignificantly difference between tasks, but for success andtime we needed to apply post-hoc tests.

For success, we used adjusted residual as post-hoc tests.In our case, we could determine that tasks 2, 3 and 6were significantly different. Since Task 3 had an adjustedresidual for success bigger than 1.6, we consider that Task3 had statistically significantly better success result than allthe others. The opposite happened for tasks 2 and 6, whichhad an adjusted residual smaller than -1.6 for success,meaning that those tasks are statistically significantly worsethan the others, in terms of success.

For time, post-hoc tests (pairwise comparisons) showedthat Task 4 was significantly different from tasks 1, 3 and6, and by observing Tukey’s box-plot presented in Figure

Task

654321

Tim

e

80

60

40

20

0

Page 1

Fig. 9. Tukey’s box-plot for time in all prototypes, comparing tasks.

9 we conclude that Task 4 required the users more time tocomplete, than all the other tasks.

4) Prototype comparison: We also wanted to comparethe NASA-TLX scores for all prototypes, to see if therewas any significant difference among them. After checkingthe normal distribution of values, we were able to concludethat there were no statistically significant differences be-tween prototypes’ NASA-TLX scores, as determined byone-way ANOVA, with p-value = 0.743. It means thatthere is no prototype with a NASA-TLX score significantlydifferent from the others, leading us to conclude that theworkload between all prototypes is similar. Table IV isthe result of the ANOVA test for comparing NASA-TLXscores.

TABLE IVANOVA TABLE FOR NASA-TLX SCORES.

TLX

Sum of Squares df Mean Square F Sig.

Between Groups

Within Groups

Total

339,563 3 113,188 ,415 ,743

25119,810 9 2 273,041

25459,373 9 5

TLX

In order to compare prototypes, we performed statisticalanalysis for successfulness, time and difficulty for everyprototype, regardless of the task. Using the methods de-scribed in the beginning of this section, we calculated p-values for each variable, as presented in table V. For timeand difficulty we used the Kruskal-Wallis H Test and forsuccess we used the Chi-Square Test of Independence.

For time and difficulty, tests showed there was a statis-tically significantly difference, but post-hoc tests (pairwisecomparisons) for time did not allow to identify any pairbeing significantly different (with p-value > 0.05). For dif-ficulty, post-hoc tests allowed us to conclude that prototype

Page 8: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

TABLE VCALCULATED p-value FOR COMPARISON BETWEEN ALL PROTOTYPES,

REGARDLESS OF TASK.

Variable p-valueSuccess 0.838

Time 0.033Difficulty 0.002

Prototype

4321

Dif

ficu

lty

5

4

3

2

1

Fig. 10. Tukey’s box-plot for difficulty in all prototypes.

1 had a significant difference with all other prototypes, andfigure 10 shows us that, overall, prototype 1 was consideredmore difficult to use than the others.

5) Descriptive Analysis: Despite not being able to findoverall significant differences between prototypes regard-ing which one should be better than the others, we werehappy that the results were positive, because users wereable to complete tasks quickly, had a good success rate,and did not consider prototypes hard to use.

By calculating a confidence interval (with α = 0.05),we were able to conclude, with 95% confidence, that userswould take less than 10 seconds in prototypes 2, 3 and 4(and less than 12 seconds in prototype 1). Success rate’sbest estimate is above 85% in all prototypes and difficultyis below level 2 in prototypes 2, 3 and 4.

6) Observations and Feedback: During tests, eventhough we were not formally using the think aloud testprotocol, we did not prevent users from talking whileperforming the tasks. From what they were saying and fromour observation, we could gather some general conclusionsabout our prototypes:

• In prototype 1, some users tried to count the lines oneach street, in order to know which street had moretimes. Although it may not be wrong or consideredan error, users should be able to compare streets byzooming out and seeing where there was a biggerdensity of lines.

• In prototype 4, some users complained about thegradient, claiming that it distracted and confused themregarding time and number of tracks represented by aline. Some also considered that line orientation mightbe misleading at first.

• In prototypes 2 and 3, when asked about the mostrecent or oldest tracks (tasks 2 and 4), all users exceptfour came to conclusion from average colour. Whilemost of the times they were, indeed, correct for thetask, they should be able to conclude that on prototype2 it was not possible to know the most recent (oroldest) track and in prototype 3 they should lookat the ”original” tracks, instead of the unified ones.In the presentations of those prototypes, users wereclearly notified that, in those visualizations, colourrepresented average.

Besides statistical analysis and observation, we wantedto get to conclusions regarding users’ feedback, collected atthe end of each test session. We asked users to provide theiropinions on which prototype was the best and which wasthe worst, as well as pointing some problems they havefound and, if possible, some suggestions of correctionsand features to add to our prototypes. From this feedback,some opinions diverged quite significantly but others wereunanimous.

Prototype 1 was the least appreciated by the users.The great majority of them admitted it was the hardestprototype to use and it required some visual effort to getto conclusions. This matches with the results presentedabove, where the only significant differences detected were,precisely, to conclude that prototype 1 was worse than allthe others.

Regarding prototype 4, the great majority of users likedit, but considered it could have some improvements. Somesaid it requires some more training, to get used to thegradient and what it means, because it is a new conceptthat they are not used to deal with. Some also statedthat, instead of gradient, we could consider using just thecolours, filling the right percentage of the line, instead ofhaving a colour gradient. Prototype 4 was considered themost complete by users, in terms of information provides.

About prototypes 2 and 3, opinions were almost unan-imous as well. A big number of users stated that, alongwith prototype 4, prototype 2 (and, sometimes, prototype3 as well) was also good. However, most of those usersnever realised that prototype 2 could not provide thatinformation regarding the exact times of the tracks, asmentioned before. Many claimed that prototype 2 was themore ”balanced” one, because it was simple, and not too”eye aggressive” and still allowed to come to conclusions.Some also stated that prototype 3 was very complete,because it allowed quick and detailed conclusions with thesame visualization, but, sometimes the original tracks wereunintentionally ignored and wrong conclusions were takenthrough the averaged lines.

Suggestions given by some users were, sometimes,

Page 9: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

against the compliments made by others. Few users sug-gested that the colour scale should be inverted, or morecolours should be used. One user suggested that the mostrecent track should be highlighted from the others inall prototypes, using a completely different colour, andthat it could be triggered by user choice. Interactivitywas the most requested feature. Users suggested ways toanalyse each trip individually with some kind of selectionfeature. They also suggested having some space on screendedicated to present detailed information about a selectedtrack, like the exact amount of times that trajectory wastravelled, the exact date when it occurred, or even thehighlight of the start and end points of that track.

V. CONCLUSION

We started by looking at current solutions to visualizetime and space together, identifying the limitations of eachone, in order to understand how we could use previousknowledge and work to develop a new alternative and easyto use solution. We concluded that a map was still the bestoption to present geospatial information and time couldbe mapped using colours. We then decided to use edge-bundling to visualize multiple tracks on a map and whileproviding time perception.

Thus, we studied a way to prepare data to be visualizedand a proper visualization technique, using edge-bundling.We developed a total of four proof-of-concept prototypes.One prototype (prototype 1) did not use edge-bundlingand intended to be a starting point for the other threeprototypes, but still used colour to represent time. Anotherprototype (prototype 4) used a technique that we wantedto test, consisting in using colour gradients in trajectorylines, to display the distribution of times of the trajectoriesrepresented by those lines.

After creating the four prototypes, we tested them withusers (usability tests), in order to try to find if therewere prototypes that were better or worse than the others,and also if users could use them successfully, easily andquickly. Our analysis concluded that there were not muchsignificant differences between prototypes but prototype 1was overall worse than the others. It was also possible toconclude that users were able to perform tasks in a shorttime with a good success rate, and did not consider ourprototypes hard to use. However, we noticed that userscould be misled about time in one prototype, that onlypresented the average of times (prototype 2).

Given the above, we consider that all our sub-objectiveswere achieved and that our goal to study techniques tovisualize geo-temporal data collected through time withthe possibility to compare trajectories was also achieved.

A. Future Work

Considering the current work and the results from usabil-ity tests, we consider that some features can be improved:

• Enhance the processing algorithm. Find a way so thatsimilar segments are all unified in one, specially when

start and/or end points are near each other. Currently,if there are three or more segments that start or endin near points, it is possible that they do not unifyall in one, because they unify with the first possiblematch, and their order of comparisons is not alwaysthe same.

• Add the possibility to have information about absolute(instead of just relative) time of tracks.

• Add interactivity to prototypes, in order to improveuser experience visualising tracks, as suggested byusers, by providing more information regarding tracksand trajectories they represent.

• Improve some visualization elements, studying itseffects on users and comparing results with currentones. It can either mean an improvement in currentexisting elements (like opacity, colours, width, gradi-ents. . . ) and the way they are calculated (new scales,new algorithms. . . ) or the creation and implementa-tion of new elements.

REFERENCES

[1] Q. Zhang, A. Slingsby, J. Dykes, J. Wood, M.-J. Kraak, C. a.Blok, and R. Ahas, “Visual analysis design to support research intomovement and use of space in Tallinn: A case study,” InformationVisualization, vol. 13, no. 3, pp. 213–231, 2013. [Online]. Available:http://ivi.sagepub.com/lookup/doi/10.1177/1473871613480062

[2] G. Andrienko, N. Andrienko, U. Demsar, D. Dransch,J. Dykes, S. I. Fabrikant, M. Jern, M.-j. Kraak, H. Schumann,and C. Tominski, “Space, time and visual analytics,”International Journal of Geographical Information Science,vol. 24, no. 10, pp. 1577–1600, 2010. [Online]. Available:http://www.tandfonline.com/doi/abs/10.1080/13658816.2010.508043

[3] C. Ware, “Foundations for a Science of Data Visualization,” in Infor-mation Visualization - Perception for design, 2nd ed. San Francisco,CA, USA: Elsevier, 2004, ch. 1, pp. 1–30. [Online]. Available:http://linkinghub.elsevier.com/retrieve/pii/B9780123814647000016

[4] D. A. Keim, F. Mansmann, J. Schneidewind, J. Thomas,and H. Ziegler, “Visual analytics: Scope and challenges,” inLecture Notes in Computer Science. Berlin, Germany: Springer,oct 2008, vol. 4404 LNCS, no. 7, pp. 76–90. [Online].Available: http://www.mendeley.com/research/scope-challenges-visual-analytics/?utm source=desktop&utm medium=1.12.3-dev3&utm campaign=open catalog&userDocumentId=%7Bbca2e0b5-4d2a-48d2-8774-5743949c5e2a%7Dhttp://link.springer.com/10.1007/978-3-540-71080-6 6 http://w

[5] M.-j. Kraak, “The Space-Time Cube Revisited from a Geovisual-ization Perspective,” in 21st International Cartographic Conference(ICC). Durban, South Africa: International Cartographic Associa-tion, 2003, pp. 1988–1996.

[6] S. Dubel, M. Rohlig, H. Schumann, and M. Trapp, “2Dand 3D presentation of spatial data: A systematic review,”in 2014 IEEE VIS International Workshop on 3DVis (3DVis).Paris, France: IEEE, nov 2014, pp. 11–18. [Online]. Available:http://ieeexplore.ieee.org/document/7160094/

[7] D. Fisher, “Hotmap: Looking at Geographic Attention,”IEEE Transactions on Visualization and Computer Graphics,vol. 13, no. 6, pp. 1184–1191, nov 2007. [Online]. Available:http://ieeexplore.ieee.org/document/4376139/

[8] D. Liu, D. Weng, Y. Li, J. Bao, Y. Zheng, H. Qu, and Y. Wu,“SmartAdP: Visual Analytics of Large-scale Taxi Trajectories forSelecting Billboard Locations,” IEEE Transactions on Visualizationand Computer Graphics, vol. 23, no. 1, pp. 1–10, jan 2017.[Online]. Available: http://ieeexplore.ieee.org/document/7534856/

[9] M. Dickerson, D. Eppstein, M. T. Goodrich, and J. Y. Meng,“Confluent drawings: visualizing non-planar diagrams in a planarway,” in International Symposium on Graph Drawing. Springer,2003, pp. 1–12. [Online]. Available: https://doi.org/10.1007/978-3-540-24595-71

Page 10: Understand My Steps - Using Edge-bundling to Visualize GPS Tracks · Understand My Steps - Using Edge-bundling to Visualize GPS Tracks Daniel Sil T´ecnico Lisboa ... 2D or 3D techniques,

[10] D. Holten and J. J. Van Wijk, “Force-directed edge bundling forgraph visualization,” in Computer graphics forum, vol. 28, no. 3.Wiley Online Library, 2009, pp. 983–990. [Online]. Available:https://doi.org/10.1111/j.1467-8659.2009.01450.x

[11] A. Graser, J. Schmidt, F. Roth, and N. Brandle, “Untanglingorigin-destination flows in geographic information systems,”Information Visualization, pp. 1–20, 2017. [Online]. Available:https://doi.org/10.1177/1473871617738122

[12] D. Keim, G. Andrienko, J.-d. Fekete, C. Gorg, J. Kohlhammer, andG. Melancon, “Visual Analytics: Definition, Process, and Challenges,”in Information Visualization - Human-Centered Issues and Perspectives,1st ed. Berlin, Germany: Springer-Verlag Berlin Heidelberg, 2008,pp. 154–175. [Online]. Available: http://hal-lirmm.ccsd.cnrs.fr/lirmm-00272779 http://link.springer.com/10.1007/978-3-540-70956-5 7

[13] T. Hagerstraand, “What About People In Regional Science?” Papers inRegional Science, vol. 24, no. 1, pp. 7–24, jan 2005. [Online]. Available:http://doi.wiley.com/10.1111/j.1435-5597.1970.tb01464.x

[14] P. Gatalsky, N. Andrienko, and G. Andrienko, “Interactive analysisof event data using space-time cube,” in Proceedings. EighthInternational Conference on Information Visualisation, 2004. IV2004. London, UK: IEEE, 2004, pp. 145–152. [Online]. Available:http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1320137http://ieeexplore.ieee.org/document/1320137/

[15] G. Andrienko, N. Andrienko, P. Bak, D. Keim, andS. Wrobel, Visual Analytics of Movement, 1st ed.Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. [On-line]. Available: http://www.springer.com/us/book/9783642375828http://link.springer.com/10.1007/978-3-642-37583-5

[16] S. G. Hart, “Nasa-task load index (nasa-tlx); 20 years later,”Proceedings of the Human Factors and Ergonomics Society AnnualMeeting, vol. 50, no. 9, pp. 904–908, 2006. [Online]. Available:https://doi.org/10.1177/154193120605000909

[17] K. Durrheim and C. Tredoux, Numbers, hypotheses & conclusions: Acourse in statistics for the social sciences. Juta and Company Ltd,2004.