actinfo: information platform for physical activity · i declare that this document is an original...

Actinfo: Information Platform for Physical Activity

Alexandre Silva Carreira

Thesis to obtain the Master of Science Degree in

Biomedical Engineering

Supervisors: Prof. Dr. Mário Jorge Costa Gaspar da SilvaProf. Dr. Maria de Fátima Marcelina Baptista

Examination Committee

Chairperson: Prof. Dr. João Miguel Raposo SanchesSupervisor: Prof. Dr. Mário Jorge Costa Gaspar da Silva

Member of the Committee: Dr. Pedro Alexandre Barracha da Guerra Júdice

June 2019

Preface

The work presented in this thesis was performed at the Exercise and Health Lab, Faculty of Human

Kinetics, University of Lisbon (Lisbon, Portugal), during the period September 2018-March 2019, under

the supervision of Prof. Fatima Baptista. The thesis was co-supervised at Instituto Superior Tecnico by

Prof. Mario J. Silva and Prof. Bruno Martins.

iii

Declaration

I declare that this document is an original work of my own authorship and that it fulfills all the require-

ments of the Code of Conduct and Good Practices of the Universidade de Lisboa.

v

Acknowledgments

First and foremost, I would like to express my appreciation to Professor Fatima Baptista, Professor Mario

J. Silva and Professor Bruno Martins for all their assistance, guidance and availability throughout this

project.

I would like to offer my special thanks to Joao Magalhaes and Pedro Judice for all the valuable input,

feedback and for welcoming me and showing me the ropes during my time at the Exercise and Health

Lab.

I also wish to acknowledge the help provided by all the researchers at the lab who allowed me to use

the data which made this work possible and, in particular, those who took the time to test the platform,

providing me with valuable feedback for evaluating the prototype.

Finally, I wish to thank my parents and brother for their support and encouragement throughout this

journey.

vii

Resumo

Esta dissertacao apresenta o desenvolvimento e avaliacao de uma nova plataforma de informacao para

gestao de dados de actividade fısica (AF), Actinfo. Construıda com base num stack de tecnologias

JavaScript de codigo aberto (MEAN), esta plataforma e tanto um repositorio de estudos de AF como

uma ferramenta para operar sobre ficheiros de actigrafia. Actinfo fornece visualizacoes de estatısticas

relevantes geradas a partir de informacao em estudos de AF, bem como ferramentas para comparacao

de dados de AF de estudos diferentes. O modelo de dados utilizado tem por base o padrao FHIR, asse-

gurando interoperabilidade com informacao clınica. Para validar a precisao das ferramentas de proces-

samento de dados implementadas foi conduzido um estudo comparativo, para comparar indicadores

temporais de AF e o cumprimento com recomendacoes de AF entre dois grupos: uma populacao de

adultos com diabetes tipo II, estudo ”D2FIT” (n=73) e uma amostra da populacao adulta do munincıpio

de Lisboa, estudo ”ProjCML” (n=69). Concluiu-se que uma menor percentagem dos participantes do

estudo D2FIT cumpre com as recomendacoes de AF (18% vs. 35%), e que os mesmos atingem, em

media, um menor tempo em comportamento sedentario (71.89% vs. 72.12% do tempo de utilizacao do

acelerometro), menor tempo em AF de intensidade moderada a vigorosa (3.99% vs. 4.88% do tempo de

utilizacao do acelerometro) e um maior numero de interrupcoes no comportamento sedentario (10.04

vs. 9.64 interrupcoes/hora em comportamento sedentario), por dia. Resumindo, foi desenvolvido um

prototipo funcional de uma plataforma de gestao de dados de AF com boa usabilidade.

Palavras-chave: Actividade fısica, actigrafia, plataforma web, MEAN stack, FHIR

ix

Abstract

This dissertation presents the development and assessment of Actinfo, a new information platform for

the management of physical activity (PA) data. Built using a full-stack of open-source JavaScript tech-

nologies (MEAN), this platform is both a repository of PA studies and a tool for performing a number of

operations on actigraphy files. Actinfo provides visualizations of relevant statistics from information in

PA studies, as well as tools for comparing PA data from different studies. Data in Actinfo is modelled

after the FHIR standard for healthcare information exchange, to ensure interoperability with clinical data.

To validate the accuracy of the data processing tools implemented, a comparative study was carried, to

compare computed PA time indicators and compliance with PA recommendations between two studies:

a population of adult patients of type II diabetes, i.e. the study ”D2FIT” (n=73) and a sample of the adult

population of the municipality of Lisbon, i.e. the study ”ProjCML” (n=69). It was possible to conclude that

a lower percentage of participants in study D2FIT attain sufficient physical activity (18% vs. 35%), and

that subjects in this study average lower sedentary times per day (71.89% vs. 72.12% of accelerometer

wear time), less time in moderate- to vigorous-intensity PA per day (3.99% vs. 4.88% of accelerometer

wear time) and a higher number of interruptions in sedentary behaviour (10.04 vs. 9.64 breaks/hour of

sedentary time). In summary, it was possible to achieve a functional prototype of a PA data management

platform with good usability.

Keywords: Physical activity, actigraphy, web platform, MEAN stack, FHIR

xi

Contents

1 Introduction 1

1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background on physical activity 5

2.1 Physical activity and health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Physical activity: role in health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Classifying physical activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.3 WHO’s recommendations on physical activity . . . . . . . . . . . . . . . . . . . . . 8

2.2 Objectively measured physical activity: actigraphy . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Chapter overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Supporting technology 17

3.1 Web application architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 The MEAN stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.1 Angular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.2 Node.js . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.3 Express . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.4 MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3 The FHIR standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.4 Security and authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


4 ActInfo 29

4.1 Overview of the platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.1 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.2 Web server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.1.3 Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.2 User interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

xiii

4.2.1 Administrator role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2.2 Researcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Compliance with data protection regulations . . . . . . . . . . . . . . . . . . . . . . . . . . 47


5 Assessing Actinfo 51

5.1 Conformity with requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.2 Platform usability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.3 Comparative study with two adult populations . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.3.1 Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3.2 Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3.4 Discussion of experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 66


6 Conclusions and future work 69

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Bibliography 73

A Entity types and document fields 79

B User interface 83

C Exported Excel file example 87

D Consent forms 91

xiv

List of Tables

5.1 Conformity of the platform’s features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 SUS scores of Actinfo’s evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.3 Number of breaks per hour of sedentary time . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.4 Daily average ST, from Actilife. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.5 Daily average time in MVPA, from Actilife. . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.6 Daily average number of breaks/ST hour, from Actilife. . . . . . . . . . . . . . . . . . . . . 65

5.7 Mean absolute error for each computed PA time indicator. . . . . . . . . . . . . . . . . . . 66

A.1 Fields in documents with the user entity type . . . . . . . . . . . . . . . . . . . . . . . . . 79

A.2 Fields in documents with the researchStudy entity type . . . . . . . . . . . . . . . . . . . 80

A.3 Fields in documents with the studyGroup entity type . . . . . . . . . . . . . . . . . . . . . 80

A.4 Fields in documents with the researchSubject entity type . . . . . . . . . . . . . . . . . . 81

A.5 Fields in documents with the file entity type . . . . . . . . . . . . . . . . . . . . . . . . . . 81

xv

List of Figures

2.1 Actigraph Corp. activity monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Processing of accelerometer data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 *.agd file schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 3-tier architecture of web applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Request-response flow in the MEAN stack . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.3 Relational database vs. MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4 Modelling relationships in MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1 Data model for Actinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Actinfo homepage, login menu and profile page . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3 ”admin” interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4 ”My studies” interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 ”New study” form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.6 Interface for a created study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.7 Study group interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.8 Validation settings interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.9 Output interface (summary) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.10 Output interface (detailed) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.11 ”Group statistics” page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.1 SUS questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2 Subject information and validation details (example) . . . . . . . . . . . . . . . . . . . . . 58

5.3 Demographics for the analyzed population . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.4 Distribution of compliance with PA recommendations . . . . . . . . . . . . . . . . . . . . . 60

5.5 Distribution of compliance with physical activity recommendations (males vs. females) . . 61

5.6 Distribution of daily sedentary time for both studies . . . . . . . . . . . . . . . . . . . . . . 62

5.7 Distribution of daily sedentary time for both studies (males vs. females) . . . . . . . . . . 62

5.8 Distribution of daily time in MVPA both studies . . . . . . . . . . . . . . . . . . . . . . . . 63

5.9 Distribution of daily time in MVPA both studies (males vs. females) . . . . . . . . . . . . . 63

B.1 File uploader interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

xvii

B.2 Custom cut point form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

B.3 Custom cut point example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

B.4 Commonly used bouts and breaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

B.5 Custom bout form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

C.1 ”Summary” sheet for exported Excel file . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

C.2 ”Daily” sheet for exported Excel file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

D.1 Consent form signed by participants of the D2FIT study (front) . . . . . . . . . . . . . . . 92

D.1 Consent form signed by participants of the D2FIT study (back) . . . . . . . . . . . . . . . 93

D.2 Consent form signed by participants of the CML study . . . . . . . . . . . . . . . . . . . . 94

xviii

Chapter 1

Introduction

The effect of physical activity (PA) in the health of individuals has been studied for decades and motivated

the development of more robust methods for exploring the PA-health relationship. Despite the benefits of

meeting adequate levels of physical activity being widely accepted (and even part of the common sense

of the general population), the interest in further investigating more intricate aspects of this relationship

has not by all means plummeted. In fact, research in the field of PA has progressed tremendously over

the few decades, making use of various emerging technologies and methods along the way. Specifically,

current research often relies on the use of accelerometers (usually integrated in an activity monitor,

called an actigraph) for the objective measurement of physical activity. These devices, which emerged in

the 1980s and 1990s, and are now commercialized in a larger scale (Troiano et al., 2014). The availability

and accuracy of such tools has made it easier to profile PA, not only at the level of the individual,

but also at the scale of a population, allowing the categorization of PA across a group of subjects.

However, new tools also imply more data and, with that, the need to properly process, store and extract

relevant information from that data. Particularly, in research facilities, there is a growing tendency to

perform statistical analysis over larger datasets, originating from the aggregation of population samples

from multiple PA studies. This results in a rising need for systems that centralize the vast amount of

information generated, storing it under internationally recognized standards, while at the same time

allowing the analysis of the data stored.

At the Exercise and Health Laboratory (EHLab, Faculty of Human Kinetics, University of Lisbon),

research involving this type of data are part of everyday life. As stated in their mission, they strive to

”(...) lead and innovate in research and dissemination of models, methods, and interventions to treat or

prevent the unhealthy effects of sedentary behavior, to further understand the role of physical activity in

health and disease (...)” 1. It is, therefore, of no surprise the constant need to generate, process and store

data collected from activity monitors, in the form of actigraphy files. The problem, however, arises when

said storage is spread and unstandardized, hindering the researchers’ workflow when trying to reuse

data previously collected for conducting new studies. No centralization system for the data collected is

implemented whatsoever, nor is it straightforward to, for instance, compare results from different physical

1EHLab: http://www.fmh.utl.pt/en/research/exercise-and-health

1

http://www.fmh.utl.pt/en/research/exercise-and-health

activity studies to one another. Additionally, there is a lack of readily available tools for analyzing said

data, as the lab staff relies heavily on proprietary software to do so (currently, a very limited number of

software licenses makes it difficult for more than two researchers to use it at the same time). Tackling

these two problems, i.e., on the one hand, the nonexistence of a system in which physical activity data

is centralized and stored under international standards and, on the other hand, the heavy dependence

on proprietary software, would greatly benefit research conducted at the EHLab, both from a workflow

optimization point of view and a data integrity one.

There is an obvious need for a centralized information system, coupled with a necessity to develop

a more accessible tool to handle and process the files. Additionally, there is an interest in improving on

certain features of the currently used software to score and validate actigraphy data. This makes the

EHLab a perfect candidate for the development and implementation of a new platform capable of meeting

these needs. This platform should be able to store physical activity information under international

information standards, ensuring interoperability with clinical data to the best of its extent. Furthermore,

the platform should be equipped with tools for processing and generating relevant statistical information

from sets of actigraphy files.

1.1 Objectives

The main objectives of the work presented in this dissertation were as follows:

• Develop a new web information platform, Actinfo, for PA data, with the following requirements:

– Allow centralization of the currently sparsely stored data from physical activity studies, at the

EHLab, integrating multiple studies under one system;

– Allow the visualization and export of results from, not only individual studies, but also the

comparison between different physical activity studies in the platform;

– Offer tools for processing actigraphy files from studies uploaded to the platform, namely for

validation of accelerometer wear time, computation of physical activity time indicators and

generation of visualizations for relevant population statistics;

– Store data in the platform’s database following, as best as possible, international standards

for healthcare information, allowing it to be as interoperable with clinical data as possible;

– Data processing should be in compliance with the EU General Data Protection Regulation

(GDPR) and with Portugal’s Data Protection Authority (CNPD).

• Evaluate the platform’s current prototype in three steps:

– Assessment of the conformity of the platform’s features with requirements;

– Assessment of the platform’s usability using a standard usability questionnaire;

– Using two distinct populations, previously studied only independently from one another, demon-

strate Actinfo’s ability to handle accelerometer data and produce relevant population statistics,

2

namely the distributions of physical activity time indicators and compliance with recommen-

dations for physical activity, in a comparative study.

1.2 Methods

The tasks for developing the platform were the following:

1. Review of literature on the current research on physical activity and health, as well as objectively

measured physical activity through actigraphy;

2. Study of the workflow of the staff at the EHLab when conducting physical activity studies. In partic-

ular, the life cycle of the accelerometry files after they have been collected and downloaded from

the corresponding device, to understand how and where the generated information is currently

stored;

3. Identification and characterization of the tools the platform should offer for processing the actigra-

phy files, extraction of relevant information and analysis of the results of both individual and sets

of studies;

4. Identification of the requirements to develop a platform suited for the researcher’s needs: research

of standards for information storage and exchange, as well as the steps to take to make it compliant

with the GDPR and CNPD;

5. Decide on state of the art technology to be employed in the development of Actinfo, namely which

front-end and back-end interfaces to use, the database management system and security proto-

cols;

6. Development of Actinfo: creation of a framework, complete with an authentication system, and

building of the previously defined features for data storage and processing;

7. Consultation of researchers at the EHLab regarding the implemented features: presentation of the

prototype, creation of user accounts and implementation of the necessary adjustments to better

meet the researcher’s needs;

8. Testing of the platform:

(a) Analysis of the conformity of the platform’s features with the requirements;

(b) Assessment of usability, via administration of a standard, anonymous questionnaire to Act-

info’s active users;

(c) Conduction of a comparative study, to test the platform’s tools for comparison of physical

activity studies, computation of physical activity time indicators and statistical analysis.

3

1.3 Contributions

Two main contributions resulted from the work conducted in this dissertation, which can be summarized

as follows:

1. Actinfo, a platform for the centralization of PA studies and actigraphy data, storing them in a stan-

dardized manner, compliant with the EU’s GDPR and Portugal’s CNPD. Additionally, the platform

allows users to easily access the studies’ contents and perform comparisons between different

sets of data, and provides tools for processing accelerometry data.

2. A comparative study of the objectively measured PA profiles of two adult populations, conducted

using Actinfo, concluding on differences in the distribution of various computed metrics for PA.

Regarding the first item, I developed a functional prototype, resulting in an integrated information

platform which acts as a 2-in-1 system: on the one hand, a repository of PA studies, allowing access to

readily available information from, not only individual studies, but also from the comparison between dif-

ferent studies stored in the database; on the other hand, a tool for processing actigraphy data, equipped

with features for a number of operations on accelerometer files. Tailored to suit the needs of researchers

tackling PA not only at the EHLab, but also at other faculties conducting research in this area, Actinfo’s

development benefited from constant feedback during conceptualization and implementation, turning it

compliant with the current research standards. As for the study conducted using Actinfo, it allowed, not

only the comparison of the distribution of PA time indicators between the two adult populations, but also

validated the platform as a tool for storing, processing and generating relevant statistical information

from actigraphy files and PA studies.

1.4 Thesis Outline

The remainder of the dissertation is structured in the following manner: Chapter 2 provides crucial

definitions regarding physical activity, explores the PA-health relationship and elaborates on the role of

actigraphy in objectively measured PA, explaining the state of the art technologies in this field; Chapter

3 explores the technology used to build Actinfo, specifically, web development technologies and stan-

dards for information exchange; Chapter 4 describes the platform in detail, focusing on its requirements,

architecture, data model and the various interfaces and features for storing, processing and visualiz-

ing relevant data extracted from actigraphy files; Chapter 5 details the various steps of the assessment

phase, starting with a conformity analysis, moving on to the usability of the platform and, finally, present-

ing the results of the proof of concept comparative study conducted to test Actinfo’s tools for processing

PA data. Finally, Chapter 6 summarizes the most significant conclusions and achievements of this dis-

sertation while also reflecting on future work to be conducted to improve the platform.

4

Chapter 2

Background on physical activity

The benefits of physical activity (PA) for overall health are clear in this day and age, and not by all means

a recent discovery. In fact, the relationship between PA and health has been a subject of interest for

humans since ancient times. Over the past century, this connection became a thoroughly researched

topic, mostly due to the proved effectiveness of habitual PA in preventing a myriad of chronic diseases,

such as certain types of cancer, cardiovascular disease, obesity and depression (Warburton et al., 2006).

Recently, tools have emerged that allow for the objective measurement of PA, specifically, hip worn

accelerometers (Troiano et al., 2014). It is, therefore, of no surprise that more and more researchers

have taken interest in using such technologies to quantify the PA of individuals. Consequently, with the

data collection needed for such studies, there arises a need to store and organize it, leading to the

creation of systems to support the gathering of information. This chapter contains crucial definitions and

relevant aspects of the impact of PA in health, as well as the importance of actigraphy data, emphasizing

the major advancements in the area. Furthermore, existing technologies for the gathering and storage

of these type of health data are also explored.

2.1 Physical activity and health

First and foremost, the term ”physical activity” is not to be confused with the terms ”physical exercise”

or ”physical fitness”. Although often used interchangeably, these expressions have distinct definitions.

The World Health Organization (WHO) describes PA as ”(...) any bodily movement produced by skeletal

muscles that requires energy expenditure” (World Health Organization, 2017). The same entity goes on

to define the term ”exercise” as ”(...) a subcategory of PA that is planned, structured, repetitive, and pur-

poseful in the sense that the improvement or maintenance of one or more components of physical fitness

is the objective” (World Health Organization, 2017). The previous definitions are based on Caspersen

et al. (1985), who additionally consider ”physical fitness” to be ”(...) a set of acquired (genetic) or de-

veloped (training) attributes related to the ability to perform PA”. This section will focus various aspects

of PA which, simply put, is a behaviour resulting in energy expenditure above resting levels (Hills et al.,

2014).

5

2.1.1 Physical activity: role in health

Hypertension, coronary heart disease, type II diabetes, stroke, colon and breast cancer and depres-

sion are some of the most common noncommunicable diseases (NCDs). Globally, it is estimated that

NCDs account for 63% of all deaths, corresponding to 36 million people dying annually from these dis-

eases (World Health Organization, 2013). Risk factors for NCDs include an unhealthy diet, smoking,

overweightness, high blood pressure and cholesterol, obesity and physical inactivity. Five of these are

related with PA, which is a major independent and modifiable risk factor for NCDs. PA is known to reduce

blood pressure, improve HDL cholesterol and control of blood sugar levels, and reduce the risk of devel-

oping colon cancer, breast cancer (in women) and prostate cancer (in men) (European Union, 2008). It

is fundamental for controlling body weight and energy balance, which in turn provides additional benefits

in preventing obesity. Regarding the musculoskeletal system, PA plays a role in preserving or poten-

tiating bone mineralization and maintaining and improving muscular strength and endurance (World

Health Organization, 2004). Additionally, PA has an important effect in preserving cognitive function,

decreases the risk of depression and dementia, decreases stress, improves sleep quality and improves

self-esteem, which in turn decreases absenteeism. In elders, PA is associated with a decrease in the

risk of fall and decrease of functional limitations, as well as a delay (and even prevention) of chronic

diseases associated with aging (European Union, 2008; World Health Organization, 2004).

Physical inactivity is the one of the leading risk factors for global mortality (World Health Organiza-

tion, 2010). When addressing insufficient PA, it is important to distinguish it from the term ”sedentary

behaviour”. Despite some inconsistencies in its definition (Yates et al., 2011; Pate et al., 2008), the term

can be described from an energy expenditure point of view, as explained later in Section 2.1.2, while

”physical inactivity” simply translates into not meeting recommended amounts of PA (Gonzalez et al.,

2017). Nevertheless, the two entities represent a threat to population health, and both translate into low

levels of PA. Attaining sufficient levels of PA has become a more and more difficult task. It is estimated

that one in five people are not physically active enough, aggravating the global situation in regards to the

increasing number of people suffering from chronic diseases (Gonzalez et al., 2017). On top of that, the

effects of physical inactivity in a country’s economy are not to be ignored (Janssen, 2012); the reduction

of the prevalence of physical inactivity has a significant impact on reducing healthcare costs (Cadilhac

et al., 2011).

Regarding sedentary behaviour, i,e., daytime activities performed in the sitting or reclining position,

research has shown the various adverse outcomes of high levels of sedentary time (ST) on both the risk

of disease and in the risk of death (Brocklebank et al., 2015; Ekelund et al., 2016). As ST occupies the

better part of an individual’s day (Baptista et al., 2012; Clark et al., 2011), its effects on health are worth

investigating, especially since it is possible for a person to meet the minimum recommended levels of

PA but still spend most of their time in sedentary behaviour (Tremblay et al., 2017). In fact, meeting

the recommended amounts of PA does not attenuate the risks associated with high ST. The effects

of sedentary behavior may possibly be minimized if the person accumulates a weekly PA of moderate

or higher intensity of at least four times that which is recommended, i.e. approximately 600 min/week

(Ekelund et al., 2016). Aside from studying the total ST per day for an individual, research has also

6

focused on the patterns of accumulation of ST, i.e., bouts of sedentary activity and breaks (interruptions

in ST). Typically, in the context of PA research, when using the term bout, one refers to ST, however, a

bout can correspond to a time period in which a subject’s level of PA is equal to, or greater than, some

specific intensity, during a given time frame (Barrett et al., 2017). For the purposes of this dissertation,

when referring to the term ”bout”, I will be addressing sedentary time, with ”breaks” being interruptions

in ST, unless otherwise specified.

Various studies have been conducted to investigate the connection between bouts and breaks in ST

and subjects’ physiology and health. Chastin et al. (2015) have demonstrated the positive effect that

interrupting sedentary behaviour has in controlling adiposity and blood sugar levels. In fact, in patients

suffering from certain NCDs, such as type II diabetes, breaking up ST has been shown to be a useful

method to mitigate the negative effects of sedentary behaviours (Sardinha et al., 2017). Prolonged bouts

of ST, on the other hand, have been shown to be associated with obesity (Judice et al., 2015). Sardinha

et al. (2015) showed improved physical function from breaking up ST. Recently, Santos et al. (2018)

have investigated how patterns of ST accumulation change throughout the lifespan, concluding, against

previous expectations, that longer bouts of ST are less common in adulthood than in late adolescence,

highlighting the possible existence of crucial periods in which ST increases, namely in adolescence and

the transition from adulthood into old age. This work emphasizes the importance of analyzing not only

total ST, but also how it is distributed across the day.

Current and past research shows the importance of adequate levels of PA and avoidance of pro-

longed periods of sedentary behaviour. Nevertheless, PA of any duration is better than none, with

various benefits for health (Saint-Maurice et al., 2018).

2.1.2 Classifying physical activity

The question ”how much PA is enough?” is the next logical inquiry to arise when trying to relate PA and

health. An answer for such a question can only be met by first quantifying PA.

PA can be quantified through a variety of approaches. One approach is its quantification through

energy expenditure, usually expressed in metabolic equivalent (MET) or kcal. The World Health Or-

ganization (2014) defines MET as ”(...) the ratio of a person’s working metabolic rate relative to their

resting metabolic rate”. It can be interpreted as the intensity of a specific task relative to the resting

metabolism. As such, one can define one MET as the energy cost of sitting quietly, which in quantitative

terms translates into a consumption of 1kcal/kg/hour. It is important to note that, depending on fitness

levels, the subjective perception of effort or the heart rate may vary during the execution of a particular

task, amongst individuals. Hence the importance of using an absolute, task-independent method of

categorizing PA, such as expressing intensity in terms of METs.

In an absolute scale, using METs as a reference, the World Health Organization (2010) splits PA

across two major levels :

1. Moderate PA, with an energy expenditure between 3 and 6 METs;

2. Vigorous PA, corresponding to an activity being performed at an intensity greater than 6 METs.

7

For the purposes of the work developed in this dissertation, however, two added categories must be

considered:

3. Light PA, with an energy expenditure between 1.6 and 2.9 METs (Kim et al., 2013);

4. Sedentary behaviour, for activities with a cost 6 1.5 METs (Sedentary Behaviour Research Net-

work, 2012).

These four categories serve as a basis for grouping periods of PA or ST, for a specific individual, in a

given time frame.

2.1.3 WHO’s recommendations on physical activity

The World Health Organization reports that, in 2010, an estimated 23% of the global population (20%

men and 27% women) was not active enough1. Sedentary behaviours both at home and in the work-

place, low activity during leisure and the option for more passive means of transportation are the main

causes for physical inactivity. With the main goal of preventing NCDs through an increase in PA, the

WHO published the ”Global Recommendations on Physical Activity for Health” (World Health Organi-

zation, 2010), a document detailing not only the recommended amounts of PA (both time and intensity

wise), for different age groups, but also various policies to meet those recommended levels, globally.

These range from measures for the nationwide implementation of guidelines to enhance PA, to the mon-

itoring of those implemented measures, so as to assure the promotion and maintenance of adequate

PA.

The guidelines state the recommended levels of PA should be met through accumulation during the

day or week, meaning a distribution of the recommended time across various activities during the given

time period. Additionally, for inactive individuals, the increase in PA should be gradual, improving on

frequency, duration and intensity over time.

Physical activity recommendations for children and youth aged 5 to 17 years old

A daily accumulation of a minimum of 60 minutes of moderate- to vigorous-intensity PA (MVPA, i.e., PA

that is at least of moderate intensity) is recommended for this age group. The WHO goes on to specify

that these amounts of PA should mostly come from aerobic activities (i.e., using oxygen as the main

source of energy, making the most use of aerobic metabolism, such as walking, running, swimming or

cycling), and that PA of vigorous intensity should be performed at least three times per week.

Examples of activities suited for children include planned physical exercise, sports, games, trans-

portation and physical education, inserted in the context of school, family or community activities.

1Data retrieved from https://www.who.int/news-room/fact-sheets/detail/physical-activity

8

https://www.who.int/news-room/fact-sheets/detail/physical-activity

Physical activity recommendations for adults aged 18 to 64 years old

Adults falling in these range of ages are recommended an accumulation of 150 minutes minimum of

moderate PA per week or a minimum of 75 minutes of vigorous PA accumulated during the week. Alter-

natively, combination of both moderate and vigorous PA can be used to achieve these recommendations.

Furthermore, the WHO recommends aerobic PA to be performed in periods lasting at least 10 min-

utes. Added health benefits are possible through an increase in moderate or vigorous PA to double the

recommended amount per week, i.e., an accumulated 300 minutes for moderate PA or 150 minutes for

vigorous PA (or a combination of the two). Finally, engaging in activities for increasing muscle strength

is advised, twice per week, targeting the major muscle groups.

PA in adults can be in the form of work, transportation, chores, leisure, sports or planned physical

exercise, under the daily, family or community environments.

Physical activity recommendations for adults aged 65 years old and above

In later stages of adulthood, the WHO recommends similar amounts of accumulated PA throughout the

week to those of the previous age group: either at least 150 minutes of moderate intensity PA or 75

minutes of vigorous PA (or a combination of both), with added benefits from doubling the duration and

with the recommendation of aerobic activity to be performed in periods no shorter than 10 minutes.

Strength training should also be included, with a frequency of two days per week.

If health conditions impede the compliance with these recommendations, adults should do their best

to be as physically active as possible, to the best of their abilities.

In older adults, PA can be included in transportation, chores, work (when the person is still active in

that sense), leisure, sports or planned physical exercise, under the daily, family or community environ-

ments.

For all the aforementioned age groups, meeting the described recommendations brings pros that

transcend any eventual cons. In children and adolescents, an improvement in both muscular and car-

diorespiratory fitness is to be expected when aiming for the described targets; in adults aged 18 to 64

years old, engaging in the weekly recommended PA amounts also contributes greatly to bone health and

reduces the risk of NCDs; in older adults, there’s an additional benefit relating to maintaining cognitive

function and functional health.

9

2.2 Objectively measured physical activity: actigraphy

When trying to accurately measure PA, there’s an obvious interest in minimizing estimation errors as best

as possible, using appropriate methods to do so and avoiding subjective assessments of PA intensity.

Although self-report methods exist to measure PA, for instance through interviews, self-administered

surveys, questionnaires, diaries or a combination of these methods, questions arise regarding the va-

lidity and accuracy of such measures in estimating PA intensity, volume, bouts and breaks (Helmerhorst

et al., 2012; Ainsworth et al., 2012; Loney et al., 2011; Monyeki et al., 2018). To overcome these lim-

itations, the use of activity monitors has become standard for providing an objective measurement of

PA.

Actigraphy, a non-invasive monitoring method for human activity, is typically used for objectively

measuring PA. The devices used for this type of assessment, actigraphs, are small, portable watch-like

units, which are worn by participants of a given study for assessing PA, during a given period of time

(for example, five to seven consecutive days) and record the wake-time activity of the subject. Data

is then extracted from the devices, as most contain an USB hub for this purpose. In PA research,

these devices are worn in the hip and the main sensor for registering activity is a built-in accelerometer.

According to the manufacturer and model, different devices may contain different sensors, such as light

and temperature sensors. In the context of this dissertation, as accelerometry is the sole method by

which PA was assessed, data from that sensor contains the most important information extracted from

the devices. Not exclusive to PA research, actigraphs are also employed in sleep research, being worn

in participants’ wrists during sleep time (Ibanez et al., 2018).

The use of actigraphs in PA research has been validated as a reliable, objective method for quan-

tifying activity (Plasqui et al., 2013). Among the various manufacturers for these devices, one stands

above the rest as having the most widely used and validated devices: Actigraph Corp (Actigraph, LLC;

Ft. Walton Beach, FL2). Out of all Actigraph Corp.’s, models of activity monitors, two main devices are

used at the research conducted at the EHLab: Actigraph model GT1M3 and Actigraph model wGT3X+4

(Figure 2.1 (a) and (b), respectively). The main difference between these two devices is the fact that the

former only allows for uniaxial accelerometer data collection, while the latter allows triaxial accelerome-

ter data recording. Kaminsky and Ozemek (2012) assessed both models of accelerometers’ recordings

in uniaxial mode and found them to be comparable. To ensure comparability between findings using

different devices, researchers at the EHLab typically focus on data from only one axis and, as such, the

work developed in this dissertation follows the same method when making use of accelerometer data.

Actigraph’s accelerometers have been thoroughly validated since their release. One study in particu-

lar deserves special attention regarding the validation of these devices in an experimental setting. Using

data from the United State’s 2003-2004 National Health and Nutrition Examination Survey (NHANES)5

Troiano et al. (2008) characterized a representative sample of the United States population (n=6329)

2Actigraph Corp: https://www.actigraphcorp.com/3Actigraph Model GT1M: https://www.actigraphcorp.com/support/activity-monitors/gt1m/4Actigraph model wGT3X+: https://actigraphcorp.com/support/activity-monitors/wgt3xplus/5NHANES: https://www.cdc.gov/nchs/nhanes/index.htm

10

https://www.actigraphcorp.com/

https://www.actigraphcorp.com/support/activity-monitors/gt1m/

https://actigraphcorp.com/support/activity-monitors/wgt3xplus/

https://www.cdc.gov/nchs/nhanes/index.htm

(a) (b)

Figure 2.1: Actigraph model GT1M (a) and wGT3X+ (b).

in terms of their levels of PA, integrating accelerometer data measured using an Actigraph model 7164

from children (6 to 11 years), adolescents (12 to 19 years), and adults (older than 20 years). Since then,

Actigraph Corp. have released new, improved devices, and studies have been conducted which com-

pare newer devices with the ones used in the aforementioned study (Cain et al., 2013). Nevertheless,

the settings for accelerometer initialization and data processing originally described by Troiano et al.

(2008) are still employed in current research, in particular in studies conducted by the EHLab.

Actigraph Corp.’s devices data: collection, conversion and processing

Since the work developed in this dissertation greatly revolves around processing data from Actigraph

Corp.’s accelerometers, it is important to address how that data is collected in the devices. Although

equivalent in terms of accuracy (Robusto and Trost, 2012), GT1M and wGT3X+ devices can produce

different files upon download, as they record accelerometer data in distinct manners. As per Actigraph’s

documentation (Actigraph Software Department, 2012), in the older GT1M devices, data is sampled at a

fixed 30Hz, to be then filtered and accumulated into epochs of user-determined size (for example, 15s,

30s or 60s). This process occurs in the device, with data being processed in relatively small chunks.

Data can then be downloaded using Actigraph Corp.’s proprietary software, Actilife, producing an epoch-

level file (containing, amongst other information, accelerometer data), an *.agd file. The newer wGT3X+

devices, however, sample data at a user defined frequency, ranging from 30 Hz to 100 Hz, with every

sample being stored in the device without accumulation. Downloading data from the devices produces

a raw, *.gt3x data file. This file is then filtered and accumulated into epochs through Actilife, creating the

*.agd file. Because processing occurs only after download, users can create different epoch-level files

from a single *.gt3x file. Figure 2.2 illustrates the differences between both devices.

Typical workflow at EHLab produces files at the already filtered and sampled *.agd level, which are

then used to extract all the necessary activity data, through Actilife. Nevertheless, researchers also

keep the raw *.gt3x files, when data collection is performed using wGT3X+ devices, as to allow for future

analysis at different epoch levels. However, researchers work at the filtered and processed *.agd file

level, relying on Actilife for extracting information from the accelerometer files. The software is needed

11

(a)

(b)

Figure 2.2: Differences between processing of accelerometer data between older devices (a) and morerecent models (b). Adapted from Actilife 6 User’s Manual (Actigraph Software Department, 2012).

Figure 2.3: *.agd file schema. From Actilife 6 User’s Manual (Actigraph Software Department, 2012).

for converting the raw *.gt3x files into *.agd files. This is due to the fact that the *.gt3x files are binary

files, their format being proprietary and belonging to specific copyright protected software, in this case,

Actilife. While this is true for the *.gt3x files, the *.agd files can be more easily accessed with the right

tools, without relying on Actilife, as described in the next section.

*.agd file format

The *.agd files, on top of which operations to actigraphy files are performed, are in the open SQLite

format6, with the schema represented in Figure 2.3.

SQLite is a C-language library and a widely used database engine. As such, queries can be per-

formed to *.agd files to operate on their contents, in a completely independent manner from Actilife. It

6SQLite: www.sqlite.org

12

www.sqlite.org

is, therefore, easy to develop code to process information stored in these files. As observable in the

file schema presented, files may contain more than just accelerometer data. Here, however, we’re in-

terested in the ”data” table and, in particular, in the columns ”dataTimestamp” and ”axis1” (the vertical

axis and the one researchers at the EHLab often use to measure activity). This is the time series for

the recorded accelerometer data, accumulated in user defined epochs. Each row of the ”axis1” column

contains a numeric value of ”counts”, i.e., Actigraph Corp.’s units of measurement of activity. According

to the company’s documentation, counts are obtained by adding post-filtered accelerometer data into

epoch-sized chunks. Count values vary according to the frequency and intensity of the raw acceleration.

Counts are produced by a proprietary filter, reserved to Actigraph Corp7.

Once understood how activity is measured in Actigraph Corp.’s accelerometers, it is now necessary

to understand how different values of counts translate into different levels of PA intensity. Resulting from

various research over the years using Actigraph Corp.’s devices, several cut point sets are implemented

in Actilife in order to map counts to sedentary activity, light PA, moderate PA or vigorous PA. These

sets were originally defined for 60s epoch files (their unit being, therefore, counts per minute, CPM)

and are linearly scaled for files for which epochs are under 60s. Out of the 13 cut point sets currently

implemented in the software8, we will focus on two: the ”Evenson Children” and the ”Troiano” cut point

sets. In research conducted by the EHLab, these cut point sets are used, respectively, for children and

youth aged 17 or younger and for adults aged 18 or older.

The ”Evenson Children” cut point set was based on Evenson et al. (2008), who determined threshold

values for the intensity of physical activities in children, using Actigraph’s accelerometers. The resulting

cut point set, currently implemented in Actilife is as follows:

• Sedentary activity, for values in the range 0 to 100 CPM;

• Light PA, for counts between 101 and 2295 CPM;

• Moderate PA, for values between 2296 and 4011 CPM;

• Vigorous PA, when activity counts are higher than 4011 CPM.

As for the ”Troiano” cut point set, it derived from previously cited research (Troiano et al., 2008).

Although the study mentions thresholds for various age groups, the cut point set implemented in Actilife

applies solely to adults (18 or older). The thresholds are the following:

• Sedentary activity, for values in the range 0 to 99 CPM;

• Light PA, for counts between 100 and 2019 CPM;

• Moderate PA, for values between 2020 and 5998 CPM;

• Vigorous PA, when activity counts are higher than 5998 CPM.

7Actigraph Corp.’s definition of counts: https://actigraphcorp.force.com/support/s/article/What-are-counts8Different cut point sets in Actilife: https://actigraphcorp.force.com/support/s/article/

What-s-the-difference-among-the-Cut-Points-available-in-ActiLife

13

https://actigraphcorp.force.com/support/s/article/What-are-counts

https://actigraphcorp.force.com/support/s/article/What-s-the-difference-among-the-Cut-Points-available-in-ActiLife

https://actigraphcorp.force.com/support/s/article/What-s-the-difference-among-the-Cut-Points-available-in-ActiLife

These cut point sets will serve as a basis for part of the work developed in this dissertation. Specif-

ically, the thresholds will guide the implementation of features for processing *.agd files, allowing the

distinction between the various levels of intensity of PA in the data contained in the actigraphy files.

Other tools for processing and storing accelerometer data

Apart from Actilife, another product from Actigraph Corp. deserves attention in the context of the work

developed in this dissertation. CentrePoint9 is Actigraph Corp.s’ cloud based system for managing

and analyzing accelerometer data. It replicates Actilife’s functionalities in a web platform, without the

need for installation of a heavy software package. A solution of this type could be employed to tackle the

problem here presented of centralizing accelerometer data from various studies conducted at the EHLab,

however, the problem of it being a commercial product would still apply. Moreover, the issues of certain

Actilife features not directly satisfying the needs of the researchers indicate that a more personalized,

simple, free to use and easily deployable solution would be ideal. Specifically, there is an issue with the

fact that some tools (such as detection of bouts and breaks in ST) are not implemented accordingly to

the exact results researchers would want to extract from that information.

Regarding existing databases for accelerometry data, there are two that deserve focus, with data

which can be explored using solutions which replicate Actilife’s functionalities. In Portugal, the National

Observatory for Physical Activity and Sports (Observatorio Nacional da Actividade Fısica e do Desporto,

ONAFD)10 aims to, amongst other objectives, monitor PA in the Portuguese population, via analysis of

data collected using Actigraph’s accelerometers. The work conducted by researchers has the endgame

of promoting PA and health, in Portugal. As such, great benefits can arise from having a readily avail-

able tool to work on data collecting, without being so dependent on proprietary software. Regarding

international databases for accelerometry, ICAD, the International Children’s Accelerometry Database11

contains, as the name indicates, accelerometry data from children aged 3 to 18 years from various

countries, totaling variables from over 37000 subjects. ICAD’s contents have resulted in sound research

over the years, some of which in collaboration with the EHLab (Tarp et al., 2018; Hansen et al., 2018;

Tarp et al., 2018; Kuzik et al., 2017). Once again, accelerometer files used in this database are similar to

those of which can be processed via Actilife, which means they can also be handled with a tool capable

of the same operations.

2.3 Chapter overview

Great benefits arise from attaining adequate levels of PA, as those described by the WHO. PA plays a

crucial role in the prevention of chronic diseases, such as diabetes, cardiovascular diseases and certain

types of cancer. In contrast, physical inactivity constitutes a major factor in global mortality.

9CentrePoint: https://www.actigraphcorp.com/centrepoint/10ONAFD, physical activity: http://observatorio.idesporto.pt/Conteudos.aspx?id=311ICAD: http://www.mrc-epid.cam.ac.uk/research/studies/icad/

14

https://www.actigraphcorp.com/centrepoint/

http://observatorio.idesporto.pt/Conteudos.aspx?id=3

http://www.mrc-epid.cam.ac.uk/research/studies/icad/

Sedentary behaviour has proved to bring numerous adverse health effects. As important as the study

of total ST over a person’s day, patterns of accumulation of ST (bouts of sedentary activity and breaks

in ST), have emerged as a research topic over the last decade, mainly concluding on the importance of

breaking up ST.

PA can be categorized, according to the intensity of the task performed, in four levels: sedentary

activity, light PA, moderate PA and vigorous PA. According to these levels, the WHO recommends mini-

mum amounts of MVPA to be achieved by individuals during a day’s or week’s time. These values vary

according to the age group of the individual.

Objectively measured PA via accelerometry, by using activity monitors (actigraphs) has emerged as

a reliable, effective method for assessing PA. Out of the various commercialized solutions, Actigraph

Corp.’s devices are the most widely used and validated. Activity data from these devices is currently

treated using proprietary software, however, the specific file format used for actigraphy files, *.agd, fol-

lows a format which makes them easy to access, with the right tools. Large databases for accelerometer

data exist, which contain files in this very format, and are ready to be explored by researchers.

Current research is being somewhat hindered by relying too heavily on proprietary software for pro-

cessing accelerometer data. A free, easy to use and deploy, solution is of great benefit for researchers

working in the field of PA evaluation through accelerometry. Additionally, the centralization of PA infor-

mation allows for improved workflow.

15

Chapter 3

Supporting technology

As a web platform, Actinfo was developed using various technologies. This chapter describes the pro-

gramming languages, architecture and standards used to develop the platform, detailing the approach

followed to build Actinfo and model data in the platform. Furthermore, security and authentication mea-

sures are addressed.

3.1 Web application architecture

Web applications are computer programs which, as opposed to desktop applications, run on a com-

puting server and are accessed via a web browser. These applications follow the client-server model

structure, a computing model in which communication between the provider of a service (server) and

the requesters of that service (clients) occurs over a network. Advantages in web applications, in com-

parison to desktop ones, include:

• Rapid deployment, without the need to download or install additional software apart from an inter-

net browser;

• Large compatibility across different platforms, such as smartphones and tablets;

• Ease of development, due to the vast amount of resources and open source technologies available.

Over the last decade, web applications have witnessed a large increment in their capabilities, as more

and more tools are being added to web browsers. Specifically, JavaScript and HTML, the most widely

used programming languages in web development1, have experienced phenomenal gains in terms of

performance, making it possible to develop web applications comparable to desktop ones. Although

initially conceived to allow for the execution of client-side scripts, incorporated in browsers, we now find

JavaScript implemented in server-side software. In fact, frameworks and libraries have been developed

to extend Vanilla JavaScript (i.e., plain JavaScript, without any additions), further increasing the number

of different features which can be developed using this programming language.

1Data from the 2019 Stack Overflow’s Developer Survey, available at: https://insights.stackoverflow.com/survey/2019

17

https://insights.stackoverflow.com/survey/2019

Figure 3.1: 3-tier architecture of web applications.

Architecture of a web application

Applications of this nature are organized in tiers, each with specific roles. Although the number of tiers

in web applications can vary depending on the type of technologies used, the 3-tier architecture is the

most commonly used (and the one followed by Actinfo). In this structure, represented in Figure 3.1,

three logical modules comprise the web application:

• A presentation tier, accessible through a web browser, which acts as the client, and through which

information is presented via a graphical interface to end users;

• An application tier (or application server), containing the core logic which drives the application;

• A data tier (also referred to as the database server) for handling database functions.

To understand how communication between the three tiers occurs, one must refer to the term API.

Short for Application Programming Interface, it consists in a set of rules and methods acting as a com-

munication medium between tiers. Amongst the various existing types of API, REST APIs are the most

popular. REST is an acronym for Representational State Transfer and it corresponds to an architectural

style originally described by Fielding (2000). Web services using REST, (i.e., RESTful web services)

make use of HTTP methods (GET, POST, PUT and DELETE being the most relevant in this context, for

getting, sending, updating or deleting contents, respectively (Fielding and Reschke, 2014)) to operate

on resources. In the World Wide Web, a resource is an item of interest which is identifiable by Uniform

Resource Identifiers (URI)2. The flow of information in an application using REST APIs can be simplified

as follows: an agent (be it a person or software) makes a request to a specific resource, identifiable

by an URI, via an URL (Uniform Resource Locator, a specific type of URI). This request (which, since

the application makes use of the HTTP protocol, can be of the type GET, POST, PUT or DELETE, in-

dicating the action to be performed on the resource) will generate a response, in a specific format. In

the context of this dissertation, responses use a JavaScript-based format known as JSON (JavaScript

Object Notation3). The contents of the response indicate that the method used in the request produced

the desired effect on the resource.

2Definition by the World Wide Web Consortium: https://www.w3.org/TR/2003/WD-webarch-20031209/3JSON definition: https://www.json.org/

18

https://www.w3.org/TR/2003/WD-webarch-20031209/

https://www.json.org/

3.2 The MEAN stack

Typically, to create a web application, different technologies are combined together to form what is called

the full stack for web development, that is, software for implementing the various tiers. A very popular

stack, and one deeply nested into many applications in production today is the LAMP stack. LAMP

is an acronym which originally stood for Linux, Apache, MySQL and PHP (Lawton, 2005). Each one

of these serves a specific purpose in the application’s structure: Linux is the operating system, at the

base of the application; on top of that, Apache is used as the web server; MySQL is used for the data

tier, as the relational database management system (RDBMS); finally, PHP is used as the scripting

language, which offers the needed programming support. LAMP has since evolved to incorporate more

programming languages and frameworks for building applications and APIs which are still compatible

with the stack (Louridas, 2016).

In recent years, the MEAN stack has emerged as a strong competitor, slowly replacing LAMP as

the first choice for developing web applications (Louridas, 2016). MEAN is an acronym for MongoDB,

Express, Angular and Node.js. These open source components come together to create an end-to-

end framework for application development, from the database to the presentation tier. As such, the

MEAN stack uses: MongoDB as the non-relational database management system; Express as the

web framework, which runs on top of Node.js; Node.js, for the web server-side implementation of the

application in JavaScript; Angular for the presentation tier of the application. All of the aforementioned

components are operated using JavaScript, thus allowing developers the use of only one language for all

the tiers of the application. This constitutes an advantage over the LAMP stack, which requires different

programming languages and format conversions to exchange data between tiers.

A MEAN stack application follows the previously described 3-tier architecture: MongoDB can be

viewed as the data tier; Node.js contains code for the web server; Express is used to create REST

APIs and can be interpreted as a channel to allow communication between the server and presenta-

tion layer, comprising, together with Node.js, the application tier; Angular is the presentation tier. The

typical request-response flow is illustrated in Figure 3.2. Since only one format is used across all tiers

to structure the data (the JSON format), it is possible to avoid data conversions, as is the case with

applications using SQL-based databases. In MEAN application it is, therefore, faster to make HTTP

requests, present and store the data, as there is no need for reformatting.

Similar to MEAN, other stacks exist which share some of its components. These variants typically re-

place the components of the stack with other JavaScript frameworks. For example, in a MERN stack, the

front-end environment, which is used to build the presentation tier, is replaced by React.js, a JavaScript

library developed by Facebook4; in a MEVN stack, Vue.js, ”(...) a progressive framework for building

user interfaces (...)”5, is used in the presentation tier. Nevertheless, I chose the original MEAN variant

for the platform developed in this project, as it is more widespread than the emerging alternatives, with

extensive documentation on each component available online, both individually and as part of the stack.

4React.js: https://reactjs.org/5Vue.js: https://vuejs.org/

19

https://reactjs.org/

https://vuejs.org/

Figure 3.2: Request-response flow in a 3-tier architecture MEAN stack application.

The following sections explore each one of the four components in a MEAN stack application, from front

(starting in the client side, Angular) to back, ending with the database (MongoDB).

3.2.1 Angular

At the time of the development of Actinfo, Angular 6 was the available version of this presentation en-

vironment. Developed and supported by Google, Angular6 is a framework for building interactive single

page applications (i.e., a web application in which user interaction is based on the rewrite of the current

page instead of loading new pages from the server, thus making for a better user experience, without in-

terruptions between pages, much like a desktop application). It is written in TypeScript, a typed superset

of JavaScript developed by Microsoft, and builds the client application in HTML and TypeScript.

Since the time it was first conceived, the MEAN stack benefited from upgrades to its various technolo-

gies, Angular being one of them. The most recent iteration of the stack makes use of a component-based

architecture for the presentation tier:

1. Angular’s building blocks, NgModules, provide context for compiling these components;

2. Metadata for these components associate them with templates, which define views, i.e., screen

elements which can be modified according to the program’s logic and data;

3. The templates combine HTML with Angular directives (providers of program logic) and binding

markup (which connect the application data and the DOM, Document Object Model, of the page),

which allow Angular the modification of the HTML;

4. The page is finally rendered for display to the end user.

Angular’s power comes not only from its speed and performance, as it makes use of code splitting

to only load what is required to render the view the user requested, but also from its versatility, as a

cross-platform client environment, with the views being easily rendered in mobile devices. The Angular

CLI (command line interface) makes it easy to generate various components, which can be routed to one

6Angular: https://angular.io/

20

https://angular.io/

another and connected via services for sharing methods and data. Its ease of testing and deployment

make it an obvious choice for using with the MEAN stack.

3.2.2 Node.js

Node.js7 is a server-side runtime environment. It is built on top of Chrome’s V8 JavaScript engine

and its architecture is event-driven, running on a single thread and an asynchronous, non-blocking I/O

(input/output) model. By using a single thread to service all the requests, Node.js creators hoped to

overcome the bottleneck of I/O operations by moving away from synchronous service of the requests

arriving at the server. This way, when the code serviced needs, for instances, to query the database,

the web server does not wait for data to be returned; the main thread will continue running, moving

on to the next API call. When the database operation finishes, its corresponding callback is queued,

pending execution for once the engine gets a chance to handle the response. This event-driven style

of programming used in Node.js has proved it to be extremely efficient in I/O operations and resource

utilization (Chaniotis et al., 2015).

One major feature of Node.js is its extensibility through the pre-installed package manager, npm

(Node.js package manager). A command line interface and an online database of public and paid-

for packages greatly enhance the Node.js versatility by providing open source JavaScript development

tools, available across 750,000 packages8.

Node.js allows for building highly scalable, real-time JavaScript web applications and is currently

used by many top companies and organizations, such as Netflix, PayPal, LinkedIn and NASA9.

3.2.3 Express

Specifically built for Node.js, Express10 is a web framework, providing developers with features for build-

ing web applications. Although minimalist and lightweight, it contains numerous HTTP utility methods,

facilitating the creation of APIs. Express makes use of middleware functions for handling requests.

These functions have access to both the request and response objects, as well as the next function in

the request-response cycle. As such, through middleware, Express handles HTTP requests, by either

returning a response or passing on the parameters to a different middleware function. Express is also a

routing framework11, allowing developers to determine how the application responds to a client request

to a given endpoint (an URI plus one of the HTTP request methods GET, POST, PUT or DELETE).

Express is crucial in the application tier, allowing the creation of REST APIs and thus ensuring commu-

nication between the server and the client.

7Node.js: https://nodejs.org/8Data gathered from https://www.npmjs.com/products/enterprise9Information retrieved from https://www.netguru.com/blog/top-companies-used-nodejs-production

10Express: https://expressjs.com/11Routing in Express: https://expressjs.com/en/guide/routing.html

21

https://nodejs.org/

https://www.npmjs.com/products/enterprise

https://www.netguru.com/blog/top-companies-used-nodejs-production

https://expressjs.com/

https://expressjs.com/en/guide/routing.html

(a) Relational model.

(b) Data as documents.

Figure 3.3: Data models for a relational database (a) and MongoDB (b). Adapted from MongoDBArchitecture Guide.

3.2.4 MongoDB

MongoDB is a a non-relational database management system which stores data as documents, using a

binary representation of the JSON format called BSON (Binary JSON). BSON documents can contain

one or more fields, each field containing a value of a specific data type (such as arrays, strings, numbers,

Booleans, objects, binary data or sub-documents). Documents with similar structure are organized in

MongoDB as collections. In a traditional relational database, the equivalent of a collection would be

a table, with documents being the rows and fields being the columns. An example from MongoDB’s

Architecture Guide12 comparing a relational data model with MongoDB’s ”data as documents” can be

found in Figure 3.3. In this example, which shows the modeling of data for a blogging application, the

relational approach would require multiple tables (here, we’re considering the tables ”Category”, ”User”,

”Article”, ”Tag” and ”Comment”). In MongoDB, on the other hand, it is possible to model data using two

collections of documents: one for the users, and another for the articles.

In an article of the blog, multiple comments, tags and categories may exist, each one expressed as

an embedded array in the article document. This approach of localizing data, using a single document

for all the data for a single record is, not only simpler for developers, but also increases scalability and

performance, as it is possible to retrieve a full document, with all related data, in a single read to the

database, in contrast with relational databases, where data is spread across multiple tables.

Because the storing of data in MongoDB is flexible, fields in the JSON documents can be altered,

effectively changing the data structure of the document. These updates do not affect other documents

in the database, translating into the possibility of having documents with different fields in the same

12MongoDB Architecture Guide: https://www.mongodb.com/collateral/mongodb-architecture-guide

22

https://www.mongodb.com/collateral/mongodb-architecture-guide

collection. The dynamic schema that MongoDB offers allows developers to insert or remove new fields

in documents as they are needed, without tampering with the database schema, as is the case of

relational databases.

The use of flexible schemas in MongoDB means the collections do not enforce document structure by

default. As such, it is the responsibility of the developer to define constraints which ensure the integrity

of the data. While in SQL databases there is the need to define foreign key constraints (i.e., fields which

support uniquely identifying the relationship between two tables), in MongoDB the use of these fields

is optional. This may result in the insertion of invalid data in a document. Therefore, data must me

modelled in a way which ensures its integrity, while still matching the performance requirements.

A typical MongoDB database may contain several collections of entities (i.e., documents). The en-

tities in each collection share attributes, defining a loose entity type. To create the logical data model

of the document database, the developer needs to define relationships between the entity types. There

are two types of relationships relevant to this work: one-to-many relationships and many-to-many rela-

tionships. The official documentation for MongoDB13 provides strategies to model both types:

One-to-many relationships: there are two methods to model these types of relationships:

Embedded documents: this strategy was already alluded to in the example from Figure 3.3 (b):

an article of the blog may contain multiple comments, tags or categories. In this example, the

one-to-many relationship between the blog article and the comments, tags and categories

is expressed by embedding the collections of entities on the ”many” side of the relationship

in the document for the blog article. In the JSON document, this is the equivalent of using

embedded arrays for each one of the entity types on the ”many” side. An example of this

type of relationship in Actinfo is shown in Figure 3.4 (a), where the document belongs to a

collection of the type researchStudy. Documents in this collection may be linked to multiple

documents of the type studyGroup. Therefore, the studyGroup entity type is represented in

the form of embedded arrays of sub-documents in the researchStudy documents. This is

a denormalized data model, in which it is possible to retrieve all information regarding the

groups in a study through a single read.

References: this method consists in including a foreign key field in the documents on the ”many”

side of the relationship. Figure 3.4 (b) shows an example from Actinfo. The figure displays

two documents of the file type, which are linked to the same document of the researchSubject

type in the database. The ”subject” field acts as a foreign key, linking the file documents to the

researchSubject document. One advantage of this strategy is that it will not grow the original

document for the subject as more files are uploaded to the database. Instead, the references

are stored in the documents for each new file.

Many-to-many relationships: to model these relationships, we can use a strategy similar to the refer-

encing explained for one-to-many relationships. For many-to-many relationships, we can embed

13Data models in MongoDB: https://docs.mongodb.com/manual/data-modeling/

23

https://docs.mongodb.com/manual/data-modeling/

(a) Embedded documents.

(b) Documents with references.

(c) One way embedding.

Figure 3.4: Modelling relationships in MongoDB (examples from Actinfo, highlighting the relevant fields).Some fields were collapsed for improving the readability.

the references in one side of the relationship, creating an array with foreign keys. This method of

”one way embedding” is often chosen for optimizing the read performance of a relationship of this

type, particularly when the relationship is uneven. Figure 3.4 (c) shows an example from Actinfo,

for a document in the collection with the researchSubject type. Documents in this collection can

be linked to many documents of the type studyGroup. Additionally, studyGroup documents can be

linked to many documents of the researchSubject type. Since there are many more subjects in a

particular group than groups linked to a single subject, we can embed the references to groups in

the subjects.

24

Notice also the ” id” field in the example documents. This is a unique ID automatically generated by

MongoDB for each document, which acts as a primary key. The value of this field is of the type ObjectID,

consisting in a hexadecimal numeral containing information which includes a random number for that

specific document and the timestamp of the creation of the document. This information is accessible

through MongoDB’s methods.

Lastly, it is also important to address the storage requirements of the platform. Since Actinfo should

be able to handle the storage of actigraphy files of a relatively large size, GridFS14 was used for storing

and retrieving all files uploaded to the platform. GridFS is a specification for MongoDB supporting the

handling of files with sizes exceeding 16MB (useful in Actinfo, which supports the upload of the raw

*.gt3x actigraphy files, with sizes ranging from 50 to 100MB). GridFS has the particularity of automat-

ically generating the collections needed for file storage and, as such, no schema was needed for the

files. Nevertheless, additional fields with metadata were added as needed, for associating files with

documents in other collections in the database, as explained in more detail in Chapter 4. GridFS works

by dividing files into data chunks, storing them as separate documents. Therefore, one file can be asso-

ciated with more than one chunk, depending on its size. The file’s metadata and its actual contents are

kept in separate collections. In other words, GridFS creates one document per file in the collection for

file metadata, while all of the binary chunks are stored in a different collection. As such, when in need

of updating file metadata, only a single collection is accessed.

3.3 The FHIR standard

Despite the advantages of MongoDB’s schema flexibility, some additional constraints must be imple-

mented when trying to standardize the storage and sharing of data. Ultimately, Actinfo should be inter-

operable with clinical data to the fullest extent, making use of well established, international standards for

information exchange. Schema design cannot, therefore, be discarded simply because of MongoDB’s

flexible document structure.

As such, to ensure interoperability, the HL7 FHIR specification was used, whenever possible, for

modeling data in Actinfo. Published by HL7 (Health Level 715, a not-for-profit organization dedicated to

developing standards for handling electronic health information, including exchange, integration, sharing

and retrieval), FHIR16 (pronounced ”fire”) is an acronym for Fast Healthcare Interoperability Resources.

Being supported by over 1,600 members from over 50 countries, with stakeholders representing

not only healthcare providers, but also government, pharmaceutical companies and consulting firms17,

HL7’s standards were not, initially, open to the public. As the community using the standards grew,

the need for an open license to utilize the organization’s specifications for developing interoperable

applications became more and more eminent. Thus, FHIR was created, with the goal of providing a

simple, easy to implement API for healthcare (Mandel et al., 2016; Bender and Sartipi, 2013).

14GridFS: https://docs.mongodb.com/manual/core/gridfs/15Health Level 7: http://www.hl7.org/16FHIR: https://www.hl7.org/fhir/17Data from HL7’s website: https://www.hl7.org/about/index.cfm?ref=nav

25

https://docs.mongodb.com/manual/core/gridfs/

http://www.hl7.org/

https://www.hl7.org/fhir/

https://www.hl7.org/about/index.cfm?ref=nav

In FHIR, data are represented as resources. Each of these modular components has a set of well-

defined fields, with specific data types, and can have multiple representations, in different formats. Re-

sources in FHIR have clear, intuitive definitions for their data elements and contain references to one

another, defining constraints and relationships between them. Together they constitute a collection of

information models and are one of the two main pillars of FHIR, the other being its RESTful APIs to

operate on resources. In the currently official released version of FHIR, version R4, as of April 2019, re-

sources are grouped in five layers, according with their role in the application: Foundation, Base, Clinical,

Financial and Specialized. With the current iteration of Actinfo being research-focused (and not exactly

a clinical application in the eyes of FHIR), this last layer contains the relevant resources used, as they

were designed for public health and research: the ResearchStudy and ResearchSubject resources.

FHIR resources can be described in multiple formats, such as XML, Turtle and, of particular interest

for the work developed in this dissertation, JSON. The representation of FHIR resources the JSON

format allows for integration with a MEAN stack application such as Actinfo, as information can be

stored in the database with the template defined in FHIR’s documentation. It is, however, important

to mention that FHIR resources were not used out-of-the-box when modelling data in Actinfo, as they

either lack fields needed in this specific context, or provide ones which are not relevant under the scope

of this work. The resources were, therefore, followed as closely as possible, with some minor extensions

and some fields ignored, due to missing data (but no fields specifically required in the FHIR resource in

question were left empty). This is safeguarded by FHIR’s extensibility: FHIR resources were conceived

in such a way that, provided the document’s structure remains true to the original template, i.e., its

schema having all the fields specified in the documentation, and provided the required fields are not

left empty, documents can be extended with additional fields and the non-required fields can be ignored

without compromising the correct use of the standard. This flexibility is of great importance in Actinfo, as

it makes it possible for the platform to use the standard, even though some fields specific to a healthcare

context are left empty, due to the nature of this project being more oriented towards PA research.

The use of MongoDB as a flexible database proves to be a major advantage, since integration with

FHIR is possible by modeling documents after FHIR resources, in the JSON format. To accomplish this

goal, a library for MongoDB, Mongoose18, was used to model data, allowing the creation of schemas

following the structure of FHIR resources.

3.4 Security and authentication

In a platform such as Actinfo, where several requirements must be met to ensure compliance with the

GDPR (the General Data Protection Regulation) and Portugal’s own data protection authority, CNPD

(Comissao Nacional de Proteccao de Dados) guidelines, some measures must be implemented to es-

tablish a secure connection between the client and the server. Furthermore, client authentication for

navigating the platform is also to be considered.

18Mongoose: https://mongoosejs.com/

26

https://mongoosejs.com/

HTTPS and SSL

An extension of HTTP, the HTTPS is a protocol for secure communication over a network. When using

HTTPS, the browser is signaled to add an encryption layer, protecting the traffic, using SSL (Secure

Sockets Layer). By wrapping normal traffic in this protected, encrypted layer, the server and client can

communicate without the risk of interception of the messages by outside parties, effectively blocking

what are referred to as ”man-in-the-middle” attacks.

Actinfo is currently hosted in a SSL protected domain. A SSL certificate was emitted for the platform

and installed in the web server. This certificate contains a digital signature, which is verified by a Cer-

tified Authority (CA). When the browser connects to the server, it tries to verify the authenticity of the

certificate, by checking the entity who emitted it against a list of trusted organizations. In this case, the

SSL certificate is signed by DigiCert19, a company focused on digital security. Once the identity of the

website has been verified, an encrypted session is started to allow client-server communication. Users

can also verify the identity of the website they’re in: in most modern browsers, a padlock in the address

bar indicates a secure connection.

Client authentication: JWT

I implemented a platform specific user authentication system in Actinfo, to control access to the platform,

as explained in more detail in Chapter 4. Users registered in the platform are provided login credentials

for navigating Actinfo. When a registered user successfully logs in to the platform, a JSON Web Token

(JWT, pronounced ”jot”) is generated. A JWT is an open standard (RFC 751920) which allows for the se-

cure transmission of information between parties as a JSON object. Furthermore, because it is digitally

signed, the information can be trusted and verified. A JWT is a string composed of a header, payload

and a signature, separated by dots.

The most common use for JWTs (and the one which Actinfo makes use of) is authorization. When the

user logs in, every request will include the generated JWT for that user, which allows access to routes,

services and resources, provided these were permitted with that token. The token is passed in the HTTP

Authorization header with every API call; the server will then check for a valid JWT which, if present,

allows access to the protected routes. Because the JWT is saved in the browser’s local storage, there’s

no need for exchanging credentials to verify the user’s identity with every request. However, JWTs

should not be kept in local storage for longer than required. I defined an expiration date of one day for

tokens generated for Actinfo’s users. Once the JWT expires, the user needs to log back in to access the

platform’s contents. Additionally, JWTs are removed from local storage every time an user logs out of

the platform.

To implement this authentication protocol, Passport21, a Node.js middleware for authentication, was

used. Passport is easily integrated in Express-based applications, being flexible and modular. To au-

thenticate requests, Passport makes use of authentication mechanisms called strategies, which are

19DigiCert: https://www.digicert.com/20JWT description: https://tools.ietf.org/html/rfc751921Passport: http://www.passportjs.org/

27

https://www.digicert.com/

https://tools.ietf.org/html/rfc7519

http://www.passportjs.org/

packaged as individual modules. In Actinfo, the JWT strategy was employed, along with a strategy for

authentication using a Google account, as explained in more detail in Chapter 4.


Actinfo was built using the MEAN stack, a full-stack of JavaScript-based open source technologies which,

when combined together, form a platform framework for web applications development following the 3-

tier architecture: Angular is used for the presentation layer; together, Express and Node.js make up the

application tier; MongoDB is used as the non-relational database. Communication between layers is

made through REST APIs, in a request-response flow.

MongoDB’s flexibility and storage of data as JSON-like documents makes it possible to employ stan-

dards for electronic information exchange, such as the HL7 FHIR standard. FHIR addresses interop-

erability with clinical data using well structured data models, called resources. In its current iteration,

Actinfo does not deal with clinical data, focusing on accelerometry and physical activity (PA) data. How-

ever, the platform should be prepared to allow for future integration with data from multiple sources,

many of which containing health indicators, and allowing that information to be compared with PA time

indicators. Additionally, the FHIR resources used here are adaptable, to meet the requirements of the

information storage needs in the context of the work developed, which, paired with the robustness of the

standard, makes it an obvious choice for modelling data in Actinfo.

To ensure protection from man-in-the-middle attacks and provide a trusted, verifiable identity to the

website, a SSL certificate was installed, which allows encryption of data exchanged between the web

server and browsers. Additionally, Actinfo is equipped with an authentication system of its own, making

use of JWT for authenticating users in the platform, which allows for safe navigation in the platform.

28

Chapter 4

ActInfo

The present chapter describes Actinfo, the platform developed in the scope of this dissertation. Actinfo

is currently hosted by FCCN1, in a virtual machine, managed by INESC-ID2, its associated domain being

https://actinfo.inesc-id.pt. This chapter explains the platform in detail, starting with an overview

of its architecture, followed by a description of its various features and tools. Lastly, an explanation of

how Actinfo is compliant with current security and data protection regulations is provided.

4.1 Overview of the platform

Following the model introduced in Chapter 3, Actinfo follows a 3-tier architecture based on the MEAN

stack. Each of these tiers can be analyzed individually to understand their contribution to the platform

as a whole.

4.1.1 Database

Figure 4.1 shows the logical data model for the database. The relationships between the different entity

types were modelled as explained in Section 3.2.4. Five different collections of entities exist in this

model. Each collection has an entity type:

• user : a collection of information profiles for the registered users, including the access creden-

tials for navigating the platform. The ”role” field serves a specific purpose which is addressed in

Subsection 4.1.2.

• researchStudy : the collection for data from physical activity (PA) studies, its documents modelled

after FHIR’s ResearchStudy resource. A researchStudy establishes a many-to-many relationship

with the documents in the user collection via embedding of the user references in the research-

Study document, under the ”userPermissions” field. These documents contain metadata for the

PA studies. Additionally, documents in this collection contain embedded documents for groups

1FCCN: https://www.fccn.pt/2INESC-ID: https://www.inesc-id.pt/

29

https://actinfo.inesc-id.pt

https://www.fccn.pt/

https://www.inesc-id.pt/

Figure 4.1: Data model for Actinfo.

of study participants, establishing a one-to-many relationship. As explained in Section 3.3, FHIR

resources were used as a basis for modelling these type of data. However, due to them being

more clinically-oriented, some additional fields were needed, for the purposes of this project, while

others were intentionally not used due to lack of FHIR-specific information. This is not to say the

core FHIR schema is incomplete, but instead that a request to a researchStudy in Actinfo returns

a response which differs slightly from the original FHIR resource, as only non-empty fields are

present in the JSON. Future iterations of the platform may contain documents with more FHIR-

related fields, which now act only as placeholders. A more detailed explanation of the contents of

each field can be found in Appendix A.

• studyGroup: documents with this entity type appear in the form of embedded documents in the

researchStudy collection. As such, the studyGroup does not warrant the creation of a separate

MongoDB collection to store its documents. The studyGroup contains metadata resulting from

grouping the study participants.

• researchSubject : a collection for data from participants from the PA studies. Similarly to the re-

searchStudy collection, its documents were modelled after a FHIR resource, the ResearchSubject

resource. Once again, several non-required fields were not used, while some were added, ex-

tending on the original FHIR resource. This is the collection in which the outputs resulting from

operations on actigraphy files are stored, i.e., the PA time indicators (sedentary time, time in light

PA, time in moderate- to vigorous-intensity PA and breaks/bouts) and other information extracted

from the files. The ”groups” field in the documents of this collection allow the establishment of a

many-to-many relationship with the studyGroup the subject belongs to, via one way embedding.

Note: there is also a ”study” field, which references the study the subject is a part of. It was in-

cluded due to being required in the FHIR resource’s model, even though it is not used to establish

a relationship in the data model chosen for Actinfo.

30

• file: a collection for metadata for actigraphy files uploaded to the platform. A file establishes

a many-to-one relationship with the researchSubject, by including a field which references the

subject the file belongs to.

The documents of each collection follow schemas modelled using Mongoose (see Section 3.2.4),

except for those in the file collection, for which the model is generated automatically by GridFS. Note

that, as explained in Section 3.2.4, all entities have a ” id” field, generated automatically by MongoDB,

in addition to the fields represented in Figure 4.1. A more detailed description of the fields in each entity

of the data model can be found in Appendix A.

4.1.2 Web server

The platform’s web server is where I developed APIs to perform all the required tasks for CRUD (Create,

Read, Update, Delete) operations on documents in the researchStudy, researchSubject, user and file

collections, as well as process actigraphy data. When an user makes a request, via, for example, the

click of a button, an Angular service is triggered to send this HTTP request to the application tier, where

the appropriate responses are generated and sent to the web client. As such, it is understandable

the need for different methods to handle requests targeting documents in different collections. A more

structured explanation of what operations are specifically being performed in the application tier with

every request is described in Section 4.2.

The user management API

With the main goal of controlling access to the platform, contents of the user collection describe who

is navigating Actinfo. Methods defined in this API support, not only the creation and management of

user accounts, but also the authentication before a user logs in to the platform. When creating new

user accounts, based on the information received from the web client, the actual passwords are never

stored in the database; instead, as of the creation of the account, the password is hashed, resulting in

the actual string to be stored in the ”password” field of the user document. Regarding authentication,

an API method is able to compare the hash stored in the document with the password provided at log

in; if they match, a JWT is generated (see Section 3.4), for user navigation on the platform. Optionally,

users registered on Gmail can access the platform without providing a password, using an alternative

authentication path based on a Passport.js strategy for that purpose: a single sign-in method is em-

ployed, using an OAuth3 provider (in this case, Google), who enables Actinfo’s access to a portion of the

user’s Google account profile, specifically, his/hers email. This required the registration of Actinfo, as a

third-party application, able to request access to this information, in the official Google API console.

The implementation of the user authentication API required the installation of npm packages (see

Section 3.2) to handle registration and authentication, namely the bcryptjs package, for hashing the

passwords, and the passport, passport-jwt and jsonwebtoken packages for authentication.

3OAuth: https://oauth.net/

31

https://oauth.net/

In addition to the general authentication system, I implemented an authorization protocol, to further

restrict navigation in the platform, via the creation of the ”role” field in the user documents. Currently,

Actinfo supports two different types of user accounts, depending on the contents to be accessed, result-

ing in two different roles: ”admin”, for administrator accounts and ”researcher”, for researcher accounts.

While the former is only able to register and manage users in the platform, being restricted to these API

methods, the latter has access to all of the remaining features of the platform.

The researchStudy and researchSubject API

A repository of PA studies such as Actinfo requires standard CRUD operations for researchStudy and re-

searchSubject documents. Users with the ”researcher” role can make requests in the front-end involving

the creation of new studies, the access to its contents, the update of studies with new information (e.g.:

addition or removal of participants) or their deletion. This is made via POST, GET, PUT and DELETE

requests, respectively, generating server-side responses which are sent to the presentation tier, reflect-

ing these changes. Similar methods for these operations are also defined for the researchSubject entity

type.

The file API

Lastly, to support Actinfo’s functionalities for, not only managing files uploaded to the platform, but also

computing metrics from their contents, I defined a set of API methods for accessing and operating on

*.agd files. Routes for handling the POST, GET and DELETE requests were created using Express

on Node.js for file upload, download, providing access to the files and deletion from the database,

respectively. As for reading the contents of the SQLite database in the files, I used the npm package

sqlite3 in association with Node.js’s native methods for file streaming, allowing queries to be performed

in files saved to GridFS, thus enabling access to the tables presented in Figure 2.3 (Section 2.2). From

then on, I developed different methods to allow for operations on the retrieved accelerometer’s time

series, as well as for getting personal information stored in the device, such as the sex, height, weight

and age of the subject.

4.1.3 Client

The presentation tier of the platform sits on top of various Angular components, rendering different

pages the user has access to. In the web client side, Actinfo makes use of the popular HTML/CSS +

JavaScript library Bootstrap4, originally developed for Twitter. I created a responsive, user-friendly set

of web pages, using these technologies. Furthermore, certain small details were implemented to make

for a better user experience, such as the use of the flash messages module for Angular, available from

npm, which implements a form of notification messages displayed to the end user in case of success or

failure of certain actions. As for more complex visualizations, such as charts for the distributions of PA

4Bootstrap: https://getbootstrap.com/

32

https://getbootstrap.com/

time indicators, the open-source libraries D3.js5 and Chart.js6 were used to generate HTML plots from

JavaScript arrays created using the implemented APIs.

4.2 User interface

Upon landing on Actinfo’s website, the homepage is presented to the end user, represented in Figure

4.2 (a). This, together with the login menu (Figure 4.2 (b)) and user profile page (Figure 4.2 (c)), are

the only pages which are common to all users. Once logged in, depending on the role of the account,

different interfaces are presented to the user. From that point onward, different features of the platform

can be explored, which are explained in this section.

The many operations which can be performed using Actinfo are grouped in two main categories,

one for each of the two user roles. The pages accessible to the researcher role is where the platform’s

potential truly manifests: Actinfo presents itself as both a hub for PA studies and a tool for processing

actigraphy data. As such, various interfaces were created for these two purposes: on the one hand, in

a broader level, CRUD operations on studies and files; on the other hand, processing and visualizations

from computed metrics, with plots and relevant statistics. This section describes these features, accord-

ing with the intended flow for navigating the platform, starting with account creation and management, in

the case of administrator accounts, and the various steps going from study creation to data visualization,

for researcher accounts.

4.2.1 Administrator role

Users with the ”admin” role can perform CRUD operations on user accounts. The separation between

users with access to the platform’s many features for data storing and processing, and users for manag-

ing accounts allows control over who can access different contents in Actinfo.

Upon login, users are shown the page in Figure 4.3 (a). From this interface, administrator accounts

can update user information of specific accounts or remove registered users from the database. Addi-

tionally, an option to register new users is available, by clicking the ”New user” button. Figure 4.3 (b)

presents the form for the creation of user accounts.

Regarding the form fields, most are self-explanatory, however, it should be mentioned the fact that

an email for the user is collected not only as an alternative login method, as previously explained, but

also as the user’s contact information.

5D3.js: https://d3js.org/6Chart.js: https://www.chartjs.org/

33

https://d3js.org/

https://www.chartjs.org/

(a)

(b)

(c)

Figure 4.2: Actinfo’s homepage (a), login menu (b) and profile page (c).

34

(a)

(b)

Figure 4.3: The administrator interface (dashboard (a) and ”Register” form (b))

35

Figure 4.4: ”My studies” interface, presented to users with researcher accounts upon login.

4.2.2 Researcher

Figure 4.4 shows the interface presented to users with researcher accounts upon successful login. Users

are shown the studies they have access to and are presented with an option to create a new study, by

clicking the ”Create new study” button. Additionally, clicking on a row of the table redirects the user to

that specific study’s interface. Lastly, an option to ”Compare studies” is also available.

New study

Upon clicking the ”Create new study” option, the user is redirected to the study creation form page. As

shown in Figure 4.5, several fields are required to create a new PA study, most of which corresponding

to FHIR-specific fields, derived from the ResearchStudy resource (as explained in section 3.3):

• Identifier: an identifier attributed to the research study by the responsible researcher.

• Title: a short and descriptive label for the study.

• Responsible person: the name of the researcher who oversees the study.

• Start and end dates: when the study began and ended, mapped to the ”period” field in the

corresponding JSON document.

• Status: the status of the study, based on FHIR version R3 (note: at the time of implementation

of this field, version R4 had not yet been officially released, hence the choice for the options from

version R3). Hovering over each one of the radio button options shows the user a description of

each status.

36

Figure 4.5: ”New study” form.

• Study groups: when designing the interface for studies, researchers at the EHLab requested a

system to group files within each study, resulting in the implementation of an organization method

based on study groups. Users must specify at least one group for uploading files corresponding

to a specific study. More groups can be created using the ”New group” button. This feature was

designed to allow the grouping of study participants by their geographical location and/or corre-

sponding cohort. As such, studies in which accelerometer data was collected in distinct moments

for the same subjects can belong to multiple groups, one for each one of those moments. The

cohort index allows for a distinction between, for instance, a first baseline PA assessment and a

second one some time later, after participants followed a certain physical exercise protocol. Alter-

natively, if data was only collected once for each subject, a cohort index of 1 should be specified

for all groups, as described in the helper text in the form.

Clicking the ”Continue” button redirects users to the created study interface (Figure 4.6). Users are

notified to upload the corresponding files to each of the created study groups. This interface also intro-

duces options to manage various aspects of the study, namely: managing of the study’s permissions,

deletion of the study, creation of new groups and visualization of statistics from data in the study. Permis-

37

Figure 4.6: Interface for a created study. Two groups exist in this example study, one with files alreadyuploaded to it, the other still empty.

sions were introduced as a way to restrict access to a study’s contents. By default, only the researcher

who created the study can access it, unless he/she updates the permissions by granting access to other

registered researchers. Regarding the ”Compute group statistics” button, its features are explained in

more detail later in this section. As for the groups list, it should be mentioned the column ”Validated?” is

an indicator of the *.agd files in the study group being subjected (or not) to Actinfo’s tools for processing

actigraphy files, as will be explained later.

Clicking on a item of the group list redirects the user to an interface for that study group. There, users

are presented with options to either delete the group, upload *.gt3x/*.dat/*.agd files to it or download

those files. The file uploader interface is shown in Appendix B (Figure B.1). Figure 4.7 shows an example

group with files already uploaded to it. Users can perform common file management operations, such

as file deletion, download or upload.

*.agd file validation and processing

The processing of actigraphy files is introduced with the ”Validate *.agd files” operations, available from

the interface in Figure 4.7. Clicking this button redirects users to the page for setting the parameters

for validation and processing of actigraphy files in a specific study group, as shown in Figure 4.8. The

processing of actigraphy data involves three major operations, which take place via API, based on the

input parameters:

1. Wear time validation: to remove periods of non-wear time from the analysis, based on standard-

ized criteria for accelerometer reduction settings (International Children’s Accelerometry Database

38

Figure 4.7: Study group interface, indicating files uploaded to it. Options to search for files by name areavailable to users.

(ICAD), 2017). A period of time in the *.agd time series (i.e., the rows for ”dataTimestamp” and

”axis1” in the ”data” table, if we refer to the file schema from Figure 2.3) is considered as non-

wear time (that is, a period during which it is considered the subject was not wearing the monitor)

if a minimum of 60 minutes of consecutive zero-value activity counts (in the ”axis1” column) is

observed. Form this definition, a day is considered valid if it accumulates at least 600 minutes

of valid wear time. Lastly, at the file level, a file is considered valid if a minimum of three valid

days occur, one of them being a weekend day. This validation process takes place independently

from the chosen settings. Additionally, if the option ”Define maximum wear time” is selected, the

accelerometer wear time validation loop will use the input value as the maximum time, in minutes,

a day of data should have to be considered valid. This option was implemented after user request,

to account for situations in which participants do not follow the exact instructions for wearing the

accelerometer. Specifically, participants should remove the activity monitors before going to bed;

in case this does not happen, the accelerometer will wrongly register sleep time as accelerometer

wear time, which will be incorrectly classified as sedentary time during the application of the cut

point set. By defining a maximum for wear time, adjusting for the number of hours of sleep, it is

possible to remove sleep time from the analysis.

39

Figure 4.8: Interface for setting the validation parameters for *.agd files. ”Compute bouts” option is notselected, for better readability.

2. Application of cut points: users can select the cut point set to be applied to the *.agd time

series, to compute sedentary time and times in light, moderate and vigorous PA. The ”Troiano”

and ”Evenson” cut point sets are given as options, as they are the most commonly used sets in

research conducted by the EHLab for adults and children, respectively. These were implemented

after Actilife’s very own cut point sets, the thresholds for each level of PA intensity being the ones

described in Section 2.2. Additionally, users also have an option for defining their own cut point

set, thus not being limited to the common four levels of intensity. Choosing this ”Custom” option

will present users with a form for defining the thresholds for each custom level. An example form

is provided to users, to better understand how to fill in the fields. The custom form, along with the

example, are shown in Appendix B, Figures B.2 and B.3, respectively. Based on the cut point set

chosen, the duration of the time periods in each level of PA intensity is calculated, resulting in the

accumulated time in each intensity, for each valid day.

3. (if selected) Computation of bouts/breaks: for each valid wear time period, bout detection takes

place, according to the selected bouts. To avoid manual input of the thresholds for the commonly

used bouts and breaks, a list for selecting among the most often chosen ones is presented to the

user. Additionally, users can also define a custom bout of PA, not being restricted to sedentary

behaviour, via manual input in a similar fashion to the custom cut point set definition. A bout of

PA requires definition of a minimum duration for the bout (in minutes) and count level (in counts

per minute). The complete list of options for the commonly used bouts (all referring to sedentary

behaviour) is shown in Appendix B, Figure B.4. The form for custom bout definition is presented in

40

Figure B.5. Detecting a bout is a matter of looping through a valid wear time period and checking

if the minimum duration for that bout has been reached; next, we check if the activity (data in the

”axis1” column) remains within the interval of counts for that bout and does so during a period of

time which does not exceed the bout’s maximum duration. The total number of detected bouts and

accumulated duration are saved, for each selected bout.

These three operations are possible via file streaming from the database to a temporary on-disk

location, using native Node.js methods in conjugation with the sqlite3 package. It is important to note

that, because cut point sets and count levels for bouts are defined for periods of one minute, these need

to be linearly scaled to account for different epoch lengths of the *.agd files (for example, in Chapter

5, the files correspond to sequences of 15 second epochs, which is a typical epoch length in studies

conducted at the EHLab).

Output page

The results of the described operations are presented to the end user in the output page, for which an

example is shown in Figures 4.9 and 4.10. The page is divided in two sections: one ”Summary” section,

condensing the important information for the valid files, and a section for the detailed outputs, for all

files (both valid and invalid). Each section presents a list with the information for each file. Each item

of the lists (file) can be expanded, by clicking in the arrow on the right side of the card, revealing the

file information. Options to expand/collapse all items at once are also available. Users can also search

for specific files by filename. Additionally, users can export the analysis to an Excel file, by clicking the

”Export to Excel file” button at the top of the page (an example *.xlsx file is shown in Appendix C) and/or

save the output to the database, clicking the ”Save output” button. This last action makes an API call

to generate researchSubject documents from the files, saving them to MongoDB. This allows users to

return to an already validated dataset through the ”study groups” menu from Figure 4.7, without having

to reset the validation parameters. Saved validations can also be wiped from the database, if users

decide to process the files with different settings.

In the ”Summary” section, only valid files are shown. For each item of the list, three tables are

presented to the user:

• ”Subject info. and validation details”: general information extracted from the *.agd file including the

subject ID, the file epoch, gender and age. The chosen cut point set is also shown.

• ”Wear time validation”: summary information for the wear time validation results. Includes the

the number of valid days, total valid wear time, total times for PA indicators, and an indication if

the subject meets the WHO’s recommendations for PA (World Health Organization, 2010). The

compliance with PA recommendations is assessed taking into account the age of the participant.

• (if bout/break detection was selected) ”Bouts summary”: A summary of the total number and time

in each of the selected bouts.

41

The ”Detailed validation outputs” section follows a similar structure to the ”Summary” section, only

with all files are included, regardless of being valid or not. Invalid files are highlighted in red in the list.

Additional information is presented to the user in each table, for each file:

• ”Subject info. and validation details”: the table in this section is divided in ”Subject information” and

”Validation details”. The former presents personal information extracted from the *.agd file. The

latter shows the validation details, including the user defined settings.

• ”Wear time validation details”: in addition to the summary information for the wear time validation

results, it includes the detailed information by day for the PA time indicators (ST, time in light PA,

time in MVPA and breaks/bouts). Invalid days are highlighted in red.

• (if bout/break detection was selected) ”Bouts”: contains a summary of the total number and time

in each of the selected bouts plus the number and time in each bout by day.

It is important to mention that, while height and weight are extracted from the *.agd file, this informa-

tion is not used to compute any metric. Future iterations of the platform may make use of this information,

as explained in Chapter 6.

”Group statistics” and ”Compare studies” functionalities

The pages for these features are similar in the type of information they present. Both contain visualiza-

tions for the metrics shown in the output page. The difference between the two resides in the datasets

used to generate the plots and tables: while the ”Group statistics” functionality applies to participants

in user-selected study groups from a particular study, the ”Compare studies” page displays information

comparing results from two or more studies. This section will focus on the ”Group statistics” page, while

the ”Compare studies” tool is demonstrated in Chapter 5, as part of the comparative study. Figure 4.11

shows an example of the ”Group statistics” interface. The page is divided in two sections: ”Demograph-

ics” and ”Physical activity and sedentary time”. Additionally, options to filter the data are also available

to users.

The menu at the top of the page allows users to apply age and/or gender filters to the data. The

plots and tables are promptly updated with the new datasets. Regarding the age filters, I implemented

options for three distinct age groups, based on input from researchers at the EHLab. Any combination

of the three can be chosen:

• ”Children/adolescents”, for subjects aged 17 years old or younger;

• ”Adults”, for participants in the age range 18-65 years old;

• ”Elders”, which includes subjects aged older than 65 years.

The ”Demographics” section contains three elements. At the top, users can view a summary table

with the number of subjects and their distribution among the different age groups. Following the table,

two cards are displayed: a card with a violin plot, for the distribution of ages of the participants and one

42

Figure 4.9: Output interface (”Summary” section). Two items (files) are shown in the list. The first(filename: ”LIS 015915sec.agd”) is expanded, revealing the summary information, while the second(filename: ”LIS 019015sec.agd”) is collapsed.

containing a pie chart with the distribution of genders. Regarding the ”Ages” card, users can interact with

the violin diagram, by hovering/clicking the plot, which shows quantitative information: the y coordinate

(in this case, the age), maximum and minimum, mean, median, the 1-quantile and 3-quantile. Addi-

tionally, since the D3.js library was used to generate the plot, users have access to axes manipulation

options, namely zoom and pan, as well as an option to download a .png image of the diagram. As for

the ”Gender” card, interactions with one of the slices of the pie chart (i.e., hovering or clicking) shows

the user the percentage of subjects in each category.

Following the demographics, users can view information for the PA time indicators. Three cards are

presented in the section ”Physical activity and sedentary time”: a card for the distribution of subjects who

43

Figure 4.10: Output interface (”Detailed validation outputs” section). Only the first item (filename:”LIS 015915sec.agd”) is expanded, for better readability. File ”LIS 015015sec.agd” was flagged as in-valid, which is why it is highlighted in red.

44

meet the PA recommendations, a card for the distribution of ST and a card for the distribution of time

in MVPA. The charts have the same type of interactions/options available as the ones in the previous

section. Regarding the tables in the ST and MVPA cards, in addition to the average time per day in

each behaviour, I chose to also present these times as percentage of wear time. This allows for a

more accurate representation of the indicators than simply presenting the average time per day in each

behaviour and replicates the metrics used in current research conducted by the EHLab (Santos et al.,

2018).

45

Figure 4.11: ”Group statistics” page.

46

4.3 Compliance with data protection regulations

The General Data Protection Regulation, GDPR (European Parliament and Council, 2016), in effect

since 25 May 2018, is the European regulation on personal data protection. In Portugal, the Comissao

Nacional de Proteccao de Dados (CNPD) is the data protection authority responsible for the control of

the processing of personal data, thus ensuring the respect for individual rights. Up until the GDPR had

been entered into application in the European Union, personal data in Portugal was protected under the

Lei da Proteccao de Dados Pessoais no. 67/98, the LPDP (Assembleia da Republica, 1998). As of June

2019, the LPDP is still in effect for all matters that do not contradict the GDPR.

Actinfo aims to be compliant, to the best of its extent, with the GDPR. The relevant articles of this

regulation, for the type of data handling performed in the platform, are, in general, more comprehensive

(and, thus, covering a broader range of aspects) than the articles in the LPDP referring to similar matters.

As such, ensuring compliance with the GDPR will result in Actinfo also being compliant with the LPDP.

The relevant articles of the GDPR in the context of the work developed in this dissertation, with a

summary description of the key aspects and an explanation of how Actinfo complies with each one are

presented in the following list:

• Articles 5 and 6: ”Principles relating to processing of personal data” and ”Lawfulness of

processing”:

– Description: personal data must be processed based on legitimate purposes and in a trans-

parent manner, informing subjects about the processing activities on the data. The data must

be collected for specified, explicit and legitimate purposes. The subjects must have given

consent to the processing of their personal data for one or more specific purposes.

– Actinfo: in Actinfo, user accounts are created by an administrator, for a specific individual,

who is present during the account creation process and gives verbal consent for the input of

his/her data. The user is informed of the purposes of the data collected: personal information

for the identification of the user and an email for contact. The data is not used for purposes

other than the ones stated to the user.

As for the data in actigraphy files, from study participants, its handling is safeguarded by

signed consents, which explain how the data is to be used. The participants sign the consent

forms prior to the data collection. The researcher who is responsible for the study must ensure

the data is used for the specific purposes stated to the participant. This must be guaranteed

by the researcher before the actigraphy data is uploaded to the platform. It is outside the

scope of this project to verify if all active users with researcher accounts are indeed complying

with the regulation.

• Articles 12 to 23: ”Rights of the data subject”:

– Description: the data subjects have the right to ask about the information stored and how it

is being handled and processed. The subjects have the right to ask for corrections, object to

the processing of their data or even have it deleted.

47

– Actinfo: clicking the ”About” link in the platform’s navigation bar shows users an explanation

of how to contact the administrator regarding any questions, concerns or requests regarding

the platform. The administrator must attend these requests, taking the necessary course of

action to protect the users’ rights.

Regarding data from study participants (i.e., actigraphy files), it is, once again, protected by

signed consents. This is external to Actinfo; the researcher who is responsible for the study

is the one who the participants must contact regarding any concerns about their data.

• Article 25, articles 32 to 34: ”Data protection” and ”Security of personal data”

– Description: aspects regarding privacy and protection should be considered from the start, at

system design (”privacy by default”). Users must be notified of any personal data breaches, if

they occur. The system must be regularly tested to ensure security of the data. In the event of

a technical or physical incident, a mechanism must exist to restore the availability and access

to personal data.

– Actinfo: the administrator of the platform must ensure the software installed in the virtual

machine hosting the application is up to date, to prevent exploits of possible vulnerabilities

in the system by a party with malicious intent. Regular back-ups of the data must also be

performed. Furthermore, the platform is hosted in the FCCN data center, equipped with

protocols for testing the servers, to further aid in identifying possible technical issues.

In addition to the described articles from the GDPR, Article 15 of the LPDP, which addresses security

concerns, also refers to access control. As explained in Section 4.1.2, only registered users can navigate

the platform. Additionally, the ”role” system further restricts access to the data: only researcher accounts

have access to PA studies and tools to operate them. Finally, with the implemented permissions system,

the studies are only available to researchers with permissions to access them.

The current version of Actinfo tries to comply with data protection regulations in all possible aspects,

both for the EU regulation and the guidelines defined by the CNPD. As the platform grows, improvements

must be made to further ensure compliance with these regulations and increase security.


Actinfo is a web application following a 3-tier architecture. It was built based on the MEAN stack.

The database is divided in different collections of entities, for user data, studies, study groups, study

participants and actigraphy files.

In the application tier, I developed methods for performing the necessary operations on each of the

documents in the database, such as CRUD in the case of the users, studies and participants, and

methods for computing PA time indicators from information in *.agd files.

As for the presentation tier, I expanded Angular with libraries for generating data visualizations, such

as plots and charts.

48

The platform’s features can be divided in two major groups, depending on the role of the user account.

Upon login, administrator accounts are presented with the option to manage user accounts. Researcher

accounts, on the other hand, have options to operate on studies. Several features are available to

researcher accounts:

• Creation of studies (upload of actigraphy files);

• Validation of files in studies, while at the same time applying cut point sets and bout detection;

• Export of the validation results to an Excel file;

• Visualization of PA statistics for files within a specific study;

• Comparison of PA statistics between two or more studies.

Actinfo aims to be compliant with current data protection regulations, namely the GDPR and the

LPDP. Continuous improvements must be made to ensure compliance with these regulations and further

increase the protection of personal data.

49

Chapter 5

Assessing Actinfo

Once a functional prototype of the platform had been achieved, I conducted an assessment phase

in order to evaluate the degree to which the implemented features are in agreement with what was

proposed and assess the website’s usability. Furthermore, I performed a comparative study with two

datasets managed by Actinfo, as a practical demonstration of the various tools for handling actigraphy

files. As such, this chapter is divided in three major sections, each tackling one of the aforementioned

topics.

In Section 5.1, I assess the extent of the completion of the platform’s major functionalities, i.e.,

the conformity of the implementation of the major features in Actinfo with the requirements for each

functionality. An overview of the most important features is given, accompanied by relevant comments,

including some gathered from the platform’s users.

Section 5.2 provides insight on user feedback regarding the ease of use of Actinfo. To assess the

website’s usability, an anonymous, standardized questionnaire was provided to the platform’s active

users.

Finally, Section 5.3 details the comparative study I conducted, demonstrating Actinfo’s capabilities

for, not only extracting relevant data for the characterization of the sedentary time and objectively mea-

sured physical activity (PA) profiles of two populations, but also compare the metrics extracted for both.

Data from two different previously conducted studies was uploaded to the platform, allowing for the com-

parison of two populations in terms of PA time indicators. Additionally, the results were validated using

the ”de facto standard” software for actigraphy data analysis, Actilife.

5.1 Conformity with requirements

During the development of Actinfo, I collected input from researchers at the EHLab regarding the various

functionalities as they were being implemented. This feedback served to determine exactly how each

one of the main features of the platform should operate, i.e., what was required of each functionality.

With the input from the researchers, it was possible to evaluate the implementation status of the major

functionalities of the platform, once the first prototype of Actinfo was completed. This assessment of the

51

Table 5.1: Conformity of the platform’s features.

Category Feature Implementation status

UsersAccount management Performing as expected.

Access control Performing as expected.

Studies

Interface Lacking in options to group studies.

Study permissions Currently, users can only add permis-sions to a study, not remove them.

Comparison of studies Performing as expected.

Group statistics Performing as expected.

Filters for computed metrics andplots

Performing as expected, for age andgender.

Actigraphy files

File management File handling is incomplete.

Wear time validation Lacking in customization options.

Application of cut point sets Performing as expected.

Computation of activity bouts Performing as expected.

Visualization of the obtainedoutput

User interface needs improvements forbetter readability.

Export analysis to an Excel file Exported file needs additional informa-tion.

implementation status of the features allowed me to perform a conformity analysis, aimed to evaluate if

the implemented functionalities were performing in accordance with the requirements. When a feature

was found not to be performing exactly as expected, this was mostly due to revised requirements in later

stages of the development approaching completion. As I gathered more feedback from researchers, I

was able to identify specific details they would like to see implemented differently (mostly pointing to-

wards more customization options and interfaces). Table 5.1 presents an overview of the main features

implemented in Actinfo and their conformity with the requirements. These features were grouped in

three major areas of interest: ”Users”, for functionalities relating to the user database and account man-

agement, ”Studies”, for features affecting the manipulation of studies in the platform and corresponding

outputs and ”Actigraphy files”, for the handling of actigraphy files and accelerometer data analysis.

Observing the table, the major features for which the implementation is not in complete accordance

with the requirements refer to the handling of studies and actigraphy files:

• Studies:

– Regarding the CRUD operations on studies, in particular the creation of studies, it should

include more FHIR-specific fields, improving on interoperability with clinical information. Addi-

tionally, the creation of studies should be possible from files already uploaded to the platform.

As for the interface itself, it should be possible to group the participants by location and/or nb.

of cohort.

– In the feature ”Study permissions”, permission management should be more complete, allow-

ing users to also remove access from a specific study.

52

• Actigraphy files:

– Regarding file management, in the current version of the platform, files must be associated

to a specific study. The separation of the dataset from the study should be possible, in future

iterations. Additionally, users showed interest in being able to perform operations on actigra-

phy files (such as wear time validation and application of cut point sets) only on selected files,

instead of having to operate on all files in a study group.

– In the tool for wear time validation, users reported the minimum wear time should be a custom

field. Furthermore, when using the maximum wear time, if exceeded, users should be notified

of the specific file(s) in which this occurs, so as to define sleep time and subtract it from

sedentary time. Lastly, options for marking a file as valid should have more custom fields.

Users have shown interest in being able to define the minimum number of valid days and

which days of the week should have valid data, for marking a file as valid.

– As for the visualization of the validation output, some tweaks to the interface are needed

to make it more intuitive: larger font size and grouping of ”Summary” and ”Detailed” menus

under a single section. For the tables containing detailed information, averages should be

presented for all valid days, for time indicators, in a final table row.

– Lastly, regarding the generated Excel file, users stated the file should contain two additional

sheets: ”Summary” and ”Daily” sheets for valid files only.

This analysis of conformity allowed to understand the current status of Actinfo’s main features. While

the majority are performing in accordance with the requirements, some adjustments are necessary in

future versions of the platform, mainly regarding the addition of customization options and redesign of

certain pages, for improved readability and user experience.

5.2 Platform usability

Although conceived over two decades ago by Brooke (1996), the System Usability Scale (SUS) is still

currently viewed as the industry standard method for assessing usability. The SUS consists in a 10-item

questionnaire which provides a ”quick and dirty” method to evaluate usability of various systems, and

has been widely tested for measuring usability of a range of products. These include smartphone appli-

cations, websites, web applications and an array of other hardware and software, as it is an inexpensive

tool to assess perceived usability (Lewis, 2018). As such, because it is a simple (yet effective) tool,

reliable even when sample size is small, I employed the SUS to evaluate Actinfo’s usability.

Embedded in Actinfo’s ”About” section, in the website’s navigation bar, users can open a website-

tailored version of the SUS, as shown in Figure 5.1. The questionnaire contains 10 Likert items, i.e.,

questions with five levels of response, ranging from ”strongly disagree” to ”strongly agree”:

An equal number of positively connoted and negatively connoted items compose the survey. Users

are asked not to leave any item unanswered, as it is standard when answering a questionnaire of this

type (Bangor et al., 2009).

53

Figure 5.1: SUS questionnaire, found in Actinfo’s ”About” section.

Currently, there are nine registered users, seven of which having accounts with the ”researcher” role.

These users were asked to answer the survey after testing Actinfo’s various features. Their answers

were collected to obtain the SUS score. As per the method originally described by Brooke (1996),

for every positively connoted item, the expression x − 1 was applied, x being the average of all users

scores, ranging from 0 (”strongly disagree”) to 4 (”strongly agree”); for every negatively connoted item,

the expression 5 − x is used; the scores for all items are then added and multiplied by 2.5, yielding the

final SUS score. A score of 87.50 out of a possible 100 was obtained for Actinfo, resulting from the

individual scores presented in Table 5.2.

Over 10 years of research compiled by Bangor et al. (2008) allowed to establish benchmarks for

an array of systems and products evaluated using the SUS, websites included, for which an average

score of 70 was obtained. In a later study, Bangor et al. (2009) mapped results from 1000 surveys to

a 7-point adjective scale, comprised of words often associated with usability: ”Awful” (no score data),

”Worst imaginable” (mean SUS score of 25.00), ”Poor” (mean SUS score of 25.00), ”OK” (mean SUS

score of 52.01), ”Good” (mean SUS score of 72.75), ”Excellent” (mean SUS score of 85.58) and ”Best

imaginable” (mean SUS score of 100). The score obtained for Actinfo falls within the range of scores

for categories between ”Excellent” and ”Best imaginable”, which can be used as an indicator of having

achieved a prototype with good usability. However, this score is not by all means final, since usability

54

Table 5.2: SUS scores of Actinfo’s evaluation (n=7)

Item Mean score Mean SUS item score”I felt very confident using the website” 4.33 3.33”I found the various functions on this website werewell integrated”

4.67 3.67

”I think that I would like to use this website fre-quently”

5 4

”I thought the website was easy to use” 4.33 3.33”I would imagine that most people would learn touse this website very quickly”

5 4

”I thought there was too much inconsistency on thiswebsite”

1.67 3.33

”I needed to learn a lot of things before I could getgoing with this website”

2.33 2.67

”I found the website unnecessarily complex” 1.67 3.33”I found the website very cumbersome to use” 1 4”I think that I would need the support of a technicalperson to be able to use this website”

1.67 3.33

Final SUS score 87.5

must be periodically assessed, as more users are registered and new features implemented in the

platform.

5.3 Comparative study with two adult populations

As an integrated information system with the power to aggregate multiple PA studies, Actinfo is equipped

with tools for establishing comparisons between them, as explained in Chapter 4. To demonstrate the

platform’s ability to compute and compare PA metrics and their distribution between different, never

before compared, populations, I conducted a comparative study, using the various tools implemented

in Actinfo. To this end, accelerometer data from two populations was used, from two different studies

conducted by the EHLab: one study collected accelerometer data with the goal of assessing PA in a

sample of the population of the municipality of Lisbon (from this point forward refered to as ProjCML),

the other consisting in actigraphy data collected as part of a clinical trial aiming to determine the effect of

different physical exercise protocols in biomarkers in patients of type II diabetes (study with the identifier

D2FIT).

Using these two datasets, I employed the platform’s tools for computing PA time indicators to obtain

average sedentary time (ST) per day, average time in MVPA (moderate- to vigorous-intensity PA) per

day and average number of breaks per day per hour of sedentary time. Regarding this last metric,

I chose it as an alternative to simply obtaining the number of breaks per day, as current literature is

moving towards making it a standard for evaluating the patterns of sedentary behaviour (Chen et al.,

2018). This also served as a means to validate Actinfo’s break detection method. The platform also

allowed for comparing the distribution of subjects meeting the WHO’s recommendations for PA (World

Health Organization, 2010). Through Actinfo it was, therefore, possible to infer on possible differences in

55

the PA profiles of a sample of the adult population of the municipality of Lisbon and an adult population

suffering from type II diabetes.

5.3.1 Studies

The D2FIT study consisted in a controlled, randomized trial with a 12 month duration with the goal of

assessing the efficacy of different physical exercise protocols in biomarkers and quality of life of patients

of type II diabetes. The project divided patients in three groups, two to whom specific exercise protocols

were administered and a control group. Only data from the control group was used for establishing the

comparison with the ProjCML study; in particular, actigraphy files from the first evaluation moment of

the participants (where the baseline for the study was created) were used. The PA aspect of the study

was measured using Actigraph’s wGT3X+ activity monitor: participants used the hip-worn device for 7

consecutive days, being asked to remove the accelerometers for any water-based activities and during

sleep. Initialization of the devices for data recording occurred on the morning of the first day. Data were

downloaded from the devices, converted into 15 second epoch *.agd files through Actilife and stored for

posterior analysis, by the EHLab staff, after the recording period ended. These were the files uploaded

to Actinfo.

ProjCML consisted in an assessment of the current level of physical fitness of the residents of the

municipality of Lisbon, in order to understand the interventions needed to promote a more active lifestyle

of the population. As such, actigraphy was used to objectively measure PA in participants. Similar to the

D2FIT study, participants wore an Actigraph’s wGT3X+ monitor, with the same recommendations and

initialization conditions. Once finished the collection period, files were downloaded and converted into

15 second epoch *.agd files through Actilife.

For both studies, informed consent from all participants was obtained prior to the data collection, for

the specific purposes of evaluating PA, as explained in the consent forms in Appendix C. The goals of the

comparative study described in this section, i.e., evaluating PA via time indicators (average sedentary

time per day, average time in moderate to- vigorous-intensity PA and average number of breaks per

ST hour) and assessing the compliance with the global recommendations for PA are aligned with the

objectives stated in the consent forms. In fact, this analysis follows the same methodology as the

research conducted at the EHLab with these exact data, ensuring compliance with the EU GDPR’s

principals regarding the processing of personal data (see Section 4.3). Additionally, each participant

was attributed an unique code, by the responsible researchers for the study, at the EHLab, as to protect

subject’s identity.

5.3.2 Data preparation

Prior to the analysis, I used Actinfo to prepare the data for the comparative study. This process can be

summarized in the following steps:

1. I uploaded actigraphy files from a total of 174 subjects to the platform, grouping them by study (80

corresponding to the D2FIT study and 94 for ProjCML).

56

2. Using the implemented tools for validating files, I performed a preliminary screening analysis.

The platform flagged files having missing data needed for the ”Group statistics” feature, which I

excluded from the study. Additionally, I employed the platform’s wear time validation functionality

to determine invalid files, i.e., files in which the total wear time did not meet the criteria to be

considered valid. Of the original dataset of 174 files, 32 were excluded due to having missing data

and nine were flagged as invalid by Actinfo. A total of 142 files remained, which were used for the

analysis: 73 corresponding to the D2FIT study, the remaining 69 from the ProjCML study.

3. Lastly, I employed Actinfo’s tools for computing PA time indicators: the wear time validation feature,

application of cut point sets and break computation. Regarding the cut point set, since, in both

studies, the participants are adults, I opted for the ”Troiano” cut point set. For break detection, I

selected the option for detecting one minute breaks from the list shown in Figure B.4, Appendix B.

Regarding the wear time validation described in step two, as previously explained in Chapter 4, a

period of time is considered as non-wear time whenever a minimum of 60 minutes of consecutive zero-

value activity counts occurs. Valid days are then defined as having a minimum of 600 minutes of valid

wear time. Additionally, valid data for at least three days must be present, one of which should be a

weekend day. These conditions apply to all participants, regardless of study, age group or sex. The de-

scribed reduction settings are adapted from standardized criteria (International Children’s Accelerometry

Database (ICAD), 2017).

As for the break detection referred to in step three, it is important to clarify we are defining a ”break”

as an interruption in ST, which translates into a period of time of a minimum duration of one minute in

which activity was higher than 100 counts/min. This metric was divided by the daily ST, obtaining the

variable number of breaks/ST hour, to ensure consistency with current research in the field of objectively

measured PA making use of this parameter (Chen et al., 2018; Santos et al., 2018).

In Figure 5.2 the ”Subject info. and validation details” tab of Actinfo’s output page for one example

subject is presented, resulting from the described validation process. It is important to note that, while in-

formation regarding race, height and weight is extracted from the *.agd file, it is not used in this analysis,

as discussed in Chapter 6.

Once the data had been prepared, I used Actinfo’s ”Compare studies” feature for the analysis, de-

scribed in Section 5.3.3.

5.3.3 Analysis

After saving the outputs resulting from wear time validation, application of cut point sets and break

computation to the database, Actinfo’s ”Compare studies” tool was used to produce the visualizations

for the various computed metrics. Figure 5.3 shows the first section of the comparison page, with

demographic data for the population.

Observing the violin plots for the distribution of ages, we can confirm that both studies follow similar

distributions, with populations composed of subjects close to late adulthood. As explained in Chapter

4, hovering the mouse or clicking the violin shows the user quantitative information, mainly the mean,

57

Figure 5.2: Subject information and validation details for an example file, from Actinfo’s output page.

median and interquartile ranges. As for the gender distribution, both studies show a higher percentage

of females than males. However, study D2FIT has a more uniform distribution of genders, as opposed

to study ProjCML, in which there’s a much higher percentage of females than males.

Regarding the compliance with the global recommendations for PA (World Health Organization,

2010), in both studies the percentage of subjects who do not reach the recommended levels of PA

is greatly superior to the percentage of subjects who do, as shown in the bar chart in Figure 5.4. Using

Actinfo’s option to filter by gender, study ProjCML shows a much higher percentage of females (40%)

meeting the recommendations than males (24%), while in study D2FIT the opposite happens, with a

much less discrepant difference (16% of females meeting the recommendations vs. 19% of males

reaching the recommended amounts of PA), as shown in Figure 5.5.

As for the distribution of sedentary time, the plots shown in Figure 5.6 allow a comparison in daily

average ST between participants from both studies. Although mean ST per day is similar in both studies,

study ProjCML shows a slightly lower interquartile range, as observed in the violin plot. In addition

to ST per day, sedentary time was also calculated as a % of wear time, as explained in Chapter 4.

Using Actinfo’s gender filters, it is observable, in both studies, a higher average ST/day in males by

approximately 20 min/day, as observed in Figure 5.7.

Still regarding time indicators, the distributions of time per day in PA of moderate intensity or greater

(moderate- to vigorous-intensity PA, MVPA) for the two populations was obtained, as shown in Figure

5.8. An overall lower average time in MVPA/day was obtained for study D2FIT, which also shows a

slightly lower interquartile range. When filtering by gender, a higher mean time in MVPA per day in

females is observed for study ProjCML, while in study D2FIT the opposite occurs, as shown in Figure

5.9.

58

Figure 5.3: Demographics for the analyzed population, from Actinfo’s ”Compare studies” tool.

59

Figure 5.4: Distribution of compliance with PA recommendations (no filters active).

60

(a) Males (b) Females

Figure 5.5: Distribution of compliance with PA recommendations, with filtering by males (a) and females(b).

61

Figure 5.6: Distribution of daily sedentary time for both studies (no filters active).


Figure 5.7: Distribution of daily sedentary time for both studies, filtering by males (a) and females (b).

62

Figure 5.8: Distribution of daily time in MVPA both studies (no filters active).


Figure 5.9: Distribution of daily time in MVPA for both studies, with filtering by males (a) and females (b).

63

Table 5.3: Number of breaks per hour of sedentary time

StudyNumber of breaks/ST hour (mean ± SD)

No filter Males Females

ProjCML 9.64±3.60 9.91±3.80 9.52±3.49

D2FIT 10.04±4.82 8.86±4.45 11.21±4.88

Lastly, using Actinfo’s ”Export” function, I obtained the average daily number of breaks per hour of

sedentary time. Unlike the previously presented metrics, which were obtained directly through Actinfo’s

”Compare studies” tool, this parameter was computed from the exported Excel files for each study.

The outputs are summarized in Table 5.3. Study ProjCML shows a higher number of interruptions in

sedentary time in males when compared to study D2FIT, while the inverse happens when comparing

the females of the two studies. A higher number of breaks/ST hour was found in males from ProjCML

when compared to females in the same study, the opposite happening in study D2FIT.

Results validation

To assess the validity of the obtained results, I processed the same files used for the described analysis

in Actigraph Corp.’s Actilife software (regarded as the ”de facto standard” for actigraphy and the software

currently used at the EHLab). I selected equivalent parameters for wear time validation and data scoring

(i.e., using the Troiano cut point set and the same definition of valid wear time period). No optional

screening parameters were selected in the software. Specifically, Actilife’s options to define a threshold

for activity and spike tolerance were not selected in the analysis. The former, if it had been not set to

zero, would result in the classification of a period of wear time as non wear unless a specific intensity

had been achieved. The latter allows users to define a time threshold for non wear periods to be tagged

as non wear, regardless of there being activity during that period, and was also set to zero. This ensures

consistency with Actinfo’s own implementation of wear time validation and segmentation of the activity

time series according to the defined cut points. Additionally, it replicates research conducted by the

EHLab, which uses these same settings.

Tables 5.4, 5.5 and 5.6 show the daily average ST, daily time in MVPA and number of breaks/ST

hour, respectively, as mean ± SD, obtained from Actilife.

64

Table 5.4: Daily average ST, from Actilife.

StudyDaily avg. ST, min/day (mean ± SD)

All subjects Males Females

ProjCML 624.35±80.56 639.26±97.14 620.62±77.14

D2FIT 622.24±104.39 624.79±112.86 615.77±92.22

Table 5.5: Daily average time in MVPA, from Actilife.

StudyDaily avg. time in MVPA, min/day (mean ± SD)


ProjCML 43.07±30.09 40.40±24.95 44.93±26.46

D2FIT 34.27±27.81 39.95±26.14 31.63±25.62

Table 5.6: Daily average number of breaks/ST hour, from Actilife.

StudyDaily avg. nb. of breaks/ST hour (mean ± SD)


ProjCML 7.25 ± 3.69 7.55 ± 3.43 7.11 ± 3.78

D2FIT 11.27 ± 3.03 10.58 ± 2.84 11.96 ± 3.06

Table 5.7 presents the obtained error values for each PA time indicator. I computed the mean abso-

lute error (MAE), instead of simply comparing the averages for the time indicators with the ones obtained

from Actinfo, to obtain a measure of the error associated with the outputs from the platform. In the ex-

pression for the MAE,

∑ni=1|yi − xi|

n, (5.1)

yi is the value of the output from Actilife for participant i, xi is the result from Actinfo for participant i

and n is the number of participants in the study. The MAE was computed, for each study, for daily seden-

tary time, daily time in MVPA and number of breaks/day (since the variable number of breaks/ST hour is

calculated using the number of breaks per day and daily ST), to assess the difference between outputs

using the two methods (Actilife vs. Actinfo). This allows for an evaluation against what is considered the

”de facto standard” software for objectively measured PA.

65

Table 5.7: Mean absolute error for each computed PA time indicator.

StudyMAE

ST (min/day) MVPA (min/day) Number of breaks/day

ProjCML 0.11 0.13 29.06

D2FIT 0.11 0.14 26.06

5.3.4 Discussion of experimental results

The distributions obtained in the ”Demographics” section of Actinfo’s ”Compare studies” tool indicate

similar age distributions for both studies, with a high density of participants close to the late adulthood

stage of life (around 60 years old). Actinfo’s age filters were not employed for this analysis, as populations

in both studies are comprised of a mixture between older adults and elders; its use would be more

relevant if the studies contained subjects falling under a broader range of ages, as it was intended

when they were implemented. Gender filters, on the other hand, allow a distinction between males and

females in both studies, for the various distributions. It is important to note, however, the fact that, while

study D2FIT shows an even distribution of subjects across both genders, the same is not observable for

study ProjCML, where the percentage of males is much lower than females (30 % vs. 70%).

Comparing the studies in terms of the distribution of subjects who meet the recommended amounts

of PA, patients of type II diabetes show a much lower percentage of subjects attaining sufficient PA than

participants from ProjCML (18% vs 35%). When comparing different genders in both studies, while in

ProjCML a higher percentage of females than males meet the recommendations, in study D2FIT the

opposite happens. The large discrepancy observed between the percentage of males and females who

meet the recommendations in ProjCML may be attributed to the uneven distribution of subjects across

genders, in this study.

As for ST, participants in study D2FIT average lower sedentary times per day when compared to

subjects in ProjCML, both in males and females. A higher interquartile range was also found in the for-

mer. Additionally, it was observed males in both studies spend more wear time in sedentary behaviours,

when compared to females.

Observing the plots and metrics for time in MVPA, patients of type II diabetes spend less time in

moderate- to vigorous-intensity PA per day than subjects in ProjCML. A lower interquartile range was

also found for study D2FIT. Comparing males and females, while in study ProjCML the latter spend a

higher percentage of wear time in MVPA, the opposite happens in study D2FIT.

Lastly, comparing the number of breaks/ST hour between studies, an overall higher number of in-

terruptions in ST was obtained for study D2FIT, with the exception of males in this study, who show

lower breaks/ST hour in a day when compared to males in study ProjCML. Additionally, while in Pro-

jCML a higher number of breaks/ST hour is observed in males, male participants in D2FIT present fewer

interruptions in ST than females in the same study.

66

Based on the obtained metrics, it is possible to conclude that, not only do patients of type II diabetes

show lower sedentary times, interruptions in ST are also higher, when compared to participants from

ProjCML. Participants from study D2FIT average less time spent in MVPA, however. As explained in

Chapter 2, breaking up sedentary time has positive health outcomes and can even mitigate the negative

effect of long periods of sedentary time. Females from both studies show lower sedentary times. Re-

garding time in MVPA, females from ProjCML spend more time in PA of this intensity, while the opposite

happens in study D2FIT. As for interruptions in ST, in study D2FIT, a higher number of breaks was found

in females, the opposite happening in ProjCML. It is important to understand, however, that a small

sample was used to obtain these statistics, which may not accurately represent profiles of PA in larger

populations with the same characteristics.

Although stored in the actigraphy files, height, weight and race of the subjects was not taken into

account for this analysis (as discussed in Chapter 6, these data are relevant for exploring energy expen-

diture, a feature to be implemented in future iterations of Actinfo).

When addressing the deviations in the computation of ST via application of cut points to the *.agd

file’s time series, we should take into account different implementations of the wear time validation

cycles. Although Actilife’s code is proprietary, comparing the SQLite time series in *.agd files (see

Section 2.2 with the data scoring export file the software produces, it is possible to understand some

filtering of the data occurring during the validation. Currently, Actigraph’s Corp. documentation does not

provide any explanation as to why this happens, even when a spike tolerance of zero and no activity

threshold are selected when scoring the files: Actilife provides the option to keep classifying a certain

period of wear time as non-wear unless a specified time threshold of non-zero values is exceeded or,

similarly, unless a certain intensity is reached. For this analysis, neither of these options were selected,

which should in theory result in every non-zero value being considered as an interruption of non-wear

time. This, however, does not seem to be the case, since we observe a lower wear time than that of

which we gather from the actual data. This explains the slight error in the sedentary times and time in

MVPA per day when comparing Actinfo’s outputs with the export from Actilife.

Regarding the number of breaks/ST hour, possible explanations for the high MAE point, once again,

towards different implementation strategies when computing this parameter. Actinfo’s bouts and breaks

detection tools were implemented based on feedback from the researchers at the EHLab. The main

goal was to develop a better suited tool for the specific research conducted in the field of profiling

accelerometer-derived sedentary time. Currently, whenever break detection is needed in a specific study

conducted by researchers at the lab, Actilife’s sedentary analysis tools are used. This may sometimes

constitute a problem, as, per Actigraph Corp.’s own documentation (Actigraph Corp., 2018), the total

time in breaks may be larger than the total wear time, since breaks are computed by subtraction of the

time in sedentary bouts from the total time, without taking non-wear time into consideration. With Actinfo,

however, break and bout calculation was implemented by detecting bouts and/or breaks only for valid

wear time periods.

67


This chapter was focused on the evaluation of the platform’s conformity, usability and its main tools for

processing and analyzing actigraphy data,

The conformity analysis shows that, while most features are performing as expected, some tweaks

and improvements must be made to ensure a more complete version of Actinfo, with more customization

options for the management of studies and files in the platform and operating on said files.

Usability was assessed by administration of a survey following the System Usability Scale method.

The current prototype of Actinfo obtained a score which falls within the range of ”Excellent” and ”Best

imaginable”. Nevertheless, usability must continuously be assessed as the platform grows its user base.

A comparative study was conducted, as proof of concept of Actinfo’s ability to cross different studies

and compare demographics, distribution of compliance with PA recommendations and distributions of

computed PA time indicators between studies. Data from participants from two different studies, D2FIT

comprised of accelerometer data of patients of type II diabetes, and ProjCML, a sample of the residents

of the municipality of Lisbon, was uploaded to the platform and used to extract PA time indicators (aver-

age sedentary time per day, average time in MVPA and average number of breaks per hour of ST). In

study D2FIT, lower sedentary times and less time per day spent in MVPA were observed, as well as a

higher number of interruptions in ST. Overall, females in both studies are more active than males

Some deviations were obtained when assessing the validity of the obtained results, by comparison

with outputs obtained via the software Actilife (the de facto standard for actigraphy), which can be ex-

plained by different implementations of the wear time validation computations. Additionally, the indirect

method by which Actilife’s break detection works can, to an extent, justify the error obtained for this

parameter.

68

Chapter 6

Conclusions and future work

6.1 Conclusions

The work presented in this dissertation included the conceptualization, development and implementation

of Actinfo. This platform is, on the one hand, a repository of physical activity (PA) studies and, on the

other hand, a tool for operating on PA data. The motivation for the development of Actinfo derived from

the lack of a system to aggregate PA data from multiple studies, storing it in a standardized manner,

while also being able to perform typical processing tasks on actigraphy files.

The platform follows a 3-tier architecture, based on the MEAN stack, a full stack of JavaScript com-

ponents for developing web applications. These are open-source technologies, which allow further

improvements of the platform’s features and ease of maintenance. I followed the FHIR standard, when-

ever possible, to model the data. This standard, however, is oriented towards healthcare and clinical

research, while the current version of Actinfo is focused on accelerometry data. Nevertheless, FHIR

was followed to serve as a basis for future iterations of the platform, which may integrate data from

multiple sources, some of which may contain health indicators.

When developing the platform, I aimed to follow current regulations regarding the protection of per-

sonal data, namely the GDPR and LPDP.

During the development phase, I obtained input from researchers at the EHLab, in order to create

tools which support the needs of researchers tackling PA as best as possible. Following development,

I conducted an assessment of the platform, to evaluate the conformity with the requirements, usability

and the implemented tools for operating on actigraphy files. Regarding the assessment of the features

for handling PA data (studies and files), I performed a comparative study with data from two adult pop-

ulations, using Actinfo. The study allowed to conclude on the different objectively measured PA profiles

of the populations.

In summary, I achieved a functional prototype of a PA management platform with good usability.

The centralization of actigraphy data, paired with statistical analysis and validation features makes the

platform a great tool for research in the field of PA. With Actinfo, it is possible to improve PA analysis

workflows while also contributing for the interoperability with clinical information and reusability of the

69

data, which is stored in a standardized manner. The current version of the platform was evaluated

for conformity and usability. Furthermore, I validated the accuracy of the tools for processing actigraphy

data against the Actilife software, regarded as the ”de facto standard” for analyzing objectively measured

PA via actigraphy.

6.2 Future Work

While the current version of Actinfo established a solid foundation for a system of this kind, the plat-

form can be improved upon, to support the growing needs of PA research. The following list describes

additional adjustments which can improve Actinfo:

• Encryption of the database: data stored in MongoDB should be encrypted, as an extra measure

of protection, which would also make Actinfo more compliant with the GDPR. Currently, only the

paid-for version of MongoDB, named MongoDB Enterprise, supports ”Encryption at rest”, which

would solve this problem. Alternative ways to ensure the data are secured could be explored, either

by paying for a service such as MongoDB Enterprise or by hosting the application in a protected

server.

• Data from multiple sources: ultimately, Actinfo should allow for the integration of data from mul-

tiple sources, not just from actigraphy files. In future iterations, users should be able to cross

PA data with information from sources such as imaging exams (e.g.: x-rays and DEXA scans for

bone density), physical performance tests or medical exams from which health indicators could be

extracted.

• Expand on FHIR: if Actinfo moves towards making use of clinical information, interoperability with

clinical data could be improved by using more FHIR-specific fields in the database documents.

• Energy expenditure: a feature for computing energy expenditure could be implemented in future

versions of the platform. This, in turn, could justify the use of the height and weight information

extracted from the *.agd files, which is not used in the current version of Actinfo.

• Read *.gt3x files: to access the true raw data recorded by the activity monitor, we would need to

be able to operate on the data stored in the *.gt3x files. A tool for extracting raw accelerometer

data from these files could be implemented in Actinfo.

• Re-integrate *.agd files: Actilife offers a feature to re-integrate *.agd files to higher epoch values,

which could be replicated in Actinfo.

• Expand the database: the platform could benefit from having a larger dataset of PA files and

studies available to its users, allowing for more comparisons and meta-analysis between different

populations.

• Improve the data model: it could be useful to expand the current data model. In fact, it could

be beneficial to allow the support of a different type of user account for study participants, who

70

could be able to upload their actigraphy data directly to the platform. Additionally, the model could

account for the separation of the datasets from the study, allowing users to create studies from

files already uploaded to the platform. The model would also need to be more generalized and

support new relationships between collections of entities.

71

Bibliography

Actigraph Corp. (2018). How does Sedentary Analysis work? Retrieved from https://actigraphcorp.

force.com/support/s/article/How-does-Sedentary-Analysis-work.

Actigraph Software Department (2012). ActiLife 6 User’s Manual. Actigraph Corp.

Ainsworth, B. E., Caspersen, C. J., Matthews, C. E., Masse, L. C., Baranowski, T., and Zhu, W. (2012).

Recommendations to improve the accuracy of estimates of physical activity derived from self report.

Journal of physical activity & health, 9 Suppl 1:76–84.

Assembleia da Republica (1998). Lei da Proteccao de Dados Pessoais. Diario da Republica n.o

247/1998, Serie I-A de 1998-10-26.

Bangor, A., Kortum, P., and Miller, J. (2009). Determining What Individual SUS Scores Mean: Adding

an Adjective Rating Scale. Technical report.

Bangor, A., Kortum, P. T., and Miller, J. T. (2008). An Empirical Evaluation of the System Usability Scale.

International Journal of Human-Computer Interaction, 24(6):574–594.

Baptista, F., Santos, D. A., Silva, A. M., Mota, J., Santos, R., Vale, S., Ferreira, J. P., Raimundo, A. M.,

Moreira, H., Lui, L., Sardinha, L. B., Baptista, F., Santos, D. A., Silva, A. M., Mota, J., Santos, R.,

Vale, S., Ferreira, J. P., Raimundo, A. M., Moreira, H., and Sardinha, L. B. (2012). Prevalence of the

Portuguese Population Attaining Sufficient Physical Activity. Med. Sci. Sports Exerc, 44(3):466–473.

Barrett, C., Dominick, G., and Winfree, K. N. (2017). Assessing bouts of activity using modeled clinically

validated physical activity on commodity hardware. In 2017 IEEE EMBS International Conference on

Biomedical & Health Informatics (BHI), pages 269–272. IEEE.

Bender, D. and Sartipi, K. (2013). HL7 FHIR: An Agile and RESTful approach to healthcare information

exchange. In Proceedings of the 26th IEEE International Symposium on Computer-Based Medical

Systems, pages 326–331. IEEE.

Brocklebank, L. A., Falconer, C. L., Page, A. S., Perry, R., and Cooper, A. R. (2015). Accelerometer-

measured sedentary time and cardiometabolic biomarkers: A systematic review. Preventive Medicine,

76:92–102.

Brooke, J. (1996). SUS - A quick and dirty usability scale. In Patrick W. Jordan, B. Thomas, I. L. M.

B. W., editor, Usability Evaluation In Industry, chapter 22, pages 189–194. Taylor Francis.

73

https://actigraphcorp.force.com/support/s/article/How-does-Sedentary-Analysis-work

https://actigraphcorp.force.com/support/s/article/How-does-Sedentary-Analysis-work

Cadilhac, D. A., Cumming, T. B., Sheppard, L., Pearce, D. C., Carter, R., and Magnus, A. (2011). The

economic benefits of reducing physical inactivity: an Australian example. International Journal of

Behavioral Nutrition and Physical Activity, 8(1):99.

Cain, K. L., Conway, T. L., Adams, M. A., Husak, L. E., and Sallis, J. F. (2013). Comparison of older and

newer generations of ActiGraph accelerometers with the normal filter and the low frequency extension.

International Journal of Behavioral Nutrition and Physical Activity, 10(1):51.

Caspersen, C. J., Powell, K. E., and Christenson, G. M. (1985). Physical activity, exercise, and physical

fitness: definitions and distinctions for health-related research. Public health reports (Washington,

D.C. : 1974), 100(2):126–31.

Chaniotis, I. K., Kyriakou, K.-I. D., and Tselikas, N. D. (2015). Is Node.js a viable option for building

modern web applications? A performance evaluation study. Computing, 97(10):1023–1044.

Chastin, S. F., Egerton, T., Leask, C., and Stamatakis, E. (2015). Meta-analysis of the relationship

between breaks in sedentary behavior and cardiometabolic health. Obesity, 23(9):1800–1810.

Chen, T., Kishimoto, H., Honda, T., Hata, J., Yoshida, D., Mukai, N., Shibata, M., Ninomiya, T., and

Kumagai, S. (2018). Patterns and Levels of Sedentary Behavior and Physical Activity in a General

Japanese Population: The Hisayama Study. Journal of Epidemiology, 28(5):260–265.

Clark, B. K., Healy, G. N., Winkler, E. A. H., Gardiner, P. A., Sugiyama, T., Dunstan, D. W., Matthews,

C. E., and Owen, N. (2011). Relationship of Television Time with Accelerometer-Derived Sedentary

Time. Medicine & Science in Sports & Exercise, 43(5):822–828.

Ekelund, U., Steene-Johannessen, J., Brown, W. J., Fagerland, M. W., Owen, N., Powell, K. E., Bauman,

A., and Lee, I.-M. (2016). Does physical activity attenuate, or even eliminate, the detrimental associ-

ation of sitting time with mortality? A harmonised meta-analysis of data from more than 1 million men

and women. The Lancet, 388(10051):1302–1310.

European Parliament and Council (2016). Regulation (EU) 2016/679 of the European Parliament and

of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of

personal data and on the free movement of such data, and repealing Directive 95/46/EC (General

Data Protection Regulation). OJ 2016 L 119/1.

European Union (2008). EU Physical Activity Guidelines Recommended Policy Actions in Support of

Health-Enhancing Physical Activity. Technical report.

Evenson, K. R., Catellier, D. J., Gill, K., Ondrak, K. S., and McMurray, R. G. (2008). Calibration of two

objective measures of physical activity for children. Journal of Sports Sciences, 26(14):1557–1565.

Fielding, R. and Reschke, J. (2014). Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content.

Technical report.

Fielding, T. F. (2000). Architectural styles and the design of network-based software architectures.

74

Gonzalez, K., Fuentes, J., and Marquez, J. L. (2017). Physical Inactivity, Sedentary Behavior and

Chronic Diseases. Korean journal of family medicine, 38(3):111–115.

Hansen, B. H., Anderssen, S. A., Andersen, L. B., Hildebrand, M., Kolle, E., Steene-Johannessen, J.,

Kriemler, S., Page, A. S., Puder, J. J., Reilly, J. J., Sardinha, L. B., van Sluijs, E. M. F., Wedderkopp, N.,

Ekelund, U., and Collaborators, t. I. C. A. D. I. (2018). Cross-Sectional Associations of Reallocating

Time Between Sedentary and Active Behaviours on Cardiometabolic Risk Factors in Young People:

An International Children’s Accelerometry Database (ICAD) Analysis. Sports Medicine, 48(10):2401–

2412.

Helmerhorst, H. J., Brage, S., Warren, J., Besson, H., and Ekelund, U. (2012). A systematic review

of reliability and objective criterion-related validity of physical activity questionnaires. International

Journal of Behavioral Nutrition and Physical Activity, 9(1):103.

Hills, A. P., Mokhtar, N., and Byrne, N. M. (2014). Assessment of physical activity and energy expendi-

ture: an overview of objective measures. Frontiers in nutrition, 1:5.

Ibanez, V., Silva, J., and Cauli, O. (2018). A survey on sleep assessment methods. PeerJ, 6:e4849.

International Children’s Accelerometry Database (ICAD) (2017). Suggested settings for accelerometer

data reduction in ICAD 2.0. Technical report.

Janssen, I. (2012). Health care costs of physical inactivity in Canadian adults. Applied Physiology,

Nutrition, and Metabolism, 37(4):803–806.

Judice, P. B., Silva, A. M., Santos, D. A., Baptista, F., and Sardinha, L. B. (2015). Associations of breaks

in sedentary time with abdominal obesity in Portuguese older adults. Age (Dordrecht, Netherlands),

37(2):23.

Kaminsky, L. A. and Ozemek, C. (2012). A comparison of the Actigraph GT1M and GT3X accelerometers

under standardized and free-living conditions. Physiological Measurement, 33(11):1869–1876.

Kim, J., Tanabe, K., Yokoyama, N., Zempo, H., and Kuno, S. (2013). Objectively measured light-intensity

lifestyle activity and sedentary time are independently associated with metabolic syndrome: a cross-

sectional study of Japanese adults. International Journal of Behavioral Nutrition and Physical Activity,

10(1):30.

Kuzik, N., Carson, V., Andersen, L. B., Is, L., Sardinha, B., Grøntved, A., Hansen, H., and Ekelund,

U. (2017). Physical Activity and Sedentary Time Associations with Metabolic Health Across Weight

Statuses in Children and Adolescents. Obesity, 25:1762–1769.

Lawton, G. (2005). LAMP lights enterprise development efforts. Computer, 38(9):18–20.

Lewis, J. R. (2018). The System Usability Scale: Past, Present, and Future. International Journal of

Human–Computer Interaction, 34(7):577–590.

75

Loney, T., Standage, M., Thompson, D., Sebire, S. J., and Cumming, S. (2011). Self-report vs. objectively

assessed physical activity: which is right for public health? Journal of physical activity & health,

8(1):62–70.

Louridas, P. (2016). Component Stacks for Enterprise Applications. IEEE Software, 33(2):93–98.

Mandel, J. C., Kreda, D. A., Mandl, K. D., Kohane, I. S., and Ramoni, R. B. (2016). SMART on FHIR: a

standards-based, interoperable apps platform for electronic health records. Journal of the American

Medical Informatics Association, 23(5):899–908.

Monyeki, M. A., Moss, S. J., Kemper, H. C. G., and Twisk, J. W. R. (2018). Self-Reported Physical

Activity is Not a Valid Method for Measuring Physical Activity in 15-Year-Old South African Boys and

Girls. Children (Basel, Switzerland), 5(6).

Pate, R. R., O’Neill, J. R., and Lobelo, F. (2008). The Evolving Definition of "Sedentary".

Exercise and Sport Sciences Reviews, 36(4):173–178.

Plasqui, G., Bonomi, A. G., and Westerterp, K. R. (2013). Daily physical activity assessment with

accelerometers: new insights and validation studies. Obesity Reviews, 14(6):451–462.

Robusto, K. M. and Trost, S. G. (2012). Comparison of three generations of ActiGraphTM activity monitors

in children and adolescents. Journal of sports sciences, 30(13):1429–35.

Saint-Maurice, P. F., Troiano, R. P., Matthews, C. E., and Kraus, W. E. (2018). Moderate-to-Vigorous

Physical Activity and All-Cause Mortality: Do Bouts Matter? Journal of the American Heart Associa-

tion, 7(6).

Santos, D. A., Judice, P. B., Magalhaes, J. P., Correia, I. R., Silva, A. M., Baptista, F., and Sardinha,

L. B. (2018). Patterns of accelerometer-derived sedentary time across the lifespan. Journal of Sports

Sciences, pages 1–9.

Sardinha, L. B., Magalhaes, J. P., Santos, D. A., and Judice, P. B. (2017). Sedentary Patterns, Physical

Activity, and Cardiorespiratory Fitness in Association to Glycemic Control in Type 2 Diabetes Patients.

Frontiers in Physiology, 8:262.

Sardinha, L. B., Santos, D. A., Silva, A. M., Baptista, F., and Owen, N. (2015). Breaking-up Sedentary

Time Is Associated With Physical Function in Older Adults. The Journals of Gerontology Series A:

Biological Sciences and Medical Sciences, 70(1):119–124.

Sedentary Behaviour Research Network (2012). Letter to the Editor: Standardized use of the terms

“sedentary” and “sedentary behaviours”. Applied Physiology, Nutrition, and Metabolism, 37(3):540–

542.

Tarp, J., Bugge, A., Andersen, L. B., Sardinha, L. B., Ekelund, U., Brage, S., and Møller, N. C. (2018).

Does adiposity mediate the relationship between physical activity and biological risk factors in youth?:

a cross-sectional study from the International Children’s Accelerometry Database (ICAD). Interna-

tional Journal of Obesity, 42(4):671–678.

76

Tremblay, M. S., Aubert, S., Barnes, J. D., Saunders, T. J., Carson, V., Latimer-Cheung, A. E., Chastin,

S. F., Altenburg, T. M., and Chinapaw, M. J. (2017). Sedentary Behavior Research Network (SBRN)

– Terminology Consensus Project process and outcome. International Journal of Behavioral Nutrition

and Physical Activity, 14(1):75.

Troiano, R. P., Berrigan, D., Dodd, K. W., Masse, L. C., Tilert, T., and Mcdowell, M. (2008). Physical

Activity in the United States Measured by Accelerometer. Medicine & Science in Sports & Exercise,

40(1):181–188.

Troiano, R. P., McClain, J. J., Brychta, R. J., and Chen, K. Y. (2014). Evolution of accelerometer methods

for physical activity research. British journal of sports medicine, 48(13):1019–23.

Warburton, D. E. R., Nicol, C. W., and Bredin, S. S. D. (2006). Health benefits of physical activity:

the evidence. CMAJ : Canadian Medical Association journal = journal de l’Association medicale

canadienne, 174(6):801–9.

World Health Organization (2004). Global Strategy on Diet, Physical Activity and Health. Technical

report.

World Health Organization (2010). Global Recommendations on Physical Activity for Health. Technical

report.

World Health Organization (2013). 2013-2020 Global action plan for the prevention and control of non-

communicable diseases.

World Health Organization (2014). WHO — What is Moderate-intensity and Vigorous-intensity Phys-

ical Activity? Retrieved from https://www.who.int/dietphysicalactivity/physical_activity_

intensity/en/.

World Health Organization (2017). WHO — Physical Activity. Retrieved from https://www.who.int/

dietphysicalactivity/pa/en/.

Yates, T., Wilmot, E. G., Davies, M. J., Gorely, T., Edwardson, C., Biddle, S., and Khunti, K. (2011).

Sedentary Behavior. American Journal of Preventive Medicine, 40(6):e33–e34.

77

https://www.who.int/dietphysicalactivity/physical_activity_intensity/en/

https://www.who.int/dietphysicalactivity/physical_activity_intensity/en/

https://www.who.int/dietphysicalactivity/pa/en/

https://www.who.int/dietphysicalactivity/pa/en/

Appendix A

Entity types and document fields

Table A.1: Fields in documents with the user entity type

Field Data type Descriptionname String Name of the registered user.email String Email of the registered user.

username String Username of the registered user. Identifies the user who navigates theplatform. Part of the login credentials.

role String Role of the registered user, either ”admin” or ”researcher”, for controllingaccess to the platform’s features.

password String A string resulting from the the hash of the user’s password.

79

Table A.2: Fields in documents with the researchStudy entity type

Field Data type FHIR-specific? Description

identifier String Yes An identifier assigned to the research studyby the responsible researcher.

userPermissions Array No Array containing usernames of users withpermission to access the study.

meta Array NoArray containing metadata for the study, suchas the user who created it and a timestamp ofthe last modification.

resourceType String YesString identifying the type of FHIR resourcethis document is (set to ”ResearchStudy” forall documents in this collection).

title String Yes A descriptive, short and user-friendly label forthe study.

status String Yes The current state of the study.

period Object Yes Object with ”start” and ”end” timestamps forthe study.

principalInvestigator String Yes Name of the researcher who oversees thestudy.

studyGroup Array No An array of embedded documents with meta-data for the groups created for the study.

Table A.3: Fields in documents with the studyGroup entity type

Field Data type DescriptiongroupName String Name of the study group.cohort Number Index of the cohort.country String Country of the subjects in the group.city String City of the subjects in the group.

80

Table A.4: Fields in documents with the researchSubject entity type

Field Data type FHIR-specific? Description

individual Object Yes Object with an identifier for the participant in thestudy (a pseudonym is used in Actinfo).

period Object Yes Object with ”start” and ”end” timestamps for thisstudy the participant is part of.

outputInfo Object No

Object containing the outputs resulting from runningthe file through Actinfo’s validation and processingtool. Specifically, information regarding the com-puted physical activity time indicators and the per-sonal information obtained from the *.agd file, suchas the sex, height, mass and age of the subject. Thevalidation parameters are also included as proper-ties of this field (i.e., cut point set chosen and boutsselected). Additionally, some auxiliary arrays aresaved, for generating tables in the platform’s outputpage.

study String Yes Study subject is part of.

status String Yes The current state of the subject in the study (set to”on-study” by default).

groups Array No References to the studyGroup the subject is part of.

resourceType String YesString identifying the type of FHIR resource thisdocument is (set to ”ResearchSubject” for all doc-uments in this collection).

Table A.5: Fields in documents with the file entity type

Field Data type Descriptionlength Number Size of the file, in bytes.chunkSize Number Size of each file chunk, in bytes.uploadDate Date Timestamp for when the document was stored in the database.

filename String Randomly generated filename, created by GridFS (unique names aregenerated to allow upload of the same file under different studies).

contentType String MIME type for the GridFS file.subject String Reference to the subject the file belongs to.

81

Appendix B

User interface

Figure B.1: File uploader interface.

83

Figure B.2: Custom cut point form.

Figure B.3: Custom cut point example.

84

Figure B.4: List of commonly used bouts and breaks presented to users. Each bout is accompanied bya table detailing its settings (duration and count levels).

85

Figure B.5: Custom bout form. When de-selected, the maximum duration and count level are set to∞.

86

Appendix C

Exported Excel file example

87

FigureC

.1:”S

umm

ary”sheetforanexam

pleexported

Excelfile.

88

Figure C.2: ”Daily” sheet for an example exported Excel file.

89

Appendix D

Consent forms

91

Figure D.1: Consent form signed by participants of the D2FIT study (front).

92

Figure D.1: Consent form signed by participants of the D2FIT study (back).

93

Figure D.2: Consent form signed by participants of the CML study.

94