d3.6 development of novel longitudinal statistical tools · sezen cekic unige researcher rogier...

30
This project has received funding from the European Union’s Horizon2020 research and innovation programme under grant agreement No 732592. D3.6 Development of novel longitudinal statistical tools Project title: Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts Due date of deliverable: 30 th June, 2018 Submission date of deliverable: 15 th June, 2018 Leader for this deliverable: Max Planck Institute for Human Development

Upload: others

Post on 04-Nov-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

This project has received funding from the European Union’s Horizon2020 research and innovation programme under grant agreement No 732592.

D3.6 Development of novel longitudinal statistical tools

Project title: Healthy minds from 0-100 years: Optimising the use of

European brain imaging cohorts

Due date of deliverable: 30th June, 2018

Submission date of deliverable: 15th June, 2018

Leader for this deliverable: Max Planck Institute for Human Development

Page 2: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 1 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Contributors to

deliverable: Name Organisation Role / Title

Deliverable Leader Andreas Brandmaier MPIB Researcher

Contributing

Author(s)

Ulman Lindenberger MPIB PI

Ylva Köhncke MPIB Researcher

Paolo Ghisletta UNIGE PI

Sezen Cekic UNIGE Researcher

Rogier Kievit CAM Researcher

Reviewer(s) Klaus Ebmeier UOXF WP3 Leader

William Baaré REGIONH WP2 Leader

Final review and

approval

Kristine Walhovd UiO Coordinator, WP4 Leader

Barbara B. Friedman UiO Administrative

Coordinator

Document History

Release Date Reason for Change Status (Draft/In-

review/Submitted Distribution

1.0 26.03.2018 Sent to all authors for commenting

Draft Email

1.1 24.04.2018 Draft for the General Assembly

In review OneDrive

1.2 05.06.2018 Final revision Final version OneDrive

Dissemination level

PU Public x

PP Restricted to other programme participants (including the Commission Services)

RE Restricted to a group specified by the consortium (including the Commission

Services)

CO Confidential, only for members of the consortium (including the Commission

Services)

Page 3: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 2 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Table of contents Executive Summary .............................................................................................................................. 3

List of acronyms / abbreviations .......................................................................................................... 4

1. Introduction .................................................................................................................................. 5

1.1. Background ......................................................................................................................... 5

1.2. Objectives ........................................................................................................................... 6

1.3. Collaboration among partners ........................................................................................... 6

2. Description of activities................................................................................................................. 7

2.1. Research Line I .................................................................................................................... 7

2.1.1. Meta Analysis .............................................................................................................. 7

2.1.2. Structural Equation Modelling, Power, Effect Size, and Reliability ........................... 13

2.1.3. Reliability of Brain Imaging Studies ........................................................................... 15

2.2. Research Line II ................................................................................................................. 16

2.2.1. Longitudinal Models of Correlated Change ............................................................... 16

2.2.2. The Variable Selection Problem: Novel approaches using SEM Trees, SEM Forests,

and Regularized SEM ............................................................................................................... 19

2.2.3. Joint Longitudinal and Survival Models ..................................................................... 22

3. Results ......................................................................................................................................... 25

3.1. Summary ........................................................................................................................... 25

3.2. Publications and Tools ...................................................................................................... 25

4. Conclusion ................................................................................................................................... 27

5. References................................................................................................................................... 28

Page 4: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 3 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Executive Summary

This deliverable is a report on the implementation of part of Task 3.4 “Development of novel

longitudinal statistical tools“ in Work Package 3.

The complex mixed cross-sectional- and longitudinal nature of the pooled Lifebrain data poses

various statistical challenges to exploit the potential of the combined big cohort studies fully. These

challenges include differences between cohorts with respect to which instruments were used to

measure the same constructs, and as a result, the varying precision of measurement to detect

effects of interest, mixed cross-sectional and longitudinal designs, different patterns of missing data,

and a potentially large number of possible predictors (risk factors and protective factors) that

moderate individual change in brain, cognitive, and mental health over time.

To address these challenges, various approaches were developed:

(I) theoretical frameworks and tools for comparative analysis of data sets and research

designs taking into account differences in effect size, precision, and power across studies;

(II) statistical modelling tools to account for the complex nature of lifespan development

using multivariate and dynamic variants of longitudinal structural equation modelling

(SEM), data driven approaches inspired from classification and regression trees (CART),

and joint longitudinal and survival models. The goal is to identify statistical approaches

that are optimally adjusted to the needs of the consortium and statistical models that

best represent the consolidated findings of the consortium, and hence contribute to

theory development and generalizability.

Page 5: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 4 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

List of acronyms / abbreviations

CAM University of Cambridge

CART Classification and Regression Tree

CI Confidence Interval

GAM Generalized Additive Model

ICED Intra-class Effect Decomposition

LCS Latent Change Score

LIFESPAN Longitudinal – Interactive Front-End – Study Planner

MRI Magnetic Resonance Imaging

MPIB Max Planck Institute for Human Development

PS Processing Speed

RegSEM Regularized Structural Equation Modelling

SEM Structural Equation Model

SES Socio-economic Status

UNIGE University of Geneva

UOXF University of Oxford

Page 6: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 5 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

1. Introduction

1.1. Background

The rich and complex data structure in Lifebrain requires methodological approaches that are

tailored to its specific demands. First, we have identified the main requirements for successful data

analyses in Lifebrain so that we can tackle the substantive questions laid out in the tasks of Work

Package 4. Particular challenges include the mixed nature of cross-sectional and longitudinal data,

the choice of different measurement instruments with different precisions, reliabilities, and

statistical power across studies, potentially non-linear relationships between variables, and a

potentially large number of missing data. Last, but not least the large number of potential risk and

protective factors (such as genes, lifestyle, health-related factors, demographics, etc.) form a

particular challenge for data analysis.

We had proposed three lines of research in our application:

(I) Theoretical frameworks and practical tools for comparative analysis of data sets and research

designs with respect to effect size and power;

(II) statistical modelling tools to account for the complex nature of lifespan development, including

multivariate and dynamic variants of longitudinal structural equation modelling (SEM), data-driven

approaches inspired from classification and regression trees (CART), and flexible modelling using

generalized additive mixed modelling (GAMM);

(III) tool refinement and optimizing of model selection to tackle the important problem of identifying

and selecting those models that best summarize data across sites. The goal of this third line of

research is to identify models that best represent the consolidated findings of the consortium, and

hence contribute to theory development and generalizability.

This deliverable (D3.6) describes our work along the first two lines of research (I & II).

The consolidated findings of all three lines of research (I, II, and III) will be presented in deliverable

D3.7.

Page 7: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 6 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

1.2. Objectives

Our objectives were to design and implement a flexible and general analysis pipeline to facilitate

analyses, and specifically at this first stage of the project, meta-analyses of the consortium data for

cross-sectional, longitudinal and mixed data analyses. First, this entailed building a data-analysis

pipeline and developing appropriate statistical techniques for analysing common constructs of

interest (such as working memory, brain health, etc.) while allowing for differences in how the

constructs were measured. A second objective was to advance methodological developments with

regard to formal definitions of measurement precision, reliability, and statistical power in order to

be able to quantify the precision with which each study contributes evidence to a pooled analysis

and to build a power analysis tool that can inform future studies in a priori power analyses. Our third

objective was to provide a pool of existing and novel longitudinal methods that are specifically

suitable to address the challenges of large-scale multi-site longitudinal data and allow us to test for

various risk and protective factors explaining individual differences in brain and mental health

changes over the lifespan.

In the following, we will describe our progress in each of these lines of research in detail and will

point to scientific manuscripts and scientific software that have originated from this work.

1.3. Collaboration among partners

Collaboration started with the kick-off meeting in Brussels (January 2017). Method development

proceeded in core groups according to expertise and practical tasks that were not necessarily

restricted to individual sites. Some findings were presented at the general meeting in Barcelona

(November 2017). For example, members from Cambridge and Berlin jointly worked on correlated

change-score models, and members from Berlin and Geneva jointly worked on matters of reliability

and effect size. Throughout, we made use of all digital communication devices in Lifebrain including

email exchange, Slack as central broadcasting and organization device, Skype calls, and personal

meetings. The partners mainly involved in the development of these methods were Andreas

Brandmaier, Ylva Köhncke, and Ulman Lindenberger (MPIB), Sezen Cekic and Paolo Ghisletta

(Geneva) and Rogier Kievit (CAM). Klaus Ebmeier (UOXF) supervised the work package.

Page 8: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 7 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

2. Description of activities

2.1. Research Line I

2.1.1. Meta Analysis

Across the human life span, there are several phases marked by profound changes. For example,

many people suffer loss of memory performance in late life. Although memory performance gets

worse on average across the lifespan, there are large differences between individuals. Similarly,

people differ in rates of change across many functional domains, such as cognitive ability, mental

health, but also in brain structure and function (e.g., Lindenberger, 2014). Describing, explaining,

and ultimately modifying between-person differences in change are the central goals of lifespan

research, so as to be able to promote healthy ageing (Baltes, Lindenberger, & Staudinger, 2006;

Baltes & Nesselroade, 1979; Ferrer & McArdle, 2010; McArdle & Nesselroade, 2014). Major goals of

our combined efforts are to describe and explain these changes across major European cohort

studies, and to determine which person-related factors are detrimental to and which facilitate

cognitive, mental, and brain health.

A major task of our consortium is the development of theories that allow us to establish a solid

foundation of knowledge for understanding how brain, cognitive and mental health can be

optimized through the lifespan. The integration of major European longitudinal neuroimaging

studies of age-related changes in brain, cognition, and mental health requires a unified statistical

framework to formalize and test our theories in form of hypotheses. We focus on structural

equation modelling (SEM) (Bollen, 1989) as one general framework for multivariate statistical

inference. Theories often posit causal explanations, and SEM’s are uniquely powerful tools to

translate causal hypotheses into models that can be rejected or provisionally supported with

empirical data (provided the data meet the assumptions of the method). A particular virtue of the

SEM technique is its inherent approach of latent variable modelling. Latent variables represent

constructs that are not directly observable (such as a person’s working memory capacity), but are

instead posited as an underlying explanation of observed phenomena. Oftentimes, there are several

possible instruments measuring a construct of interest (such as several tasks that require working

memory). If several measurements have been assessed in the same group of study participants, one

can take advantage of the fact that certain commonalities among them should reflect the latent,

unobserved construct. In a measurement model within the SEM framework, estimating the latent

variable is then achieved by partitioning between-person variance into a part that is shared across

measurement instruments – the latent construct – and variance that is unique to each measurement

instrument. In the ideal case, the unique variance of each instrument is a mix of variance specific to

Page 9: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 8 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

one measurement instrument and random measurement error (or, noise), neither of which is of

interest when focusing on the latent construct. These latent construct variables can then be

correlated with other latent construct variables within the same SEM, which gives us estimates of

correlations that are less contaminated by noise than correlations between the observed variables

would have been, since the nuisance variance has been filtered out in a first step.

Structural equation models are theory-based statistical models, because the researcher postulates

a mapping of observed variables on latent constructs (e.g. a working memory test, or maximal

aerobic capacity is an indicator of fitness) or a certain type of relationship between latent variables

(e.g. fitness level predicts working memory capacity). SEM is extremely flexible and allows for

postulating and testing a wide range of models, including models of longitudinal change. These

models represent theoretical assumptions formalized as patterns of associations among many

variables.

Pooling the available European lifespan studies offers the unique possibility to estimate effects with

high precision while explicitly controlling for differences between studies or countries. A particular

challenge in the analysis of Lifebrain data is the fact that the pooled data sets originated from

independent sources and, thus, are not fully harmonized. That is, theoretically identical constructs

are measured with different instruments and some constructs of interest were not assessed at every

site. We address these problems with SEM-based meta-analysis, by allowing the measurement

instruments to vary across studies, but unifying the hypotheses among constructs. In sum, this

approach allows us to triangulate psychological constructs such as intelligence, memory, and well-

being while taking into account differences across studies.

A first SEM-based meta-analysis will target the relations between Socio-economic status (SES),

cognition, and brain structure. SES has been used as a proxy for cognitive reserve over the lifespan

(Jefferson et al., 2011), and the relation of education, income and occupational status to both early

and late life brain and cognition has been quantified in different studies across nations (Livingston

et al., 2017; Noble et al., 2015; Walhovd et al., 2016). Using SEM-based meta-analysis, we will

calculate per-study effect sizes for the associations between SES (education and income), general

cognitive ability, and brain volume (focusing on total brain volume and hippocampal volume,

(Mackey et al., 2015; Staff et al., 2012; Yu et al., 2017). SEM-based meta-analysis allows us to

estimate latent variables and relate them to each other. As described above, latent variables are

estimated as being free of measurement error and are thus a more valid and reliable representation

of a construct of interest. For all relations between latent variables of interest, effect sizes (Pearson’s

r) and precision of effect size (standard errors and confidence intervals) estimates are computed on

each site separately, and then gathered into a single meta-analysis to test whether the general

associations between SES, brain, and cognition hold across study populations.

Page 10: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 9 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Figure 1 A Structural Equation Model representing a latent factor model of cognition and socio-

economic status. Cognition is represented as two latent factors, Fluid Intelligence (Flu) and Crystallized

Intelligence (CRY). Socio-economic status is represented as a further latent variable (SES). This

schematic model can be used to obtain meta-analytic estimates of the relation of cognition and socio-

economic status across all Lifebrain cohorts.

Figure 1 shows a Structural Equation Model representing a SEM-based meta-analysis of relations

between cognition and SES. This model is part of a larger model that also includes relations with

brain measures (such as grey-matter volume or intracranial volume) but for the sake of simplicity

we show only a simpler model structure here. The model has three latent variables, depicted in

circles, representing socioeconomic-status (SES; shown in red), fluid intelligence (FLU; shown in

blue), and crystallized intelligence (CRY; shown in green). The one-headed arrows indicate that each

latent construct is measured by one or more observed variables, which are shown as rectangles.

The choice of observed variables and the number of observed variables per construct vary from site

to site depending on which measures were chosen in the original studies to represent a given

construct. Double-headed arrows represent variances and covariances. The double-headed arrows

attached to the observed variables represent variances unique to the variable, whereas the double-

headed arrows attached to the latent constructs represent between-person differences in those

constructs. The double-headed arrows between constructs represent correlations between

constructs and, thus, they represent our primary outcome of this analysis. Modelling associations

between latent variables allows us to account for between-study differences in how latent

constructs were measured. By aggregating the effect size estimates (point estimates of latent

correlations) and their precisions (standard errors) in a meta-analysis, we can estimate a single

overall, meta-analytic effect size assuming that either there is a single “fixed” underlying effect size

or that there is “random” variation of effect sizes across studies.

Page 11: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 10 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

We will adopt a random effects modelling approach, which in principle allows to test for predictors

of the between-study differences in effect sizes in a so-called meta-regression approach.

Figure 2 illustrates preliminary results of the proposed meta-analytic approach. Here we show how

a general factor of cognition correlates with socio-economic status across a subset of the data pool.

We will use an identical approach to fully answer the question of how SES, cognition, and brain

relate and we will adapt this approach to further questions that we are addressing in Work Package

4. From this analysis, we can obtain that the average correlation of SES and cognition is .65 [95% CI

.56;.75], which means that .652 = 42% of the between-person variance in SES and cognition is shared.

The forest plot in Figure 2 displays the individual effect size estimates and their confidence intervals

for each study. The confidence intervals are proportional to the magnitude with which individual

studies contribute evidence to our overall meta-analytic estimate. A larger CI means lower precision,

which will mean a lower contribution to the aggregate estimates. First of all, we can deduce that

there is converging evidence from the individual studies (range of effect sizes from .53 to .77)

without any clear outliers. This plot reveals the particular virtue of our full information approach,

which makes use of all available information, even if some studies (or a subset of participants in

some studies) contribute only partial information. Specifically in Lifebrain, we face the problem that

not all studies included measures of each cognition, brain, and SES construct. But even in studies

that have all three domains assessed on all participants, not all participants were measured with all

measurement instruments (for example, because of early drop-out). Our weighting scheme takes

into account all available pieces of information and weights the meta-analytic results accordingly.

For example, even though LCBC has by far the largest sample size, it has the lowest precision (and

such the lowest meta-analytic weight) because the actual available data on SES are relatively sparse

at this preliminary analysis stage. This analysis serves only the purpose of demonstrating the

proposed analysis strategy and estimates are likely to change once data from all cohorts are fully

exported.

We follow our analysis with a probe for evidence of selection effects such as publication bias, p-

hacking (in short, any practice of searching multiple hypotheses until at least one of them becomes

significant) or any other type of systematic bias. Such biases distort meta-analytic effects and

typically show dependencies of effect size on precision. The funnel plot in Figure 2 shows that effect

sizes fall within expected deviations around the meta-analytic effect size and there is no evidence

for selection effects (even though power to detect those violations may be rather low). As further

diagnostics of our meta-analytic results, we will also use radial plots (also known as Galbraith plots)

and plots of standardized residuals. For a fixed-effects model, the plot shows the inverse of the

standard errors on the horizontal axis against the individual observed effect sizes or outcomes

standardized by their corresponding standard errors on the vertical axis (random effects models

additionally take into account the between-study heterogeneity). Radial plots allow us to eyeball

heterogeneity and outliers in the effect size estimates.

Page 12: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 11 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Vertical scatter of effect sizes reflects between-study heterogeneity, which clearly is within the

expected bounds (95% CI) here. Standardized residual plots are useful for outlier detection

(standardized residuals are standard-normally distributed) based on threshold techniques (e.g.,

absolute value >2 as shown by the dashed lines in Figure 2, bottom left).

SEM-based meta-analyses allow us to tackle questions of relations between cognition, brain, and

other factors, which we demonstrated in a cross-sectional data analysis. We are currently extending

these approaches to a longitudinal data analysis that will exploit the full potential of the Lifebrain

dataset.

Page 13: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 12 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Figure 2 Results from a preliminary meta-analysis of the relation of cognition and SES in Lifebrain. A

subset of the available studies (BASE, Betula, LCBC, UB, and Whitehall) were used to illustrate the

meta-analytic approach. Top-left: Forest plot depicting individual studies point estimates and

precisions (confidence intervals). Top-right: Funnel plot showing the relation of effect size and precision

showing no evidence for selection bias. Bottom-left: Radial plots showing the relation of standardized

effect size and precision showing no evidence for outliers or selection bias. Bottom-right: Standardized

residual plot showing no evidence for outliers.

RE Model

0.3 0.5 0.7 0.9

Observed Outcome

Whitehall

UB

LCBC

Betula

BASE

0.63 [0.51, 0.74]

0.72 [0.63, 0.81]

0.60 [0.39, 0.80]

0.53 [0.45, 0.62]

0.77 [0.67, 0.88]

0.65 [0.56, 0.75]

Forest Plot

Observed Outcome

Sta

nd

ard

Err

or

0.1

04

0.0

52

0

0.4 0.5 0.6 0.7 0.8 0.9

Funnel Plot

xi = 1 vi + t2

0 4 8 12

−202

zi =yi

vi + t2

0.530.570.610.650.690.730.77

●●

Radial Plot

−2

−1

01

2

Study

1 2 3 4 5

Standardized Residuals

Page 14: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 13 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

2.1.2. Structural Equation Modelling, Power, Effect Size, and Reliability

A central aspect of all questions pertaining to such large meta-analytic or pooled analyses is the

question of how precise our measurements are. Only when we take into account the precision with

which each study measures a construct of interest or a relationship between two constructs of

interest, can we draw meaningful and statistically sound conclusions about the presence and

magnitude of effects. Our goals of methodological development in this regard were two-fold. First,

we needed to conceptualize precision of measurement for general latent variable frameworks

covering both cross-sectional models of age-differences as well as longitudinal models of age-

related change. Second, we wanted to provide a vehicle for optimal study planning that enables

both most efficient use of resources and optimal power under given resource constraints such as

limited time, budgetary restrictions, or participant burden (Brandmaier, von Oertzen, Ghisletta,

Hertzog, & Lindenberger, 2015). Underpowered studies are problematic for several reasons

(Wagenmakers et al., 2015): They waste scarce resources, raise ethical concerns, increase the

likelihood of errors of magnitude and even of sign (Gelman & Carlin, 2014) and lower power is

associated with a decreased likelihood of successful replication.

The Lifebrain consortium set out with the goal to understand how risk and protective factors affect

specific cognitive and mental functions, and what the pathways may be that operate at the levels

of brain, genes and specific health indicators. Single neuroimaging studies incorporating highly

refined cognitive and mental health measurements are so expensive and time-consuming that they

are oftentimes restricted in the number of participants, measurement occasions and number of

measurements per construct so that they lack the statistical power to detect small or medium-size

effects. We have introduced effective precision, and its inverse, the effective error, as a generic

measure to gauge the sensitivity with which a given study design can detect an effect of interest

(Brandmaier, von Oertzen, Ghisletta, Lindenberger, & Hertzog, 2018). Effective error is a major

determinant of statistical power but does not need to specify a population effect size; together with

sample size it completely determines a design’s efficiency to measure a given true effect of interest.

When relating a hypothesized, estimated, or known true effect size and effective error, we can

obtain a measure of reliability. Reliability for a given sample size directly links to statistical power.

In other words, effective error, reliability, and statistical power are different measures that each

quantify a study’s ability to detect change under differently restrictive assumptions (Brandmaier,

von Oertzen, et al., 2018). In a different context, we have applied the concept of effective error to

derive meta-analytic weights for analysing the degree to which cognitive tasks are correlated across

domains and, thus, may give rise to a unitary factor of cognitive ageing (Tucker-Drob, Brandmaier,

& Lindenberger, 2018). Here, we have implemented this approach as a freely available software

package for the statistical programming language R.

Page 15: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 14 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

We demonstrated the utility of the semper (Brandmaier, 2018) package as a practical tool for

evaluating statistical power in SEM with emphasis on the evaluation of longitudinal designs but

generality for all aspects of precision, reliability, and statistical power in latent variable models.

semper includes templates for fast model specification, derivation of indices of change sensitivity,

such as effective error, and statistical power. Figure 3 shows a Monte-Carlo-based power curve for

longitudinal study designs generated using the semper package with only few lines of code. In this

simulated example, we are interested in the minimum sample size for adequate power in a

longitudinal design; using semper, such considerations may be expanded to all facets of a study

design, including number of measurement instruments, instrument reliability, number of time

points, or total study time (Brandmaier et al., 2015). We hope that the availability of this package

will further increase the attention to statistical power considerations in research across all

disciplines, particularly, with regard to a priori statistical power evaluations of longitudinal studies.

In particular, it will enable Lifebrain researchers to compare the included studies with respect to

how well the study designs are powered to detect change and how reliable their measures are.

Figure 3. Results of a Monte Carlo Power Simulation using the semper package. Here, we see the

dependency of Statistical Power on Sample size for an exemplary longitudinal study design involving

four measurements over three years with a relatively unreliable measurement instrument ( intercept

variance = 10, residual variance = 50, Intra-class correlation=.17). If a Power of at least 80% is desired,

sample size needs to be at least 260 to detect a slope variance of 2.

100 150 200 250 300 350 400

0.4

0.5

0.6

0.7

0.8

0.9

Monte Carlo Power Simulation

Sample Size

Sta

tistica

l P

ow

er

N>= 260

Page 16: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 15 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

2.1.3. Reliability of Brain Imaging Studies

Magnetic resonance imaging (MRI) has become an indispensable tool for studying brain structure

and function in their relationship to cognition and behaviour, but studies using MRI have

traditionally suffered from low power. In longitudinal studies, a major contributor to power is

reliability of the measurement instrument. We have developed a formal and general framework

(ICED; intra-class effect decomposition) to assess reliability in neuroimaging studies that involve

repeated measures of an outcome of interest (Brandmaier, Wenger, et al., 2018) measured close in

time, such that all differences between measurements should largely reflect measurement error

and not change in the underlying construct. The framework is based on Structural Equation

Modelling and allows researchers to partition the observed between-person variance in a given

outcome into systematic variance and several, uniquely identifiable sources of error variance. For

example, this enables researchers to gauge day-specific or session-specific variance in relation to

systematic between-person differences (see Figure 4 for illustration). Moreover, it yields not only

the absolute precision of a measure but also its diagnostic potential, that is, its ability to discriminate

between persons from a given population. We demonstrate this approach using published data on

myelin content indices by Arshad, Stanley, and Raz (2017). Myelin content indices have been

proposed as neuroanatomical substrates of cognitive decline. In our work, we demonstrated how

ICED can be used to decompose residual variance into individual sources of error variance and

obtain more reliable estimates of the true-score variation in myelin content indices (Brandmaier,

Wenger, et al., 2018).

Within this line or research (research line I), we hope that awareness for the necessity of assessing

reliability in brain imaging studies will further increase, and demonstrate the benefits of tools to

identify and tease apart sources of measurement error in order to arrive at more efficient study

designs. Ensuring high precision is the critical value that determines a design’s propensity to detect

individual differences, which again is a necessary condition to explain these individual differences.

In other words, when future studies aim at investigating specific risk factors or protective factors for

mental and brain health, this approach optimally informs a priori power planning.

Page 17: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 16 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Figure 4 Schematic display of how different sources of measurement error act (circles in green, red,

and blue representing day-specific errors, session-specific errors, and residual errors) may act upon the

repeated measurements (white rectangles) of a latent true score of interest (yellow circle). The ICED

framework describes a formal approach for orthogonalising and identifying these measurement errors.

2.2. Research Line II

2.2.1. Longitudinal Models of Correlated Change

Modelling neurocognitive changes using longitudinal structural equation modelling is an important

part of answering our substantive questions related to the goals in Work Package 4. In a

collaboration within Lifebrain, we published a tutorial article that explicates the use of structural

equation modelling (SEM) for the analysis of longitudinal data, with a focus on latent change score

(LCS) models (Kievit et al., 2017). The SEM framework is well suited to analyse longitudinal data and

superior to most statistical approaches currently used by neuroscientists studying change in brain

and cognition, e.g. in child and adolescent development. Traditional approaches, such as age

mediation models on cross-sectional data sets, fail to identify individual differences in change

(Lindenberger, Von Oertzen, Ghisletta, & Hertzog, 2011). Only longitudinal data as we have it

available in the Lifebrain data pool allows us to study inter-individual differences in intra-individual

change. Furthermore, latent variable modelling in SEM allows us to account for measurement error,

and provides statistical tools for model comparisons, which is not the case when we only work with

observed variables.

LCS models based on data from 2 measurement occasions estimate the rate of change between the

two occasions, and can easily be extended to multi-occasion data. They can be used to examine

distinct classes of brain–behaviour relationships, each associated with distinct parameters.

Specifically, associations between brain structure (or function) of a particular region and some

True Score

Session-specific Errors

Day-specific Errors

Residual Errors

Observed Scores

Page 18: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 17 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

behavioural outcome may reflect one (or more) of the following patterns: Simple covariance (a

covariance parameter at T1); plasticity-induced structural change (cognitive states driving neural

change); maturity-induced cognitive development (neural states driving cognitive change); or

correlated change (correlated rates of change between cognition and brain structure, suggested an

omitted third variable driving change). We illustrate the value of the LCS approach using data from

the COGITO project (Schmiedek, Lövdén, & Lindenberger, 2010), a high-intensity (100 day) training

intervention with pre and post-tests cognitive scores for 204 adults: 101 young (range=20-31) and

103 old (age range=64-80). We examine changes between pre- and post-test scores on a latent

variable of processing speed, measured by three standardized tests from the Berlin Intelligence

Structure test measured on two occasions (see Schmiedek et al., 2010, for more details). The neural

measure of interest was Fractional Anisotropy in the sensory region of the Corpus Callosum.

Inspection of the four parameters of interest, reflecting the four possible brain-behaviour

relationships outlined above, shows evidence (only) for correlated change. In other words, those

with greater gains in processing speed (due to practice) were, on average, those with less positive

change in fractional anisotropy after taking into account the other dynamic parameters (r=-0.465).

Such models can and will provide a solid basis for further meta-analytic investigations in the

Lifebrain consortium.

Page 19: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 18 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Figure 5. A Bivariate change score model of correlated change in processing speed and white matter

plasticity. left: Structural Equation Model including parameter estimates and standard errors. The

latent variable Processing Speed is measured by three subtests of the Berlin Intelligence Structure

test (BIS1-BIS3) measured before (pre) and after (post) an intensive training intervention (see

Schmiedek et al., 2010). Observed variable means and variance estimates are omitted for visual

clarity. Right: Scatterplots of raw scores changing across two occasions.

[This Figure is reproduced from Kievit et al. (2017)]

Our tutorial provides a step-by-step introduction into the foundations of SEM in general, and

explains how to estimate and interpret key parameters. Along with the tutorial we also provide an

interactive online tool that allows users to examine the consequences of change in a given

parameter for other parameters in the model (Kievit et al., 2017). We believe that such a

comprehensive multivariate analysis approach will be particularly suitable to address questions of

shape of change, dynamics of change, and inter-individual differences in those parameters.

Page 20: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 19 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

2.2.2. The Variable Selection Problem: Novel approaches using SEM Trees, SEM Forests, and

Regularized SEM

Building models fully informed by theory is especially challenging when data are “big”. Equally,

deriving predictions for all variables of interest is difficult. In such instances, researchers may start

with a core model guided by theory, and eventually face the problem of which additional variables

need to be included. Brandmaier, von Oertzen, McArdle, and Lindenberger (2013) introduced SEM

Trees to provide a viable and versatile solution to this variable selection problem. SEM Trees

hierarchically split empirical data into homogeneous groups sharing similar parameters of a model.

They achieve this by recursively selecting optimal predictors from a potentially large set of candidate

predictors. These model-based trees can be thought of as adaptive multiple group models, in which

the group structure and its predictors are learnt from the data. Importantly, this allows us to explore

those variables (and interactions between variables) that best predict differences in multivariate

outcomes. Particularly, this includes the possibility to explore predictors of individual differences in

models of (correlated) change over time. In initial work, SEM trees have been used within Lifebrain

to address the longstanding hypothesis of age differentiation. This hypothesis suggests that

covariance between cognitive factors may change across the lifespan, reflecting various mechanism

of compensation or reorganization. For the first time, we were able to use SEM trees to investigate

the question of age differentiation in terms of neural factors, specifically factors of grey matter

volume (de Mooij et al., 2018). We showed that after middle age, the covariance, or similarity,

among specific grey matter and white matter factors decreased (see Figure 6).

Figure 6 Left: Optimal age break-point with maximal differences between sub groups as retrieved by the SEM Tree algorithm. Right: Path diagram of white matter factor structure. [This Figure is reproduced from de Mooij, Henson, Waldorp, and Kievit (2018).]

Page 21: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 20 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Using SEM trees in this manner represents a considerable improvement over traditional methods

that often require the specification of somewhat arbitrary age-related subgroups, by instead

allowing for a flexible and continuous investigation of age-related changes in model structure.

SEM Forests (Brandmaier, Prindle, McArdle, & Lindenberger, 2016) are a recent extension of SEM

Trees. They are large ensembles of SEM Trees, comprising hundreds to thousands of individual trees,

each based on a random sample of the original data. By aggregating the predictive information in a

forest, one obtains a measure of variable importance that is more robust than corresponding

measures from single trees. Variable importance guides researchers on what variables may be

missing from their models and the underlying theories. Our most recent development was the

combination of SEM Trees and SEM Forests with dynamic continuous time models, which are

dynamic models for the analysis of longitudinal data that adequately account for unequal

assessment intervals within as well as across individuals (Brandmaier, Driver, & Voelkle, in press).

Those models generalise the well-known autoregressive and cross-lagged panel models, which

allows us to effectively explore interacting predictors of differences in dynamics across time.

In the process of theory development, SEM Trees and Forests support researchers in hypothesis

generation and in challenging established theory-driven models by providing predictive benchmarks

(Brandmaier et al., 2016). The combination of SEM and Trees and Forests provides researchers

across disciplines with a powerful “off-the-shelf” tool for exploratory data analysis after a first step

of theory-driven and confirmatory modelling. Both SEM Trees and SEM Forests are freely available

open-source programs.

SEM Trees can be understood as a method to tackle the question what variables from a potentially

large pool of variables may best explain or predict other variables. This is also known as the variable

selection problem in statistics and machine learning, and it is of fundamental importance for the

discovery of yet unknown pathways in which complex sets of risk and protective factors influence

cognitive, mental, and brain health. Typical questions we would ask are: “What is the relative

importance of a set of candidate genes for predicting mental health?” or “What subset of variables

(lifestyle-related, genes, health factors) is most predictive of (or most relevant for) healthy ageing?”

As an alternative venue to approach this question, we have developed a novel data analysis

technique inspired from machine learning approaches that allows estimating complex multivariate

models with large numbers of potential predictors, called regularized SEM. Regularization in a

regression model with many predictors penalizes small regression coefficients and thus helps

pruning the model such that it retains only the most important predictors. Regularization is achieved

by a modified regression estimation procedure that adds a penalty to the likelihood or least-squares

fitting function. We have developed a regularization approach amenable to SEM and demonstrated

the method using simulated data and real data from the Cambridge Study of Cognition, Aging and

Neuroscience (Cam-CAN; Shafto et al., 2014), which is part of the Lifebrain data pool.

Page 22: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 21 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

In a sample of 627 participants who participated in a large battery of cognitive tests, demographic

and lifestyle measurements and MRI scans (for more detail on the cohort and sampling

methodology see Taylor et al., 2017), we set up a factor model with a latent factor of visual short

term memory. We asked to what degree markers of neural white-matter integrity from 48 different

regions of interest may predict visual short-term memory. Specifically, given the large number of

candidate tract (48), we hoped to use regularization to yield a more parsimonious predictive model.

Figure 7 demonstrates the effect of regularization for estimating this model. The x-axis represents

regularization penalties, and the y-axis represents estimated regression coefficients. Lines represent

the 48 regions of interest. Grey lines represent those regions that were not selected in the final

regularized model whereas coloured lines represent those six regions that “survived” regularization,

and that, were actively selected as important predictors of visual working memory from the total of

48 regions.

Figure 7 In the final regularized model six non-zero white-matter tracts remain active predictors of visual working memory. They are shown as individual colours whereas the tracts regularized to zero are shown in grey.

[This Figure is reproduced from Jacobucci, Brandmaier, and Kievit (2018)]

Page 23: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 22 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

In an un-regularized model, the coefficients (represented by the left-most vertical cut through all

lines where Lambda=0), we find that all coefficients were estimated to be non-zero, even though

many were generally small. The further we increased our penalty, that is, the larger Lambda gets,

the stronger the weak predictors are shrunken towards zero. In contrast, the strong and stable

predictors (several of which are related to crucial memory systems in the brain) remain relatively

unaffected. Our simulations demonstrated regularization could significantly increase power to

detect significant effects in modest to intermediate sample sizes.

We believe that both SEM Trees and regularized SEM will help us address crucial questions of Work

Package 4. The methods will play out their full potential when we start combining longitudinal

models (such as described earlier; see Section Longitudinal Models of Correlated Change) with these

formal exploratory approaches. The richness of the consortium data will allow us to use SEM Trees

and regularized SEM to generate hypotheses about potential pathways, risk factors, and protective

factors in subsets of the pooled data and validate our findings on the remaining data, effectively

turning this second analysis phase into an as-if replication study.

2.2.3. Joint Longitudinal and Survival Models

Recent advances in statistics aim at describing relations between a dependent variable and a

predictor in the most flexible fashion. When studying how cognitive performance or brain

characteristics (e.g., density, volume) may undergo developmental changes, Generalized Additive

Models (GAM; Hastie & Tibshirani, 1990; Wood, 2006) are particularly promising. These models use

a spline approach, rather than a priori assumptions about the functional form (linear, quadratic, or

others) to best describe the relation between, for instance, age and cognition or brain structure.

Furthermore, within this framework moderating covariates (such as sex) can maximally express their

effects on the outcome across the entire age range by interacting with the spline, so that the age

spline for males could differ from that of females. This methodology thus assures that sex effects

are tested with high power, which decreases the likelihood of type-2 errors (for an application, see

for instance Ghisletta, Renaud, Fagot, Lecerf, and de Ribaupierre (2018)).

In this project, we have focussed on combining flexible models that describe intraindividual

trajectories over time with discrete events. For instance, one can model how trajectories of

cognitive performance relate to mortality risk using a spline approach (such as GAM). This type of

modelling allows conclusions such as “the hazard of death dropped by 5.13% per unit increase in a

[cognitive] score" (Muniz-Terrera et al., in press, p. 7). This statistical approach is called a joint

model, or joint longitudinal-survival model, and it allows for studying how parameters that describe

individuals' trajectories are integrated in a survival model that describes the hazard associated to a

specific outcome event. This model has been proposed in different versions and most recent

versions developed allow including a spline approach for the longitudinal trajectory and allow

Page 24: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 23 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

multiple types of associations between the trajectory parameters and the survival model

(Rizopoulos, 2012, 2016).

We dedicated a large part of our work with this model to difficulties of its practical implementation,

including questions of how to decide for a suitable longitudinal model, how to relate the longitudinal

model to the survival model, as well as interpretation issues. In this work, we used the R package

JMbayes, Rizopoulos (2016). This package is the most flexible in terms of associating longitudinal

models to survival models. It allows for many types of models for the longitudinal component,

including linear and generalized linear mixed-effects models, nonlinear mixed-effects models, GAM.

The survival model can be specified by parametric models (e.g., Weibull, exponential) or semi-

parametric models (Cox proportional hazard model). The association between the two models can

also follow multiple specifications. The longitudinal component can be characterized by a person’s

actual value at a given time point (where one is), by the derivative of that value at a given time point

(where one is going), and by the time-invariant overall level or change score. We illustrate here the

joint modelling methodology with the Manchester Longitudinal Study of Cognition dataset (Rabbitt

et al., 2004). With this example, we sought to determine if the change in processing speed (PS)

performances across life span has an impact on the risk of mortality. Figure 8 illustrates trajectories

of processing speed changes for deceased and alive participants. The differences between the

groups convincingly show the necessity to condition longitudinal trajectories on survival, as the two

are clearly related. Furthermore, this relationship is sex-specific, which requires stratifying by sex.

This type of joint modelling of longitudinal and survival models is promising within Lifebrain as a

tool that helps understanding how cognitive change or structural/functional brain changes over long

time spans are related to discrete events such as death or disease diagnosis.

Page 25: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 24 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Figure 8. Processing Speed (PS) trajectories of living and deceased participants split according to

gender (female = red, male = blue). Circles represent individual measurement time points. This pattern

clearly illustrates the necessity for joint models of survival status and longitudinal change.

Page 26: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 25 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

3. Results

3.1. Summary

The development of methods has been going on since day one of the consortium grant agreement

and we are maintaining our rapid development pace. Our plan entailed three major lines of research

to equip the consortium with the tools and expertise necessary to tackle the substantive questions

(relating to the tasks in Work Package 4) regarding epidemiological analyses across European

cohorts, the identification of individual pathways/mediators to risk for mental health and cognitive

function. These lines of research included setting up an analysis pipeline ready to process cross-

sectional and longitudinal data in a meta-analytic fashion to synthesize results across studies.

Second, we advanced methodological understanding of how differences in study designs influence

the power to detect effects as to both inform the informativeness of existing and future planned

studies, which led to a new software package “semper”. Third, we planned to provide a toolbox with

state-of-the-art and novel data-analytic tools to tackle the specific challenges of our data pool. This

toolbox includes the tools SEM, LCSM, SEM Trees, SEM Forests, regularized SEM, GAM, joint

longitudinal and survival models, and (SEM-based) Meta-Analysis. These tools allow us to address

various aspects of all questions relating to the estimation of within- and between-person differences

in various aspects of human development, and potential moderators (pathways, risk factors,

protective factors) of these differences. Our on-going work will entail adaptions and refinements of

the methods as needed once the specific analysis tasks start. The consolidated findings will be

reported on in deliverable D3.7.

3.2. Publications and Tools

First, we have prepared a data-analytic framework for comparative analysis of data sets with our

consortium data. In response, we have created a meta-analysis framework that is available to the

consortium via a GitHub repository. The first test beds are our on-going analyses to answer the

question in how far SES, brain, and cognition are related. The framework is not bound to this specific

question but allows broader conceptualizations and questions to be asked and meta-analysed.

In research line I, we had promised to provide a tool for comparative analysis of data sets and

research designs with respect to effect size and power, with the particular goal of informing future

studies in a priori power analysis. In response to this, we have developed a statistical software

package. This package is freely available to the public under an Open Source License (GPL-3) from

Page 27: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 26 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

GitHub: https://github.com/brandmaier/semper. An accompanying manuscript is currently in

preparation. A manuscript, which originated from a different research context, on the conceptual

and statistical background is provisionally accepted (Brandmaier et al., 2018), so that we hope that

soon, not only members of the consortium but also other researchers working with SEM in general

can benefit from our tools to improve power analysis and study planning.

In research line II, we have been working on software packages, tutorials, and substantial

publications.

We have published a tutorial paper on how to conceptualize change over time and correlated

change over time using SEM (Kievit et al., 2017). The tutorial illustrates how these

conceptualizations can be implemented using free software (lavaan within R and Onyx) and we hope

that this tutorial will not only help the researchers in our consortium to move from cross-sectional

to longitudinal data analyses (specifically addressing the challenges laid out in all tasks of WP 4) but

will also empower other researchers interested in longitudinal relations of brain and cognition over

the human lifespan.

Software for SEM Trees (Brandmaier & Prindle, 2017) and regularized SEM (Jacobucci, Grimm,

Brandmaier, & Serang, 2017) have already been freely available. Our most recent developments

regarding regularized SEM in the context of Lifebrain are currently submitted for publication and

freely available as preprint (Jacobucci et al., 2018).

Another line of research in the development of novel longitudinal methods focussed on joint

longitudinal and survival models. The methodological framework was implemented using open and

free software, the JMbayes package by Rizopoulos (2016). Our approach is ready to be used by all

consortium members. An accompanying tutorial manuscript is currently in preparation.

To foster the use of the proposed techniques, specifically longitudinal models of change and joint

longitudinal and survival models, in the consortium, we will organize a workshop during the June

2018 consortium meeting in Oslo to familiarize all Lifebrain researchers with these techniques.

Page 28: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 27 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

4. Conclusion

Due to the concerted efforts of all sites involved in the methodological development of novel

statistical tools for longitudinal data analysis, we were able to meet all of our objectives. We have

equipped the consortium with tools ready to answer the substantive questions at the heart of Work

Package 4. In response to the particular challenges of the consortium, we have developed novel

longitudinal methods. To the greatest extent possible we have made our tools and innovations

freely available to the public through preprints, Open Access publications and Open Source software

packages. Development does not stop here; rather, we will continue to invest time and effort to

closely monitor the data-analytic needs as the theory development and concomitant substantive

analyses unfold over time, and continue to contribute to the methodological foundation of

Lifebrain.

Page 29: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 28 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

5. References

Arshad, M., Stanley, J. A., & Raz, N. (2017). Test–retest reliability and concurrent validity of in vivo myelin content indices: Myelin water fraction and calibrated T1w/T2w image ratio. Human Brain Mapping, 38(4), 1780–1790. doi:10.1002/hbm.23481

Baltes, P. B., Lindenberger, U., & Staudinger, U. M. (2006). Lifespan theory in developmental psychology. In W. Damon & R. M. Lerner (Eds.), Handbook of Child Psychology (Vol. 1, pp. 569--664). New York: Wiley.

Baltes, P. B., & Nesselroade, J. R. (1979). History and rationale of longitudinal research. In J. R. Nesselroade & P. B. Baltes (Eds.), Longitudinal research in the study of behavior and development (pp. 1-39). New York: Academic Press.

Bollen, K. A. (1989). Structural Equations with Latent Variables. Oxford, UK: John Wiley. Brandmaier, A. M. (2018). semper. R package version 0.3. Retrieved from

https://github.com/brandmaier/semper

Brandmaier, A. M., Driver, C., & Voelkle, M. C. (in press). Recursive partitioning in continuous time analysis. . In K. van Montfort, J. Oud, & M. C. Voelkle (Eds.), Continuous time modeling in the behavioral and related sciences. New York: Springer.

Brandmaier, A. M., & Prindle, J. J. (2017). semtree: Recursive partitioning for Structural Equation Models. Retrieved

from http://www.brandmaier.de/semtree

Brandmaier, A. M., Prindle, J. J., McArdle, J. J., & Lindenberger, U. (2016). Theory-Guided Exploration with Structural Equation Model Forests. Psychological Methods, 21, 566--582.

Brandmaier, A. M., von Oertzen, T., Ghisletta, P., Hertzog, C., & Lindenberger, U. (2015). LIFESPAN: A Tool for the Computer-Aided Design of Longitudinal Studies. 6(272). doi:10.3389/fpsyg.2015.00272

Brandmaier, A. M., von Oertzen, T., Ghisletta, P., Lindenberger, U., & Hertzog, C. (2018). Precision, Reliability, and Effect Size of Slope Variance in Latent Growth Curve Models: Implications for Statistical Power Analysis.

Frontiers in Psychology, 9, 294. doi:https://doi.org/10.3389/fpsyg.2018.00294

Brandmaier, A. M., von Oertzen, T., McArdle, J. J., & Lindenberger, U. (2013). Structural equation model trees. Psychol Methods, 18(1), 71-86. doi:10.1037/a0030001

Brandmaier, A. M., Wenger, E., Bodammer, N., Kühn, S., Raz, N., & Lindenberger, U. (2018). Assessing Reliability in Neuroimaging Research Through Intra-Class Effect Decomposition (ICED). Manuscript under review.

de Mooij, S. M. M., Henson, R. N. A., Waldorp, L. J., & Kievit, R. A. (2018). Age differentiation within grey matter, white matter and between memory and white matter in an adult lifespan cohort. The Journal of Neuroscience.

Ferrer, E., & McArdle, J. J. (2010). Longitudinal modeling of developmental changes in psychological research. Current Directions in Psychological Science, 19(3), 149--154.

Gelman, A., & Carlin, J. (2014). Beyond power calculations assessing type s (sign) and type m (magnitude) errors. Perspectives on Psychological Science, 9(6), 641--651.

Ghisletta, P., Renaud, O., Fagot, D., Lecerf, T., & de Ribaupierre, A. (2018). Age and sex differences in intra-individual variability in a simple reaction time task. International Journal of Behavioral Development, 42(2), 294-299. doi:10.1177/0165025417739179

Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models. Boca Raton, FL: Chapman & Hall/CRC. Jacobucci, R., Brandmaier, A. M., & Kievit, R. A. (2018). Variable Selection in Structural Equation Models with

Regularized MIMIC Models. OSF. doi:10.17605/OSF.IO/BXZJF Jacobucci, R., Grimm, K. J., Brandmaier, A. M., & Serang, S. (2017). regsem: Regularized Structural Equation Modeling

(R package version 1.0.6). Jefferson, A. L., Gibbons, L. E., Rentz, D. M., Carvalho, J. O., Manly, J., Bennett, D. A., & Jones, R. N. (2011). A life course

model of cognitive activities, socioeconomic status, education, reading ability, and cognition. J Am Geriatr Soc, 59(8), 1403-1411. doi:10.1111/j.1532-5415.2011.03499.x

Kievit, R. A., Brandmaier, A. M., Ziegler, G., van Harmelen, A. L., de Mooij, S. M. M., Moutoussis, M., . . . Dolan, R. J. (2017). Developmental cognitive neuroscience using latent change score models: A tutorial and applications. Dev Cogn Neurosci. doi:10.1016/j.dcn.2017.11.007

Lindenberger, U. (2014). Human cognitive aging: Corriger la fortune? Science, 346(6209), 572--578.

Page 30: D3.6 Development of novel longitudinal statistical tools · Sezen Cekic UNIGE Researcher Rogier Kievit CAM Researcher Reviewer(s) Klaus Ebmeier UOXF WP3 Leader William Baaré REGIONH

| P a g e | 29 L i f e b r a i n D e l i v e r a b l e 3 . 6

Healthy minds from 0-100 years: Optimising the use of European brain imaging cohorts

Lindenberger, U., Von Oertzen, T., Ghisletta, P., & Hertzog, C. (2011). Cross-sectional age variance extraction: what's change got to do with it? Psychology and Aging, 26(1), 34.

Livingston, G., Sommerlad, A., Orgeta, V., Costafreda, S. G., Huntley, J., Ames, D., . . . Mukadam, N. (2017). Dementia prevention, intervention, and care. Lancet, 390(10113), 2673-2734. doi:10.1016/S0140-6736(17)31363-6

Mackey, A. P., Finn, A. S., Leonard, J. A., Jacoby Senghor, D. S., West, M. R., Gabrieli, C. F. O., & Gabrieli, J. D. E. (2015). Neuroanatomical Correlates of the Income Achievement Gap. Psychological Science, 26(6), 925-933. doi:10.1177/0956797615572233

McArdle, J. J., & Nesselroade, J. R. (2014). Longitudinal Data Analysis Using Structural Equation Models. Washington, DC: American Psychological Association.

Muniz-Terrera, G., Massa, F., Benaglia, T., Johansson, B., Piccinin, A., & Robitaille, A. (in press). Visuospatial Reasoning Trajectories and Death in a Study of the Oldest Old: A Formal Evaluation of Their Association. Journal of Ageing and Health.

Noble, K. G., Houston, S. M., Brito, N. H., Bartsch, H., Kan, E., Kuperman, J. M., . . . Sowell, E. R. (2015). Family income, parental education and brain structure in children and adolescents. Nat Neurosci, 18(5), 773-778. doi:10.1038/nn.3983

Rabbitt, P. M. A., McInnes, L., Diggle, P., Holland, F., Bent, N., Abson, V., . . . Horan, M. (2004). The University of Manchester Longitudinal Study of Cognition in Normal Healthy Old Age, 1983 through 2003. Aging, Neuropsychology, and Cognition, 11(2-3), 245-279. doi:10.1080/13825580490511116

Rizopoulos, D. (2012). Joint models for longitudinal and time-to-event data: With applications in R: CRC Press. Rizopoulos, D. (2016). The r package jmbayes for fitting joint models for longitudinal and time-to-event data using

mcmc. Journal of Statistical Software, Articles, 72(7), 1-46. Schmiedek, F., Lövdén, M., & Lindenberger, U. (2010). Hundred days of cognitive training enhance broad cognitive

abilities in adulthood: findings from the COGITO study. Frontiers in Aging Neuroscience, 2(27). doi:10.3389/fnagi.2010.00027

Shafto, M. A., Tyler, L. K., Dixon, M., Taylor, J. R., Rowe, J. B., Cusack, R., . . . Cam, C. A. N. (2014). The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) study protocol: a cross-sectional, lifespan, multidisciplinary examination of healthy cognitive ageing. BMC Neurol, 14, 204. doi:10.1186/s12883-014-0204-1

Staff, R. T., Murray, A. D., Ahearn, T. S., Mustafa, N., Fox, H. C., & Whalley, L. J. (2012). Childhood socioeconomic status and adult brain size: Childhood socioeconomic status influences adult hippocampal size. Annals of Neurology, 71(5), 653-660. doi:10.1002/ana.22631

Tucker-Drob, E., Brandmaier, A. M., & Lindenberger, U. (2018). Coupled Cognitive Change in Adulthood: A Meta-analysis. Submitted for publication review.

Wagenmakers, E.-J., Verhagen, J., Ly, A., Bakker, M., Lee, M. D., Matzke, D., . . . Morey, R. D. (2015). A power fallacy. Behavior Research Methods, 47(4), 913--917.

Walhovd, K. B., Krogsrud, S. K., Amlien, I. K., Bartsch, H., Bjornerud, A., Due-Tonnessen, P., . . . Fjell, A. M. (2016). Neurodevelopmental origins of lifespan changes in brain and cognition. Proc Natl Acad Sci U S A, 113(33), 9357-9362. doi:10.1073/pnas.1524259113

Wood, S. (2006). Generalized Additive Models: An Introduction with R. Boca Raton, FL: Chapman and Hall/CRC. Yu, Q., Daugherty Ana, M., Anderson Dana, M., Nishimura, M., Brush, D., Hardwick, A., . . . Ofen, N. (2017).

Socioeconomic status and hippocampal volume in children and young adults. Developmental Science, 21(3), e12561. doi:10.1111/desc.12561