high intra- and inter-rater chance variation of the movement assessment battery for children 2,...

6
High intra- and inter-rater chance variation of the movement assessment battery for children 2, ageband 2 Inger Holm a, *, Anne Therese Tveter b , Vibeke Smith Aulie c , Britt Stuge a a Division of Surgery and Clinical Neuroscience, Orthopaedic Dept., Section of Research, Oslo University Hospital, Postal Address Box 4950 Nydalen, 0424 Oslo, Norway b Institute of Health and Society, Medical Faculty, University of Oslo, Postal Address Box 1089 Blindern, 0317 Oslo, Norway c Women and Children’s Division, Oslo University Hospital, Postal Address Box 4956 Nydalen, 0424 Oslo, Norway 1. Introduction The movement assessment battery for children (MABC) is a frequently used test to identify mild motor performance disturbances in children. The test has become the evaluation tool of choice among pediatric physiotherapists and psychologists working in children’s hospitals and primary care units both in screening programs identifying children with motor impairments and in clinical and/or scientific evaluation of treatment efficacy. The MABC was launched in 1992 (Henderson & Sugden, 1992) and in 2007 its successor MABC-2 was introduced (Henderson, Sugden, & Barnett, 2007). The test has been extensively revised, the age range has been extended from 4–12 year olds to 3–16 year olds, some items have been revised, some new items have been added, the test material has been improved, the scoring scales have been totally changed and a traffic light system to help interpretation of the results has been introduced. Because of these considerable Research in Developmental Disabilities 34 (2013) 795–800 A R T I C L E I N F O Article history: Received 15 August 2012 Received in revised form 5 November 2012 Accepted 5 November 2012 Available online 5 December 2012 Keywords: Movement assessment battery for children Second edition Reliability Limits for clinical changes A B S T R A C T The aim of the present study was to evaluate the intra- and inter-tester reliability of the movement assessment battery for children second edition (MABC-2), ageband 2. We wanted to analyze the collected data, with adequate statistical methods, to provide relevant recommendations for physical therapists who are interpreting changes in the context of daily clinical practice. Forty-five healthy children, 23 girls and 22 boys with a mean age of 8.7 0.7 years, participated in the study, the inter-tester procedures were performed the same day and the intra-tester procedures within a one to two week interval. The statistical methods used were intra-class correlation coefficient (ICC), standard error of measurement (SEM), and smallest detectable change (SDC). The children had no failed items during the tests. The ICC values ranged from 0.23 to 0.76. The items ‘‘treading lace’’ and ‘‘one-board balance’’ showed the highest measurement errors both for the intra- and inter-rater reliability. The SDC 90% values were 9.7 and 18.5 for the intra- and inter-rater reliability, respectively. The present study showed high intra- and inter-rater chance variation MABC-2, ageband 2. A change of more than 9.7 and 18.5 on the total test score (TTS) should be required to state (with a 90% confidence) that a real change in a single individual has occurred, for intra- and inter-rater testing, respectively. These findings may indicate that the MABC-2 might be more suitable for diagnostic or clinical decision making purposes, than for evaluation of change over time. ß 2012 Elsevier Ltd. All rights reserved. * Corresponding author. Tel.: +47 23072278; fax: +47 23072920. E-mail addresses: [email protected] (I. Holm), [email protected] (A.T. Tveter), [email protected] (V.S. Aulie), [email protected] (B. Stuge). Contents lists available at SciVerse ScienceDirect Research in Developmental Disabilities 0891-4222/$ see front matter ß 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ridd.2012.11.002

Upload: britt

Post on 08-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Research in Developmental Disabilities 34 (2013) 795–800

Contents lists available at SciVerse ScienceDirect

Research in Developmental Disabilities

High intra- and inter-rater chance variation of the movement assessmentbattery for children 2, ageband 2

Inger Holm a,*, Anne Therese Tveter b, Vibeke Smith Aulie c, Britt Stuge a

a Division of Surgery and Clinical Neuroscience, Orthopaedic Dept., Section of Research, Oslo University Hospital, Postal Address Box 4950 Nydalen,

0424 Oslo, Norwayb Institute of Health and Society, Medical Faculty, University of Oslo, Postal Address Box 1089 Blindern, 0317 Oslo, Norwayc Women and Children’s Division, Oslo University Hospital, Postal Address Box 4956 Nydalen, 0424 Oslo, Norway

A R T I C L E I N F O

Article history:

Received 15 August 2012

Received in revised form 5 November 2012

Accepted 5 November 2012

Available online 5 December 2012

Keywords:

Movement assessment battery for children

Second edition

Reliability

Limits for clinical changes

A B S T R A C T

The aim of the present study was to evaluate the intra- and inter-tester reliability of the

movement assessment battery for children – second edition (MABC-2), ageband 2. We

wanted to analyze the collected data, with adequate statistical methods, to provide

relevant recommendations for physical therapists who are interpreting changes in the

context of daily clinical practice. Forty-five healthy children, 23 girls and 22 boys with a

mean age of 8.7 � 0.7 years, participated in the study, the inter-tester procedures were

performed the same day and the intra-tester procedures within a one to two week interval.

The statistical methods used were intra-class correlation coefficient (ICC), standard error of

measurement (SEM), and smallest detectable change (SDC).

The children had no failed items during the tests. The ICC values ranged from 0.23 to

0.76. The items ‘‘treading lace’’ and ‘‘one-board balance’’ showed the highest measurement

errors both for the intra- and inter-rater reliability. The SDC90% values were 9.7 and 18.5 for

the intra- and inter-rater reliability, respectively. The present study showed high intra-

and inter-rater chance variation MABC-2, ageband 2. A change of more than �9.7 and

�18.5 on the total test score (TTS) should be required to state (with a 90% confidence)

that a real change in a single individual has occurred, for intra- and inter-rater testing,

respectively. These findings may indicate that the MABC-2 might be more suitable for

diagnostic or clinical decision making purposes, than for evaluation of change over time.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

The movement assessment battery for children (MABC) is a frequently used test to identify mild motor performancedisturbances in children. The test has become the evaluation tool of choice among pediatric physiotherapists andpsychologists working in children’s hospitals and primary care units both in screening programs identifying children withmotor impairments and in clinical and/or scientific evaluation of treatment efficacy. The MABC was launched in 1992(Henderson & Sugden, 1992) and in 2007 its successor MABC-2 was introduced (Henderson, Sugden, & Barnett, 2007). Thetest has been extensively revised, the age range has been extended from 4–12 year olds to 3–16 year olds, some items havebeen revised, some new items have been added, the test material has been improved, the scoring scales have been totallychanged and a traffic light system to help interpretation of the results has been introduced. Because of these considerable

* Corresponding author. Tel.: +47 23072278; fax: +47 23072920.

E-mail addresses: [email protected] (I. Holm), [email protected] (A.T. Tveter), [email protected] (V.S. Aulie), [email protected]

(B. Stuge).

0891-4222/$ – see front matter � 2012 Elsevier Ltd. All rights reserved.

http://dx.doi.org/10.1016/j.ridd.2012.11.002

I. Holm et al. / Research in Developmental Disabilities 34 (2013) 795–800796

changes, previous published studies on reliability and validity based on the MABC can no longer be used as documentationfor satisfying psychometric properties.

In 2009, Brown and Lalor published the paper ‘‘MABC-2. A Review and Critique’’ (Brown & Lalor, 2009). They concludedthat fundamental information about intra- and inter-rater reliability and internal consistency was lacking. Hence, therapistsshould be cautious in making clinical decisions solely based on the MABC-2 until further reliability and validity studies arecompleted (Brown & Lalor, 2009). However, some studies on the psychometric properties of the MABC-2 have recently beenpublished. Two studies have evaluated the reliability of ageband 1 (3–6 years) (Ellinoudis et al., 2011) and in 3 year olds(Smits-Engelsman, Niemeijer, & Van, 2011). In the first study (Ellinoudis et al., 2011) the authors concluded that the MABC-2can be a reliable and valid tool for the assessment of movement difficulties among 3–5-year-old children and the secondstudy (Smits-Engelsman et al., 2011) they stated that the revised test can be applied to assess motor performance in typicallydeveloping 3-year old children.

However, several objections may be raised to these previous published methodological studies. First, none of the studieshave analyzed the psychometric properties of the test on the basis of the actually measured Raw scores (given in measuredunits). The analysis have only been performed on the basis of the transformed Item Standard Scores, where the raw data havebeen collapsed into categorical variables, which may reduce the differences considerable from one test to the next.

Second, when evaluating the psychometric properties of a specific test, the choice of statistical analysis is crucial. Usingonly intra-class correlation coefficient (ICC) analysis for evaluating reliability qualities of a measurement tool, provide theclinicians and researchers insufficient and not very useful information. ICC only gives information about how well theparticipants can be distinguished from each other (de Vet, Terwee, Mokkink, & Knol, 2011). However, one study (Smits-Engelsman et al., 2011) performed analysis including both ICC, standard error of measurement (SEM) and smallestdetectable difference (SDD) in their calculation of reliability. The results from these statistical methods give the cliniciansuseful guidelines not only of the magnitude of relationship between the tests, but indications of what can be classified as thelower limits for clinical significant changes. This exact and important information will help the physical therapistsinterpreting changes and making clinical decisions.

The purpose of the present study was to perform a test–retest design to evaluate the intra- and inter-tester reliability ofthe MABC-2, ageband 2. We wanted to analyze the collected data, especially the raw scores, with adequate statisticalmethods, to provide relevant recommendations for physical therapists who are interpreting changes in the context of dailyclinical practice or intervention studies.

2. Materials and methods

2.1. Participants

A convenient sample of 45 healthy children from the second and third grades (7–9 years of age) in primary school wasrecruited from two schools situated near the Oslo University Hospital. Thirty children were included in the inter-tester partof the study, 29 children in the intra-tester part, 14 of them were overlapping cases, taking part in both studies. The childrengot oral and written information about the study and an invitation-letter to bring home. Written informed consents wereobtained from their parents. The study was approved by the Regional Committee for Medical and health research Ethics ofEastern Norway.

2.2. Instruments

Demographic data about age, weight, height, nationality, siblings, diseases or injuries were collected. The movementassessment battery for children – second edition (MABC-2) (Henderson et al., 2007) was the test of interest for the reliabilitystudy. The revised Test and Checklist make it possible to identify and describe impairments in motor performance of childrenin the age of 3–16 year (divided into 3 agebands, ageband 1, 3–6 years; ageband 2, 7–10 years and ageband 3, 11–16 years)and can be used to identify children who are significantly behind their peers in motor development (Henderson et al., 2007).There are 8 items within each ageband, divided into three subgroups, and for ageband 2 these are; manual dexterity (placing

pegs, threading lace and drawing trail), aiming and catching (catching with two hands and throwing beanbags onto mats) andstatic and dynamic balance (one-board balance, walking heel- to- toe forwards and hopping on mats). It takes about 20–30 minto complete the test.

Age-adjusted standard scores and percentiles are provided for the three components of the battery and for the total score.The best score was used for data analysis when there were items with more than one test trial. Because two of the items,within ageband 2, e.g. ‘‘placing pegs’’ and ‘‘one-board balance’’ involved testing of both preferred and non-preferred limbs,eleven? Raw item scores were obtained from a total of eight MABC-2 tasks.

On the basis of the raw scores, standard item scores, component standard scores (CSS), total test scores (TTS) and totalstandard scores (TSS) were calculated. The TSS is often used for classification purposes. TTS below or equal to 56 points placesthe child at or below the 5th percentile, red zone, TTS between 57 and 67 points place the child between the 5th and the 15thpercentile, amber zone and TTS above 67 points is above the 15th percentile, green zone (Smits-Engelsman et al., 2011). Onthe basis of the TTS each child was categorized into one of three movement difficulty categories: ‘‘no movement problemsdetected’’ (green zone), ‘‘potential motor problems’’ (amber zone) and ‘‘impaired motor problems’’ (red zone).

I. Holm et al. / Research in Developmental Disabilities 34 (2013) 795–800 797

2.3. Procedure

Children were individually assessed with MABC-2 in a quiet room at school. The first day each child was tested twice bytwo physiotherapists who scored them independently (inter-rater reliability). Within a one to two week interval the childrenwere re-tested by one of the two assessors (intra-tester reliability). Both examiners were pediatric physical therapist withmore than 15 years of clinical experience and they both had been using the MABC-2 in clinical settings before the study wascarried out.

2.4. Data analysis

PASW Statistics, version 18, was used for analyzing the collected data. Demographic data are presented as mean � onestandard deviation (SD). The raw scores from the 8 different items were transformed into CSS according to the tables given in themanual, a TSS was calculated and on the basis of the TSS each child was categorized into one of three movement difficultycategories (Henderson et al., 2007).

The intra- and inter-test reliability was analyzed in different ways. First, the intra-class coefficient (ICC 2.1) was used.Standard error of measurement (SEM) was calculated to provide an estimate of the measurement error given in the sameunits as the original measurement. SEM (agreement) was estimated with values from a two-way ANOVA according to theformula of de Vet et al. (2011). Smallest detectable change (SDC) was then calculated. SDC is the magnitude of changenecessary to exceed the measurement error and represents the smallest change that can be detected beyond themeasurement error, in a single individual (de Vet et al., 2011; Wagner, Rhodes, & Patten, 2008). SDC was calculated as(1:68 �

ffiffiffi

2p� SEM) to obtain 90% confidence interval (CI).

3. Results

Twenty-three girls and 22 boys with a mean age of 8.7 � 0.7 years participated in the study. Their height and body weightwas 137.2 � 7.5 cm and 29.6 � 5.8 kg, respectively. Five children (11%) were left-handed, 51% had two or three siblings and no onehad any known disease or injury. No gender differences were found. The children had no failed items during the tests, so therewere no missing data in the raw scores collected during the testing. The results from the baseline test are shown in Table 1. Theoldest group, the 9-year-olds showed the best raw score for all items and the highest TTS. Of the 45 children included in the study,43 (95.5%) children had no movement problems (green zone) and 2 (4.5%) children were classified as children with impairedmotor problems (red zone).

3.1. Intra-tester reliability

The raw scores from the two tests included in the intra-tester analysis are given in Table 2. The ICC values ranged from0.23 to 0.76. The items ‘‘treading lace’’ and ‘‘one-board balance’’ (both preferred and non-preferred legs) showed the highest

Table 1

MABC2. Desciptive statistics (mean � SD) of the 8 items, domains and total scores divided by age.

Ageband 2

7 year olds (n = 10) 8 year olds (n = 14) 9 year olds (n = 21) All (n = 45)

Manual dexterityPlacing pegs, preferred hand, sec 30.5 � 4.5 27.6 � 2.3 26.2 � 3.2 27.6 � 3.6

Placing pegs, non-preferred hand, sec 41.2 � 6.2 32.0 � 3.9 28.1 � 3.7 32.2 � 6.7

Treading lace, sec 30.7 � 8.2 25.3 � 4.5 22.9 � 4.0 25.4 � 6.1

Drawing trail, no. of errors 1.4 � 1.3 0.8 � 0.9 0.2 � 0.5 0.6 � 1.0

Aiming and catchingCatching with two hands, no. of catches 6.7 � 2.1 7.0 � 2.7 7.7 � 2.4 7.2 � 2.4

Throwing beanbag onto mat, no. of hits 6.5 � 2.3 7.1 � 1.5 8.3 � 1.4 7.5 � 1.8

BalanceOne-board balance, right leg, sec 22.0 � 9.3 27.6 � 5.0 28.4 � 4.7 26.7 � 6.5

One-board balance, left leg, sec 16.9 � 11.4 16.1 � 9.0 24.8 � 7.7 20.3 � 9.7

Walking heel-to-toe forwards, no. of steps 12.6 � 3.2 15.0 � 0.0 15.0 � 0.0 14.5 � 1.8

Hopping on mats, right leg, no. of hops 5.0 � 0.0 4.9 � 0.3 5.0 � 0.0 5.0 � 0.1

Hopping on mats, left leg, no. of hops 4.2 � 1.2 4.9 � 0.4 5.0 � 0.0 4.8 � 0.7

DomainManual dexterity, component score 27.8 � 6.6 27.9 � 4.2 31.0 � 5.0 29.3 � 5.3

Aiming and catching, component score 21.6 � 4.7 20.2 � 3.5 22.9 � 3.3 21.8 � 3.8

Balance, component score 31.2 � 7.6 34.1 � 2.9 35.4 � 1.3 34.1 � 4.2

Total test score 80.4 � 17.0 82.1 � 5.7 89.3 � 5.7 88.4 � 11.0

Table 2

MABC2. Intra-tester reliability given as inter-correlation coefficient (ICC2.1) and confidence intervals (95% CI), standard error of measurement (SEMagreement)

and smallest detectable change (SDC90%).

Test 1

(N = 29)

Test 3

(N = 29)

Difference

Test 1–Test 2

ICC2.1 95% CI SEMagreement SDC90%

Manual dexterityPlacing pegs, preferred hand, sec 27.9 � 3.9 26.0 � 3.5 �1.8 � 3.5 0.50 [0.16, 0.73] 2.8 5.6

Placing pegs, non-preferred hand, sec 30.7 � 6.5 30.2 � 4.9 �0.4 � 4.0 0.77 [0.56, 0,88] 2.8 6.6

Treading lace, sec 25.0 � 6.3 22.0 � 4.4 �2.9 � 6.0 0.35 [0.02, 0.62] 4.7 11.1

Drawing trail, no. of errors 0.4 � 0.7 0.3 � 0.7 �0.1 � 0.6 0.68 [0.43, 0.84] 0.4 0.9

Aiming and catchingCatching with two hands, no. of catches 7.3 � 2.2 8.2 � 1.6 0.9 � 1.9 0.48 [0.15, 0.72] 1.5 3.5

Throwing beanbag onto mat, no. of hits 7.6 � 1.7 7.7 � 1.3 0.03 � 1.4 0.59 [0.29, 0.79] 1.0 2.3

BalanceOne-board balance, right leg, sec 26.0 � 6.6 27.2 � 5.6 2.1 � 4.9 0.56 [0.26, 0.77] 4.0 9.6

One-board balance, left leg, sec 20.6 � 9.9 21.2 � 9.8 0.7 � 8.0 0.70 [0.45, 0.85] 5.3 12.7

Walking heel-to-toe forwards, no. of steps 14.5 � 1.8 14.8 � 1.3 0.3 � 2.1 0.75 [0.53, 0.87] 0.9 1.9

Hopping on mats, right leg, no. of hops 5.0 � 0.0 5.0 � 0.0 0 � 0.0 –a –a –a –a

Hopping on mats, left leg, no. of hops 4.8 � 0.8 4.8 � 0.6 0.1 � 0.7 0.24 [�0.15, 0.56] 0.6 1.5

Domains (component score)Manual dexterity 29.8 � 5.3 32.4 � 4.4 2.6 � 3.8 0.62 [0.21, 0.82] 3.2 7.7

Aiming and catching 21.9 � 3.6 22.7 � 3.1 0.8 � 3.4 0.49 [0.17, 0.72] 2.4 5.7

Balance 33.9 � 4.2 34.4 � 3.3 0.5 � 3.8 0.49 [0.15, 0.72] 2.7 6.4

Total scoreTotal test score 85.5 � 9.2 89.5 � 7.2 4.0 � 5.8 0.68 [0.28, 0.85] 4.9 11.7

Total standard score 11.6 � 2.4 12.7 � 7.2 1.1 � 1.6 0.64 [0.23, 0.84] 1.4 3.3a Because of no variation, ICC and SEM could not be calculated.

I. Holm et al. / Research in Developmental Disabilities 34 (2013) 795–800798

measurement errors. The SDC values indicated that a change of greater than 9.7 or 2.6 for the TTS and TSS, respectively,would be required to be 90% certain that a change would not be the result of intra-tester variability or measurement error,but of a real change (Table 2).

3.2. Inter-tester reliability

The raw scores from the two tests at the first test day are given in Table 3, the mean TTS increased from Test 1 to Test 2.Table 3 also shows the results from the reliability analysis. The ICC values ranged from 0.35 to 0.67. The items ‘‘treading lace’’

Table 3

MABC2. Inter-tester reliability given as inter-correlation coefficient (ICC2.1) and confidence intervals (95% CI), standard error of measurement (SEMagreement)

and smallest detectable change (SDC90%).

Test 1

(N = 30)

Test 2

(N = 30)

Difference

Test 1–Test 2

ICC2,1 95% CI SEMagreement SDC 90%

Manual dexterityPlacing pegs, preferred hand, sec 27.9 � 3.9 26.6 � 3.5 �1.4 � 2.6 0.57 [0.27, 0.77] 2.5 5.9

Placing pegs, non-preferred hand, sec 30.7 � 6.4 29.9 � 5.3 �0.8 � 2.9 0.68 [0.38, 0.84] 3.5 8.4

Treading lace, sec 25.0 � 6.4 23.0 � 4.6 �2.0 � 3.4 0.53 [0.23, 0.75] 4.1 9.8

Drawing trail, no. of errors 0.4 � 0.7 0.5 � 0.7 0.03 � 0.6 0.67 [0.42, 0.83] 0.6 1.5

Aiming and catchingCatching with two hands, no. of catches 7.1 � 2.6 7.4 � 2.4 0.3 � 1.2 0.66 [0.40, 0.82] 1.3 3.1

Throwing beanbag onto mat, no. of hits 7.6 � 1.7 7.7 � 1.9 0.1 � 1.0 0.62 [0.33, 0.80] 1.1 2.5

BalanceOne-board balance, right leg, sec 26.8 � 6.5 24.7 � 8.0 �2.2 � 8.0 0.39 [0.05, 0.65] 5.8 13.7

One-board balance, left leg, sec 18.7 � 10.0 16.3 � 10.6 �2.4 � 10.3 0.50 [0.19, 0.73] 7.3 17.4

Walking heel-to-toe forwards, no. of steps 14.2 � 2.1 14.2 � 2.1 0.0 � 2.3 0.42 [0.06, 0.67] 1.6 3.1

Hopping on mats, right leg, no. of hops 5.0 � 0.2 4.9 � 0.4 �0.1 � 0.5 –a –a 0.3 0.8

Hopping on mats, left leg, no. of hops 4.7 � 0.8 4.7 � 0.6 0.0 � 1.0 –a –a 0.7 1.7

Domains (component score)Manual dexterity 28.0 � 5.5 29.7 � 5.0 1.7 � 4.4 0.63 [0.35, 0.80] 3.2 7.7

Aiming and catching 21.9 � 4.0 23.0 � 4.2 1.1 � 2.6 0.77 [0.56, 0.89] 2.0 4.7

Balance 33.5 � 5.0 32.4 � 5.6 �1.1 � 6.3 0.29 [�0.07, 0.58] 4.5 10.6

Total scoreTotal test score 88.4 � 11.0 85.1 � 11.0 �1.8 � 9.5 0.62 [0.35, 0.80] 6.8 16.0

Total standard score 11.1 � 2.5 11.6 � 2.7 0.5 � 2.2 0.63 [0.36, 0.80] 1.6 3.8a Because of small variation, ICC could not be calculated.

I. Holm et al. / Research in Developmental Disabilities 34 (2013) 795–800 799

and ‘‘one-board balance’’ (both preferred and non-preferred legs) also showed the highest measurement errors concerninginter-tester reliability and highest values both for the SEM and the SDC. The SDC values indicated that a change of greaterthan 18.5 or 4.5 for the TTS and TSS, respectively, would be required to be 90% certain that a change would not be the result ofmeasurement error, but of a real change (Table 3).

4. Discussion

The present study showed a relatively high intra- and inter-rater chance variation of the MABC2, ageband 2. A change ofmore than �9.7 and �18.5 on the TTS should be required to state (with a 90% CI) that a real change in a single individual hasoccurred, for intra- and inter-rater testing, respectively (Tables 2 and 3). The results indicate that a change of almost �10 on theTTS is necessary to state that a real change has taken place when the same assessor is testing the child at two occasions. Thecorresponding value when there are different assessors at the two test occasions, which is quite common in daily clinical practice,is �18.5. Taken into account that the TTS scale goes from 1 to 108+, changes of more than 10 (same assessor) and 18.5 (differentassessors) are quite extensive, and the results may indicate that the MABC-2 might be more suitable as a diagnostic or clinicaldecision making tool, than a sensitive tool for evaluation of change over time.

In contrast to 3-year-old children (Smits-Engelsman et al., 2011) all children in the present study completed all items atthe two occasions, indicating that the tasks were not that challenging for children in ageband 2 compared to ageband 1.However, based on the amount of change, Tables 2 and 3 show that some items seem to be more difficult to perform in aconsistent way than others. Both the item ‘‘treading lace’’ and ‘‘one-board balance’’ (bilaterally) show considerable higherSEMs and SDCs compared to the other items. These findings might be an indication that the two aforementioned tasks aremore challenging than the other items, at least in healthy children.

The SEMs and SDC values for the total test score (Tables 2 and 3) were quite high compared to the results from Smits-Engelsman et al. (2011), who found a smallest detectable difference (SDD = SDC) for the TSS of 1.7 and 3.4 for the intra-tester andinter-tester test–retest, respectively. The corresponding values from the present study were 2.6 and 4.5. The children included inSmits-Engelsman’s study were younger than the children included in the present study. Because the youngest children normallyhave a limited attention span and ability to understand the instructions given by the examiners, we would have expected highervariation in the youngest group. However, the children included in Smits-Engelsman’s study showed a high number of faileditems and children which had �4 failed or missing items were excluded from the analysis. If all the included children hadaccomplished all 8 items, the SDCs probably would have been higher. In the present study, all children, independent of level ofmotor competence, performed all tasks, which might have increased the variation from one test to the next.

For practical reasons, the children were recruited from schools in the immediate neighborhood of the hospital and maytherefore be a selected group. However, when comparing the antropometric data from our study sample to data from moreextensive populations of the same age groups in previous published papers (Beenakker, van der Hoeven, Fock, & Maurits,2001; Holm, Fredriksen, Fosdahl, & Vøllestad, 2008), they are quite identical with regard to height and weight. In addition, asthe purpose of the study was to analyze the reliability of the MABC-2 and not to provide reference values, therepresentativety is of minor importance.

The results revealed that the 4.5% of the children included in the study had motor problems. The prevalence is almostidentical with the findings from previous published studies (Chow & Henderson, 2003; Holm, Fredriksen, Fosdahl, Olstad, &Vollestad, 2007), which showed that 6–10% of children without any known medical conditions obtained scores below the 5thpercentile and were characterized as having impaired motor competence. It might be speculated that the real prevalenceshould have been somewhat higher. Participation in research studies is voluntary and the recruitment procedure might failto persuade the sedentary and inactive children or children with motor problems to accept an invitation to take part in testswhere motor competence is the main objective.

The present study has some major methodological limitations. The intra- and inter-tester reliability studies include 15overlapping cases, meaning that only half of the children took part in both the intra- and the inter-tester part of the protocol.Ideally, estimates of the intra- and inter-tester reliability should have been based on the same samples, to allowstraightforward comparison of the two sets of ICC (consistency versus agreement). For practical reasons this was notpossible. Alternatively two independent samples could have been included.

5. Conclusion

In contrast to previous published studies, the present study showed high intra- and inter-rater chance variation MABC-2,ageband 2. A change of more than �9.7 and �18.5 on the TTS should be required to state (with a 90% confidence) that a realchange in a single individual has occurred, for intra- and inter-rater testing, respectively. These findings may indicate that theMABC-2 might be more suitable for diagnostic or clinical decision making purposes, than for evaluation of change over time.

Acknowledgments

We want to thank the children who participated in the study. Grant support was provided by Sophies Minde Foundationand the Norwegian Fund for Postgraduate Training in Physiotherapy.

I. Holm et al. / Research in Developmental Disabilities 34 (2013) 795–800800

References

Beenakker, E. A. C., van der Hoeven, J. H., Fock, J. M., & Maurits, N. M. (2001). Reference values of maximum isometric muscle force obtained in 270 children aged 4–16 years by hand-held dynamometry. Neuromuscular Disorders, 11, 441–446.

Brown, T., & Lalor, A. (2009). The movement assessment battery for children—second edition (MABC-2): A review and critique. Physical & Occupational Therapy inPediatrics, 29, 86–103.

Chow, S. M., & Henderson, S. E. (2003). Interrater and test–retest reliability of the movement assessment battery for Chinese preschool children. American Journalof Occupational Therapy, 57, 574–577.

de Vet, H. C., Terwee, C. B., Mokkink, L. B., & Knol, D. L. (2011). Measurement in medicine. Cambridge: Cambridge University Press.Ellinoudis, T., Evaggelinou, C., Kourtessis, T., Konstantinidou, Z., Venetsanou, F., & Kambas, A. (2011). Reliability and validity of age band 1 of the movement

assessment battery for children—second edition. Research in Developmental Disabilities, 32, 1046–1051.Henderson, S., Sugden, D. A., & Barnett, A. (2007). Movement battery assessment for children-2. Pearson Assessment.Henderson, S. E., & Sugden, D. A. (1992). Movement assessment battery for children: Manual. London: Psychological Corporation.Holm, I., Fredriksen, P. M., Fosdahl, M. A., Olstad, M., & Vollestad, N. (2007). Impaired motor competence in school-aged children with complex congenital heart

disease. Archives of Pediatrics & Adolescent Medicine, 161, 945–950.Holm, I., Fredriksen, P. M., Fosdahl, M. A., & Vøllestad, N. K. (2008). A normative sample of isotonic and isokinetic muscle strength measurements in children 7 to 12

years of age. Acta Paediatrica, 97, 602–607.Smits-Engelsman, B. C., Niemeijer, A. S., & Van, W. H. (2011). Is the movement assessment battery for children-2nd edition: A reliable instrument to measure

motor performance in 3 year old children? Research in Developmental Disabilities, 32, 1370–1377.Wagner, J. M., Rhodes, J. A., & Patten, C. (2008). Reproducibility and minimal detectable change of three-dimensional kinematic analysis of reaching tasks in people

with hemiparesis after stroke. Physical Therapy, 88, 652–663.