very good reliability analysis
TRANSCRIPT
Reliability Analysis
A technique to determine the scalability and reliability of a scale with multiple items.
Cronbach’s alpha
Spearman-Brown split-half reliability
Guttman split-half reliability
Factor analysis & scale validity
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
2
Key Concepts*****
Reliability Analysis
The concept of a scaleDifference between a scale and an indexUCR Index Crime per 100,000Selline-Wolfgang Crime Seriousness ScaleSalient Factor Scale of the US Parole CommissionRand Seven-Factor ScaleKey questions to asked about a scaleThe concept of reliabilityOperational definition of reliability
Test-retest reliabilityAlternative forms reliability
Odd-even reliabilitySplit-half reliability
Inter-rater reliabilityThe concept of validityOperational definitions of validity
Face validityContent validityConcurrent validityPredictive validityInter-rater validity
Scale score: the sum of the score across itemsAverage score: (scale score) (1/n)Classical theory of reliability
Observed scoreTrue scoreErrorReliability as the ratio of the
True score variance to the observed score varianceThe relationship between the reliability of a scale and the number of itemsInterpretation of Cronbach’s The effect on the reliability of a scale of deleting one or more itemsInterpretation of Spearman-Brown split-half reliability
AssumptionsInterpretation of Guttman split-half reliability
AssumptionsStrictly parallel v. parallel models of reliability and their assumptionsThe use of factor analysis in reliability analysisThe use of regression analysis and analysis of variance in reliability
analysis
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
3
Lecture Outline
The concept of a scale & criminal justice examples of scales
Issues in assessing a scale: reliability & validity
An example: the training needs of court administrators
Classical theory of reliability
Cronbach’s Alpha ()
Reliability analysis of training needs data
The concept of a split-half reliability
Spearman-Brown split-half reliability
Guttman split-half reliability
Testing assumptions about scale items:
Strictly parallel vs parallel models
Factor analysis and scale validity
Validating a scale using external criteria: factor analysis, regression and ANOVA
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
4
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
5
Reliability Analysis
Interdependency Technique
Designed to determine the consistency with which multiple items in a scale measure the same underlying trait
Assumptions
Since reliability analysis uses correlational techniques, the assumptions of correlation apply
Variables are metric
Variances of the various variables are comparable
Covariances among the various combinations of variables are comparable
Absence of outliers
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
6
The Concept of a Scale
A measuring instrument from which …
A single number can be derived
Across multiple items
Which indicates the quantity of a trait a subject possesses.
Some criminal justice examples of scales
The UCR Index Crime Rate
Sellin-Wolfgang Crime Seriousness Index
US Parole Commission Salient Factor Score
Rand Seven-Factor Index
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
7
Uniform Crime Report Index of Part I Offenses Per 100,000 Population
The UCR provides an index of crime based upon the sum of reported crimes in seven categories, including:
Violent Crimes Property Crimes *
Homicide BurglaryForcible Rape Larceny-TheftRobbery Auto TheftAggravated Assault
This index is tracked year-to-year and it is assumed that:
If the index rise, so does the total incidence of crime, both reported & unreported
E.g. an incidence of 4000 index crimes is twice 2000, indicating twice the incidence of reported and unreported crime. A questionable assumption.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
8
* Arson is considered a property crime but is not included in the Crime Index Total. It is included in the Modified Crime Index Total, however.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
9
Sellin-Wolfgang Crime Seriousness Index
(Sellin, T. & Wolfgang, M.E. The Measurement of Delinquency, Wiley, 1966)
Sellin & Wolfgang developed a technique to account for not only …
The number of crimes reported to police
But also their relative seriousness
Based upon surveys of various populations they found differential seriousness weights for various crimes. For example:
Crime Serious Weight
Assault (death) 26Forcible Rape 11Robbery (weapon) 5Larceny $5000 4Auto Theft (no damage) 2Larceny $5 1Assault (minor) 1
They proposed that crimes be weighted for seriousness first, then added together to provide an index which reflects both the …
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
10
Amount of crime
And the relative seriousness of crimesThe Salient Factor Score of the
US Parole Commission(Hoffman, P. Screening for risk: a revised salient factor score. J. of Criminal
Justice, 11, 1984, 539-547)
The US Parole Commission has used the Salient Factor Score to predict the likelihood of recidivism on parole.
The score ranges from 0 (poor risk) to 10 (very good risk) based on the weighting of the following factors:
1 Number of prior convictions
2 Prior commitments longer than 30 days
3 Age at the time of the current offense
4 How long the offender was at liberty since the last commitment
5 Whether the offender was on probation, parole or escape status at the time of the current offense
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
11
6 Record of heroine dependency
The Salient Factor Score combined with the seriousness of the current offense is also used by the US Sentencing Commission to provide sentencing guidelines.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
12
Rand Seven-Factor Index: Selective Incarceration of Career Criminals
(Greenwood, P.W. & Abrahamse, A. Selective Incapacitation, Rand Corp., Santa Monica, Calif.: 1982)
The Rand Corporation developed a seven-factor scale to identify defendants likely to be high-rate serious offenders if not incarcerated.
The research was based upon self-report surveys of incarcerated robbers and burglars.
The seven factors of the scale included:
Prior conviction for the same charge
Incarcerated more than 50% of the previous 2 years
Convicted before the age of 16
Served time in a juvenile facility
Drug use in the previous 2 years
Drug use as a juvenile
Unemployed more than 50% of the last 2 years
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
13
The Index ranges from 0 (low risk) to 7 (high risk)
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
14
Key Questions About A Scale
Are the items in the scale reliable, valid?
Are the scale items additive?
Can a scale score be derived from the items?
From the sum of the items, or
The average of the items?
Is the scale score reliable, valid?
Do the items in the scale measure one or more than one trait?
To what extent are the items in the scale intercorrelated?
Can parallel forms of the scale be developed?
How well can individual items predict the scale score?
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
15
What external criteria should be used to validate the scale?
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
16
The Concept of Reliability
Reliability
How accurate is the instrument?
How accurately does the instrument measures “what ever” it measures?
How well does the instrument correlate with itself?
Operational definitions of reliability
Test-retest reliability
Alternative forms reliability
Split-half reliability
Odd-even reliability
Inter-rater reliability
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
17
Operational Definitions of Reliability
Test-Retest Reliability
Measure the same subjects twice (t1 & t2)with the same instrument & under the same conditions.
Reliability = the correlation between t1 & t2
Problems: pretest sensitivity, history, andmaturation
Alternative Forms Reliability
Odd-Even Reliability: correlate the odd numbered items in a scale or test with the even numbered items.
Split-Half Reliability: correlate the 1st half of the items on the scale or test with the 2nd half of the items.
Inter-Rater Reliability
Used to determine the consistency with
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
18
which 2 or more raters can independently rate the same subjects the same way.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
19
The Concept of Validity
Validity
What is being measured?
To what extent does the instrument measure what it is designed to measure?
Is more than one trait being measured?
How well does the instrument correlate with validated external criteria?
Operational Definitions of Validity
Face validity
Content validity
Concurrent validity
Predictive validity
Inter-rater validity
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
20
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
21
Operational Definitions of Validity
Face Validity
On its face, does the measuring instrument “look” like it measures what it is designed to measure (non-empirical standard)
Content Validity
As on an examination, the extent to which the items on a scale or test adequately sample the full range of content to be measured
Concurrent Validity
Does the instrument measure the intended concept as it exists “now”, at the present time, vis-à-vis some future time
Predictive Validity
Does the instrument measure the intended concept as it will be at some future point intime, as in a forecast of recidivism
Inter-Rater Validity
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
22
The correlation between the independent assessment made by a valid expert and the assessment made with the measuring instrument
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
23
An Example: Training Needs of Court Administrators
A survey that included 13 Likert scale training needs items was distributed to 202 court administrators to determine their relative need for continuing professional education.
The items were designed to determine the perceived need for training in the following areas.
Administrative Issue Administrative Issue
Case flow management (case_flo)
Judge/administrator relations (jud_rel)
Communication skills (com_skl) Integrated justice systems (int_jus)
Court's role in corrections (com_cor)
Management information systems (info_sys)
Court reporting technology (rep_tec)
The court as a human organization (hum_org)
Security management (sec_man) Program evaluation (eval)
Judicial ethics (ethics) Strategic planning (plan)
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
24
Human resource management (hum_res)
(The terms in parentheses are the database code names of the variables)
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
25
An Example: Training Needs of Court Administrators (cont.)
Each of the 13 items in the survey was rated on the following Likert scale.
1=no training needed 4=growing need2=minor need 5=very critical need3=needed
Calculating a scale score for a subject
Minimum scale score = (13) (1) = 13
Maximum scale score = (13) (5) = 65
Converting the scale score to a Likert value
Scale conversion factor: 1/13 = 0.07692
For a minimum scale score of 13
(13) (0. 0769) = 1 level of need
For a maximum scale score of 65
(65) (0.0769) = 5 level of need
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
26
Research Questions About the Training Needs of Court Administrators
How much variability is there in the need for training across the various items?
In what area(s) is there the greatest need?
In what area(s) is there the least need?
What is the average need for training across all the 13 items?
To what extent are the training need items correlated?
Can a scale score, the sum of the items or their average, be used as an overall measure of the need for training?
Is the scale score a reliable measure of whatever the scale measures?
Can reliable alternative forms of the scale be constructed?
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
27
What is the effect on the reliability of the scale of deleting one or more items?
What does the scale measure? What are the external correlates of the scale?
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
28
Classical Theory of Reliability
The trait being measured (need for training)
True Score (tau )
Random Error (e)
The observed score on an item (Xij)
Xij = + e
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
29
Definition of Reliability
Index of Reliability
The proportion of the true score variability captured across all items
Relative to the total observed score variability across all the items
r = ( 2 true score) / (2 observed score)
Assumptions
If the error associated with the observed scores is random,
Then when the scores are summed across items,
The errors should cancel, and
The scale score should approximate the true score being measured.
Therefore, the more items in the scale,
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
30
The better the estimate of the true score due to the greater opportunity for the errors to cancel each other
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
31
Lee J. Cronbach’s Alpha(Cronbach, L.J. Coefficient alpha and the internal structure of tests.
Psychometrica, 16, 1951, 297-334)
Alpha () measures the extent to which the scale score measures the true score
Indicates the reliability of the scale
Ranges between 0.0 and 1.0
0.0 = no reliability
1.0 = perfect reliability
(k) (cov / var) =
1 + (k - 1) (cov / var)
k = the number of items in the scale
cov = the average covariance between pairs of items
var = the average variance of the items
If the scale items have been standardized
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
32
= [ (k) (r) ]/ [1+(k-1) (r)]
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
33
Structure of Cronbach’s Alpha
The greater the correlation among the Items …
The higher the value of (ranges from 0 to 1)
The greater the covariance among the Items …
The higher the value of
The greater the number of items …
The higher the value of
Items with high covariance are measuring the same thing, namely …
Tau, the true score
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
34
Descriptive Data on the Survey of Court Administrators
Item Mean Std Dev Cases
1. CASE_FL 3.3692 1.3222 202.0 2. COM_COR 2.4667 1.2628 202.0 3. COM_SKL 3.3692 1.1958 202.0 4. ETHICS 3.0923 1.3127 202.0 5. EVAL 2.7077 1.3352 202.0 6. HUM_ORG 2.8814 1.1553 202.0 7. HUM_RES 2.4330 1.2845 202.0 8. INF_SYS 3.0258 1.2476 202.0 9. INT_JUS 2.8299 1.3071 202.0 10. JUD_REL 3.0464 1.3868 202.0 11. PLAN 2.8308 1.2327 202.0 12. REP_TEC 1.9077 1.1337 202.0 13. SEC_MAN 3.1077 1.3727 202.0
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
35
Intercorrelation of Scale Items
The higher the intercorrelation among the scale items, the greater the reliability of the scale and the higher the value of Cronbach's alpha
Correlation Matrix
CASE_FL COM_COR COM_SKL ETHICS EVAL
CASE_FL 1.0000COM_COR .2396 1.0000COM_SKL .4733 .3901 1.0000ETHICS .4510 .3049 .5336 1.0000EVAL .3355 .2260 .4115 .5429 1.0000HUM_ORG .3596 .3646 .5197 .6004 .4842HUM_RES .3435 .3870 .4654 .5006 .5370INF_SYS .4770 .2893 .4408 .4938 .4524INT_JUS .3716 .2612 .3117 .4607 .2855JUD_REL .4849 .2890 .5753 .6811 .4882PLAN .4462 .2921 .4765 .5628 .5149REP_TEC .2345 .2794 .3143 .4602 .3508SEC_MAN .4064 .4426 .5190 .5386 .4021
HUM_ORG HUM_RES INF_SYS INT_JUS JUD_REL
HUM_ORG 1.0000HUM_RES .5262 1.0000INF_SYS .5647 .4155 1.0000INT_JUS .4813 .4098 .5243 1.0000JUD_REL .5871 .5253 .5025 .4982 1.0000PLAN .6462 .5333 .6369 .4824 .5779REP_TEC .4216 .4590 .3744 .3898 .3791SEC_MAN .4372 .4242 .3820 .4087 .4811
PLAN REP_TEC SEC_MAN
PLAN 1.0000REP_TEC .4626 1.0000SEC_MAN .3603 .3419 1.0000
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
36
Item Variances and Covariances
average scale score
Statistics for Mean Variance Std Dev Variables Scale 37.0678 132.4273 11.5077 13
Item Means Mean Minimum Maximum Range Max/Min Variance 2.8514 1.9077 3.3692 1.4615 1.7661 .1637
Item Variances Mean Minimum Maximum Range Max/Min Variance 1.6262 1.2853 1.9233 .6380 1.4964 .0390
Inter-itemCovariances Mean Minimum Maximum Range Max/Min Variance .7134 .3515 1.2399 .8884 3.5276 .0313
Inter-itemCorrelations Mean Minimum Maximum Range Max/Min Variance .4398 .2260 .6811 .4551 3.0134 .0102
var
cov
The scale score is the sum of the Likert ratings across the 13 items in the scale. The mean score for the 202 administrators is 37.0678.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
37
Cronbach's alpha is calculated from the average variance (var) and average covariance (cov) among the scale items.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
38
Calculation of Cronbach’s Alpha ()
(k) (cov / var) =
1 + (k - 1) (cov / var)
(13) (0.7134) / (1.6262) =
1 + (13 - 1) (0.7134) / (1.6262)
= 0.9104 A high degree of reliability
Average scale score
(2.8514, average Likert rating) (13) =
37.0682 37.07
(2.8514) / (0.07692) 37.07
The scale score can range from a low of 13 to a high of 65.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
39
The Effect of Deleting an Item on the Reliability of the Scale
Q If the item caseflow management (case_fl) is deleted from the scale, would the reliability () decline appreciably?
Average Likert score for all 13 items = 2.8514
Average scale score for all items = 37.0682(2.8514) (13) = 37.0682
For case_fl, the average score = 3.3692
If case_fl is deleted from the scale
The mean scale score declines to
(37.0682) - (3.3692) = 33.699
declines from 0.9104 to 0.9071 (cf. table in the next exhibit)
Conclusion
Deletion of case_fl does not effect the reliability of the scale very much since its deletion does not change appreciably
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
40
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
41
Summary Table of the Impact of Deleting Items on the Reliability of the
Scale
Cronbach's for the full scale = 0.9104
Item-total Statistics
Scale Scale Corrected Mean Variance Item- Squared
Alpha if Item if Item Total Multiple if
Item Deleted Deleted Correlation Correlation
Deleted
CASE_FL 33.6985 115.0976 .5492 .3710 .9071
COM_COR 34.6011 118.7323 .4397 .2811 .9113
COM_SKL 33.6985 114.3094 .6526 .4906 .9028
ETHICS 33.9755 110.2300 .7428 .6047 .8987
EVAL 34.3601 113.6023 .5988 .4418 .9050
HUM_ORG 34.1863 113.3235 .7224 .5686 .9002
HUM_RES 34.6348 112.7329 .6615 .4871 .9023
INF_SYS 34.0420 113.2100 .6652 .5248 .9022
INT_JUS 34.2379 114.4985 .5799 .4213 .9058
JUD_REL 34.0214 109.2204 .7342 .5978 .8990
PLAN 34.2370 112.0915 .7209 .5995 .8999
REP_TEC 35.1601 118.1503 .5271 .3309 .9076
SEC_MAN 33.9601 112.6420 .6144 .4535 .9044
Reliability Coefficients 13 items
Alpha = .9104 Standardized item alpha = .9108
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
42
Interpretation Based upon the decrease in ,
The most reliable items are ethics, judicial/ administrator relations, & planning.
The least reliable items are community corrections, caseflow management, & court reporting technology.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
43
Interpretation of the Item Deletion Summary Table
Scale mean if item deleted
The mean scale score if the associated item is deleted. The mean scale score for all 13 items is 37.0682. (cf. p. 24)
Scale variance if item deleted
The scale variance if the associated item is deleted. The variance for all 13 items is 132.4273. (cf. p. 24)
Scale item total correlation
The correlation between a deleted item and the scale score associated with of the remaining 12 items
If low, the item contributes little to the scale's reliability
If high, the item contributed a lot to the scale's reliability
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
44
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
45
Interpretation of the Item Deletion Summary Table (cont.)
Squared multiple correlation (R 2 )
Regression of the deleted item on the 12 remaining items in the scale
Xd = a + b1X1 + b2X2 + … + b12X12
Xd = deleted item
Indicates the proportion of variance in the deleted item explained by the 12 remaining items in the scale
If R2 is high, the deleted item contributes substantially to the reliability of the scale
If R2 is low, the deleted item contributes little to the reliability of the scale
Alpha if item deleted
The effect on the reliability of the scale () if the item is deleted.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
46
Compare with the value of Cronbach's for the scale including all 13 items (0.9104, cf p. 27)
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
47
Split-Half Reliability
Sometimes alternative forms of the same scale are desirable, as in pre-post designs.
But will the two forms be equally reliable?
Spearman-Brown Split-Half Reliability
rSB = (2) (rxy) / (1 + rxy)
rxy = correlation between the two halves of the scale
Guttman Split-Half Reliability
rG = 2 (S2t - S2
t1 - S2t2) / S2
t
S2t = total variance of entire scale
S2t1 = variance of the 1st half of the scale
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
48
S2t2 = variance of the 2nd half of the scale
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
49
Split-Half Reliability of the Training Needs Scale
Statistics for Mean Variance Std Dev Variables Part 1 20.3196 40.0103 6.3254 7 Part 2 16.7482 32.2112 5.6755 6 Scale 37.0678 132.4273 11.5077 13
Item Means Mean Minimum Maximum Range Max/Min Variance Part 1 2.9028 2.4330 3.3692 .9362 1.3848 .1534 Part 2 2.7914 1.9077 3.1077 1.2000 1.6290 .2008 Scale 2.8514 1.9077 3.3692 1.4615 1.7661 .1637
Item Variances Mean Minimum Maximum Range Max/Min Variance Part 1 1.6091 1.3347 1.7828 .4481 1.3357 .0286 Part 2 1.6462 1.2853 1.9233 .6380 1.4964 .0583 Scale 1.6262 1.2853 1.9233 .6380 1.4964 .0390
Inter-itemCovariances Mean Minimum Maximum Range Max/Min Variance Part 1 .6845 .3811 .9515 .5705 2.4969 .0261 Part 2 .7445 .5295 .9879 .4584 1.8657 .0259 Scale .7134 .3515 1.2399 .8884 3.5276 .0313
Inter-itemCorrelations Mean Minimum Maximum Range Max/Min Variance Part 1 .4284 .2260 .6004 .3744 2.6565 .0105 Part 2 .4535 .3419 .6369 .2950 1.8629 .0072 Scale .4398 .2260 .6811 .4551 3.0134 .0102
Reliability Coefficients
N of Cases = 202.0 N of Items = 13
Correlation between forms = .8385 Equal length Spearman-Brown = .9122
Guttman Split-half = .9093 Unequal-length Spearman-Brown = .9126
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
50
7 Items in part 1 6 Items in part 2
Alpha for part 1 = .8382 Alpha for part 2 = .8320
Interpretation
The 13 items in the scale are divided into two parts or forms: Part 1 and Part 2
Part 1 has 7 items, Part 2 has 6 items
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
51
Split-Half Reliability of the Training Needs Scale (cont.)
The Spearman-Brown reliability for equal-length forms: rSB = 0.9122, unequal length forms: rSB = 0.9126
The Guttman reliability: rG = 0.9093
Cronbach's
Part 1 = 0.8382
Part 2 = 0.8320
Compare the 's of the two parts to the for the scale with 13 items: = 0.9104
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
52
Calculation of the Spearman-BrownSplit-Half Reliability
For forms of equal length
rSB = (2) (rxy) / (1 + rxy)
rSB = (2) (0.8385) / (1 + 0.8385) = 0.9122
rxy = correlation between the two forms of the scale
For forms of unequal length
rSB = (2.00097) (0.8385) / (1 + 0.8385)
= 0.9126
Assumptions
Parts 1 and 2 are equally reliable
Equal variances in Parts 1 and 2
Interpretation
rSB = 0.9126 indicates the reliability of a 13
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
53
item scale made up of two parts that correlate 0.8385
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
54
Calculation of the Louis Guttman’sSplit-Half Reliability
rG = 2 (S2t - S2
t1 - S2t2) / S2
t
S2t = variance of the 13 item scale (cf. p.31)
S2t1 = variance of Part 1, 7 items (cf. p. 31)
S2t2 = variance of Part 2, 6 items (cf. p. 31)
rG = (2) (132.43 - 40.01 - 32.21) / 132.43
rG =0.9093
Assumptions
Assumes neither equal reliability in Parts 1 and 2
Nor equal variances in Parts 1 and 2
Interpretation
rG = 0.9093 indicates the reliability of a 13
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
55
item scale made up of two parts that correlate 0.8385
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
56
Testing Assumptions
Do the scale items have
Equal mean estimates of the true score?
Equal variance estimates of the true score?
Two models which can be tested
Strictly Parallel Model
Parallel Model
Strictly Parallel Model
Tests whether all the items have the same means and variances for the true score
Parallel Model
Tests whether all the items have the same variances for the true score
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
57
But not necessarily the same means for the true score
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
58
Test of the Strictly Parallel Model of Assumptions
Test for Goodness of Fit of Model Strictly Parallel
Chi-square = 633.3580 Degrees of Freedom = 101 Log of determinant of unconstrained matrix = -.009346 Log of determinant of constrained matrix = 3.206150 Probability = .0000
Parameter Estimates
Estimated common mean = 2.8514 Estimated common variance = 1.7773 Error variance = 1.0720 True variance = .7053 Estimated common inter-item correlation = .3943
Estimated reliability of scale = .8943 Unbiased estimate of reliability = .8959
Strictly Parallel Model
Null hypothesis The items have the same means and variances for the true score.
2 = 633.358, df = 101, p 0.0001
Decision Reject the null hypothesis, the items have significantly different means and variances.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
59
Test of the Parallel Model of Assumptions
Test for Goodness of Fit of Model Parallel
Chi-square = 242.7770 Degrees of Freedom = 89 Log of determinant of unconstrained matrix = -.009346 Log of determinant of constrained matrix = 1.226618 Probability = .0000
Parameter Estimates
Estimated common variance = 1.6262 Error variance = .9128 True variance = .7134 Estimated common inter-item correlation = .4387
Estimated reliability of scale = .9104 Unbiased estimate of reliability = .9113
Parallel Model
Null hypothesis The items have the same variances for the true score, but not necessarily the same means.
2 = 242.77, df = 89, p 0.0001
Decision Reject the null hypothesis, the items have significantly different variances.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
60
To What Extent Do the 13 Training Needs Items Measure the Same Thing?
Factor analysis of the 13 training needs items
Principal Component Analysis
With Varimax Rotation
Results
Only one factor extracted with an eigenvalue greater than 1.0
This factor accounts for 48.94% of the variance in the 13 training needs items
Conclusion
The one underlying trait being measured by the 13 item scale is the need for training.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
61
Results of the Factor Analysis of the 13 Training Needs Items
Communalities
Variable Initial Extraction
CASE_FL 1.000 .380COM_SKL 1.000 .508COM_COR 1.000 .251REP_TEC 1.000 .354SEC_MAN 1.000 .451ETHICS 1.000 .638HUM_RES 1.000 .521JUD_REL 1.000 .630INT_JUS 1.000 .420INF_SYS 1.000 .532HUM_ORG 1.000 .613EVAL 1.000 .450PLAN 1.000 .614
Extraction Method: Principal Component Analysis.Factor Analysis
Total Variance Explained
Component Total % of Variance Cumulative % Total % of Variance Cumulative %
1 6.363 48.943 48.943 6.363 48.943 48.9432 .975 7.500 56.4433 .882 6.788 63.2314 .798 6.142 69.3735 .653 5.021 74.3946 .581 4.470 78.8647 .551 4.235 83.0998 .501 3.856 86.9559 .421 3.238 90.193
10 .363 2.793 92.98611 .325 2.501 95.48612 .308 2.368 97.85413 .279 2.146 100.000
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
62
Extraction Method: Principal Component Analysis.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
63
Results of the Factor Analysis of the 13 Training Needs Items (cont.)
Component Matrix
Variable Component1
CASE_FL .617 COM_SKL .713 COM_COR .501 REP_TEC .595 SEC_MAN .672 ETHICS .799 HUM_RES .722 JUD_REL .793 INT_JUS .648 INF_SYS .729 HUM_ORG .783 EVAL .671 PLAN .783
Extraction Method: Principal Component Analysis.a 1 components extracted.
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
64
Validation of the Scale
To what extent is the need for training as measured by the scale score sum correlated with the following experiential variables: (* metric variable, ** nonmetric variable)
Years of experience (years)* Education (edu)* Participation in the state’s professional
development program (pdp)* Participation in annual professional conferences
(conf)* Years of membership in the state's professional
court administration association (year_mem)* Type of court administered (type_c)**
The Analysis
Dependent Variable = the scale score (scl_sum)
For metric independent variables
Multiple regression analysis
For the nonmetric independent variable
One-way ANOVA
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
65
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
66
The Scale Score as a Function of Metric Predictor Variables
Regression Analysis
Scale score regressed on all metric experiential variables
The metric independent variables were not significantly related to the scale score for training needs
R2 = 0.047, p 0.267
Model Summary
Model R R Square Adjusted R Square
Std. Error of the Estimate
1 .217 .047 .011 11.6954
a Predictors: (Constant), YEAR_MEM, EDU, PDP, CONF, YEARSb Dependent Variable: SCL_SUM
ANOVA
Model Sum of Squares
df Mean Square
F Sig.
1 Regression 889.993 5 177.999 1.301 .267Residual 18055.359 132 136.783Total 18945.352 137
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
67
a Predictors: (Constant), YEAR_MEM, EDU, PDP, CONF, YEARSb Dependent Variable: SCL_SUM
The Scale Score as a Function of Metric Predictor Variables (cont.)
Coefficients
Unstandardized Coefficients
Standardized
Coefficients
t Sig. 95% Confidence Interval for
B Model B Std.
ErrorBeta Lower
BoundUpper Bound
1 (Constant) 29.659 5.003 5.928 .000 19.762 39.556 YEARS -.139 .285 -.060 -.487 .627 -.703 .426 EDU 2.078 1.176 .152 1.766 .080 -.249 4.405
PDP 2.783 2.869 .085 .970 .334 -2.893 8.459CONF .898 .615 .146 1.462 .146 -.317 2.114YEAR_MEM
-.192 .330 -.076 -.581 .562 -.844 .461
a Dependent Variable: SCL_SUM
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
68
The Scale Score as a Function of Type of Court
One-way ANOVA
IV = type of court, DV = scale score
A marginally significant difference was found in the mean scale score for training needs
F = 3.344, p 0.069
Administrators in county courts at lawindicated a greater need for training than district court administrators
Univariate Analysis of Variance
Descriptives
Dependent Variable: SCL_SUM N Mean Std.
DeviationStd. Error 95%
Confidence Interval for
Mean
Minimum
Maximum
LowerBound
Upper Bound
1.00 132 36.0032 11.4240 .9943 34.0361 37.9702 13.00 61.00 2.00 67 39.1652 11.7273 1.4327 36.3047 42.0257 18.00 59.00 Total 199 37.0678 11.5946 .8219 35.4469 38.6886 13.00 61.00
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
69
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
70
The Scale Score as a Function of Type of Court (cont.)
ANOVA
Dependent Variable: SCL_SUM
Sum of Squares
df Mean Square
F Sig.
Between Groups
444.347 1 444.347 3.344 .069
Within Groups
26173.536 197 132.861
Total 26617.883 198
Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
71