Manual

CAARS 2 Manual

Appendix N: Between-Subjects Measurement Invariance — Translation Study Analyses


Appendix N: Between-Subjects Measurement Invariance — Translation Study Analyses

In addition to the within-subject invariance approach outlined in Content and DSM Symptom Scales in chapter 13, Translations, a between-subjects approach was taken to examine whether ratings from the French (Canada) or Spanish (North American) language versions of the Conners Adult ADHD Rating Scale 2nd Edition (CAARS™ 2) differed from a demographically matched sample taken from the English Normative Sample (individuals were matched on gender, age, race/ethnicity, and education level). Both a multiple-group confirmatory factor analysis (CFA) and an item-response theory (IRT; specifically differential test functioning [DTF]) approach were taken for the between-subject sample. Details on the methodology for these analyses are covered in appendix M, Methods of Evaluating Measurement Bias.

Results

Results of the between-subjects MI analyses for the French and Spanish translation are presented in Table N.1 to Table N.4. Overall, the models were found to be invariant between the French and English and Spanish and English versions of the forms, as evidenced by non-decreasing ΔCFI values and nonsignificant Satorra-Bentler chi-square tests. As part of the modeling procedure, some steps required partial invariance adjustments to result in nonsignificant chi-square tests, but the adjustments were infrequent and did not compromise the overall comparability of the scales between the language versions (Dimitrov, 2010). The results reflect the findings from the within-subject MI analysis reported in chapter 13, Translations, and provide additional evidence for the validity of the CAARS 2 French (Canada) and Spanish (North American) translations as parallel measures to the English version.

Click to expand

Table N.1. Between-Subjects Measurement Invariance by Language Version (French vs. English): CAARS 2 Self-Report

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1733.34*** 810 .066 .954 .951 .070 --
Weak 1756.82*** 839 .065 .955 .953 .070 22.72 29 .002
Strong 1755.77*** 868 .063 .956 .956 .071 36.25 29 .003
Strict 1760.44*** 891 .061 .957 .958 .071 33.90 23 .002
Hyperactivity Configural 568.80*** 130 .114 .937 .925 .087 --
Weak 583.69*** 143 .109 .937 .931 .087 9.01 13 .006
Strong 570.87*** 155 .101 .941 .940 .087 15.73 12 .009
Strict 563.76*** 162 .098 .943 .945 .087 11.10 7 .005
Impulsivity Configural 454.06*** 130 .098 .924 .909 .079 --
Weak 475.31*** 143 .094 .922 .915 .079 18.18 13 .006
Strong 465.68*** 155 .088 .927 .927 .079 17.98 12 .012
Strict 468.31*** 167 .083 .930 .934 .081 21.00 12 .012
Emotional Dysregulation Configural 174.254*** 54 .092 .986 .981 .045 --
Weak 180.742*** 63 .085 .986 .984 .045 5.37 9 .003
Strong 178.013*** 71 .076 .987 .987 .045 8.29 8 .003
Strict 186.31*** 77 .074 .987 .988 .045 11.51 6 .001
Negative Self-Concept Configural 141.43*** 28 .125 .981 .971 .044 --
Weak 154.51*** 35 .114 .980 .976 .044 7.87 7 .005
Strong 145.50*** 41 .099 .982 .982 .045 3.44 6 .006
Strict 150.33*** 44 .096 .982 .983 .045 6.12 3 .001
Note. N = 274 French version; N = 274 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Inattention/Executive Dysfunction, Hyperactivity, Emotional Dysregulation, and Negative Self-Concept scale revealed that six, five, three, and two intercepts had to be released, respectively, for the strict invariance hypothesis to hold.
Click to expand

Table N.2. Between-Subjects Measurement Invariance by Language Version (French vs. English): CAARS 2 Observer

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1255.13*** 810 .054 .975 .974 .065 --
Weak 1279.74*** 839 .052 .976 .975 .065 29.14 29 .001
Strong 1281.81*** 868 .050 .977 .977 .065 22.98 29 .001
Strict 1301.93*** 893 .049 .977 .978 .065 37.15 25 .000
Hyperactivity Configural 510.17*** 130 .124 .933 .920 .101 --
Weak 528.25*** 142 .119 .932 .925 .101 10.45 12 -.001
Strong 532.32*** 153 .114 .933 .932 .101 14.44 11 .001
Strict 523.63*** 165 .107 .937 .940 .101 17.12 12 .004
Impulsivity Configural 393.54*** 130 .103 .955 .946 .074 --
Weak 414.26*** 143 .100 .954 .949 074 18.93 13 -.001
Strong 410.87*** 155 .093 .956 .956 075 13.13 12 .002
Strict 412.59*** 165 .089 .958 .960 075 14.50 10 .002
Emotional Dysregulation Configural 255.60*** 54 .140 .972 .963 .062 --
Weak 265.98*** 63 .130 .972 .968 .062 6.77 9 .000
Strong 261.91*** 71 .119 .974 .973 .062 8.62 8 .002
Strict 262.41*** 78 .111 .974 .976 .062 10.74 7 .000
Negative Self-Concept Configural 68.69*** 28 .087 .986 .978 .048 --
Weak 74.54*** 35 .077 .986 .983 .048 4.97 7 .000
Strong 83.49*** 40 .075 .985 .984 .048 9.73 5 -.001
Strict 91.69*** 46 .072 .984 .985 .048 10.67 6 -.001
Note. N = 195 French version; N = 195 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Hyperactivity and Negative Self-Concept scale revealed that one factor loading in each scale had to be released to meet the strong invariance hypothesis. To meet the strict invariance hypothesis, four item intercepts had to be released for the Inattention/Executive Dysfunction Scale, two for the Impulsivity Scale, and one for the Emotional Dysregulation Scale.
Click to expand

Table N.3. Between-Subjects Measurement Invariance by Language Version (Spanish vs. English): CAARS 2 Self-Report

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1441.51*** 810 .053 .974 .972 .056 --
Weak 1465.66*** 840 .051 .974 .973 .056 30.15 30 .000
Strong 1468.67*** 869 .050 .975 .975 .056 33.77 29 .001
Strict 1475.00*** 896 .048 .976 .977 .056 36.47 27 .001
Hyperactivity Configural 501.77*** 130 .101 .944 .933 .077 --
Weak 514.84*** 141 .097 .944 .938 .077 11.87 11 .000
Strong 510.05*** 153 .091 .946 .945 .077 19.20 12 .002
Strict 499.11*** 163 .086 .949 .951 .078 14.93 10 .003
Impulsivity Configural 363.97*** 130 .080 .956 .947 .066 --
Weak 374.31*** 143 .076 .956 .952 .066 6.97 13 .000
Strong 370.15*** 155 .070 .959 .959 .067 15.56 12 .003
Strict 368.81*** 167 .066 .962 .964 .068 16.43 12 .003
Emotional Dysregulation Configural 218.84*** 54 .104 .969 .959 .054 --
Weak 229.38*** 63 .097 .969 .965 .054 6.90 9 .000
Strong 223.60*** 71 .087 .972 .971 .054 10.49 8 .003
Strict 223.39*** 79 .081 .973 .975 .055 12.17 8 .001
Negative Self-Concept Configural 94.63*** 28 .092 .986 .979 .046 --
Weak 103.54*** 35 .083 .986 .983 .046 6.76 7 .000
Strong 95.02*** 41 .068 .989 .988 .047 5.00 6 .003
Strict 95.84*** 45 .063 .989 .990 .047 4.64 4 .000
Note. N = 283 Spanish version; N = 283 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models found that two item intercepts had to be released for the Inattention/Executive Dysfunction, Hyperactivity, and Negative Self-Concept scale for the strict invariance model to hold.
Click to expand

Table N.4. Between-Subjects Measurement Invariance by Language Version (Spanish vs. English): CAARS 2 Observer

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1366.89*** 810 .055 .980 .979 .056 --
Weak 1394.48*** 840 .054 .980 .980 .056 31.37 30 .000
Strong 1395.46*** 869 .052 .982 .981 .056 22.75 29 .002
Strict 1407.05*** 896 .050 .982 .983 .056 37.69 27 .000
Hyperactivity Configural 445.78*** 130 .103 .965 .958 .073 --
Weak 461.34*** 143 .099 .964 .961 .073 8.59 13 -.001
Strong 455.62*** 154 .093 .966 .966 .074 7.18 11 .002
Strict 450.05*** 165 .087 .968 .970 .074 16.51 11 .002
Impulsivity Configural 225.72*** 130 .057 .989 .987 .047 --
Weak 239.64*** 143 .055 .989 .988 .047 12.88 13 .000
Strong 246.99*** 154 .052 .990 .989 .047 12.40 11 .001
Strict 258.85*** 165 .050 .989 .990 .048 16.46 11 -.001
Emotional Dysregulation Configural 219.12*** 54 .116 .982 .977 .049 --
Weak 231.52*** 63 .109 .982 .979 .049 9.30 9 .000
Strong 233.83*** 71 .101 .983 .982 .049 9.28 8 .001
Strict 239.49*** 77 .096 .983 .984 .049 10.95 6 .000
Negative Self-Concept Configural 87.98*** 28 .097 .972 .959 .059 --
Weak 94.01*** 35 .086 .973 .967 .059 3.34 7 .001
Strong 96.59*** 38 .082 .973 .970 .059 4.33 3 .000
Strict 105.44*** 43 .080 .971 .972 .060 10.21 5 -.002
Note. N = 230 Spanish version; N = 230 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models found that one loading had to be released for the Hyperactivity and Impulsivity scale, and three loadings had to be released for the Negative Self-Concept scale for the strong invariance model to hold. Further, one item intercept had to be released for the Hyperactivity, Impulsivity, and Negative Self-Concept scales, and two item intercepts for the Inattention/Executive Dysfunction and Emotional Dysregulation scales for strict invariance to hold.

In addition to the MI analyses, DTF was also examined for Content Scales. Across both the French and English and Spanish and English comparisons, test characteristic curves for Content Scales were found to be statistically equivalent as evidenced by overlapping 95% confidence intervals (see Figure N.1 and Figure N.2 for examples featuring the Inattention/Executive Dysfunction scale). More specifically, considerable overlap occurred in the area approaching and exceeding 1.5 standard deviations above mean theta (see Test Information in chapter 8, Reliability, for more information on how these graphs are interpreted). This pattern of results demonstrates a lack of difference between the functioning of Content Scales for the French and English and Spanish and English language versions, a finding further supported by the negligible effect sizes presented in Table N.5 and Table N.6.

Click to expand

Figure N.1. Differential Test Functioning for French and English Language Versions (Inattention/Executive Dysfunction)

a) Self-Report




b) Observer




Click to expand

Figure N.2. Differential Test Functioning by Spanish and English Language Versions (Inattention/Executive Dysfunction)

a) Self-Report




b) Observer




Click to expand

Table N.5. Differential Test Functioning by Language Version (French vs. English)

Scale Self-Report Observer
Inattention/​Executive Dysfunction -0.03 -0.05
Hyperactivity -0.11 -0.02
Impulsivity -0.03 -0.03
Emotional Dysregulation 0.00 -0.11
Negative Self-Concept 0.02 0.10
Note. Values presented are Expected Test Score Standardized Differences (ETSSD); guidelines for interpretation: small effect size ≥ |0.20|; medium effect size ≥ |0.50|; large effect size ≥ |0.80|. Positive ETSSD values indicate that individuals with equal amounts of the constructs being measured who took the French translation as part of the translation study sample scored higher than individuals who took the English version.
Click to expand

Table N.6. Differential Test Functioning by Language Version (Spanish vs. English)

Scale Self-Report Observer
Inattention/​Executive Dysfunction -0.02 0.00
Hyperactivity -0.01 -0.03
Impulsivity -0.07 0.01
Emotional Dysregulation -0.11 0.00
Negative Self-Concept -0.05 0.11
Note. Values presented are expected test score standardized differences (ETSSD); guidelines for interpretation: small effect size ≥ |0.20|; medium effect size ≥ |0.50|; large effect size ≥ |0.80|. Positive ETSSD values indicate that individuals with equal amounts of the constructs being measured who took the Spanish translation as part of the translation study sample scored higher than individuals who took the English version.

Both the muliple-group CFA and DTF approach support the invariance of the factor structure for Content Scales on the CAARS 2 French and Spanish versions. This between-subject approach adds support to the within-subject approach presented in chapter 13, Translations.

< Back