Manual

CAARS 2 Manual

Chapter 13: French (Canada) Translation Study


French (Canada) Translation Study

Sample

A total of 307 Canadians completed the CAARS 2 Self-Report, and 215 Canadians completed the CAARS 2 Observer. Data were cleaned prior to analysis, based on data quality metrics, including indicators of careless responding (e.g., random responding) and response acquiescence (e.g., an irregular number of consecutive item responses). Data for 33 individuals were removed from the Self-Report sample and those for 20 individuals were removed from the Observer sample. The final samples included 274 individuals for Self-Report and 195 for Observer. Refer to appendix J for the demographic characteristics for the individuals being rated and for the Observers.

Reliability

The reliability of the French version of the CAARS 2 was assessed via internal consistency estimates and test information functions (see Internal Consistency and Standard Error of Measurement and Test Information in chapter 8, Reliability, for more details). Coefficients alpha and omega were used as estimates of internal consistency and information functions were generated using the mirt package in R (Chalmers, 2012).

Internal Consistency

Table 13.1 presents alpha and omega coefficients for the French version of the CAARS 2. Internal consistency estimates were excellent for both the Self-Report and Observer form; the median coefficient omega value across scales for the Self-Report was .94 (ranging from .92 to .97) and for the Observer was .96 (ranging from .89 to .98). These estimates are comparable to those found for the English version from the Normative Samples (see Tables 8.1a, 8.1b, 8.2a, and 8.2b in chapter 8, Reliability), as well as to values derived from the English version completed by individuals in the current sample (estimates are not reported as all scales were within .01 of each other across language versions). Overall, results show that the French version of the CAARS 2 is internally consistent and comparable to the English version.

Click to expand

Test Information

Figure 13.1 shows the test information functions for the CAARS 2 by Content and DSM Symptom Scale. As can be seen in the figure, the French versions of the scales display high information across the relevant range of the ability scale (approaching and exceeding 1.5 standard deviations above the mean). Further, the peaks of the information curves are broad with a wide area beneath them, implying the precision of measurement remains consistent across the relevant range of the scales. The peaks of the information functions are also equal to or greater than a value of 10, indicating very high precision and excellent reliability for the French version of the CAARS 2. These information functions are similar in overall shape and magnitude to the English version (see Test Information in chapter 8, Reliability).

Click to expand

Validity

An important goal of this study was to ensure that the CAARS 2 item content was parallel across the English and French versions of the forms. As described in Creation of Translated Forms, a cultural translation was used to ensure the meaning of each item was captured in the translation, rather than just the literal wording. The expectation was that the French translated items should perform similarly to their English counterparts. Evidence of validity was explored for the Content Scales in terms of consistency in the factor structure across languages (tested via measurement invariance [MI] methods). Given the within-subjects design (in which individuals completed the CAARS 2 in both languages), a within-subjects MI approach was taken (Liu et al., 2017; however, results are also available for a between-subjects approach involving the Normative Samples; see appendix N for these results). After invariance was established, correlations were computed, and mean scale scores were compared for both the Content and DSM Symptom Scales to examine whether scores were consistent between the French and English versions. The CAARS 2–ADHD Index, Associated Clinical Concern Items, Impairment & Functional Outcome Items, and Validity Scales were analyzed for differences between the French and English translations. Results for all analyses are reported in the following sections.

Content and DSM Symptom Scales

The factor structure of the Content Scales was compared via a within-subjects MI approach (note that this differs from the methodology described in appendix M), in accordance with recommendations from Liu et al. (2017). Conducting MI with this approach involves testing the following four steps, in order:

  1. Configural Invariance: Language versions have the same factor structure.

  2. Weak Invariance: Language versions have the same factor structure and factor loadings,

  3. Strong Invariance: Language versions have the same factor structure, factor loadings, and item thresholds.

  4. Strict Invariance: Language versions have the same factor structure, factor loadings, item thresholds, and item residuals.

Results of the MI investigations are presented in Table 13.2 and Table 13.3. Overall, the CAARS 2 Content Scales were found to be invariant across the French and English versions, as evidenced by non-decreasing CFI values and nonsignificant Satorra-Bentler chi-square tests (Satorra & Bentler, 2001). As part of the modeling procedure, some steps required partial invariance adjustments to result in nonsignificant chi-square tests, but the adjustments were infrequent and did not compromise the overall comparability of the scales between the two languages (Dimitrov, 2010). The results provide strong evidence for the validity of the CAARS 2 French translation as a parallel measure to the English version.

Click to expand

Table 13.2. Within-Subjects Measurement Invariance by Language Version (French vs. English): CAARS 2 Self-Report

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 2666.97*** 1679 .046 .960 .957 .070 --
Weak 2689.54*** 1708 .046 .961 .958 .070 41.63 29 .000
Strong 2733.17*** 1766 .045 .961 .960 .070 69.16 58 .000
Strict 2622.17*** 1794 .041 .966 .967 .072 35.25 28 .005
Hyperactivity Configural 633.34*** 285 .067 .965 .958 .070 --
Weak 632.81*** 297 .064 .966 .961 .070 7.81 12 .001
Strong 642.28*** 322 .060 .968 .966 .070 31.82 25 .002
Strict 623.12*** 334 .056 .970 .970 .073 20.12 12 .002
Impulsivity Configural 688.05*** 285 .072 .948 .938 .073 --
Weak 686.96*** 296 .070 .950 .942 .073 7.66 11 .002
Strong 698.19*** 322 .065 .952 .949 .073 29.12 26 .002
Strict 676.44*** 335 .061 .956 .956 .077 20.73 13 .004
Emotional Dysregulation Configural 283.10*** 125 .068 .983 .978 .046 --
Weak 288.52*** 133 .065 .984 .980 .046 10.10 8 .001
Strong 291.61*** 150 .059 .985 .984 .047 14.06 17 .001
Strict 270.69*** 159 .051 .988 .988 .049 9.18 9 .003
Negative Self-Concept Configural 247.35*** 69 .097 .981 .972 .050 --
Weak 259.48*** 75 .095 .980 .973 .050 12.04 6 .000
Strong 270.82*** 88 .087 .980 .977 .050 11.03 13 .000
Strict 254.37*** 94 .079 .981 .981 .052 10.17 6 .001
Note. N = 274 French version; N = 274 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models at the Strict invariance step for the Inattention/Executive Dysfunction, Hyperactivity, and Negative Self-Concept scales showed that releasing one (Hyperactivity and Negative Self-Concept) or two (Inattention/Executive Dysfunction) item residuals resulted in partial invariance results being achieved.
Click to expand

Table 13.3. Within-Subjects Measurement Invariance by Language Version (French vs. English): CAARS 2 Observer

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 2325.64*** 1679 .045 .975 .973 .065 --
Weak 2347.82*** 1708 .044 .975 .973 .065 30.09 29 .000
Strong 2389.34*** 1766 .043 .976 .975 .065 58.90 58 .001
Strict 2312.57*** 1796 .039 .980 .980 .070 39.41 30 .004
Hyperactivity Configural 531.95*** 285 .067 .972 .967 .068 --
Weak 539.93*** 297 .065 .973 .969 .068 9.86 12 .000
Strong 558.69*** 321 .062 .973 .972 .069 25.66 24 .001
Strict 548.94*** 333 .058 .976 .975 .073 16.52 12 .003
Impulsivity Configural 610.89*** 285 .077 .966 .960 .071 --
Weak 623.59*** 297 .075 .966 .961 .071 18.22 12 .000
Strong 640.95*** 321 .072 .967 .965 .072 24.81 24 .001
Strict 628.81*** 334 .067 .970 .969 .074 22.36 13 .003
Emotional Dysregulation Configural 301.95*** 125 .085 .986 .981 .048 --
Weak 305.21*** 133 .082 .986 .983 .048 6.21 8 .000
Strong 320.59*** 150 .077 .986 .985 .048 24.89 17 .000
Strict 287.61*** 159 .065 .990 .989 .052 10.78 9 .003
Negative Self-Concept Configural 107.18*** 69 .053 .992 .989 .046 --
Weak 114.93*** 75 .052 .992 .990 .046 9.58 6 .000
Strong 127.57*** 88 .048 .992 .991 .048 16.37 13 .000
Strict 139.46*** 95 .049 .991 .991 .055 13.84 7 .001
Note. N = 195 French version; N = 195 English version. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Hyperactivity scale showed that one item residual had to be released to meet the strict invariance hypothesis.

Results from the correlation and mean group difference analyses are presented in Table 13.4 and Table 13.5. Corrected correlations (Sackett et al., 2000) revealed statistically significant and very strong relationships across scales between the French and English versions of the CAARS 2, with a median correlation of .91 for the Self-Report (range .90 to .95) and .89 for the Observer (range .86 to .91). Further, Welch’s paired t-tests (Welch, 1947) showed no statistically significant differences between scales when comparing obtained scores on the two language versions (p < 0.01); Cohen’s d effect sizes were also negligible (maximum Cohen’s d = 0.14). Taken together, the strong correlations between scales across language versions, and the lack of statistical and practical differences in mean scale scores, indicates that similar scores can be expected between the French and English versions. These results provide evidence for the validity of the French version of the CAARS 2.

Click to expand

Table 13.4. Correlations and Mean Differences by Language Version (French vs. English): CAARS 2 Self-Report

Scale Correlations French English Paired t-tests
Obtained r Corrected r M SD M SD Cohen's d t (273) p
Inattention/​Executive Dysfunction .91 .95 49.5 8.2 49.3 8.2 0.05 0.88 .378
Hyperactivity .84 .90 47.9 8.6 48.5 8.7 0.11 -1.87 .063
Impulsivity .87 .90 48.5 9.0 48.8 9.4 0.06 -0.99 .323
Emotional Dysregulation .86 .90 48.1 9.1 48.4 9.1 0.05 -0.90 .368
Negative Self-Concept .88 .91 50.4 9.2 50.1 8.9 0.07 1.13 .258
DSM ADHD Inattentive Symptoms .89 .95 49.5 8.1 49.1 7.9 0.11 1.84 .067
DSM ADHD Hyperactive/Impulsive Symptoms .84 .91 48.1 8.5 48.7 8.7 0.13 -2.18 .030
DSM Total ADHD Symptoms .89 .95 48.7 8.1 48.8 8.0 0.03 -0.43 .668
Note. N = 274. All r significant, p < .001. Guidelines for interpreting |r|: very weak < .20; weak = .20 to .39; moderate = .40 to .59; strong = .60 to .79; very strong ≥ .80. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for the French version than the English version.
Click to expand

Table 13.5. Correlations and Mean Differences by Form Language (French vs. English): CAARS 2 Observer

Scale Correlations French English Paired t-tests
Obtained r Corrected r M SD M SD Cohen's d t (193) p
Inattention/​Executive Dysfunction .90 .91 49.5 9.2 49.3 10.2 0.03 0.48 .635
Hyperactivity .88 .87 48.9 9.9 49.3 10.5 0.08 -1.13 .262
Impulsivity .88 .88 49.6 9.8 49.3 10.7 0.05 0.69 .494
Emotional Dysregulation .91 .91 50.8 9.9 50.2 10.2 0.14 1.98 .049
Negative Self-Concept .84 .91 50.9 8.2 51.3 8.5 0.08 -1.11 .267
DSM ADHD Inattentive Symptoms .87 .89 50.3 9.3 49.9 10.0 0.09 1.20 .232
DSM ADHD Hyperactive/Impulsive Symptoms .87 .86 49.6 10.0 49.6 10.6 0.00 -0.04 .971
DSM Total ADHD Symptoms .89 .89 50.1 9.6 49.8 10.4 0.05 0.67 .503
Note. N = 195. All r significant, p < .001. Guidelines for interpreting |r|: very weak < .20; weak = .20 to .39; moderate = .40 to .59; strong = .60 to .79; very strong ≥ .80. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for the French version than the English version.

CAARS 2–ADHD Index

The CAARS 2–ADHD Index was examined to confirm that it performed similarly across the French and English versions (see chapter 12, CAARS 2–ADHD Index, for more information on the development, scores, and psychometric properties of the CAARS 2–ADHD Index). The probability scores for the French version of the CAARS 2–ADHD Index were compared to the English version using the Wilcoxon Signed Rank Test (note that this non-parametric approach was favored, as the probability score does not follow assumptions of normality; Wilcoxon, 1945); an effect size, r, is also provided which can be interpreted using the correlation guidelines provided in this chapter (Rosenthal, 1991).

The difference in probability scores between the French and English versions was not statistically significant, and effect sizes were very weak (Self-Report: V = 7029, p = .107, r = -.10; Observer: V = 3252, p = .259, r = -.08). Thus, the CAARS 2–ADHD Index operates similarly in English as it does in French, adding to the validity evidence for the French translation.

Associated Clinical Concern Items and Impairment & Functional Outcome Items

The Associated Clinical Concern Items and Impairment & Functional Outcome Items of the CAARS 2 were also examined to ensure language versions operated similarly. To gauge this, the proportion of individuals with concordant item elevations or endorsements across language versions was calculated (see Associated Clinical Concerns: Item Selection and Scoring and Impairment & Functional Outcome Items: Item Selection and Scoring in chapter 6, Development for more information on how endorsed and elevated responses were determined for items in these scales). Item responses were considered concordant across language versions if an individual’s ratings for a given item were either elevated/endorsed or not elevated/endorsed across both languages. Conversely, if an item was elevated/endorsed on one language, but not the other, the person was considered to have discordant item elevations/endorsements. McNemar’s test (McNemar, 1947) with a continuity correction was used (yields a chi-square test statistic) to ensure item elevations/endorsements were not more frequent on one language version than the other (analyses were conducted with the stats package in R).

Results of the item-level analyses are presented in Table 13.6. For the Associated Clinical Concern Items and the Impairment & Functional Outcome Items, the percentage of individuals with concordant item endorsements or elevations was very high, with agreement above 90% for nearly all items on both forms. McNemar’s tests also showed that item elevations/endorsements were not significantly more frequent on one language version than the other (p > .01), with the exception of a single item for Self-Report (though it still displays 89.4% concordance). Taken together, the results demonstrate that elevations and endorsements on the Associated Clinical Concern Items and the Impairment & Functional Outcome Items are highly similar between the French and English versions, supporting the validity of the French translation.

Click to expand

Table 13.6. Concordance of Item Elevations/Endorsements by Language (French vs. English)

Item Set Item Stem Self-Report Observer
% Concordant χ2 p % Concordant χ2 p
Associated Clinical Concern Items Suicidal thoughts/attempts 97.8 1.50 .221 92.3 0.00 1.00
Self-Injury 96.7 1.78 .182 94.4 0.36 .546
Sadness/emptiness* 90.5 0.04 .845 93.3 0.00 1.00
Anxiety/Worry 85.4 0.03 .874 87.7 0.04 .838
Impairment & Functional Outcome Items Bothered by things endorsed on the CAARS 2 91.2 0.04 .838 90.3 0.21 .646
Things endorsed on the CAARS 2 interfere with life 89.4 13.79 < .001 88.7 3.68 .055
Problems in romantic/marital relationship(s) 92.3 0.76 .383 91.8 0.56 .453
Problems in relationships with family members 93.8 0.00 1.00 90.8 0.06 .814
Problems in relationships with friends, coworkers, or neighbors 96.0 0.00 1.00 91.3 0.24 .628
Problems at work and/or school 91.2 1.04 .307 92.8 0.07 .789
Has a harder time with things than other people do 90.9 0.16 .689 86.7 3.12 .078
Underachiever 89.1 0.00 1.00 93.3 0.00 1.00
Sleep problems 93.8 0.00 1.00 87.2 0.00 1.00
Problems with money management 92.3 1.71 .190 93.3 0.31 .579
Neglects family or household responsibilities 90.1 1.33 .248 93.3 0.00 1.00
Risky driving 92.0 2.23 .136 94.4 1.45 .228
Problems due to time spent online 92.0 2.23 .136 93.3 0.31 .228
Note. The chi-square test statistic and its associated p value are for the McNemar's tests (df = 1).
* The item stem for this Screening Item is Sadness/Emptiness for Self-Report and Sadness for Observer.

Validity Scales

The CAARS 2 Validity Scales were examined to ensure that they operated similarly in the French and English versions. For both the Negative Impression Index and the Inconsistency Index, the proportion of individuals with concordance across the French and English versions for scale elevations (that is, raw scores that exceeded the cut-off) was compared (details provided in Response Style Analysis: Item Selection and Score Creation in chapter 6, Development).

As can be seen in Table 13.7, the proportion of individuals with concordant elevations were very high (above 90%) for both the Negative Impression Index and Inconsistency Index. Further, McNemar’s tests indicated that scoring above the cut-off on one language version was not statistically significantly more likely than scoring above the cut-off on the other (p > .01). Taken together, both Validity Scales operated similarly in the French and English versions of the CAARS 2, contributing supporting evidence for the validity of the French language version

Click to expand

Table 13.7. Concordance of the Validity Scales by Language (French vs. English)

Scale Self-Report Observer
% Concordant χ2 p % Concordant χ2 p
Negative Impression Index 92.7 0.45 .502 97.9 0.00 1.00
Inconsistency Index 93.1 0.00 1.00 91.8 0.00 1.00
Note. The chi-square test statistic and its associated p value are for the McNemar's tests (df = 1).

Summary

The reliability and validity of the French (Canada) version of the CAARS 2 was examined in a translation study where individuals completed both the French and the English versions consecutively (with order counterbalanced across individuals). Both the Self-Report and Observer forms displayed excellent internal consistency and high levels of measurement precision for all Content and DSM Symptom Scales, with coefficients and information functions comparable to those of the English version for both the current sample, as well as the normative sample. These results provide strong evidence for the reliability of the French version of the CAARS 2.

Further, it was demonstrated that the French version of the Content Scales were invariant from the English scales, indicating the measurement models of both the French and English versions of the Content Scales are statistically similar. Examination of obtained scores also supported the finding of high scale correlations and no mean differences between language versions for the Content Scales and DSM Symptom Scales. Analyses for the ADHD Index, Associated Clinical Concern Items, Impairment & Functional Outcome Items, and Validity Scales also showed high concordance on scale/item-level endorsements and on elevations across language versions. Taken together, these findings provide strong evidence for the validity of the French version of the CAARS 2 and justify expectations that scores generated from both the French and English forms should be highly similar.

< Back Next >