Manual

CAARS 2 Manual

Chapter 9: Internal Structure


Internal Structure

The internal structure of the CAARS 2 Content Scales were explored to provide evidence for the validity of the measurement of the intended constructs. The extent to which items interrelate and conform to the theoretical framework can provide evidence for intended interpretation and use of the instrument (AERA, APA, & NCME, 2014). The structure of the CAARS 2 was examined through confirmatory factor analysis (CFA), and alternative and competing measurement models were tested to determine the best fit of the data for the CAARS 2 Self-Report and Observer.

The underlying relationships among the CAARS 2 Content Scales (i.e., Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, Emotional Dysregulation, and Negative Self-Concept) were inspected to provide evidence concerning the internal structure of the CAARS 2. The nature of the multidimensionality of constructs measured in the CAARS has been the subject of some debate in the literature (e.g., Adler et al., 2017; Martel et al., 2012; Park et al., 2018), specifically regarding the separation or unification of Inattention and Executive Dysfunction, as well as Hyperactivity and Impulsivity. To address these considerations, the models described in Table 9.1 were tested.

The following criteria for goodness-of-fit statistics were used to evaluate these models:

  • Comparative Fit Index (CFI; Bentler, 1990): ≥ .90 for acceptable fit and ≥ .95 for good fit (Hu & Bentler, 1999; McDonald & Ho, 2002).

  • Tucker-Lewis Index (TLI; Tucker & Lewis, 1973): ≥ .90 for acceptable fit and ≥ .95 for good fit (Hu & Bentler, 1999; McDonald & Ho, 2002).

  • Root mean square error of approximation (RMSEA; Browne & Cudeck, 1992): ≤ .08 for acceptable fit and ≤ .06 for good fit.

  • Standardized root mean square residual (SRMR; Bentler, 1995): ≤ .08 representing good fit.

CFI and TLI range from 0 to 1, with higher values indicating greater fit; conversely, RMSEA and SRMR range from 0 to 1, with lower values indicating better fit. The model was evaluated for statistically significant differences, given its nested structure. The results of the model were evaluated by examining overall fit indices, factor loadings, and correlations among factors. When examining intercorrelations, a correlation at or above .95 indicates that the factors are not meaningfully distinct, and parsimony should be favored (i.e., in this case, the selection of the model in which those factors are combined, rather than separated).

The models were evaluated for statistically significant differences, given their nested structure. A scaled chi-square (χ2) difference statistic with a conservative statistical significance level of p ≤ .01 was deemed meaningful for comparing models, as there were multiple comparisons to be examined and χ2 is known to be sensitive to large sample sizes (Tanaka, 1987). The difference in CFI was also evaluated, such that CFI had to improve by more than .01 to be considered a meaningful difference between models (Cheung & Rensvold, 2002). In addition to comparing nested models, the results of each model were evaluated by first examining overall fit indices, and then examining factor loadings and correlations among factors. When examining the intercorrelations among factors in the last step of this analysis, a correlation between factors at or above .95 indicates that the factors are not meaningfully distinct, and parsimony should be favored (i.e., in this case, the selection of the model in which those factors are combined, rather than separated). Additionally, to further investigate this last analysis step, confidence intervals around the coefficient of the inter-factor correlations are examined, and confidence intervals of the correlations that do not include a value of 1 (that is, a perfect correlation that shows the two are completely interrelated and therefore overlapping) are understood to indicate distinct constructs (Brown, 2006) and can therefore be retained as two separate factors.

Analyses were conducted with complete cases from Total Samples, including all available data from the clinical and general population groups (N = 2,226 for Self-Report; N = 2,150 for Observer; see Standardization Phase in chapter 6, Development, for details about these samples), using correlated-factor models with robust estimation methods for ordinal items via the lavaan package in R (Rosseel, 2012). As can be seen in Table 9.2, results for these competing models for Self-Report and Observer all demonstrated strong fit and performed similarly to one another. The fit indices met or exceeded typical guidelines for good fit. Model fit improved (i.e., CFI and TLI increased, and SRMR and RMSEA decreased) as more factors were added to the model.

Click to expand

Table 9.2. Fit Indices for Confirmatory Factor Analysis Models: CAARS 2 Content Scales

Form Model χ2 df CFI TLI SRMR RMSEA RMSEA Confidence Interval
Self-Report 4-factor 15621.08 2478 .949 .947 .044 .049 .049, .050
5-factor 14321.28 2474 .953 .951 .042 .047 .047, .048
6-factor 12875.15 2469 .957 .956 .041 .045 .045, .046
Observer 4-factor 17910.77 2478 .940 .938 .051 .051 .050, .051
5-factor 16013.27 2474 .945 .943 .048 .049 .048, .049
6-factor 13388.60 2469 .953 .951 .044 .045 .044, .046
Note. CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; RMSEA = Root mean square error of approximation. All χ2 values are significant, p < .01.

All three models displayed good fit to the data, so additional analyses were conducted to determine which model had the best fit across CAARS 2 Self-Report and Observer. Results of the χ2 difference test for nested models can be seen in Table 9.3. All model comparisons displayed statistically significant differences (p < .01), indicating that a greater number of factors did significantly improve fit, yet models with additional factors did not show a meaningful gain in CFI (change in CFI less than or very close to .01 for all comparisons). Therefore, further investigation was warranted to determine the most appropriate model for the data.

Click to expand

Table 9.3. Comparison of Nested Confirmatory Factor Analysis Models: CAARS 2 Content Scales

Form Models Compared χ2 df p ΔCFI
Self-Report 4-factor vs. 5-factor 108.47 4 < .01 .004
5-factor vs. 6-factor 135.32 5 < .01 .008
Observer 4-factor vs. 5-factor 72.40 4 < .01 .005
5-factor vs. 6-factor 132.77 5 < .01 .013
Note. ΔCFI = change in Comparative Fit Index value. All χ2 values are significant, p < .01.

Inspection of the inter-factor correlations of the models was the next step in this series of analyses. In the 6-factor model, Inattention and Executive Dysfunction were correlated close to the recommended threshold of .95 for meaningfully distinct factors (Self-Report r = .951, Observer r = .942), and the confidence intervals for these estimates, when rounded to three decimals, included a value of 1, indicating possible overlap. Given this finding, the 6-factor model was rejected, as separating Inattention and Executive Dysfunction was not supported by the data. All goodness-of-fit statistics and the χ2 difference tests indicated that the 4-factor model performed worse than the 5-factor model; therefore, the 5-factor model was inspected further. Close examination of the inter-factor correlations of the 5-factor model, as seen in Tables 9.4 and 9.5, indicated that Hyperactivity and Impulsivity were strongly correlated but were not entirely overlapping constructs (Self-Report r = .897, Observer r = .910). In addition, the confidence intervals for these correlations did not include a correlation of 1, providing further evidence that they could be viewed as distinct. Therefore, the 5-factor model was chosen as the best fit for the CAARS 2 Content Scales for both Self-Report and Observer.

Examination of the factor loadings provided additional support for the 5-factor model. All factor loadings were positive, statistically significant, and exceeded a typical minimum threshold (loading ≥ .40; Tabachnick & Fidell, 2007). For Self-Report, loadings ranged from .470 to .951 (median = .799). For Observer, loadings ranged from .519 to .968 (median = .815). The strength of this model provides strong evidence for the structural validity of the CAARS 2 domains.

Click to expand

Table 9.4. Five-Factor Model Inter-Factor Correlations: CAARS 2 Self-Report Content Scales

Scale Inattention/​Executive Dysfunction Hyperactivity Impulsivity Emotional Dysregulation Negative Self-Concept
Inattention/​Executive Dysfunction -- -- -- -- --
Hyperactivity .818 -- -- -- --
Impulsivity .877 .910 -- -- --
Emotional Dysregulation .777 .765 .873 -- --
Negative Self-Concept .774 .621 .664 .738 --
Note. N = 2,226. Guidelines for interpreting |r|: very weak < .20, weak = .20 to .39, moderate = .40 to .59, strong = .60 to .79, very strong ≥ .80.
Click to expand

Table 9.5. Five-Factor Model Inter-Factor Correlations: CAARS 2 Observer Content Scales

Scale Inattention/​Executive Dysfunction Hyperactivity Impulsivity Emotional Dysregulation Negative Self-Concept
Inattention/Executive Dysfunction -- -- -- -- --
Hyperactivity .778 -- -- -- --
Impulsivity .838 .897 -- -- --
Emotional Dysregulation .738 .729 .871 -- --
Negative Self-Concept .695 .480 .544 .634 --
Note. N = 2,150. Guidelines for interpreting |r|: very weak < .20, weak = .20 to .39, moderate = .40 to .59, strong = .60 to .79, very strong ≥ .80.
< Back Next >