Manual

CAARS 2 Manual

Chapter 10: Gender


Gender

Gender, for the purposes of these fairness-related analyses, is defined as the rated individual’s gender identity. Analyses were conducted to compare males and females on the CAARS 2 in terms of MI, DTF, and mean group differences. The very small sample size for individuals who are non-binary or indicated “Other” for gender (Self-Report N = 11; Observer N = 5) did not allow for meaningful testing. Therefore, when assessing invariance by gender, only males (N = 1,028 for Self-Report; N = 1,021 for Observer) and females (N = 1,186 for Self-Report; N = 1,123 for Observer) were included.

Invariance between males and females for the CAARS 2 was first explored via MI analyses (see Table 10.1). For Self-Report and Observer, there are some models with a significant Satorra-Bentler χ2 test (e.g., Strict models); however, the absence of any decline in many other fit statistics, such as CFI and SRMR, suggests invariance was upheld. As more constraints were added throughout the process of testing MI, model fit did not change in a meaningful way, indicating that the factor model is invariant between males and females.

Click to expand

In addition to MI results, DTF analyses were conducted to explore the invariance of the CAARS 2 for males and females through a different framework. An example of a DTF graph for both Self-Report and Observer is provided in Figure 10.1. Test functioning for males and females are depicted, along with a shaded band to display a 95% confidence interval. The two groups’ curves are almost completely overlapping, demonstrating a lack of difference for the Inattention/Executive Dysfunction scale.

The effect size of the DTF analyses for all CAARS 2 Content scales, as measured by the ETSSD, are summarized in Table 10.2. Negligible differences (i.e., ETSSD ≤ |.04|) between males and females were found across the CAARS 2 Content Scales and across both forms, demonstrating invariance by gender.

Click to expand

Figure 10.1. Differential Test Functioning by Gender : Inattention/Executive Dysfunction

Click to expand

Table 10.2. Differential Test Functioning Effect Sizes by Gender

Scale Self-Report Observer
Inattention/​Executive Dysfunction .03 .02
Hyperactivity .00 .02
Impulsivity .00 .03
Emotional Dysregulation -.04 .03
Negative Self-Concept .03 .02
Note. Values presented are expected test score standardized differences (ETSSD); guidelines for interpreting |ETSSD| values: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. Positive ETSSD values indicate that females would receive higher scores than males who had the same level of the construct being measured.

To examine observed group differences between gender, a matched sample of males and females were selected at random from the Normative Samples. Individuals were matched by education level (EL), language(s) spoken, clinical status, race/ethnicity, and age. The demographic characteristics of the rated individuals in the matched samples (and their raters, where applicable) are presented in appendix J.

The paired samples of males and females were then compared for significant differences across mean scores. Results of the ANOVAs and descriptive statistics for each scale are presented in Tables 10.3 and 10.4. When comparing ratings of male and female individuals, for Self-Report, there were statistically significant effects observed for the Inattention/Executive Dysfunction, Hyperactivity, and Impulsivity scales, wherein males yielded slightly higher scores than females; however, the effect sizes were negligible to small (Cohen’s d = |0.18 to 0.28|). For the Observer results, the only statistically significant effect observed was on the Negative Self-Concept scale, wherein ratings of females yielded slightly higher scores than males; however, the effect size was small (Cohen’s d = |0.24|). The effect of gender on the remainder of the scale scores was not statistically significant.

Overall, these results support the absence of meaningful gender differences. Taken together, results from the MI, DTF, and mean group difference analyses indicate psychometric equivalence between males and females for the CAARS 2 Content Scales. There was no strong evidence for meaningful differences in terms of latent structure nor in terms of test functioning between the two gender groups, and scores for males and females were not meaningfully different. Although results of these analyses suggest that use of Combined Gender as the primary reference group is appropriate, there may be instances where specific comparisons to a gender group are desired. Gender Specific and Combined Gender normative scoring options are available; please see Scoring and Report Options in chapter 3, Administration and Scoring, for details.

Click to expand

Table 10.3. Group Differences by Gender (Male vs. Female): CAARS 2 Self-Report

Scale Male
(N = 463)
Female
(N = 463)
Cohen's d F
(1, 924)
p η2
Inattention/​Executive Dysfunction M 50.7 48.6 0.21 10.33 .001 .01
SD 9.9 9.3
Hyperactivity M 50.5 48.8 0.18 7.75 .005 .01
SD 10.0 9.3
Impulsivity M 51.0 48.3 0.28 17.94 < .001 .02
SD 9.9 9.4
Emotional Dysregulation M 50.1 49.1 0.10 2.28 .131 .00
SD 9.7 9.6
Negative Self-Concept M 49.8 49.8 -0.01 0.01 .917 .00
SD 10.2 9.6
Note. Guidelines for interpreting η2: negligible effect size < .01; small effect size = .01 to .059; medium effect size = .06 to .13; large effect size ≥ .14. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for males than females.
Click to expand

Table 10.4. Group Differences by Gender (Male vs. Female): CAARS 2 Observer

Scale Male
(N = 444)
Female
(N = 444)
Cohen's d F
(1, 886)
p η2
Inattention/​Executive Dysfunction M 49.8 49.0 0.08 1.27 .260 .00
SD 9.9 9.8
Hyperactivity M 49.5 49.4 0.00 0.00 .974 .00
SD 9.2 10.4
Impulsivity M 49.9 49.3 0.06 0.69 .407 .00
SD 9.6 10.3
Emotional Dysregulation M 49.8 49.9 -0.01 0.03 .874 .00
SD 10.1 10.3
Negative Self-Concept M 48.5 50.9 -0.24 12.98 < .001 .01
SD 9.1 10.4
Note. Guidelines for interpreting η2: negligible effect size < .01; small effect size = .01 to .059; medium effect size = .06 to .13; large effect size ≥ .14. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for males than females.
< Back Next >