Chapter 10: Fairness

Manual

CAARS 2 Manual

Chapter 10: Education Level

Education Level

view all chapter tables | print this section

An individual’s education level (EL) can sometimes be considered a proxy for or a contributing factor to one’s socioeconomic status (SES), which is among the sociocultural variables that may influence the fairness of a test. It was expected that the constructs measured on the CAARS 2 would be independent of influence from EL. To test this hypothesis and ensure generalizability of scores from the CAARS 2 Content Scales, individuals in the Self-Report and Observer samples reported the EL of the rated individual using one of five options: No high school diploma (EL 1), High school diploma/GED (EL 2), Some college/university or associate degree (EL 3), Bachelor’s degree (EL 4), or Graduate or professional degree (EL 5; more information about the representativeness of these groups can be found in Education Level in chapter 7, Standardization). For the sake of MI and DTF analyses, EL was re-categorized into two groups comprising individuals with and without post-secondary education (i.e., Group 1 consists of EL 1 and EL 2: N = 1,515 for Self-Report and N = 1,134 for Observer; Group 2 consists of EL 3, EL 4, and EL 5: N = 710 for Self-Report and N = 796 for Observer). Analyses compared mean group differences across all five levels of education.

First, differences in the factor structure across the two EL groups were evaluated with MI. With more stringent models tested at each level, neither the CAARS 2 Self-Report nor the CAARS 2 Observer displayed meaningful deterioration in model fit (see Table 10.17). For Self-Report, some model comparisons were significant using the Satorra-Bentler χ² test (e.g., the intercept model, p < .001); however, the indicators must be considered in tandem, and many other model fit statistics did not show meaningful deterioration of fit. Therefore, the observed change in model fit were minor and not meaningful, such that invariance between the EL groups on the construct assessed by the CAARS 2 can reasonably be assumed. These results support the invariance of the CAARS 2 scale scores across factor model between individuals with and without post-secondary education, meeting the first-step of the criteria for establishing its unbiased and generalizable use across these populations.

Click to expand

Table 10.17. Measurement Invariance by Education Level: CAARS 2

Form	Model	χ²	df	RMSEA	CFI	TLI	SRMR	Comparison	Satorra-Bentler χ²	df	∆CFI
Self-Report	Configural	15029.34***	4948	.043	.963	.961	.046	--
	Weak	15101.93***	5020	.043	.963	.962	.046	configural vs. weak	76.56	72	.000
	Strong	15018.49***	5087	.042	.963	.963	.046	weak vs. strong	85.01	67	.000
	Strict	14984.68***	5154	.041	.964	.964	.046	strong vs. strict	169.72***	67	.001
Observer	Configural	15539.54***	4948	.045	.953	.952	.051	--
	Weak	15627.29***	5020	.044	.953	.952	.051	configural vs. weak	99.73	72	.000
	Strong	15566.46***	5087	.044	.954	.954	.051	weak vs. strong	86.56	67	.001
	Strict	15510.30***	5154	.043	.954	.955	.051	strong vs. strict	183.57	67	.000

Note. N = 710 individuals with high school education or less (EL1 and EL2); N = 1,515 individuals with post-secondary education (EL3, EL4, and EL5) for Self-Report. N = 796 individuals with high school education or less (EL1 and EL2); N = 1,134 individuals with post-secondary education (EL3, EL4, and EL5) for Observer. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; RMSEA = Root mean square error of approximation; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001.

Next, differences in the CAARS 2 Content scales’ functioning for the two broad EL groups were explored with DTF. An example of a DTF graph for the comparison of low EL and high EL groups is provided in Figure 10.4. Test functioning curves for each group are depicted with a shaded 95% confidence interval band; the two groups’ curves are almost completely overlapping, demonstrating a lack of differential functioning for the Inattention/Executive Dysfunction scale. Very similar patterns of results were found for all Content scales across both forms. The effect sizes from DTF analyses are summarized in Table 10.18. Results from both Self-Report and Observer show negligible differences between EL groups (e.g., maximum ETSDD = |.05|). The lack of differential functioning of the CAARS 2 between the different EL groups is further evidence of the test’s equivalence across demographic subgroups.

Click to expand

Figure 10.4. Differential Test Functioning by Education Level: Inattention/Executive Dysfunction

Click to expand

Table 10.18. Differential Test Functioning Effect Sizes by Education Level

Scale	Self-Report	Observer
Inattention/Executive Dysfunction	.00	.00
Hyperactivity	-.05	.01
Impulsivity	.01	.00
Emotional Dysregulation	.02	.02
Negative Self-Concept	.00	.00

Note. Values presented are expected test score standardized differences (ETSSD); guidelines for interpreting |ETSSD|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80|. Positive ETSSD values indicates higher scores would be expected for individuals without post-secondary education (EL 1 and EL 2) relative to individuals who have post-secondary education (EL 3, EL 4, and EL 5) with the same level of the construct being measured.

Next, observed mean score group differences were compared using data from the original five EL groups (entire Normative Sample). EL was compared via ANCOVA, with covariates to statistically control for the effects of other demographic factors (i.e., gender, language[s] spoken, clinical status, race/ethnicity, and age). Significant ANCOVA results (i.e., p < .01) were followed up with Tukey’s Honestly Significant Difference (HSD) post-hoc test to evaluate pairwise comparisons, alongside estimates of effect sizes for the omnibus test and pairwise differences.

Results of the ANCOVAs for Self-Report are provided in Table 10.19a, with effect sizes of the pairwise comparisons between group means provided in Table 10.19b. Corresponding results for Observer are provided in Table 10.20a and 10.20b.There were no statistically significant effects of EL on the CAARS 2 Observer scales, as evidenced by negligible effect sizes (η² ranged from .00 to .01). A statistically significant difference between EL groups was only observed for the CAARS 2 Self-Report Emotional Dysregulation scale (p = .009), but the size of this effect was not practically significant (partial η² = .01). The post-hoc analysis revealed that for Emotional Dysregulation, the EL 1 group (i.e., individuals without a high school diploma) scored higher than the EL 4 group (i.e., individuals with a Bachelor’s degree); however, the effect size was small (Cohen’s d = 0.26).

Overall, these results support the absence of meaningful differences in the measurement properties of the test across low and high EL groups. Taken together, results from the MI, DTF, and mean group difference analyses indicate that the CAARS 2 can generalize across EL groups for the Content Scales. There was no strong evidence for meaningful differences in terms of latent structure nor in terms of test functioning between the two groups, and scores were not meaningfully different, supporting the unbiased use of the CAARS 2 for individuals with a range of educational backgrounds and levels.

Click to expand

Table 10.19a. Group Differences by Education Level: CAARS 2 Self-Report

Scale		EL 1 (N = 127)	EL 2 (N = 378)	EL 3 (N = 385)	EL 4 (N = 281)	EL 5 (N = 149)	F (4, 1298)	p	Partial η²
Inattention/Executive Dysfunction	EMM	57.1	55.4	56.7	56.3	56.4	1.39	.235	.00
Inattention/Executive Dysfunction	SD	11.7	16.0	15.0	13.5	11.3	1.39	.235	.00
Hyperactivity	EMM	55.4	53.1	54.0	53.0	53.4	1.93	.103	.01
Hyperactivity	SD	12.3	16.7	15.7	14.1	11.8	1.93	.103	.01
Impulsivity	EMM	55.5	52.9	54.1	53.6	54.0	1.97	.097	.01
Impulsivity	SD	12.3	16.7	15.7	14.1	11.8	1.97	.097	.01
Emotional Dysregulation	EMM	56.8	54.2	54.3	53.2	53.1	3.40	.009	.01
Emotional Dysregulation	SD	12.4	16.9	15.9	14.2	11.9	3.40	.009	.01
Negative Self-Concept	EMM	55.6	54.6	54.9	54.9	54.0	0.51	.729	.00
Negative Self-Concept	SD	12.2	16.5	15.6	13.9	11.7	0.51	.729	.00

Note. EMM = estimated marginal means. EL = Education level; EL 1 = No high school diploma; EL 2 = High school diploma/GED; EL 3 = Some college or associate degree; EL 4 = Bachelor's degree; EL 5 = Graduate or professional degree. Guidelines for interpreting η2: negligible effect size < .01; small effect size = .01 to .059; medium effect size = .06 to .13; large effect size ≥ .14. EMMs without a common superscript letter differ (p < .01) as per Tukey's HSD post-hoc tests; values with common superscript letters are not significantly different.

Click to expand

Table 10.19b. Group Differences by Education Level: CAARS 2 Self-Report Effect Sizes

Scale	EL 1 vs. EL 2	EL 1 vs. EL 3	EL 1 vs. EL 4	EL 1 vs. EL 5	EL 2 vs. EL 3	EL 2 vs. EL 4	EL 2 vs. EL 5	EL 3 vs. EL 4	EL 3 vs. EL 5	EL 4 vs. EL 5
Inattention/Executive Dysfunction	0.11	0.02	0.06	0.06	-0.09	-0.06	-0.07	0.03	0.03	-0.01
Hyperactivity	0.15	0.09	0.18	0.17	-0.06	0.01	-0.02	0.07	0.04	-0.03
Impulsivity	0.16	0.09	0.14	0.12	-0.07	-0.05	-0.07	0.03	0.01	-0.03
Emotional Dysregulation	0.16	0.17	0.26	0.30	-0.01	0.06	0.07	0.07	0.08	0.00
Negative Self-Concept	0.06	0.05	0.05	0.13	-0.02	-0.02	0.04	0.00	0.06	0.07

Note. EL = Education level; EL 1 = No high school diploma; EL 2 = High school diploma/GED; EL 3 = Some college or associate degree; EL 4 = Bachelor's degree; EL 5 = Graduate or professional degree. Values presented are Cohen's d effect size estimates; guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates that individuals in the first group had higher scores than individuals in the second group.

Click to expand

Table 10.20a. Group Differences by Education Level: CAARS 2 Observer

Scale		EL 1 (N = 130)	EL 2 (N = 386)	EL 3 (N = 380)	EL 4 (N = 268)	EL 5 (N = 156)	F (4, 1303)	p	Partial η²
Inattention/Executive Dysfunction	EMM	56.2	53.7	53.9	53.1	53.4	2.24	.063	.01
Inattention/Executive Dysfunction	SD	12.6	17.1	16.0	14.2	11.9	2.24	.063	.01
Hyperactivity	EMM	53.4	50.9	51.4	51.3	51.9	1.50	.200	.00
Hyperactivity	SD	13.0	17.6	16.4	14.6	12.3	1.50	.200	.00
Impulsivity	EMM	53.3	50.6	50.7	49.9	51.0	2.53	.039	.01
Impulsivity	SD	13.0	17.6	16.5	14.6	12.3	2.53	.039	.01
Emotional Dysregulation	EMM	54.6	51.9	51.8	50.9	52.5	3.06	.016	.01
Emotional Dysregulation	SD	13.0	17.6	16.4	14.6	12.3	3.06	.016	.01
Negative Self-Concept	EMM	57.1	54.3	55.2	54.1	54.0	2.92	.020	.01
Negative Self-Concept	SD	12.3	16.7	15.5	13.8	11.6	2.92	.020	.01

Note. EMM = estimated marginal means. EL = Education level; EL 1 = No high school diploma; EL 2 = High school diploma/GED; EL 3 = Some college or associate degree; EL 4 = Bachelor's degree; EL 5 = Graduate or professional degree. Guidelines for interpreting η²: negligible effect size < .01; small effect size = .01 to .059; medium effect size = .06 to .13; large effect size ≥ .14.

Click to expand

Table 10.20b. Group Differences by Education Level: CAARS 2 Observer Effect Sizes

Scale	EL 1 vs. EL 2	EL 1 vs. EL 3	EL 1 vs. EL 4	EL 1 vs. EL 5	EL 2 vs. EL 3	EL 2 vs. EL 4	EL 2 vs. EL 5	EL 3 vs. EL 4	EL 3 vs. EL 5	EL 4 vs. EL 5
Inattention/Executive Dysfunction	0.15	0.15	0.22	0.23	-0.01	0.04	0.02	0.05	0.03	-0.02
Hyperactivity	0.15	0.12	0.15	0.12	-0.03	-0.02	-0.06	0.01	-0.03	-0.05
Impulsivity	0.16	0.17	0.24	0.18	0.00	0.04	-0.03	0.05	-0.03	-0.08
Emotional Dysregulation	0.17	0.18	0.27	0.17	0.00	0.06	-0.04	0.06	-0.05	-0.12
Negative Self-Concept	0.18	0.13	0.23	0.26	-0.06	0.01	0.02	0.08	0.09	0.01

< Back

Next >