-
Chapter 1: Introduction
-
Chapter 2: Background
-
Chapter 3: Administration and Scoring
-
Chapter 4: Interpretation
-
Chapter 5: Case Studies
-
Chapter 6: Development
-
Chapter 7: Standardization
-
Chapter 8: Reliability
-
Chapter 9: Validity
-
Chapter 10: Fairness
-
Chapter 11: CAARS 2–Short
-
Chapter 12: CAARS 2–ADHD Index
-
Chapter 13: Translations
-
Appendices
CAARS 2 ManualChapter 10: Key Findings |
Key Findings |
Results presented in this chapter provide strong evidence that the CAARS 2 meets or exceeds the fairness requirements outlined in the Standards for Educational and Psychological Testing (AERA, APA, & NCME, 2014).
Gender. Results provide evidence for equivalent measurement of males and females and support the absence of meaningful gender differences on the CAARS 2 Content Scales. Main results include the following:
-
No evidence of measurement bias was found (the factor structure, loadings, thresholds, and intercepts were
invariant between males and females).
-
No evidence of differential test functioning was found (negligible effect sizes observed: median ETSSD =
|0.04|).
-
In comparing Content Scale mean scores between males and females, no meaningful gender differences were found
on the Self-Report or Observer forms.
Race/Ethnicity. Results within the U.S. sample demonstrate an absence of measurement bias for the CAARS 2 Content Scales by different races/ethnicities, supporting its appropriate use of the measure with racially and ethnically diverse populations. These results include the following:
-
No evidence of measurement bias was found between the tested groups (the factor structure, loadings,
thresholds, and intercepts were invariant).
-
No evidence of differential test functioning was found (negligible effect sizes observed: median ETSSD =
|0.06|).
-
In comparing mean scores between groups, different patterns were found for the different comparisons.
-
For Self-Report, while no statistically significant differences were found for the White vs. Asian
comparisons, there were statistically significant differences for several scales for the White vs.
Hispanic and White vs. Black comparisons. For all significant effects, White individuals had slightly
higher scores than Hispanic and Black individuals (Cohen’s d ranging from 0.31 to 0.59),
indicating
slightly more endorsement or greater severity of symptoms for White individuals.
- For Observers, no meaningful race/ethnicity differences were found for the White vs. Hispanic, White vs. Black, and White vs. Asian comparisons.
-
For Self-Report, while no statistically significant differences were found for the White vs. Asian
comparisons, there were statistically significant differences for several scales for the White vs.
Hispanic and White vs. Black comparisons. For all significant effects, White individuals had slightly
higher scores than Hispanic and Black individuals (Cohen’s d ranging from 0.31 to 0.59),
indicating
slightly more endorsement or greater severity of symptoms for White individuals.
Country of Residence. Results revealed no evidence of measurement bias related to country of residence, when comparing individuals in the U.S. and Canada on the CAARS 2 Content Scales. Main results include the following:
-
No evidence of measurement bias was found (the factor structure, loadings, thresholds, and intercepts were
invariant between the U.S. and Canada).
-
No evidence of differential test functioning was found (negligible effect sizes observed: median ETSSD =
|0.05|).
-
In comparing mean scores between countries, no statistically significant differences were found for Observer,
and only one scale showed a difference for Self-Report (Hyperactivity); overall, effect sizes were negligible to
small (Cohen’s d = |0.01| to |0.41|).
Education level (EL). Results provide evidence for equivalence between individuals with different levels of education on the CAARS 2, as shown by the following findings:
-
No evidence of measurement bias was found (the factor structure, loadings, thresholds, and intercepts were
invariant between individuals with low EL or high EL).
-
No evidence of differential test functioning was found (negligible effect sizes observed: median ETSSD =
|0.05|).
-
In comparing mean scores between the levels of educational attainment, no statistically significant
differences were found for Observer, and only one scale was observed to have a significant but small effect on
Self-Report (Emotional Dysregulation; Cohen’s d = |0.26|).
Associated Clinical Concern Items, and Impairment & Functional Outcome Items
- In comparing mean scores, no meaningful differences were found across Self-Report and Observer for gender, race/ethnicity, country of residence, or education level (EL); overall, effect sizes were negligible to small (Cohen’s d = 0.00 to |0.17|).
Methods of Evaluating Measurement Bias
Potential measurement bias was assessed by exploring differences between groups with regard to the following demographic characteristics: gender, race/ethnicity, country of residence, and education level (EL). Two main methods of evaluating bias were employed for the CAARS 2 Content Scales: invariance testing and mean group difference tests. All items on the DSM Symptom Scales are represented within the Content Scales, so DSM Symptom Scales are not independently analyzed in this section. Fairness properties of the CAARS 2 Associated Clinical Concern Items and Impairment & Functional Outcome Items were examined by examining observed group differences, as invariance testing (which relies on latent variable modeling) is not applicable for individual item scores. For interested readers, see appendix M, for a description of these methods.
< Back | Next > |