-
Chapter 1: Introduction
-
Chapter 2: Background
-
Chapter 3: Administration and Scoring
-
Chapter 4: Interpretation
-
Chapter 5: Case Studies
-
Chapter 6: Development
-
Chapter 7: Standardization
-
Chapter 8: Reliability
-
Chapter 9: Validity
-
Chapter 10: Fairness
-
Chapter 11: CAARS 2–Short
-
Chapter 12: CAARS 2–ADHD Index
-
Chapter 13: Translations
-
Appendices
CAARS 2 ManualChapter 11: Key Findings |
Key Findings |
Development and Standardization. The CAARS 2–Short was developed following best practices, as outlined within this chapter. Items from the full-length CAARS 2 Self-Report and Observer forms were selected to ensure optimal construct coverage while retaining psychometric properties similar to the full-length forms. Shortened scales were developed for the Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, Emotional Dysregulation, and Negative Self-Concept scales by examining item performance, interrelationships and scale structure, associations with the full-length scales, and relationship with criterion variables. Standardized scores (i.e., T-scores and percentiles) were calculated based on data from the CAARS 2 Normative and ADHD Reference Samples.
Reliability. Evidence of reliability was established through the following:
-
Excellent internal consistency (median omega coefficient in the Normative Sample: Self-Report = .91 and
Observer = .93).
-
Low standard error of measurement (SEM) values (indicating very little error in the estimated true
scores;
median SEM in the Normative Samples: Self-Report = 3.06 and Observer = 2.69).
-
A high degree of test information (mirroring similar trends seen in the full-length CAARS 2 scales,
providing
evidence for precision of measurement).
-
Statistically significant (p < .001), strong, and positive test-retest reliability across a 2-
to
4-week
interval (correlation coefficient ranges for Self-Report: r = .78 to .95; Observer: r = .76 to
.90).
-
Moderate to strong inter-rater reliability between two observers of the same type (e.g., two friends;
r =
.40
to .69), and weak to moderate relationships between different rater types (e.g. comparing self-report ratings to
observer ratings or two different observer types; r = .25 to .48), highlighting the importance of
examining
information from multiple sources with meaningfully different perspectives.
Validity. Evidence of validity was established through the following:
-
A high degree of association between the CAARS 2–Short and the full-length CAARS 2 (median tau: Self-Report =
.88, Observer = .87).
-
Replication of the internal structure results from CAARS 2 full-length Content scales confirmatory factor
analyses (CFA) which provided evidence to support the structure of the CAARS 2–Short scales (final models for
the scales had strong fit statistics for both raters [CFI and TLI ≥ .962; SRMR ≤ .047; RMSEA ≤ .057]).
-
Meaningful differences (high degree of criterion-rated validity) between average scores of clinical
groups on
the CAARS 2–Short, such that individuals with ADHD yielded higher scores than individuals from the General
Population (median Cohen’s d = 1.08 across all raters and scales) as well as higher scores than ratings
of
individuals with Depression and/or Anxiety (median Cohen’s d = 0.61 across all raters and scales).
-
Moderate to high levels of classification accuracy were demonstrated using scores from the CAARS 2–Short
when
distinguishing between individuals from the General Population and those diagnosed with ADHD (overall correct
classification rates: 89.7% Self-Report, 84.1% Observer).
Fairness. Evidence of fairness, in terms of the absence of meaningful psychometric differences, is provided with regard to gender, race/ethnicity, country of residence, and education level (EL):
-
Gender. Results provide evidence for the equivalent measurement of males and females and the absence of
meaningful gender differences. Measurement invariance was supported, there was no evidence of measurement bias
(maximum ETSSD = |.06|), and ratings of males and females did not have statistically significantly different
scores for most scales across rater forms, with the exception of slightly higher self-reported ratings of males
on Hyperactivity and Impulsivity and slightly higher scores for ratings of women on Negative Self-Concept
(Cohen’s d = |0.01 to 0.29|).
-
Race/ethnicity. There was no evidence of measurement bias across race/ethnicity (maximum ETSSD = |.12|),
and
measurement invariance was supported across comparisons (ratings of White individuals compared to ratings of
Black and Hispanic individuals). For Asian vs. White comparisons, no meaningful scale score differences were
found across all raters. Scores for Hispanic vs. White and Black vs. White individuals did not differ based on
Observer ratings, whereas Self-Report ratings yielded small effects on three scales (Cohen’s d = |0.30|
to
|0.71|), and ratings of White individuals resulted in slightly higher scores than ratings of Hispanic and Black
individuals.
-
Country of Residence. There was no evidence of measurement bias in terms of country of residence when
comparing ratings of individuals in the U.S. and Canada. Assumptions of invariance were upheld, there was no
evidence of measurement bias (maximum ETSSD = |.06|), and mean score differences were not statistically
significant with negligible to small effect sizes (Cohen’s d = 0.02 to |0.39|).
-
Education level (EL). Results provide evidence for equivalence between ratings of individuals with
different
levels of education. Assumptions of invariance were upheld, there was no evidence of measurement bias (maximum
ETSSD = |.07|), and mean score differences were not statistically significant with negligible to small effect
sizes for all but one scale (Cohen’s d = 0.00 to |0.32|).
< Back | Next > |