Manual

CAARS 2 Manual

Chapter 12: Key Findings


Key Findings

Development and Standardization. The CAARS 2–ADHD Index was developed using a modern statistical approach, namely gradient-boosting machine learning (GBM; Friedman, 2001) algorithms, to select the most effective items at distinguishing between ratings of individuals with and without ADHD. Through an iterative process, 12 items were selected from the full-length CAARS 2 for the Self-Report ADHD Index, and 12 items were independently selected for the Observer ADHD Index. Raw scores were calculated based on relative importance of the items on the ADHD Index, and the ratio of the distribution of scores for individuals with and without ADHD was used to derive a probability score. The probability score communicates the likelihood that the score reflects the score profile of a person diagnosed with ADHD.

Reliability. Evidence of reliability was established through the following:

  • Excellent internal consistency (median omega coefficient in the Normative Sample: Self-Report = .90 and Observer = .90).

  • Low standard error of measurement (SEM) values (indicating very little error in the estimated true scores; median SEM in the Normative Samples: Self-Report = 3.16 and Observer = 3.18).

  • A high degree of test information (indicating high precision of measurement).

  • Statistically significant (p < .001), strong, and positive test-retest reliability across a 2- to 4-week interval (correlation coefficient for Self-Report: r = .84; Observer: r = .86).

  • High inter-rater reliability between two observers of the same type (e.g., two friends; r = .65), and moderate relationships between different rater types (e.g., comparing self-report ratings to observer ratings or two different observer types; r = .54), highlighting the importance of examining information from multiple sources with meaningfully different perspectives.

Validity. Evidence of validity was established through

  • the ability of the ADHD Index to correctly classify individuals with ADHD versus those without ADHD (the overall correct classification rate was 93.1% for Self-Report and 85.5% for Observer).

Fairness. Evidence of fairness, in terms of the absence of meaningful psychometric bias, is provided with regard to gender, race/ethnicity, country of residence, and education level (EL) in the following: Evidence of fairness, in terms of the absence of meaningful psychometric bias, is provided with regard to gender, race/ethnicity, country of residence, and education level (EL) in the following:

  • Gender. There was no evidence of measurement bias in Gender for Self-Report or Observer (maximum ETSSD = |0.01|). Negligible to small differences in probability scores were found (maximum Cliff’s d = .04) when comparing ratings of males and females.

  • Race/ethnicity. There was no evidence of measurement bias due to race/ethnicity (maximum ETSSD = |.06|) when comparing ratings of White individuals to ratings of Black and Hispanic individuals. There were negligible differences in probability scores for all group comparisons analyzed across both raters (White vs. Hispanic, White vs. Black, White vs. Asian; maximum Cliff’s d = |.14|).

  • Country of Residence. Ratings of individuals from the U.S. and Canada displayed equivalent measurement for country of residence on both Self-Report and Observer forms (maximum ETSSD = |0.01|). Negligible differences in probability scores were found (maximum Cliff’s d = .05) when comparing scores of individuals from the U.S. and Canada.

  • Education level (EL). There was no evidence of measurement bias based on education level for Self-Report or Observer (maximum ETSSD = |0.02|) when comparing ratings of individuals with high and low EL. Negligible differences in probability scores were found (maximum Cliff’s d = .08) between different levels of education.
< Back Next >