-
Chapter 1: Introduction
-
Chapter 2: Background
-
Chapter 3: Administration and Scoring
-
Chapter 4: Interpretation
-
Chapter 5: Case Studies
-
Chapter 6: Development
-
Chapter 7: Standardization
-
Chapter 8: Reliability
-
Chapter 9: Validity
-
Chapter 10: Fairness
-
Chapter 11: CAARS 2–Short
-
Chapter 12: CAARS 2–ADHD Index
-
Chapter 13: Translations
-
Appendices
CAARS 2 ManualChapter 11: Development |
Development |
The goal of the CAARS 2–Short is to assess core and associated symptoms of ADHD. Therefore, all five CAARS 2 Content Scales were included–Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, Emotional Dysregulation, and Negative Self-Concept. Note that the CAARS 2–Short also includes the Response Style Analysis, ADHD Index, and Additional Questions; these components are parallel to the full-length form and are not analyzed or discussed separately in this chapter. Please see Table 1.1 in chapter 1, Introduction, for a comparison of the content included in the full-length CAARS 2 and CAARS 2–Short. In order to mitigate the risks to measurement precision, reliability, and validity that can occur with abbreviated versions of scales, recommended practices for developing short forms were followed for the CAARS 2–Short (Emons et al., 2007; Kruyen et al., 2013; Smith et al., 2000; Ziegler et al., 2014).
Samples
The CAARS 2–Short was derived and validated using the CAARS 2 Total Sample (see Table 6.4 in chapter 6, Development). The Total Sample, which included individuals from the general population and from clinical groups, comprised a sample of 2,232 individuals aged 18 or older who completed the CAARS 2 Self-Report and 2,150 observers who rated adults aged 18 or older. The Total Sample was used to select and validate the items for the shortened Content Scales. Note that two individuals from the Observer Total Sample were excluded from analyses as they had omitted items that affected analyses relevant to the creation of a shortened form. The samples were split into calibration and validation subsamples (see Tables 11.1a and 11.1b for demographic characteristics of the rated individuals and the raters, respectively).
Click to expand |
Table 11.1a. Demographic Characteristics of the Rated Individuals: CAARS 2–Short Calibration and Validation Samples
Demographic | Self-Report | Observer | |||||||
Calibration | Validation | Calibration | Validation | ||||||
N | % | N | % | N | % | N | % | ||
Gender | Male | 544 | 46.3 | 487 | 46.1 | 517 | 47.6 | 504 | 47.4 |
Female | 621 | 52.9 | 569 | 53.8 | 565 | 52.0 | 559 | 52.5 | |
Other | 10 | 0.9 | 1 | 0.1 | 4 | 0.4 | 1 | 0.1 | |
U.S. Race/Ethnicity | Hispanic | 117 | 10.0 | 112 | 10.6 | 126 | 11.6 | 106 | 10.0 |
Asian | 50 | 4.3 | 42 | 4.0 | 34 | 3.1 | 35 | 3.3 | |
Black | 94 | 8.0 | 93 | 8.8 | 97 | 8.9 | 102 | 9.6 | |
White | 693 | 59.0 | 643 | 60.8 | 650 | 59.9 | 630 | 59.2 | |
Other | 28 | 2.4 | 15 | 1.4 | 20 | 1.8 | 21 | 2.0 | |
U.S. Region | Northeast | 165 | 14.0 | 176 | 16.7 | 165 | 15.2 | 172 | 16.2 |
Midwest | 232 | 19.7 | 198 | 18.7 | 219 | 20.2 | 198 | 18.6 | |
South | 382 | 32.5 | 328 | 31.0 | 333 | 30.7 | 328 | 30.8 | |
West | 203 | 17.3 | 203 | 19.2 | 210 | 19.3 | 196 | 18.4 | |
Canadian Region | Central | 124 | 10.6 | 84 | 7.9 | 105 | 9.7 | 111 | 10.4 |
East | 9 | 0.8 | 15 | 1.4 | 11 | 1.0 | 14 | 1.3 | |
West | 60 | 5.1 | 53 | 5.0 | 43 | 4.0 | 45 | 4.2 | |
Canadian Race/Ethnicity | Not a visible minority | 161 | 13.7 | 125 | 11.8 | 129 | 11.9 | 133 | 12.5 |
Visible minority | 32 | 2.7 | 27 | 2.6 | 30 | 2.8 | 37 | 3.5 | |
Education Level | No high school diploma | 87 | 7.4 | 67 | 6.3 | 88 | 8.1 | 85 | 8.0 |
High school diploma/GED | 291 | 24.8 | 268 | 25.4 | 301 | 27.7 | 322 | 30.3 | |
Some college or associate degree | 389 | 33.1 | 369 | 34.9 | 340 | 31.3 | 325 | 30.5 | |
Bachelor's degree | 251 | 21.4 | 222 | 21.0 | 214 | 19.7 | 204 | 19.2 | |
Graduate or professional degree | 157 | 13.4 | 131 | 12.4 | 143 | 13.2 | 128 | 12.0 | |
Diagnosis | ADHD Inattentive | 64 | 5.4 | 50 | 4.7 | 35 | 3.2 | 30 | 2.8 |
ADHD Hyperactive/Impulsive | 0 | 0.0 | 0 | 0.0 | 1 | 0.1 | 8 | 0.8 | |
ADHD Combined | 76 | 6.5 | 55 | 5.2 | 60 | 5.5 | 36 | 3.4 | |
Anxiety | 105 | 8.9 | 86 | 8.1 | 71 | 6.5 | 67 | 6.3 | |
Depression | 90 | 7.7 | 75 | 7.1 | 58 | 5.3 | 61 | 5.7 | |
Other Diagnosis | 69 | 5.9 | 45 | 4.3 | 49 | 4.5 | 44 | 4.1 | |
No Diagnosis | 930 | 79.1 | 863 | 81.6 | 923 | 85.0 | 912 | 85.7 | |
Age in years M (SD) | 47.4 (19.3) | 47.6 (19.3) | 47.7 (19.8) | 47.8 (19.6) | |||||
Total | 1,175 | 100.0 | 1,057 | 100.0 | 1,086 | 100.0 | 1,064 | 100.0 |
Click to expand |
Table 11.1b. Demographic Characteristics of Raters: CAARS 2–Short Calibration and Validation Samples
Rater Demographic | Calibration | Validation | |||
N | % | N | % | ||
Gender | Male | 693 | 63.8 | 371 | 34.9 |
Female | 393 | 36.2 | 691 | 64.9 | |
Other | 0 | 0.0 | 2 | 0.2 | |
U.S. Race/Ethnicity | Hispanic | 120 | 11.0 | 118 | 11.1 |
Asian | 25 | 2.3 | 34 | 3.2 | |
Black | 95 | 8.7 | 89 | 8.4 | |
White | 665 | 61.2 | 626 | 58.8 | |
Other | 19 | 1.7 | 23 | 2.2 | |
Canadian Race/Ethnicity | Not a visible minority | 134 | 12.3 | 137 | 12.9 |
Visible minority | 28 | 2.6 | 37 | 3.5 | |
U.S. Region | Northeast | 160 | 14.7 | 168 | 15.8 |
Midwest | 217 | 20.0 | 192 | 18.0 | |
South | 356 | 32.8 | 351 | 33.0 | |
West | 191 | 17.6 | 179 | 16.8 | |
Canadian Region | Central | 109 | 10.0 | 115 | 10.8 |
East | 10 | 0.9 | 15 | 1.4 | |
West | 43 | 4.0 | 44 | 4.1 | |
Education Level | No high school diploma | 38 | 3.5 | 30 | 2.8 |
High school diploma/GED | 257 | 23.7 | 252 | 23.7 | |
Some college or associate degree | 403 | 37.1 | 405 | 38.1 | |
Bachelor's degree | 259 | 23.8 | 260 | 24.4 | |
Graduate or professional degree | 129 | 11.9 | 117 | 11.0 | |
Relation to Individual Being Rated | Spouse | 334 | 30.8 | 286 | 26.9 |
Friend | 257 | 23.7 | 273 | 25.7 | |
Other Family Member | 482 | 44.4 | 491 | 46.1 | |
Other | 13 | 1.2 | 14 | 1.3 | |
Length of Relationship | 1-5 months | 8 | 0.7 | 6 | 0.6 |
6-11 months | 11 | 1.0 | 5 | 0.5 | |
1-3 years | 66 | 6.1 | 74 | 7.0 | |
More than 3 years | 1,001 | 92.2 | 979 | 92.0 | |
How well does the rater know the individual being rated? | Moderately well | 61 | 5.6 | 80 | 7.5 |
Very well | 1,025 | 94.4 | 984 | 92.5 | |
How often does the rater interact with the individual being rated? | Monthly | 64 | 5.9 | 86 | 8.1 |
Weekly | 315 | 29.0 | 292 | 27.4 | |
Daily | 707 | 65.1 | 686 | 64.5 | |
Age in years M (SD) | 43.7 (16.0) | 44.0 (15.5) | |||
Total | 1,086 | 100.0 | 1,064 | 100.0 |
Analyses and Results
Identical procedures were used to develop the CAARS 2–Short for the Self-Report and Observer forms. Consistent with recommended practice in developing shortened forms, both statistical methods and expert judgment were employed to ensure breadth of coverage of the target construct was retained in the shortened forms (Kruyen et al., 2013; Smith et al., 2000; Ziegler et al., 2014). The steps involved in item selection and subsequent validation of the shortened forms were as follows:
Step 1–Core items selected. Five experts in adult ADHD (see Acknowledgements) were asked to identify items from the full-length CAARS 2 that best represented the core construct for each scale. Experts were asked to identify core items for Self-Report and Observer separately. The number of experts who endorsed each item as core was summed, producing a score that ranged from 0 to 5 for each item. Expert consensus on a core item was defined as an item score of 4 or higher, which represented agreement from at least 4 out of 5 experts. All core items were initially included in the shortened scales, though a small subset of core items were later excluded due to statistical considerations as outlined in Step 2.
Step 2–Items excluded due to statistical considerations. Items were examined for local dependence (LD) as well as differential item functioning (DIF) across demographic groups (i.e., gender, race/ethnicity, language(s)1 spoken, and education level [EL]) in the Total Sample. LD refers to the assumption of an IRT model that the items in a scale share variance only due to a common factor and are not related to one another in other ways (e.g., a response to one question depends on a response from an earlier question; Embretson & Reise, 2000). DIF refers to the assumption of an IRT model that there is no statistical item bias in terms of group differences (Embreton & Reise, 2000). If there was evidence for significant and meaningful LD or DIF, the item was excluded from consideration on the CAARS 2–Short. Including the items could mean the scale score would be affected by factors other than the construct being measured (that is, significant DIF would indicate item responses are unduly influenced by group characteristics, while meaningful LD would suggest that responses are influenced by item similarity such that the items may be related for a reason beyond their shared latent construct). While LD and DIF had negligible impact on the full-length CAARS 2 (see chapter 6, Development, for item selection procedures that evaluated these same statistics and found little evidence for the meaningful influence of either statistic in the full-length CAARS 2 items), LD and DIF can have a larger influence on shorter scales and therefore more stringent criteria were set for the CAARS 2–Short.
Items with a medium DIF effect size in terms of the tested demographic groups were excluded from consideration for the short form. Using this criterion, no items were excluded from the Self-Report form and only one item was excluded from the Observer form (as there was a moderate effect size between Hispanic and White individuals for Observer).
LD was assessed using (a) residual correlations among items greater than .15, (b) modification indices for 1-factor confirmatory factor analysis (CFA) models for each of the scales to assess residual correlation pairs, and (c) the presence of a significant χ2 test (Chen & Thissen, 1997). When LD was detected for an item pair, the item with the better measurement properties overall was considered for inclusion on the shortened forms.
Step 3–Remaining items selected. Item selection was done using the calibration sample, by systematically adding items one at a time to the core set of items for each scale, based on the following considerations:
-
Statistical Properties. Additional items were added based on item discrimination, precision of
measurement,
and ability to discriminate between the General Population and ADHD samples.
-
Item discrimination assesses an item’s ability to distinguish individuals at low versus high levels of
the trait. Item discrimination was measured using the slope parameter of each item from an IRT model.
Higher values (e.g., > .75) were favored, as they indicate better discrimination (Embretson & Reise,
2000).
-
Precision of measurement is inversely related to the amount of error, so that an item with low error
has high precision. Precision of measurement for items was assessed using item information curves
(IICs). An IIC graphically shows precision of measurement across the range of the construct being
measured, also known as theta. Precision at or above 1.5 SD from the average level of the
construct was
targeted, to best capture both subclinical and clinical levels of the construct. Greater amounts of
information indicate higher precision of measurement and lower standard error (more details can be found
in Test Information in chapter
8, Reliability).
-
Cliff’s delta (Cliff’s d; Cliff, 1993) was employed to examine how well each item distinguished
between the General Population and ADHD samples. Cliff’s d is a measure of effect size used for
non-parametric data. Items with higher effect sizes were preferred as they indicate better
discrimination between groups.
-
Item discrimination assesses an item’s ability to distinguish individuals at low versus high levels of
the trait. Item discrimination was measured using the slope parameter of each item from an IRT model.
Higher values (e.g., > .75) were favored, as they indicate better discrimination (Embretson & Reise,
2000).
-
Expert ratings. When items had similar statistics, the item with the higher expert rating was
retained
-
Content represented. Many scales assess different content areas or facets within the construct.
For example,
for the full-length CAARS 2 Hyperactivity scale, 61% of the items assess behavioral aspects of hyperactivity,
31% assess verbal hyperactivity, and 8% assess both behavioral and verbal aspects. A similar ratio of items was
retained for the shortened form, and across both raters, to ensure proportional coverage of all facets of the
construct measured.
Note that the CAARS 2–Short Self-Report and Observer were developed separately; while they both cover the same core content areas, they differ at the item-level. Experts identified different items as core for the different rater types, and statistical analysis dictated empirical selection of certain items for the Self-Report and other items for the Observer form. As a result, the CAARS 2–Short Self-Report and Observer items are overlapping, but not completely aligned.
Step 4–Alternate shortened versions compared. The development team set a minimum and maximum length for each scale on the short form (see Table 11.2). The minimum was the fewest number of items that would still allow for reasonable breadth of coverage; the maximum was approximately two-thirds of the full-length scale. For example, the Inattention/Executive Dysfunction is the longest scale with the most content to cover. Therefore, it required more items than other scales (as seen in Table 11.3). Starting with the minimum length, alternate-length short forms were created sequentially by adding one item at a time; therefore, for example, a 7-item version was compared to an 8-item version, which only differed by one additional item. This approach enabled testing for the ideal length that balanced efficiency with reliability and validity (Smith et al., 2000).
The following criteria were used to assess reliability and validity:
-
Measurement precision of the scale, with an emphasis on peak precision at 1.5 or 2 standard deviations above
the mean for a given construct. Ensuring precision at this range was the focus, as that is typically understood
to capture the clinical range of the constructs measured (see also Test Information in chapter
8,
Reliability).
Information values greater than 10 indicate high precision, values below 10 are moderately precise, and values
near 5 are considered adequate (Flannery et al., 1995; Reeve & Fayers, 2005).
-
Goodness-of-fit statistics were explored to ensure consistency in the factor structure between shortened and
full-length scales. This comparison is helpful for ensuring that construct validity is retained (Rammstedt &
Beierlein, 2014) and that all dimensions of the construct are proportionally represented in the short form
(Maloney et al., 2011). A detailed discussion of the multiple fit indices considered is provided in
Internal
Structure in chapter 9, Validity.
-
Internal consistency, as measured by alpha and omega, was evaluated (see
Internal
Consistency in
chapter 8,
Reliability, for a detailed discussion of these metrics).
-
Correlations between raw scores on the shortened scales and the full-length scales were assessed (via
Kendall’s tau coefficient, given the non-normality of the distribution of the scales). High correlation
coefficients provide evidence that the scales are measuring the same construct. Reliability, validity, and
construct coverage were prioritized over correlation between form lengths.
The statistical properties for each of the alternate versions were evaluated, and results for each were compared against the full-length CAARS 2 as a reference point. In instances where a shorter version performed as well statistically as a version with more items, the version that included the fewest items was favored. The process is illustrated with the CAARS 2–Short Observer Impulsivity scale as an example. As seen in Table 11.3, 4-, 5-, 6-, and 7-item versions of this scale were compared, and the analyses revealed acceptable and similar results for all versions in terms of correlations to the full-length scale, internal consistency, and model fit. However, compared to the other versions, the 6-item version had slightly less desirable fit statistics (higher RMSEA and SRMR and lower CFI and TLI), and the 4-item version had slightly lower internal consistency estimates. The precision of measurement, as seen in Figure 11.1, showed that the 7-item version was the only one to surpass a value of 10. Based on these results, the 7-item version was selected for the CAARS 2–Short Impulsivity scale. This process of comparing various scale lengths for each scale on the CAARS 2–Short was repeated until a final set of items was selected for all scales.
Click to expand |
Table 11.3. Comparison of Short Form Options: CAARS 2 Observer Impulsivity Scale
Form | Number of Items | Correlation with Full-Length | Internal Consistency | Goodness-of-Fit Statistics | General Population & ADHD Group Differences | ||||||
τ | α | ω | X2 | df | CFI | TLI |
RMSEA (95% CI) |
SRMR |
Cliff's d (95% CI) |
||
Full-Length | 13 | -- | .91 | .91 | 232.661*** | 65 | .973 | .968 |
.072 (.066, .079) |
.045 | .60 (.50, .69) |
Short Form Options | 7 | .83 | .88 | .88 | 47.454*** | 14 | .988 | .982 |
.076 (.062, .090) |
.034 | .61 (.51, .70) |
6 | .81 | .86 | .86 | 41.435*** | 9 | .986 | .977 |
.089 (.073, .107) |
.036 | .63 (.54, .71) | |
5 | .79 | .85 | .85 | 16.712** | 5 | .993 | .986 |
.079 (.057, .103) |
.025 | .65 (.55, .72) | |
4 | .77 | .82 | .83 | 6.88 | 2 | .996 | .987 |
.085 (.052, 123) |
.020 | .65 (.56, .73) |
Step 5–Final short form tested. The final set of items selected for the CAARS 2–Short Content Scales included 37 items each for Self-Report and Observer. Once the final versions for each scale were selected, items were recalibrated using IRT analyses for both the calibration and validation samples. The items selected for the CAARS 2–Short Content Scales, along with the slope (a) and location (b) parameters of the recalibration, can be found in Tables 11.4 and 11.5. Overall, the CAARS 2–Short demonstrated strong item discrimination, with a minimum slope greater than 1.0 for all samples tested. These results suggest that the selected items distinguish well between low and high levels of the construct being measured by each scale.
Click to expand |
Table 11.4. IRT Parameters: CAARS 2–Short Self-Report
CAARS 2–Short Scale | Item Stem | CAARS 2–Short: Calibration Sample | CAARS 2–Short: Validation Sample | Full-Length CAARS 2: Total Sample | |||||||||
a | b1 | b2 | b3 | a | b1 | b2 | b3 | a | b1 | b2 | b3 | ||
Inattention/Executive Dysfunction | Loses focus in conversations | 2.47 | -0.07 | 1.18 | 2.02 | 2.72 | -0.02 | 1.26 | 2.07 | 3.92 | 0.05 | 0.77 | 1.34 |
Has trouble with multi-step tasks | 2.65 | 0.30 | 1.23 | 2.06 | 2.70 | 0.40 | 1.44 | 2.17 | 3.69 | -0.33 | 0.58 | 1.27 | |
Difficulty prioritizing | 3.24 | 0.17 | 1.09 | 1.78 | 3.14 | 0.26 | 1.19 | 1.79 | 2.72 | 0.01 | 0.99 | 1.84 | |
Has difficulty paying attention to details | 3.06 | 0.24 | 1.29 | 1.99 | 3.07 | 0.37 | 1.35 | 2.18 | 2.69 | -0.05 | 1.01 | 1.78 | |
Difficulty organizing | 2.87 | -0.09 | 0.98 | 1.71 | 2.61 | 0.03 | 1.04 | 1.80 | 2.31 | -0.02 | 1.13 | 2.02 | |
Makes careless mistakes | 2.25 | -0.23 | 1.26 | 2.16 | 2.35 | -0.12 | 1.33 | 2.22 | 2.27 | -0.26 | 0.77 | 1.70 | |
Difficulty planning ahead | 2.31 | 0.26 | 1.26 | 2.19 | 2.18 | 0.34 | 1.29 | 2.08 | 1.67 | 0.41 | 1.71 | 2.84 | |
Misses deadlines | 2.29 | 0.44 | 1.56 | 2.34 | 2.41 | 0.43 | 1.59 | 2.22 | 2.15 | 0.66 | 1.89 | 2.78 | |
Forgets to do things | 2.54 | -0.39 | 1.10 | 1.92 | 2.71 | -0.34 | 1.15 | 1.96 | 2.08 | -0.30 | 1.25 | 2.20 | |
Distracted easily | 3.07 | -0.35 | 0.72 | 1.45 | 2.65 | -0.27 | 0.84 | 1.52 | 3.13 | -0.32 | 0.75 | 1.47 | |
Difficulty following instructions | 3.20 | 0.33 | 1.40 | 2.25 | 3.33 | 0.48 | 1.45 | 2.21 | 2.86 | 0.40 | 1.48 | 2.33 | |
Inattentive | 2.61 | 0.29 | 1.31 | 2.06 | 2.93 | 0.35 | 1.38 | 2.25 | 2.61 | 0.06 | 1.14 | 1.97 | |
Hyperactivity | Distracts others | 1.91 | 0.56 | 1.75 | 2.58 | 1.76 | 0.56 | 1.80 | 2.74 | 1.99 | 0.10 | 1.39 | 2.43 |
Taps hands or feet | 1.60 | -0.10 | 1.02 | 1.80 | 1.50 | -0.06 | 1.09 | 2.00 | 2.60 | -0.06 | 1.22 | 2.06 | |
Feels restless when still | 2.39 | -0.38 | 0.74 | 1.75 | 2.42 | -0.26 | 0.88 | 1.75 | 2.53 | -0.07 | 1.09 | 1.92 | |
Difficulty staying still | 2.91 | 0.01 | 0.96 | 1.81 | 3.05 | 0.03 | 0.98 | 1.74 | 3.18 | 0.19 | 1.15 | 1.82 | |
Moves around when they should not | 3.49 | 0.18 | 1.08 | 1.84 | 3.83 | 0.18 | 1.12 | 1.84 | 2.26 | 0.30 | 1.47 | 2.32 | |
Struggles with being quiet | 1.72 | 0.26 | 1.36 | 2.38 | 1.55 | 0.31 | 1.58 | 2.70 | 2.75 | -0.38 | 1.11 | 1.93 | |
Leaves seat when they shouldn't | 2.04 | 0.80 | 1.90 | 2.67 | 2.10 | 0.84 | 1.96 | 2.94 | 1.66 | -1.31 | 0.03 | 0.91 | |
Impulsivity | Speaks without thinking first | 2.00 | -0.28 | 1.22 | 2.28 | 2.16 | -0.30 | 1.28 | 2.11 | 2.01 | 0.53 | 1.71 | 2.56 |
Intrudes | 2.38 | 0.59 | 1.75 | 2.62 | 2.05 | 0.75 | 2.02 | 2.88 | 2.64 | 0.17 | 1.36 | 2.25 | |
Risky behavior | 1.84 | 0.38 | 1.64 | 2.64 | 1.80 | 0.43 | 1.67 | 2.79 | 1.70 | -0.08 | 1.00 | 1.81 | |
Difficulty with turn-taking | 2.02 | 0.26 | 1.46 | 2.39 | 2.20 | 0.37 | 1.56 | 2.37 | 1.93 | -0.19 | 1.15 | 2.09 | |
Impulsive | 2.00 | -0.16 | 1.09 | 2.06 | 2.14 | -0.22 | 1.15 | 1.99 | 3.51 | -0.16 | 0.86 | 1.68 | |
Interrupt others | 2.31 | 0.14 | 1.38 | 2.33 | 2.52 | 0.22 | 1.42 | 2.29 | 2.89 | 0.29 | 1.35 | 2.12 | |
Rushes | 1.91 | -0.29 | 1.26 | 2.34 | 2.18 | -0.31 | 1.21 | 2.34 | 3.12 | 0.18 | 1.13 | 1.92 | |
Emotional Dysregulation | Difficulty controlling anger | 2.03 | 0.07 | 1.37 | 2.36 | 1.86 | 0.14 | 1.45 | 2.54 | 2.38 | 0.35 | 1.37 | 2.21 |
Moods change quickly | 2.35 | -0.07 | 1.10 | 1.95 | 2.71 | -0.05 | 1.10 | 1.91 | 2.45 | -0.32 | 0.80 | 1.75 | |
Easily frustrated | 3.14 | -0.20 | 0.86 | 1.68 | 3.38 | -0.11 | 0.89 | 1.72 | 2.24 | -0.19 | 1.31 | 2.23 | |
Overreacts | 2.94 | -0.18 | 1.07 | 1.92 | 2.85 | -0.14 | 1.17 | 1.91 | 2.10 | 0.29 | 1.31 | 2.21 | |
Difficulty controlling emotions | 2.31 | -0.09 | 1.09 | 1.99 | 2.41 | 0.06 | 1.16 | 2.01 | 2.26 | 0.43 | 1.61 | 2.34 | |
Difficulty calming down | 2.70 | 0.07 | 1.05 | 1.96 | 2.93 | 0.06 | 1.18 | 1.90 | 2.68 | -0.11 | 0.88 | 1.69 | |
Negative Self-Concept | Lacks confidence from past failures | 2.88 | 0.03 | 0.80 | 1.44 | 3.72 | 0.09 | 0.79 | 1.35 | 2.02 | -0.31 | 1.24 | 2.35 |
Lacks confidence | 5.04 | -0.33 | 0.53 | 1.16 | 3.77 | -0.31 | 0.59 | 1.35 | 2.87 | -0.16 | 1.12 | 1.92 | |
Feels inferior | 2.47 | -0.29 | 0.72 | 1.59 | 2.21 | -0.23 | 0.79 | 1.80 | 1.78 | 0.27 | 1.41 | 2.42 | |
Avoids challenges | 2.78 | -0.17 | 0.84 | 1.58 | 2.68 | -0.05 | 0.90 | 1.79 | 2.10 | 0.81 | 1.93 | 2.80 | |
Self-critical | 1.64 | -1.33 | -0.01 | 0.92 | 1.76 | -1.27 | 0.09 | 0.90 | 2.76 | 0.30 | 1.35 | 2.16 |
Click to expand |
Table 11.5. IRT Parameters: CAARS 2–Short Observer
CAARS 2–Short Scale | Item Stem | CAARS 2–Short: Calibration Sample | CAARS 2–Short: Validation Sample | Full-Length CAARS 2: Total Sample | |||||||||
a | b1 | b2 | b3 | a | b1 | b2 | b3 | a | b1 | b2 | b3 | ||
Inattention/Executive Dysfunction | Loses focus in conversations | 2.24 | 0.51 | 1.77 | 2.56 | 2.46 | 0.45 | 1.56 | 2.31 | 2.38 | 0.47 | 1.65 | 2.42 |
Has trouble with multi-step tasks | 2.91 | 0.59 | 1.49 | 2.15 | 3.03 | 0.54 | 1.42 | 2.11 | 2.82 | 0.56 | 1.48 | 2.16 | |
Difficulty prioritizing | 3.82 | 0.36 | 1.11 | 1.77 | 3.37 | 0.30 | 1.17 | 1.94 | 3.71 | 0.32 | 1.13 | 1.85 | |
Has difficulty paying attention to details | 3.30 | 0.54 | 1.41 | 2.22 | 3.91 | 0.45 | 1.31 | 1.96 | 3.43 | 0.49 | 1.37 | 2.11 | |
Difficulty organizing | 2.75 | 0.28 | 1.19 | 1.95 | 3.02 | 0.20 | 1.20 | 1.95 | 2.93 | 0.23 | 1.18 | 1.94 | |
Makes careless mistakes | 2.26 | 0.30 | 1.52 | 2.28 | 2.74 | 0.22 | 1.43 | 2.13 | 2.36 | 0.25 | 1.50 | 2.24 | |
Difficulty planning ahead | 2.62 | 0.20 | 1.25 | 1.96 | 2.65 | 0.19 | 1.18 | 1.88 | 2.60 | 0.19 | 1.22 | 1.93 | |
Misses deadlines | 2.49 | 0.52 | 1.57 | 2.28 | 2.67 | 0.46 | 1.50 | 2.31 | 2.56 | 0.48 | 1.54 | 2.30 | |
Forgets to do things | 2.71 | -0.11 | 1.34 | 2.10 | 2.77 | -0.03 | 1.30 | 2.04 | 2.92 | -0.07 | 1.30 | 2.03 | |
Distracted easily | 2.64 | 0.14 | 1.13 | 1.81 | 3.02 | 0.09 | 1.18 | 1.86 | 2.90 | 0.11 | 1.14 | 1.83 | |
Difficulty following instructions | 3.29 | 0.56 | 1.51 | 2.19 | 3.92 | 0.46 | 1.38 | 2.12 | 3.25 | 0.51 | 1.47 | 2.21 | |
Inattentive | 2.05 | 0.60 | 1.79 | 2.57 | 2.69 | 0.57 | 1.55 | 2.29 | 2.35 | 0.58 | 1.66 | 2.42 | |
Hyperactivity | Distracts others | 1.84 | 0.72 | 1.82 | 2.75 | 1.53 | 0.77 | 1.97 | 2.82 | 1.96 | 0.69 | 1.75 | 2.57 |
Taps hands or feet | 1.61 | 0.63 | 1.69 | 2.31 | 1.61 | 0.57 | 1.76 | 2.51 | 1.70 | 0.59 | 1.67 | 2.33 | |
Appears restless when still | 2.73 | 0.27 | 1.22 | 1.98 | 3.17 | 0.27 | 1.17 | 1.94 | 2.70 | 0.28 | 1.22 | 2.00 | |
Difficulty staying still | 4.03 | 0.47 | 1.31 | 2.01 | 4.73 | 0.46 | 1.27 | 1.89 | 3.73 | 0.48 | 1.32 | 1.99 | |
Moves around when they should not | 4.55 | 0.57 | 1.38 | 2.05 | 3.78 | 0.57 | 1.40 | 1.98 | 3.72 | 0.58 | 1.41 | 2.05 | |
Struggles with being quiet | 1.69 | 0.37 | 1.40 | 2.26 | 1.55 | 0.34 | 1.44 | 2.48 | 1.92 | 0.33 | 1.31 | 2.17 | |
Leaves seat when they shouldn't | 2.34 | 1.06 | 2.01 | 2.75 | 2.26 | 0.97 | 1.91 | 2.73 | 2.48 | 0.99 | 1.90 | 2.66 | |
Impulsivity | Rushes | 2.01 | 0.22 | 1.60 | 2.43 | 2.28 | 0.13 | 1.35 | 2.20 | 2.08 | 0.17 | 1.48 | 2.34 |
Interrupts others | 3.01 | 0.44 | 1.32 | 2.11 | 3.09 | 0.37 | 1.48 | 2.08 | 3.28 | 0.40 | 1.37 | 2.07 | |
Impulsive | 2.11 | 0.17 | 1.35 | 2.19 | 2.13 | 0.21 | 1.34 | 2.22 | 2.02 | 0.19 | 1.37 | 2.26 | |
Difficulty with turn-taking | 3.07 | 0.51 | 1.39 | 2.21 | 2.92 | 0.61 | 1.53 | 2.19 | 3.18 | 0.54 | 1.44 | 2.18 | |
Risky behavior | 2.05 | 0.59 | 1.66 | 2.33 | 2.07 | 0.52 | 1.64 | 2.46 | 1.92 | 0.57 | 1.69 | 2.47 | |
Intrudes | 2.36 | 0.68 | 1.64 | 2.30 | 2.39 | 0.64 | 1.63 | 2.49 | 2.34 | 0.66 | 1.65 | 2.42 | |
Speaks without thinking first | 2.28 | -0.04 | 1.24 | 2.03 | 2.38 | -0.14 | 1.22 | 1.93 | 2.39 | -0.10 | 1.22 | 1.96 | |
Emotional Dysregulation | Difficulty controlling anger | 2.38 | 0.14 | 1.20 | 1.94 | 2.47 | 0.18 | 1.22 | 2.11 | 2.45 | 0.15 | 1.20 | 2.02 |
Moods change quickly | 3.06 | 0.11 | 1.14 | 1.94 | 3.32 | 0.20 | 1.22 | 1.90 | 3.22 | 0.14 | 1.18 | 1.93 | |
Easily frustrated | 3.10 | -0.03 | 1.02 | 1.78 | 3.14 | -0.03 | 1.04 | 1.91 | 3.18 | -0.04 | 1.03 | 1.84 | |
Overreacts | 3.45 | 0.04 | 1.04 | 1.70 | 3.41 | 0.03 | 1.12 | 1.84 | 3.47 | 0.03 | 1.07 | 1.77 | |
Difficulty controlling emotions | 2.64 | 0.09 | 1.28 | 2.04 | 2.80 | 0.11 | 1.26 | 2.01 | 2.76 | 0.09 | 1.27 | 2.03 | |
Difficulty calming down | 3.29 | 0.27 | 1.21 | 1.94 | 3.07 | 0.23 | 1.22 | 1.90 | 2.97 | 0.25 | 1.23 | 1.96 | |
Negative Self-Concept | Lacks confidence from past failures | 3.47 | 0.32 | 1.14 | 1.77 | 3.57 | 0.29 | 1.17 | 1.74 | 3.94 | 0.30 | 1.13 | 1.71 |
Lacks confidence | 3.74 | 0.03 | 0.93 | 1.59 | 3.44 | 0.04 | 1.00 | 1.65 | 3.23 | 0.04 | 0.98 | 1.65 | |
Feels inferior | 1.92 | 0.40 | 1.47 | 2.32 | 1.86 | 0.37 | 1.58 | 2.53 | 1.83 | 0.39 | 1.54 | 2.46 | |
Avoids challenges | 2.52 | 0.22 | 1.20 | 2.11 | 2.37 | 0.33 | 1.28 | 2.07 | 2.49 | 0.27 | 1.22 | 2.07 | |
Self-critical | 2.03 | -0.28 | 0.89 | 1.73 | 1.77 | -0.33 | 0.99 | 1.84 | 1.90 | -0.30 | 0.94 | 1.79 |
The same criteria used in selecting the items for the CAARS 2–Short scales was used to evaluate the shortened forms with the validation sample. Correlations were computed with Kendall’s tau to evaluate the relationship between the full-length and shortened forms. The full-length CAARS 2 and CAARS 2–Short showed very strong, positive, and statistically significant (p < .001) correlations for all scales, ranging from .83 to .93 across forms (see Table 11.6).
Click to expand |
Table 11.6. Correlations Between CAARS 2 and CAARS 2–Short Scales: Validation Sample
Scale | Correlations: Full-Length & Short (τ) | |
Self-Report | Observer | |
Inattention/Executive Dysfunction | .87 | .86 |
Hyperactivity | .88 | .87 |
Impulsivity | .84 | .83 |
Emotional Dysregulation | .90 | .91 |
Negative Self-Concept | .93 | .89 |
Estimates of internal consistency for the CAARS 2–Short scale scores within the validation sample all demonstrated high reliability, with alpha and omega values at or above .85 for Self-Report and Observer (see Table 11.7). The maximum decrease in internal consistency from the CAARS 2 to the CAARS 2–Short was .05 (see Table 11.7), indicating minimal compromises when using the shortened version.
Test information was also explored in the validation samples for Self-Report and Observer, as seen in Figure 11.2. Nearly all scales on both forms had test information values above 10 at 2 SD above the mean; for Self-Report, Impulsivity showed moderate test information, with a peak value greater than 5. These results are similar to the full-length CAARS 2 (see Test Information in chapter 8, Reliability). The test information of the CAARS 2–Short shows minimal loss in reliability compared to the full-length CAARS 2, providing strong evidence for the validity of the shortened scales.
Click to expand |
Table 11.7. Internal Consistency of CAARS 2–Short and Full-Length CAARS 2 Content Scales: Validation Sample
Scale | Self-Report | Observer | ||||||
Full | Short | Full | Short | |||||
α | ω | α | ω | α | ω | α | ω | |
Inattention/Executive Dysfunction | .97 | .97 | .95 | .95 | .97 | .97 | .94 | .94 |
Hyperactivity | .92 | .92 | .88 | .88 | .91 | .92 | .87 | .87 |
Impulsivity | .92 | .92 | .90 | .90 | .91 | .91 | .87 | .87 |
Emotional Dysregulation | .93 | .93 | .91 | .91 | .92 | .92 | .89 | .89 |
Negative Self-Concept | .90 | .90 | .85 | .86 | .91 | .91 | .88 | .88 |
Click to expand |
Figure 11.2. Test Information for CAARS 2–Short: Validation Sample
a) Inattention/Executive Dysfunction
b) Hyperactivity
c) Impulsivity
d) Emotional Dysregulation
e) Negative Self-Concept
The ability of the shortened scales to distinguish between the General Population and individuals diagnosed with ADHD (Predominantly Inattentive or Combined Presentation) was explored in the validation sample. Results, as measured by Cliff’s d effect sizes of group differences, are presented in Table 11.8, and show that the full-length CAARS 2 and the CAARS 2–Short are comparable with respect to how well they differentiate between General Population and ADHD groups. For both Self-Report and Observer, effect sizes are only marginally different between the two versions, and the overlapping confidence intervals indicate that differences between the form lengths are not significant. Replicating the discriminating ability of the CAARS 2 scales with the CAARS 2–Short scales provides additional evidence that the selected items for the CAARS 2–Short perform well.
Click to expand |
Table 11.8. Clinical Group Differences: CAARS 2–Short Validation Sample
Form | Scale | ADHD Inattentive vs. General Population | ADHD Combined vs. General Population | ||||||
Full-Length | Short Form | Full-Length | Short Form | ||||||
Cliff's d | 95% CI | Cliff's d | 95% CI | Cliff's d | 95% CI | Cliff's d | 95% CI | ||
Self-Report | Inattention/Executive Dysfunction | .81 | .68, .89 | .78 | .63, .87 | .79 | .66, .87 | .75 | .62, .85 |
Hyperactivity | .44 | .24, .60 | .45 | .26, .60 | .70 | .56, .80 | .71 | .58, .80 | |
Impulsivity | .35 | .13, .53 | .38 | .19, .57 | .59 | .41, .72 | .60 | .41, .74 | |
Emotional Dysregulation | .27 | .05, .47 | .26 | .04, .46 | .61 | .44, .73 | .59 | .42, .71 | |
Negative Self-Concept | .62 | .43, .76 | .65 | .47, .77 | .65 | .48, .77 | .66 | .49, .77 | |
Observer | Inattention/Executive Dysfunction | .93 | .90, .96 | .90 | .85, .94 | .97 | .94, .98 | .95 | .92, .97 |
Hyperactivity | .63 | .48, .75 | .65 | .52, .75 | .95 | .93, .97 | .93 | .90, .95 | |
Impulsivity | .71 | .56, .81 | .68 | .53, .78 | .94 | .91, .96 | .93 | .90, .95 | |
Emotional Dysregulation | .52 | .35, .65 | .50 | .33, .64 | .84 | .76, .89 | .85 | .78, .90 | |
Negative Self-Concept | .63 | .49, .74 | .65 | .51, .76 | .75 | .64, .84 | .76 | .64, .84 |
Mirroring what was tested in the full-length CAARS 2 (see Internal Structure in chapter 9, Validity), a CFA was used to determine whether the structure of the full-length CAARS 2 was retained in the CAARS 2–Short. Results from the CFA, presented in Table 11.9, indicate that the 5-factor model is an excellent fit for both the full-length and shortened forms of the CAARS 2. The fit statistics for the full-length and shortened versions support the replicated factor structure across the two lengths.
Click to expand |
Table 11.9. Confirmatory Factor Analysis Model Fit Comparison: CAARS 2 and CAARS 2–Short
Form | Version | χ2 | df | CFI | TLI | RMSEA | RMSEA 95% CI | SRMR |
Self-Report | Full-Length CAARS 2 | 7328.74 | 2474 | .960 | .959 | .043 | .042, .044 | .044 |
CAARS 2–Short | 1536.57 | 619 | .974 | .972 | .049 | .047, .051 | .039 | |
Observer | Full-Length CAARS 2 | 8546.31 | 2474 | .954 | .952 | .045 | .044, .046 | .051 |
CAARS 2–Short | 2142.23 | 619 | .965 | .962 | .057 | .055, .059 | .047 |
The development phase goal was to create a shortened version of the CAARS 2 that measured symptoms associated with ADHD efficiently and with minimal reduction of empirical psychometric properties. Overall, the results from this validation sample demonstrated that the CAARS 2–Short has psychometric properties that are comparable to the full-length CAARS 2, in terms of correspondence, internal consistency, test information, ability to distinguish between General Population and ADHD groups, and internal structure.
1 Raters were asked to indicate what languages the individual speaks, and response options included English only, English and Non-English, and Non-English only. For ease of presentation in this chapter, this variable will be referred to as “language(s) spoken.”
< Back | Next > |