CAARS 2 Manual Appendix I: Calculating Statistical Differences in Scores |
This appendix describes how to determine the statistical significance of differences in scores between raters and
differences in scores across time. To compare scores between raters, critical values for the Conners Adult ADHD
Rating Scales 2nd Edition (CAARS™ 2) and the CAARS™ 2–Short are provided. To compare scores (from the same rater)
across time, Reliable Change Index (RCI) values for both T-scores and raw scores of the CAARS 2 and the CAARS
2–Short are provided.
Differences Between Raters
Critical values can be used to establish statistical significance when comparing CAARS 2 T-scores obtained
from
different raters within the same time period. These critical values can be calculated using the formula provided by
Anastasi and Urbina (1997):
This formula takes into account the standard error of measurement (SEM) for each of the scales (see
chapter
8,
Reliability, for the SEM values for the CAARS 2 scales).
The critical values needed to determine statistical significance between a pair of T-scores at the p
< .05 level
of
significance are provided in Tables I.1a through
I.2c. Tables
I.1a to
I.1c present
critical values for the
full-length CAARS 2; Tables I.2a to I.2c present
critical
values
for the CAARS
2–Short.
To determine whether the difference between two raters’ scores is statistically significant:
-
Find the appropriate table based on the length of the form (full-length or short), the relevant pair of
raters (e.g., Self-Report to Observer, Observer to Observer), and the type of Normative Sample selected for
scoring (Combined Gender, Male, or Female).
-
Find the column for the age group of the individual being rated.
-
Find the row containing the scale being compared and identify the critical value.
-
Compare the absolute difference between the two raters’ T-scores with the critical value. The
difference is
significant (p < .05) when it is equal to or greater than the critical value in the table.
For example, a 65-year-old woman was rated on the CAARS 2 Observer (full-length) by both her husband and her best
friend. Her Impulsivity T-score (Combined Gender normative sample) was 63 when rated by her husband, and 59
when
rated by her best friend, an absolute difference of 4 points. Given the ratings came from the full-length CAARS 2,
for an Observer-Observer comparison using Combined Gender norms, Table I.2a
is the appropriate
look-up table. The
person being rated is 65 years old, so the clinician used the “60 to 69” column for age. The clinician then found
the “Impulsivity” row and noted the appropriate critical value of 7. The clinician compared the absolute difference
(63 minus 59 = 4, or 59 minus 63 = -4) between the two Observer T-scores (4) with the critical value of 7.
Because
the absolute difference of 4 is not equal to or higher than 7, the difference between these two scores is not
statistically significant at the p < .05 level. In other words, the Impulsivity scores produced by her
husband’s and
best friend’s ratings are not statistically different. Even though these two scores fall into different
interpretation guideline categories (“Slightly Elevated” versus “Not Elevated”), the difference is not
meaningful. Return to Interpretation Step 5 (see
Step
5: Integrate and compare CAARS 2 results in
chapter 4,
Interpretation) to consider how this non-significant finding relates to other information from the
CAARS 2,
as
well as other sources of information.
In contrast, perhaps the same 65-year-old woman was also rated by her supervisor at work, with an Impulsivity
T-score of 72. This observer’s score is 9 points away from the husband-rated T-score of 63. The same
reference value
from Table
I.2a, 7, is used, as this comparison is also comparing Observer to Observer, but this
time, the absolute
difference of 9 exceeds the critical value of 7. In this instance, it is appropriate to report that the supervisor’s
rating is significantly higher than the husband’s rating (at the p < .05 level). It would be reasonable to
hypothesize that this woman’s difficulties are more evident or more impairing in the workplace than in her home
setting (or perhaps that her home setting is optimized to reduce the impact of impulsivity) and to review other
information that could reject or support that hypothesis.
CAARS 2 (Full-Length)
Comparisons between Self-Report and Observer T-scores
Table I.1a. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample–Combined Gender
Scale
|
Age Group
|
18-24
|
25-29
|
30-39
|
40-49
|
50-59
|
60-69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
5
|
4
|
4
|
4
|
4
|
4
|
5
|
Hyperactivity
|
7
|
6
|
7
|
7
|
7
|
7
|
8
|
Impulsivity
|
8
|
7
|
7
|
8
|
7
|
7
|
8
|
Emotional Dysregulation
|
7
|
7
|
7
|
6
|
7
|
7
|
7
|
Negative Self-Concept
|
8
|
8
|
7
|
7
|
8
|
8
|
9
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
6
|
6
|
6
|
5
|
6
|
6
|
7
|
ADHD Hyperactive/Impulsive Symptoms
|
7
|
7
|
7
|
7
|
7
|
7
|
8
|
Total ADHD Symptoms
|
5
|
5
|
5
|
5
|
5
|
5
|
6
|
Table I.1b. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample Gender Specific–Males
Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
5
|
4
|
4
|
4
|
4
|
4
|
5
|
Hyperactivity
|
8
|
6
|
7
|
7
|
7
|
7
|
8
|
Impulsivity
|
8
|
8
|
7
|
8
|
8
|
8
|
8
|
Emotional Dysregulation
|
7
|
7
|
7
|
7
|
7
|
7
|
7
|
Negative Self-Concept
|
9
|
8
|
7
|
7
|
9
|
8
|
8
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
7
|
6
|
5
|
6
|
5
|
6
|
6
|
ADHD Hyperactive/Impulsive Symptoms
|
8
|
7
|
7
|
7
|
8
|
8
|
8
|
Total ADHD Symptoms
|
6
|
5
|
5
|
5
|
5
|
6
|
6
|
Table I.1c. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample Gender Specific–Females
Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
5
|
5
|
4
|
4
|
4
|
4
|
5
|
Hyperactivity
|
6
|
8
|
6
|
7
|
6
|
7
|
8
|
Impulsivity
|
7
|
7
|
7
|
7
|
7
|
7
|
8
|
Emotional Dysregulation
|
7
|
7
|
7
|
6
|
6
|
6
|
8
|
Negative Self-Concept
|
8
|
7
|
7
|
7
|
8
|
7
|
11
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
6
|
5
|
6
|
5
|
6
|
6
|
7
|
ADHD Hyperactive/Impulsive Symptoms
|
7
|
9
|
7
|
7
|
7
|
7
|
8
|
Total ADHD Symptoms
|
5
|
7
|
5
|
5
|
5
|
5
|
6
|
Comparisons between Two Observers’ T-scores
Table I.2a. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample–Combined Gender
Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
4
|
4
|
4
|
4
|
4
|
4
|
5
|
Hyperactivity
|
7
|
6
|
6
|
6
|
7
|
7
|
7
|
Impulsivity
|
7
|
7
|
6
|
7
|
6
|
7
|
7
|
Emotional Dysregulation
|
7
|
7
|
7
|
6
|
5
|
7
|
7
|
Negative Self-Concept
|
8
|
8
|
7
|
8
|
9
|
8
|
9
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
6
|
6
|
5
|
5
|
5
|
6
|
6
|
ADHD Hyperactive/Impulsive
Symptoms
|
7
|
6
|
6
|
6
|
7
|
7
|
7
|
Total ADHD Symptoms
|
5
|
5
|
4
|
5
|
5
|
5
|
5
|
Table I.2b. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample Gender Specific–Males
Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
5
|
4
|
3
|
3
|
4
|
4
|
5
|
Hyperactivity
|
7
|
6
|
6
|
6
|
7
|
7
|
8
|
Impulsivity
|
8
|
7
|
6
|
7
|
6
|
7
|
8
|
Emotional Dysregulation
|
7
|
7
|
6
|
6
|
5
|
7
|
7
|
Negative Self-Concept
|
9
|
8
|
7
|
7
|
10
|
8
|
7
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
7
|
6
|
4
|
5
|
5
|
6
|
6
|
ADHD
Hyperactive/Impulsive Symptoms
|
8
|
7
|
6
|
6
|
7
|
7
|
8
|
Total ADHD Symptoms
|
6
|
5
|
4
|
4
|
5
|
5
|
6
|
Table I.2c. CAARS 2 (full-length) Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample Gender Specific–Females
Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Content Scales
|
Inattention/Executive Dysfunction
|
4
|
6
|
4
|
4
|
4
|
5
|
5
|
Hyperactivity
|
6
|
10
|
6
|
7
|
6
|
7
|
7
|
Impulsivity
|
6
|
6
|
6
|
7
|
7
|
7
|
7
|
Emotional Dysregulation
|
7
|
6
|
7
|
6
|
5
|
7
|
7
|
Negative Self-Concept
|
8
|
7
|
7
|
7
|
8
|
7
|
10
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
6
|
6
|
6
|
5
|
6
|
7
|
6
|
ADHD Hyperactive/Impulsive
Symptoms
|
6
|
10
|
6
|
7
|
7
|
7
|
6
|
Total ADHD Symptoms
|
5
|
9
|
4
|
5
|
5
|
6
|
5
|
CAARS 2–Short
Comparisons between Self-Report and Observer T-scores
Table I.3a. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample–Combined Gender
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
5
|
4
|
5
|
4
|
5
|
5
|
5
|
Hyperactivity
|
7
|
6
|
7
|
7
|
7
|
7
|
8
|
Impulsivity
|
7
|
7
|
7
|
7
|
7
|
7
|
8
|
Emotional Dysregulation
|
6
|
7
|
6
|
6
|
6
|
6
|
7
|
Negative Self-Concept
|
7
|
6
|
6
|
7
|
7
|
7
|
9
|
Table I.3b. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample Gender Specific–Males
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
6
|
5
|
5
|
5
|
5
|
5
|
6
|
Hyperactivity
|
8
|
7
|
8
|
8
|
8
|
8
|
9
|
Impulsivity
|
8
|
8
|
9
|
8
|
8
|
8
|
9
|
Emotional Dysregulation
|
7
|
8
|
7
|
7
|
8
|
7
|
8
|
Negative Self-Concept
|
9
|
7
|
7
|
8
|
8
|
8
|
10
|
Table I.3c. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Self-Report and Observer Raters: Normative Sample Gender Specific–Females
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
4
|
4
|
4
|
4
|
4
|
4
|
5
|
Hyperactivity
|
6
|
5
|
6
|
6
|
6
|
6
|
6
|
Impulsivity
|
6
|
6
|
6
|
6
|
6
|
6
|
7
|
Emotional Dysregulation
|
5
|
6
|
5
|
5
|
6
|
5
|
6
|
Negative Self-Concept
|
6
|
5
|
5
|
6
|
6
|
6
|
7
|
Comparisons between Two Observers’ T-scores
Table I.4a. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample–Combined Gender
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
5
|
4
|
4
|
4
|
4
|
5
|
5
|
Hyperactivity
|
5
|
4
|
4
|
4
|
4
|
5
|
5
|
Impulsivity
|
6
|
6
|
5
|
6
|
6
|
7
|
6
|
Emotional Dysregulation
|
6
|
5
|
6
|
5
|
5
|
6
|
6
|
Negative Self-Concept
|
7
|
7
|
7
|
7
|
8
|
7
|
8
|
Table I.4b. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample Gender Specific–Males
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
6
|
5
|
4
|
5
|
5
|
5
|
6
|
Hyperactivity
|
7
|
6
|
6
|
7
|
7
|
7
|
7
|
Impulsivity
|
7
|
7
|
6
|
7
|
7
|
8
|
7
|
Emotional Dysregulation
|
7
|
6
|
7
|
6
|
5
|
7
|
7
|
Negative Self-Concept
|
9
|
8
|
8
|
8
|
10
|
8
|
9
|
Table I.4c. CAARS 2–Short Critical Values Denoting Statistically Significant Differences Between Two Observer Raters: Normative Sample Gender Specific–Females
CAARS 2–Short Scale
|
Age Group
|
18–24
|
25–29
|
30–39
|
40–49
|
50–59
|
60–69
|
70+
|
Inattention/Executive Dysfunction
|
4
|
4
|
3
|
3
|
4
|
4
|
4
|
Hyperactivity
|
5
|
4
|
4
|
5
|
5
|
5
|
5
|
Impulsivity
|
5
|
5
|
5
|
5
|
5
|
6
|
5
|
Emotional Dysregulation
|
5
|
5
|
5
|
4
|
4
|
5
|
5
|
Negative Self-Concept
|
6
|
6
|
6
|
6
|
7
|
6
|
7
|
Differences Across Time
When the same rater completes the CAARS 2 more than once, those different points in time can be compared
statistically. The Reliable Change Index (RCI, see Jacobson & Truax, 1991) is a commonly used method to determine
whether a change in scores between test administrations is statistically significant.
A statistically significant result means that the measured change can be attributed to reliable differences between
the scores, rather than random fluctuations in behavior or error in measurement. If the CAARS 2 score has
increased
significantly from the pre-test to the post-test, then the individual’s ratings reflect a significant increase in
concerns or worsening of symptoms. If the CAARS 2 score has decreased significantly, then the individual’s
ratings
reflect a significant decrease in concerns, that is, improvement.
The following tables provide RCI values (p < .05) for comparing CAARS 2 ratings across two points in time.
Use
Tables I.5 and I.7 when comparing T-scores from the
full-length CAARS 2 (Table
I.5) or CAARS 2–Short form
(Table
I.7). In some instances, it may be necessary to compare raw scores across two points in time, such as when
enough years pass that the individual changes normative samples (e.g., a 24-year-old turns 25, moving from the
18-24 years group into the 25-29 years group). When comparing raw scores, use Table
I.6 (full-length CAARS 2)
or
Table I.8 (CAARS 2–Short).
Remember that the scores being compared must be obtained from the same rater about the same person, and the time
interval between administrations should be more than four weeks. Note that age and type of normative sample
(Combined Gender, Female, or Male) are not considerations for this comparison.
To determine whether the difference between scores over time is statistically significant:
-
Find the appropriate table based on the form length (full-length or short form) and the type of score being
compared (T-score or raw score).
-
Find the column for the rater type (Self-Report or Observer).
-
Find the row for the scale that is being compared and identify the RCI.
-
Compare the absolute difference between the two scores with the RCI. The difference is significant (p
< .05)
when it is equal to or greater than the RCI.
For example, a spouse’s rating on the full-length CAARS 2 of a 34-year-old man resulted in a baseline Hyperactivity
T-score of 85 and a post-treatment T-score of 70 after 6 months in treatment. The absolute difference
between these
scores is 15. The clinician looked for full-length CAARS 2, T-score comparison, and identified the correct
look-up
table as Table I.5. Using the Observer column and the Hyperactivity row, they identified
the RCI value of 12.
Because the absolute difference of 15 is greater than the RCI of 12, this change is statistically significant at the
p < .05 level. The score dropped 15 points between Time 1 and Time 2, indicating a decreased report of
concerns;
this difference suggests improvement in the area of Hyperactivity. The score remains in the Very Elevated range,
which means this man likely continues to have challenges related to his restlessness and activity levels, but he
is making progress. In addition to stating that the change is statistically significant, the clinician might
also determine that the change is clinically meaningful, given observable improvement.
Conversely, the spouse’s ratings for the Inattention/Executive Dysfunction scale were T-score = 74 at
baseline and
T-score = 70 at post-treatment, an absolute difference of 4 points. Using the same table (Table
I.5 for
full-length,
T-score comparison) and column (Observer), the Inattention/Executive Dysfunction RCI is 9. Because the
absolute
value of 4 is smaller than the RCI of 9, this difference is not statistically significant. Clinical judgment is
needed to determine if this represents a need for extending the treatment plan and/or revising interventions (e.g.,
changing medication type/dosage, increasing the frequency of therapy, revising therapy modalities).
CAARS 2 (Full-Length)
Table I.5. CAARS 2 (full-length) Minimum Values Needed for Significance When Comparing Time 1 to Time 2 T-scores
Scale
|
Self-Report
|
Observer
|
Content Scales
|
Inattention/Executive Dysfunction
|
7
|
9
|
Hyperactivity
|
8
|
12
|
Impulsivity
|
8
|
12
|
Emotional Dysregulation
|
10
|
12
|
Negative Self-Concept
|
12
|
12
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
7
|
9
|
ADHD Hyperactive/Impulsive
Symptoms
|
8
|
12
|
Total
ADHD Symptoms
|
7
|
9
|
Table I.6. CAARS 2 (full-length) Minimum Values Needed for Significance When Comparing Time 1 to Time 2 Raw Scores
Scale
|
Self-Report
|
Observer
|
Content Scales
|
Inattention/Executive
Dysfunction
|
13
|
17
|
Hyperactivity
|
7
|
8
|
Impulsivity
|
6
|
9
|
Emotional Dysregulation
|
5
|
8
|
Negative Self-Concept
|
6
|
5
|
DSM Symptom Scales
|
ADHD Inattentive Symptoms
|
6
|
7
|
ADHD Hyperactive/Impulsive
Symptoms
|
6
|
7
|
Total ADHD Symptoms
|
10
|
12
|
CAARS 2–Short
Table I.7. CAARS 2–Short Minimum Values Needed for Significance When Comparing Time 1 to Time 2 T-scores
Scale
|
Self-Report
|
Observer
|
Inattention/Executive Dysfunction
|
7
|
9
|
Hyperactivity
|
9
|
10
|
Impulsivity
|
8
|
13
|
Emotional Dysregulation
|
10
|
12
|
Negative Self-Concept
|
13
|
14
|
Table I.8. CAARS 2–Short Minimum Values Needed for Significance When Comparing Time 1 to Time 2 Raw Scores
Scale
|
Self-Report
|
Observer
|
Inattention/Executive Dysfunction
|
6
|
7
|
Hyperactivity
|
5
|
4
|
Impulsivity
|
4
|
6
|
Emotional Dysregulation
|
4
|
5
|
Negative Self-Concept
|
5
|
5
|