Appendices

Manual

CAARS 2 Manual

Appendix K: Impact of COVID-19 on Normative Scores

view all section tables | print this section

Sample
Reliability
- Table K.1. Internal Consistency: 2019 Samples for the CAARS 2
Validity
Summary

The Conners Adult ADHD Rating Scales 2nd Edition (CAARS™ 2) released norms based on samples collected in 2019 (see chapter 7, Standardization, for details about the Normative Samples). However, since the collection of that data, the COVID-19 pandemic occurred and disrupted life for many around the world. Part of the disruption included public health measures (e.g., physical distancing, working from home), which helped control the spread of COVID-19. On the other hand, these measures also had unintended consequences, such as adversely affecting mental health. For example, 4 in 10 adults in the U.S. reported symptoms of anxiety or depression during the pandemic, compared to 1 in 10 before it in 2019 (National Center for Health Statistics, 2022).

Although adverse mental health symptoms appeared to increase during this time, it is important to investigate whether a fundamental shift occurred in the normative reports of ADHD symptoms and features, as measured by the CAARS 2. If the pandemic has affected how ADHD manifests, updates to the CAARS 2 may be needed to prevent over- or under-identification of ADHD and to maintain the CAARS 2’s strong psychometric properties.

To investigate this, a study was conducted to compare a general population sample collected post-pandemic outbreak (termed the “2022 sample” throughout this section) with a demographically matched subset of the existing Normative Sample established before the pandemic outbreak (termed “2019 sample” throughout this section; see chapter 7, Standardization). In this study, the psychometric properties of the two samples were compared to determine whether a meaningful change had occurred. Evidence of reliability and validity were investigated for the CAARS 2 scales and item-level content, and findings are reported in this appendix. Overall, minimal differences were found, supporting the continued use of the Normative Samples.

Sample

Data collection for the 2022 sample took place from late May to early June when responses to the CAARS 2 were collected digitally via online recruitment. Participants accessed the assessment via an email link that was shared with them, and they completed either the Self-Report or Observer form for a small monetary compensation.

Three hundred and sixteen adults completed the CAARS 2 Self-Report form, and 314 adults completed the CAARS 2 Observer form. Eligible individuals for this study were from the U.S., aged 18 years or older, read English “very well,” and did not have a clinical diagnosis. Data were cleaned prior to analysis. Based on data quality metrics (e.g., indicators of careless/random responding), 51 and 64 individuals were removed from the Self-Report and Observer samples respectively due to violations of one or more rigorous data quality metrics. To compare these samples with the 2019 Normative Samples, individuals were matched on gender, age (which was coarsened to five ordered levels), race/ethnicity, and education level. This process resulted in a further exclusion of 16 cases for both the Self-Report and Observer samples, due to a lack of matching demographic characteristics in the Normative Sample.

The final sample size for the 2022 sample was 249 individuals for the Self-Report (with a corresponding 249 matched individuals from the 2019 Normative Sample; see chapter 7, Standardization, for information about the full Normative Sample) and 234 individuals for Observer (with a corresponding 234 matched individuals from the 2019 Normative Sample). The demographic characteristics for both the 2019 and 2022 samples for the Self-Report and Observer forms can be seen in Appendix J, Tables J.36 and J.37.

Reliability

The reliability of the CAARS 2 scales for the 2022 samples was assessed via internal consistency estimates (alpha and omega) and test information functions (see Internal Consistency and Test Information in chapter 8, Reliability, for more details). Internal consistency estimates were found to be excellent (see Table K.1) and in line with the full normative sample (see Tables 8.1a, 8.1b, 8.2a, and 8.2b in chapter 8, Reliability); the median omega for both the Self-Report and Observer was .96.

Test information functions for the CAARS 2 Self-Report and Observer were also examined for the 2019 and 2022 samples (see Figure K.1 and Figure K.2 for the Self-Report and Observer, respectively). All test information functions had high information across the relevant range of the ability scale, with peaks exceeding values of 10, indicating very high measurement precision. Test information functions were similarly above recommended guidelines for both the 2019 and 2022 samples.

Taken together, results from the internal consistency analysis and test information functions revealed that scale reliability for the CAARS 2 Self-Report and Observer has remained sufficiently precise since the outbreak of the COVID-19 pandemic.

Click to expand

Table K.1. Internal Consistency: 2019 Samples for the CAARS 2

Scale		Self-Report		Observer
Scale		Alpha	Omega	Alpha	Omega
Content Scales	Inattention/Executive Dysfunction	.98	.98	.98	.98
	Hyperactivity	.96	.96	.96	.96
	Impulsivity	.94	.95	.95	.96
	Emotional Dysregulation	.95	.95	.96	.96
	Negative Self-Concept	.91	.92	.91	.91
DSM Symptom Scales	ADHD Inattentive Symptoms	.96	.96	.97	.97
	ADHD Hyperactive/Impulsive Symptoms	.95	.96	.95	.96
	Total ADHD Symptoms	.98	.98	.98	.98

Note. Self-Report N = 249; Observer N = 234.

Click to expand

Figure K.1. Test Information Functions for the CAARS 2 Self-Report: 2019 and 2022 Samples

a) Inattentive/Executive Dysfunction

b) Hyperactivity

c) Impulsivity

d) Emotional Dysregulation

e) Negative Self-Concept

f) DSM ADHD Inattentive Symptoms

g) DSM Hyperactive/Impulsive Symptoms

h) DSM ADHD Total Symptoms

Click to expand

Figure K.2. Test Information Functions for the CAARS 2 Observer: 2019 and 2022 Samples

a) Inattentive/Executive Dysfunction

b) Hyperactivity

c) Impulsivity

d) Emotional Dysregulation

e) Negative Self-Concept

f) DSM ADHD Inattentive Symptoms

g) DSM Hyperactive/Impulsive Symptoms

h) DSM ADHD Total Symptoms

Validity

Content and DSM Symptom Scales

CAARS 2 Content Scales were analyzed for invariance of the factor structure and latent traits (using measurement invariance [MI] and differential test functioning [DTF]; see appendix M for details about these methodologies; note also that DSM symptom scales are not analyzed in this way, as the items are subsumed by the Content Scales and not factor-derived). Mean scale score differences between the two samples were also computed. Results of the MI analyses are presented in Tables K.2 and K.3, and examples of results for the DTF approach are presented graphically in Figure K.3, with effect sizes to quantify the differences seen in Table K.4.

Overall, the CAARS 2 Content Scales were found to be invariant across the 2019 and 2022 samples and no DTF was detected. These results provide strong evidence that the structure of the CAARS 2 Content Scales did not change as a result of the COVID-19 pandemic.

Click to expand

Table K.2. Measurement Invariance (2019 vs. 2022): CAARS 2 Self-Report

Scale	Invariance Model	χ²	df	RMSEA	CFI	TLI	SRMR	Satorra-Bentler χ²	df	ΔCFI
Inattention/Executive Dysfunction	Configural	1467.92***	810	.057	.975	.973	.054	--
	Weak	1495.29***	840	.056	.975	.974	.054	24.41	30	.000
	Strong	1507.07***	869	.054	.976	.976	.054	34.74	29	.001
	Strict	1517.01***	897	.053	.977	.977	.054	39.58	28	.001
Hyperactivity	Configural	485.30***	130	.105	.955	.946	.071	--
	Weak	508.50***	143	.102	.954	.950	.071	17.04	13	-.001
	Strong	510.52***	155	.096	.955	.955	.071	16.62	12	.001
	Strict	512.44***	167	.091	.956	.959	.072	23.86	12	.001
Impulsivity	Configural	285.40***	130	.069	.974	.968	.055	--
	Weak	298.48***	143	.066	.974	.971	.055	8.76	13	.000
	Strong	295.08***	155	.060	.976	.976	.056	9.79	12	.003
	Strict	315.98***	167	.060	.975	.977	.057	24.76	12	-.001
Emotional Dysregulation	Configural	204.23***	54	.106	.980	.973	.051	--
	Weak	219.07***	63	.100	.979	.976	.051	8.17	9	-.001
	Strong	216.78***	71	.091	.980	.980	.051	5.96	8	.001
	Strict	210.81***	79	.082	.982	.984	.051	7.21	8	.002
Negative Self-Concept	Configural	88.83***	28	.094	.990	.985	.033	--
	Weak	100.51***	35	.087	.989	.987	.033	9.16	7	-.001
	Strong	106.35***	41	.080	.989	.989	.034	10.03	6	.000
	Strict	111.82***	47	.075	.989	.990	.034	10.93	6	.000

Note. N = 249 for the 2022 sample; N = 249 for the 2022 sample. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Inattention/Executive Dysfunction revealed that one intercept had to be released for the strict invariance hypothesis to hold.

Click to expand

Table K.3. Measurement Invariance (2019 vs. 2022): CAARS 2 Observer

Scale	Invariance Model	χ²	df	RMSEA	CFI	TLI	SRMR	Satorra-Bentler χ²	df	ΔCFI
Inattention/Executive Dysfunction	Configural	1275.40***	810	.050	.984	.983	.050	--
	Weak	1312.35***	840	.049	.984	.984	.050	46.70	30	.000
	Strong	1323.18***	869	.047	.985	.985	.050	24.92	29	.001
	Strict	1340.83***	898	.046	.985	.986	.050	39.01	29	.000
Hyperactivity	Configural	510.52***	130	.112	.959	.951	.087	--
	Weak	537.87***	143	.109	.958	.954	.087	20.05	13	-.001
	Strong	542.49***	155	.104	.959	.958	.087	13.41	12	.001
	Strict	509.77***	167	.094	.963	.966	.088	6.93	12	.005
Impulsivity	Configural	249.98***	130	.063	.984	.981	.049	--
	Weak	269.72***	143	.062	.984	.982	.049	19.87	13	.000
	Strong	281.34***	155	.059	.984	.984	.049	16.55	12	.000
	Strict	271.00***	167	.052	.987	.987	.049	7.36	12	.003
Emotional Dysregulation	Configural	247.54***	54	.124	.980	.973	.048	--
	Weak	268.58***	62	.120	.978	.975	.048	17.79	8	.000
	Strong	263.43***	70	.109	.980	.979	.049	8.00	8	.002
	Strict	250.82***	78	.098	.982	.983	.049	9.28	8	.002
Negative Self-Concept	Configural	57.49***	28	.067	.992	.988	.038	--
	Weak	67.09***	35	.063	.991	.989	.038	8.06	7	-.001
	Strong	61.33***	41	.046	.994	.994	.038	2.17	6	.003
	Strict	66.00***	47	.042	.995	.995	.040	7.21	6	.001

Note. N = 234 for the 2022 sample; N = 234 for the 2019 sample. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Emotional Dysregulation Scale revealed that one threshold had to be released for the weak invariance hypothesis to hold.

Click to expand

Figure K.3. Differential Test Functioning (2019 vs. 2022)

a1) Self–Report - Inattention

a2) Self–Report - Hyperactivity

b1) Observer - Inattention

b2) Observer - Hyperactivity

Click to expand

Table K.4. Differential Test Functioning Effect Sizes (2019 vs. 2022)

Scale	Self-Report	Observer
Inattention/Executive Dysfunction	.01	.01
Hyperactivity	.00	.00
Impulsivity	.08	.00
Emotional Dysregulation	.00	.00
Negative Self-Concept	.00	.00

Note. Guidelines for interpretation: small effect size ≥ |0.20|; medium effect size ≥ |0.50|; large effect size ≥ |0.80|. Positive ETSSD values indicate that individuals with equal amounts of the constructs being measured in the 2022 sample scored higher than individuals in the 2019 sample.

Mean scale scores for the 2019 and 2022 samples were compared, and results are presented in Tables K.5 and K.6. No statistically significant differences were observed between Content Scale or DSM Symptom Scale scores for the 2019 and 2022 samples (all p > .01), and all effect sizes were small or negligible (median d = 0.12 for both the Self-Report and Observer). Thus, mean observed scale scores did not appear to differ across the 2019 and 2022 samples.

Click to expand

Table K.5. Mean Differences (2019 vs. 2022): CAARS 2 Self-Report

Scale		2019 Sample		2022 Sample		Independent t-tests
Scale		M	SD	M	SD	Cohen's d	t (496)	p
Content Scales	Inattention/Executive Dysfunction	49.5	9.2	50.6	11.1	0.11	1.21	.226
	Hyperactivity	50.1	10.1	51.4	11.8	0.12	1.30	.195
	Impulsivity	49.3	9.8	51.7	12.2	0.22	2.40	.017
	Emotional Dysregulation	49.7	10.2	51.0	11.1	0.13	1.43	.154
	Negative Self-Concept	50.2	10.2	50.4	10.0	0.02	0.23	.815
DSM Symptom Scales	ADHD Inattentive Symptoms	49.6	9.4	50.5	11.0	0.08	0.95	.345
	ADHD Hyperactive/Impulsive Symptoms	49.9	9.6	51.2	12.0	0.12	1.37	.172
	Total ADHD Symptoms	49.7	9.5	50.9	11.8	0.11	1.18	.239

Note. N = 249 for the 2019 sample, and 249 for the 2022 sample. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for the 2022 sample vs. the 2019 sample.

Click to expand

Table K.6. Mean Differences (2019 vs. 2022): CAARS 2 Observer

Scale		2019 Sample		2022 Sample		Independent t-tests
Scale		M	SD	M	SD	Cohen's d	t (466)	p
Content Scales	Inattention/Executive Dysfunction	49.2	9.9	50.3	11.7	0.10	1.09	.227
	Hyperactivity	49.4	10.0	51.0	12.3	0.14	1.55	.121
	Impulsivity	49.9	10.3	50.6	11.7	0.06	0.68	.498
	Emotional Dysregulation	49.6	9.9	50.6	11.7	0.09	0.95	.344
	Negative Self-Concept	48.8	9.3	49.8	10.1	0.10	1.09	.277
DSM Symptom Scales	ADHD Inattentive Symptoms	49.3	9.9	50.7	11.7	0.13	1.43	.155
	ADHD Hyperactive/Impulsive Symptoms	49.4	10.1	51.0	12.0	0.14	1.51	.132
	Total ADHD Symptoms	49.3	10.2	50.8	12.3	0.14	1.50	.134

Note. N = 234 for the 2019 sample, and 234 for the 2022 sample. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for the 2022 sample vs. the 2019 sample.

CAARS™ 2–ADHD Index

The CAARS 2–ADHD Index was examined to confirm that it performed similarly across the 2019 and 2022 samples (see chapter 12, CAARS 2–ADHD Index, for more information on the development, scores, and psychometric properties of the CAARS 2–ADHD Index). The probability scores for each sample were compared using the Wilcoxon Rank Sum Test (note that this non-parametric approach was favored, as the probability score does not meet assumptions of normality; Wilcoxon, 1945); an effect size, r (Rosenthal, 1991), is also provided, which can be interpreted using the correlation guidelines provided in Test-Retest Reliability in chapter 8, Reliability.

There was no statistically significant difference in the probability scores between the 2019 and 2022 samples for both Self-Report (V = 32,864; p = .221; r = -0.05), and Observer (V = 27,438; p = .966; r = -0.002), and the magnitude of effects was very weak. Thus, it appears the CAARS 2-ADHD Index operated similarly in both the 2019 and 2022 samples.

Associated Clinical Concern Items and Impairment & Functional Outcome Items

The Associated Clinical Concern Items and Impairment & Functional Outcome Items of the CAARS 2 were examined in both the 2019 and 2022 samples. For each item, the proportion of individuals endorsing an item or providing an elevated item response in each sample was determined (see the Associated Clinical Concerns: Item Selection and Scoring section in chapter 6, Development for more information on how endorsed and elevated responses were determined for items in these scales). A chi-square independence test was performed to see whether these proportions were statistically significantly different in the 2019 and 2022 samples.

Results of the item-level analyses are presented in Table K.7. Overall, it was found that there were some statistically significant differences between the Associated Clinical Concern Items (for Observer) and the Impairment & Functional Outcomes Items (for both Self-Report and Observer). For these significant differences, proportions were higher in the 2022 sample as opposed to the 2019 sample. Given that the Content Scale scores appear to be unchanged during this same period, as described earlier in this appendix, but significant differences were noted for some of the item-level content, it is possible that the pandemic and its associated struggles are responsible for the increase in frequency and/or severity of these impairment items, as opposed to changes in the nature of ADHD symptoms and features itself. Thus, it would be prudent to monitor the changes observed here over the coming years.

Click to expand

Table K.7. Proportion of Associated Clinical Concern and Impairment & Functional Outcome Items Endorsed/Elevated for the 2019 and 2022 Samples

Item Set	Item Stem	Self-Report				Observer
Item Set	Item Stem	2019 Sample %	2022 Sample %	χ²	p	2019 Sample %	2022 Sample %	χ²	p
Associated Clinical Concern Items	Anxiety/worry	32.9	29.7	0.60	.440	9.4	20.1	10.62	.001
	Sadness/emptiness*	25.3	23.7	0.17	.677	8.1	12.4	2.32	.128
	Suicidal thoughts/attempts	35.3	32.5	0.44	.508	12.4	18.4	3.22	.073
	Self-Injury	15.7	22.1	3.36	.067	5.6	15.0	11.24	.001
Impairment & Functional Outcome Items	Bothered by things endorsed on the CAARS 2	14.9	21.3	3.47	.062	9.0	17.9	8.09	.004
	Things endorsed on the CAARS 2 interfere with life	18.5	16.2	0.45	.503	13.2	18.4	2.31	.128
	Problems in romantic/marital relationship(s)	27.3	31.7	1.17	.280	27.4	27.4	0.00	1.000
	Problems in relationships with family members	18.5	24.3	2.50	.114	16.2	21.8	2.34	.126
	Problems in relationships with friends, coworkers, or neighbors	14.5	20.6	3.21	.073	10.3	18.4	6.29	.012
	Problems at work and/or school	19.3	29.3	6.82	.009	18.0	21.4	0.82	.364
	Finds things harder than other people	12.0	16.5	1.99	.159	6.4	11.5	3.77	.052
	Underachiever	14.9	17.4	0.00	.000	11.1	12.8	0.32	.569
	Sleep problems	19.7	20.5	0.05	.823	23.9	27.4	0.72	.397
	Problems managing money	15.3	22.9	4.70	.030	17.1	24.4	3.76	.053
	Neglects family/household responsibilities	12.9	13.7	0.08	.778	10.3	16.7	4.13	.042
	Risky driving	14.9	25.3	8.46	.004	10.7	17.5	4.52	.034
	Problems due to time spent online	14.5	20.1	2.75	.097	17.1	22.6	2.27	.132

Note. * The item stem for this Screening item is Sadness/emptiness for Self-Report and Sadness for Observer.

Summary

The COVID-19 pandemic and its associated public health response led to worse mental health outcomes for individuals (NCHS, 2022). To ensure that the CAARS 2 structure and its normative scores did not change, a study was conducted to compare responses collected in 2022 to demographically matched subsets of the CAARS 2 Normative Samples. Overall, it was found that the CAARS 2 had comparable reliability in 2022 as compared to 2019. Further, it was found that the validity of the CAARS 2 was not compromised, as evidenced by the invariance of its Content Scales, a lack of statistical and practical differences in observed mean scale scores, and a lack of difference in probability scores associated with the CAARS 2-ADHD Index. Although it was found that there were some differences for select items in the Associated Clinical Concern Items and Impairment & Functional Outcome Items, the fact that ADHD Content scales did not significantly change indicates that these impairments may have been influenced by the pandemic itself. These elevations in the Associated Clinical Concern Items and Impairment & Functional Outcome Items may return to their 2019 levels as we proceed past the pandemic, or they may remain elevated if impacts continue to persist.

< Back

Next >