Manual

CAARS 2 Manual

Appendix K: Impact of COVID-19 on Normative Scores


Appendix K: Impact of COVID-19 on Normative Scores

The Conners Adult ADHD Rating Scales 2nd Edition (CAARS™ 2) released norms based on samples collected in 2019 (see chapter 7, Standardization, for details about the Normative Samples). However, since the collection of that data, the COVID-19 pandemic occurred and disrupted life for many around the world. Part of the disruption included public health measures (e.g., physical distancing, working from home), which helped control the spread of COVID-19. On the other hand, these measures also had unintended consequences, such as adversely affecting mental health. For example, 4 in 10 adults in the U.S. reported symptoms of anxiety or depression during the pandemic, compared to 1 in 10 before it in 2019 (National Center for Health Statistics, 2022).

Although adverse mental health symptoms appeared to increase during this time, it is important to investigate whether a fundamental shift occurred in the normative reports of ADHD symptoms and features, as measured by the CAARS 2. If the pandemic has affected how ADHD manifests, updates to the CAARS 2 may be needed to prevent over- or under-identification of ADHD and to maintain the CAARS 2’s strong psychometric properties.

To investigate this, a study was conducted to compare a general population sample collected post-pandemic outbreak (termed the “2022 sample” throughout this section) with a demographically matched subset of the existing Normative Sample established before the pandemic outbreak (termed “2019 sample” throughout this section; see chapter 7, Standardization). In this study, the psychometric properties of the two samples were compared to determine whether a meaningful change had occurred. Evidence of reliability and validity were investigated for the CAARS 2 scales and item-level content, and findings are reported in this appendix. Overall, minimal differences were found, supporting the continued use of the Normative Samples.

Sample

Data collection for the 2022 sample took place from late May to early June when responses to the CAARS 2 were collected digitally via online recruitment. Participants accessed the assessment via an email link that was shared with them, and they completed either the Self-Report or Observer form for a small monetary compensation.

Three hundred and sixteen adults completed the CAARS 2 Self-Report form, and 314 adults completed the CAARS 2 Observer form. Eligible individuals for this study were from the U.S., aged 18 years or older, read English “very well,” and did not have a clinical diagnosis. Data were cleaned prior to analysis. Based on data quality metrics (e.g., indicators of careless/random responding), 51 and 64 individuals were removed from the Self-Report and Observer samples respectively due to violations of one or more rigorous data quality metrics. To compare these samples with the 2019 Normative Samples, individuals were matched on gender, age (which was coarsened to five ordered levels), race/ethnicity, and education level. This process resulted in a further exclusion of 16 cases for both the Self-Report and Observer samples, due to a lack of matching demographic characteristics in the Normative Sample.

The final sample size for the 2022 sample was 249 individuals for the Self-Report (with a corresponding 249 matched individuals from the 2019 Normative Sample; see chapter 7, Standardization, for information about the full Normative Sample) and 234 individuals for Observer (with a corresponding 234 matched individuals from the 2019 Normative Sample). The demographic characteristics for both the 2019 and 2022 samples for the Self-Report and Observer forms can be seen in Appendix J, Tables J.36 and J.37.

Reliability

The reliability of the CAARS 2 scales for the 2022 samples was assessed via internal consistency estimates (alpha and omega) and test information functions (see Internal Consistency and Test Information in chapter 8, Reliability, for more details). Internal consistency estimates were found to be excellent (see Table K.1) and in line with the full normative sample (see Tables 8.1a, 8.1b, 8.2a, and 8.2b in chapter 8, Reliability); the median omega for both the Self-Report and Observer was .96.

Test information functions for the CAARS 2 Self-Report and Observer were also examined for the 2019 and 2022 samples (see Figure K.1 and Figure K.2 for the Self-Report and Observer, respectively). All test information functions had high information across the relevant range of the ability scale, with peaks exceeding values of 10, indicating very high measurement precision. Test information functions were similarly above recommended guidelines for both the 2019 and 2022 samples.

Taken together, results from the internal consistency analysis and test information functions revealed that scale reliability for the CAARS 2 Self-Report and Observer has remained sufficiently precise since the outbreak of the COVID-19 pandemic.

Click to expand
Click to expand
Click to expand

Validity

Content and DSM Symptom Scales

CAARS 2 Content Scales were analyzed for invariance of the factor structure and latent traits (using measurement invariance [MI] and differential test functioning [DTF]; see appendix M for details about these methodologies; note also that DSM symptom scales are not analyzed in this way, as the items are subsumed by the Content Scales and not factor-derived). Mean scale score differences between the two samples were also computed. Results of the MI analyses are presented in Tables K.2 and K.3, and examples of results for the DTF approach are presented graphically in Figure K.3, with effect sizes to quantify the differences seen in Table K.4.

Overall, the CAARS 2 Content Scales were found to be invariant across the 2019 and 2022 samples and no DTF was detected. These results provide strong evidence that the structure of the CAARS 2 Content Scales did not change as a result of the COVID-19 pandemic.

Click to expand

Table K.2. Measurement Invariance (2019 vs. 2022): CAARS 2 Self-Report

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1467.92*** 810 .057 .975 .973 .054 --
Weak 1495.29*** 840 .056 .975 .974 .054 24.41 30 .000
Strong 1507.07*** 869 .054 .976 .976 .054 34.74 29 .001
Strict 1517.01*** 897 .053 .977 .977 .054 39.58 28 .001
Hyperactivity Configural 485.30*** 130 .105 .955 .946 .071 --
Weak 508.50*** 143 .102 .954 .950 .071 17.04 13 -.001
Strong 510.52*** 155 .096 .955 .955 .071 16.62 12 .001
Strict 512.44*** 167 .091 .956 .959 .072 23.86 12 .001
Impulsivity Configural 285.40*** 130 .069 .974 .968 .055 --
Weak 298.48*** 143 .066 .974 .971 .055 8.76 13 .000
Strong 295.08*** 155 .060 .976 .976 .056 9.79 12 .003
Strict 315.98*** 167 .060 .975 .977 .057 24.76 12 -.001
Emotional Dysregulation Configural 204.23*** 54 .106 .980 .973 .051 --
Weak 219.07*** 63 .100 .979 .976 .051 8.17 9 -.001
Strong 216.78*** 71 .091 .980 .980 .051 5.96 8 .001
Strict 210.81*** 79 .082 .982 .984 .051 7.21 8 .002
Negative Self-Concept Configural 88.83*** 28 .094 .990 .985 .033 --
Weak 100.51*** 35 .087 .989 .987 .033 9.16 7 -.001
Strong 106.35*** 41 .080 .989 .989 .034 10.03 6 .000
Strict 111.82*** 47 .075 .989 .990 .034 10.93 6 .000
Note. N = 249 for the 2022 sample; N = 249 for the 2022 sample. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Inattention/Executive Dysfunction revealed that one intercept had to be released for the strict invariance hypothesis to hold.
Click to expand

Table K.3. Measurement Invariance (2019 vs. 2022): CAARS 2 Observer

Scale Invariance Model χ2 df RMSEA CFI TLI SRMR Satorra-Bentler χ2 df ΔCFI
Inattention/​Executive Dysfunction Configural 1275.40*** 810 .050 .984 .983 .050 --
Weak 1312.35*** 840 .049 .984 .984 .050 46.70 30 .000
Strong 1323.18*** 869 .047 .985 .985 .050 24.92 29 .001
Strict 1340.83*** 898 .046 .985 .986 .050 39.01 29 .000
Hyperactivity Configural 510.52*** 130 .112 .959 .951 .087 --
Weak 537.87*** 143 .109 .958 .954 .087 20.05 13 -.001
Strong 542.49*** 155 .104 .959 .958 .087 13.41 12 .001
Strict 509.77*** 167 .094 .963 .966 .088 6.93 12 .005
Impulsivity Configural 249.98*** 130 .063 .984 .981 .049 --
Weak 269.72*** 143 .062 .984 .982 .049 19.87 13 .000
Strong 281.34*** 155 .059 .984 .984 .049 16.55 12 .000
Strict 271.00*** 167 .052 .987 .987 .049 7.36 12 .003
Emotional Dysregulation Configural 247.54*** 54 .124 .980 .973 .048 --
Weak 268.58*** 62 .120 .978 .975 .048 17.79 8 .000
Strong 263.43*** 70 .109 .980 .979 .049 8.00 8 .002
Strict 250.82*** 78 .098 .982 .983 .049 9.28 8 .002
Negative Self-Concept Configural 57.49*** 28 .067 .992 .988 .038 --
Weak 67.09*** 35 .063 .991 .989 .038 8.06 7 -.001
Strong 61.33*** 41 .046 .994 .994 .038 2.17 6 .003
Strict 66.00*** 47 .042 .995 .995 .040 7.21 6 .001
Note. N = 234 for the 2022 sample; N = 234 for the 2019 sample. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001. Exploration of partial invariance models for the Emotional Dysregulation Scale revealed that one threshold had to be released for the weak invariance hypothesis to hold.
Click to expand
Click to expand

Table K.4. Differential Test Functioning Effect Sizes (2019 vs. 2022)

Scale Self-Report Observer
Inattention/Executive Dysfunction .01 .01
Hyperactivity .00 .00
Impulsivity .08 .00
Emotional Dysregulation .00 .00
Negative Self-Concept .00 .00
Note. Guidelines for interpretation: small effect size ≥ |0.20|; medium effect size ≥ |0.50|; large effect size ≥ |0.80|. Positive ETSSD values indicate that individuals with equal amounts of the constructs being measured in the 2022 sample scored higher than individuals in the 2019 sample.

Mean scale scores for the 2019 and 2022 samples were compared, and results are presented in Tables K.5 and K.6. No statistically significant differences were observed between Content Scale or DSM Symptom Scale scores for the 2019 and 2022 samples (all p > .01), and all effect sizes were small or negligible (median d = 0.12 for both the Self-Report and Observer). Thus, mean observed scale scores did not appear to differ across the 2019 and 2022 samples.

Click to expand

Table K.5. Mean Differences (2019 vs. 2022): CAARS 2 Self-Report

Scale 2019 Sample 2022 Sample Independent t-tests
M SD M SD Cohen's d t (496) p
Content Scales Inattention/Executive Dysfunction 49.5 9.2 50.6 11.1 0.11 1.21 .226
Hyperactivity 50.1 10.1 51.4 11.8 0.12 1.30 .195
Impulsivity 49.3 9.8 51.7 12.2 0.22 2.40 .017
Emotional Dysregulation 49.7 10.2 51.0 11.1 0.13 1.43 .154
Negative Self-Concept 50.2 10.2 50.4 10.0 0.02 0.23 .815
DSM Symptom Scales ADHD Inattentive Symptoms 49.6 9.4 50.5 11.0 0.08 0.95 .345
ADHD Hyperactive/Impulsive Symptoms 49.9 9.6 51.2 12.0 0.12 1.37 .172
Total ADHD Symptoms 49.7 9.5 50.9 11.8 0.11 1.18 .239
Note. N = 249 for the 2019 sample, and 249 for the 2022 sample. Guidelines for interpreting Cohen's |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen's d value indicates higher scores for the 2022 sample vs. the 2019 sample.
Click to expand

CAARS™ 2–ADHD Index

The CAARS 2–ADHD Index was examined to confirm that it performed similarly across the 2019 and 2022 samples (see chapter 12, CAARS 2–ADHD Index, for more information on the development, scores, and psychometric properties of the CAARS 2–ADHD Index). The probability scores for each sample were compared using the Wilcoxon Rank Sum Test (note that this non-parametric approach was favored, as the probability score does not meet assumptions of normality; Wilcoxon, 1945); an effect size, r (Rosenthal, 1991), is also provided, which can be interpreted using the correlation guidelines provided in Test-Retest Reliability in chapter 8, Reliability.

There was no statistically significant difference in the probability scores between the 2019 and 2022 samples for both Self-Report (V = 32,864; p = .221; r = -0.05), and Observer (V = 27,438; p = .966; r = -0.002), and the magnitude of effects was very weak. Thus, it appears the CAARS 2-ADHD Index operated similarly in both the 2019 and 2022 samples.

Associated Clinical Concern Items and Impairment & Functional Outcome Items

The Associated Clinical Concern Items and Impairment & Functional Outcome Items of the CAARS 2 were examined in both the 2019 and 2022 samples. For each item, the proportion of individuals endorsing an item or providing an elevated item response in each sample was determined (see the Associated Clinical Concerns: Item Selection and Scoring section in chapter 6, Development for more information on how endorsed and elevated responses were determined for items in these scales). A chi-square independence test was performed to see whether these proportions were statistically significantly different in the 2019 and 2022 samples.

Results of the item-level analyses are presented in Table K.7. Overall, it was found that there were some statistically significant differences between the Associated Clinical Concern Items (for Observer) and the Impairment & Functional Outcomes Items (for both Self-Report and Observer). For these significant differences, proportions were higher in the 2022 sample as opposed to the 2019 sample. Given that the Content Scale scores appear to be unchanged during this same period, as described earlier in this appendix, but significant differences were noted for some of the item-level content, it is possible that the pandemic and its associated struggles are responsible for the increase in frequency and/or severity of these impairment items, as opposed to changes in the nature of ADHD symptoms and features itself. Thus, it would be prudent to monitor the changes observed here over the coming years.

Click to expand

Table K.7. Proportion of Associated Clinical Concern and Impairment & Functional Outcome Items Endorsed/Elevated for the 2019 and 2022 Samples

Item Set Item Stem Self-Report Observer
2019 Sample % 2022 Sample % χ2 p 2019 Sample % 2022 Sample % χ2 p
Associated Clinical Concern Items Anxiety/worry 32.9 29.7 0.60 .440 9.4 20.1 10.62 .001
Sadness/emptiness* 25.3 23.7 0.17 .677 8.1 12.4 2.32 .128
Suicidal thoughts/attempts 35.3 32.5 0.44 .508 12.4 18.4 3.22 .073
Self-Injury 15.7 22.1 3.36 .067 5.6 15.0 11.24 .001
Impairment & Functional Outcome Items Bothered by things endorsed on the CAARS 2 14.9 21.3 3.47 .062 9.0 17.9 8.09 .004
Things endorsed on the CAARS 2 interfere with life 18.5 16.2 0.45 .503 13.2 18.4 2.31 .128
Problems in romantic/marital relationship(s) 27.3 31.7 1.17 .280 27.4 27.4 0.00 1.000
Problems in relationships with family members 18.5 24.3 2.50 .114 16.2 21.8 2.34 .126
Problems in relationships with friends, coworkers, or neighbors 14.5 20.6 3.21 .073 10.3 18.4 6.29 .012
Problems at work and/or school 19.3 29.3 6.82 .009 18.0 21.4 0.82 .364
Finds things harder than other people 12.0 16.5 1.99 .159 6.4 11.5 3.77 .052
Underachiever 14.9 17.4 0.00 .000 11.1 12.8 0.32 .569
Sleep problems 19.7 20.5 0.05 .823 23.9 27.4 0.72 .397
Problems managing money 15.3 22.9 4.70 .030 17.1 24.4 3.76 .053
Neglects family/household responsibilities 12.9 13.7 0.08 .778 10.3 16.7 4.13 .042
Risky driving 14.9 25.3 8.46 .004 10.7 17.5 4.52 .034
Problems due to time spent online 14.5 20.1 2.75 .097 17.1 22.6 2.27 .132
Note. * The item stem for this Screening item is Sadness/emptiness for Self-Report and Sadness for Observer.

Summary

The COVID-19 pandemic and its associated public health response led to worse mental health outcomes for individuals (NCHS, 2022). To ensure that the CAARS 2 structure and its normative scores did not change, a study was conducted to compare responses collected in 2022 to demographically matched subsets of the CAARS 2 Normative Samples. Overall, it was found that the CAARS 2 had comparable reliability in 2022 as compared to 2019. Further, it was found that the validity of the CAARS 2 was not compromised, as evidenced by the invariance of its Content Scales, a lack of statistical and practical differences in observed mean scale scores, and a lack of difference in probability scores associated with the CAARS 2-ADHD Index. Although it was found that there were some differences for select items in the Associated Clinical Concern Items and Impairment & Functional Outcome Items, the fact that ADHD Content scales did not significantly change indicates that these impairments may have been influenced by the pandemic itself. These elevations in the Associated Clinical Concern Items and Impairment & Functional Outcome Items may return to their 2019 levels as we proceed past the pandemic, or they may remain elevated if impacts continue to persist.

< Back Next >