Manual

Conners 4 Manual

Chapter 4: Step-by-Step Interpretation Guidelines

Step-by-Step Interpretation Guidelines

view all chapter tables | print this section

Step 1: Examine the Response Style Analysis.
Step 2: Examine responses to Critical & Indicator Items.
Step 3: Interpret scale scores.
Step 4: Consider item-level responses of the Conners 4 Scales and the Additional Questions.
Step 5: Integrate results across multiple raters, with other sources of information, and monitor change over time.

A sequential approach to the interpretation of Conners 4 scores is provided next, followed by information for comparing results across different raters and across time (if applicable). Although these interpretation steps are specific to the full-length Conners 4, relevant steps can be applied to the Conners 4–Short and Conners 4–ADHD Index. For examples of the interpretation process, see chapter 5, Case Studies.

The following procedure is a systematic step-by-step strategy that can be used when interpreting scores obtained from one rater or multiple raters who provided observations about a single youth. This step-by-step sequence is not meant to be the only acceptable method for interpretation but is intended as a general guide for assessors as they go through the results of the Conners 4. To aid the interpretation process, the digital reports follow the same sequence. Whenever possible, the appropriate sections of the report will be highlighted, and a sample image of the section of the report is provided for illustration.

Before going through the step-by-step sequence, examine the Overview section of the Conners 4 Single-Rater report (see Figure 4.1). This section is a one-page snapshot of the Conners 4 results and provides a good visual summary of the areas that are flagged (i.e., may be of critical concern or require immediate and further evaluation).

Figure 4.1. Conners 4 Single-Rater Report: Overview

Step 1: Examine the Response Style Analysis.

The first step in interpretation is to consider whether the rater provided usable data or if they provided inconsistent or misleading responses. The Conners 4 results will only be accurate if the information provided by the rater is a reasonable reflection of the youth who was rated. The Response Style Analysis section of the report (see Figure 4.2) provides results from the two validity indices—Negative Impression Index and Inconsistency Index—as well as a review of the number of omitted items and the average number of items completed per minute (Pace). As seen in Figure 4.2, the report will indicate if there is a possibility that the validity of the responses may be compromised, both with a symbol and with text.

Figure 4.2. Conners 4 Single-Rater Report: Response Style Analysis

The Negative Impression Index includes items that describe improbable symptoms or unlikely presentations of problems or behaviors (e.g., “I have no control over my behavior”) or provide an overly negative description (e.g.,“Nothing makes me happy”). A rater’s high endorsement of the Negative Impression Index items (see appendix C for a list of these items) may indicate an attempt to provide a more severe or exaggerated presentation of problems or a less favorable impression of the youth (for the Parent or Teacher form) or of themself (for the Self-Report form). A high score on the Negative Impression Index does not provide any insight into reasons or motivation for creating a more negative impression than the clinical picture may warrant, but such scores can highlight cases when responses may reflect noncredible responding. Potential reasons for the Negative Impression Index to be indicated for any rater (including, parents, teachers, or the youth) include: (a) the rater being highly motivated to get access to medication for the youth being rated or to help the youth receive special accommodations or services; (b) the rater feeling ignored when they talk about their concerns and feeling the need to exaggerate the youth’s problems in order to be heard; or (c) in the case of the youth self-report, the youth may be experiencing acute psychological and emotional difficulties and signaling a need for help.

The Negative Impression Index score is compared to the cut-off values presented in Table 4.2 (for information on the development of the Negative Impression Index and cut-off score see chapter 6, Development). When the score exceeds the cut-off value for this scale, it does not automatically invalidate the results of the Conners 4. However, it does merit further examination of other information from various sources (e.g., clinical interviews with family and friends, an examination of developmental history, rater’s patterns of responses on other scales) and a discussion with the rater to identify the potential reasons for high endorsement of the items on the Negative Impression Index.

Click to expand

Table 4.2. Interpretation Guidelines for Negative Impression Index

Negative Impression Index Raw Score			Interpretive Guideline
Parent	Teacher	Self-Report	Interpretive Guideline
0–7	0–9	0–8	There was no indication of exaggerated responding.
≥ 8	≥ 10	≥ 9	An unrealistic or exaggerated presentation of the youth’s problems may have been provided. This index includes items for which high endorsement is either unlikely to be true or is extremely uncommon, even for youth with a confirmed diagnosis of ADHD. These items are likely to be endorsed in an attempt to present a less favorable impression of the youth. The score on this index can be elevated due to a number of reasons; for example, the rater may be highly motivated to describe the youth in a negative manner in order for the youth to receive accommodation or services.

The Inconsistency Index score indicates the level of inconsistency in responses to similar items (see appendix C for a list of the item-pairs). This index includes pairs of items that were selected based on their high correlation with each other in both the general population and clinical sample; thus, rating these pairs differently would be unusual. However, the correlations among the item-pairs are not perfect (see chapter 6, Development, for the correlations). Accordingly, having some inconsistent ratings for the item-pairs is possible. For example, the item, “Actively refuses to do what adults tell them to do” may be rated as 2, “Pretty much true (Often/Quite a bit),” but it is reasonable to imagine its pair—“Actively refuses to follow rules”—could be rated slightly differently. However, too many of these inconsistencies will result in a flag on this scale suggesting the possibility of random responding or not properly attending to item content.

An item-pair counts as having an inconsistent rating when there is more than a 1-point difference between their ratings. For example, an item rated as 3, “Completely true (Very often/Always),” and its pair rated as 2 (a difference of 1 point) would not count towards the Inconsistency Index. However, an item rated as 3 and its pair rated as 1, “Just a little true (Occasionally),” resulting in a difference of 2 points, would count towards the Inconsistency Index. The Inconsistency Index raw score is calculated by summing the differences between ratings for pairs with differences greater than 1 point. This score is compared to the cut-off values presented in Table 4.3 (for information on the development of the Inconsistency Index and its cut-off score, see chapter 6, Development).

Inconsistent responding can occur intentionally or unintentionally, and could be due to deliberate non-compliance, fatigue, a misunderstanding of the items or instructions, rushing, inattention, disinterest, or lack of motivation. Inconsistencies can also occur when the rater interprets the items to have nuanced differences. Like with the Negative Impression Index, when the rater provides responses to the item pairs that result in an elevated Inconsistency Index, it does not necessarily make the Conners 4 results invalid. It is best to review the item-pairs with the rater to explore why such differences in responses may have occurred.

Click to expand

Table 4.3. Interpretation Guidelines for Inconsistency Index

Inconsistency Index Raw Score			Interpretive Guideline
Parent	Teacher	Self-Report	Interpretive Guideline
0–3	0–2	0–4	There was no indication of inconsistent responding.
≥ 4	≥ 3	≥ 5	Responses to similar items showed a high level of inconsistency. This inconsistency may have been due to careless responding or difficulty comprehending some items.

The number of omitted items should also be examined. For most scales, an interpretation of the scores with omitted items can be obtained using a prorating method (omissions are handled differently for certain scales and scores; for more information on this, see appendix A), as long as the number of omitted items does not exceed the maximum number allowed (see Omitted Responses in chapter 3, Scoring and Reports and appendix A for more information). Although the prorating method provides the best estimate of the individual’s scores in the presence of omitted items, these scores could still be underestimates or overestimates of the individual’s actual level of functioning, and it is not possible to determine the exact extent of the under- or over-estimation. Given the potential impact of omitted items on the scale scores, it is strongly advised that raters complete all items to obtain a more accurate estimate. In order to limit the influence of omitted items, if the number of omissions is greater than the maximum number allowed, scales will not be scored and scores for that scale will not be provided in the report. The report will show a “?” to indicate any scale score that cannot be scored due to too many omitted items.

Omissions can occur deliberately (e.g., the rater decides not to respond to the item, the rater intends to respond to it later but forgets to go back to the skipped item) or unintentionally (e.g., the rater missed the item). Unintentional omissions may be less likely for online administrations compared to paper-and-pencil administrations, because with online administrations, the rater is provided with a pop-up prompt if they do not respond to an item. In the prompt, they are reminded that they have not responded to the item and to select a response; if they choose not to, they can move to the next item. Thus, when they occur, omissions should be explored further to determine if they are accidental or deliberate (i.e., appear to be at random or if there is a common theme underlying the omissions), especially when the assessment is completed in paper-and-pencil format. Omitted items can be found in the Items by Scale section of the report.

Pace (available only for online administrations) provides the average number of items completed per minute. This metric identifies whether an unusually fast or unusually slow response rate is indicated. Determinations for what is too fast or slow versus the typical response rate was based on average item response times during data collection in the Normative and ADHD Reference Samples (see Table 4.4 for the cut-off values and guidelines to interpret scores for Pace; see also chapter 6, Development, for details on the development of the Pace metric).

If the rater completed the assessment too quickly, this fast pace may indicate that they were responding randomly, did not take adequate time to read and understand the items, or did not consider the response options and respond accordingly. In such cases, little confidence can be placed in the rated individual’s scores. On the other hand, if the rater took too long to respond to items, this slow pace may indicate, for example, that the rater had difficulty reading and understanding the items (e.g., verbal administration of the test) or took breaks while completing the test. In either circumstance, assessors should follow up with raters to learn why the rating scale was completed so quickly or so slowly. If responses are entered from a paper-and-pencil administration of the Conners 4, there will be no value calculated for Pace.

Click to expand

Table 4.4. Interpretation Guidelines for Pace

Pace (Average Number of Items Completed per Minute)			Interpretive Guideline
Parent	Teacher	Self-Report	Interpretive Guideline
≥ 17	≥ 20	≥ 16 (ages 8 to 11) ≥ 18 (ages 12+)	This is an unusually fast pace. There could be many reasons for this; for example, the rater may have rushed through the task, or they may not have spent enough time reading the items or thinking about their responses.
1 to 16	1 to 19	1 to 15 (ages 8 to 11) 1 to 17 (ages 12+)	This pace was consistent with expectations for this form.
< 1	< 1	< 1	This is an unusually slow pace. There could be many reasons for this, such as being interrupted, distracted, or having difficulty comprehending the items while completing this form.

All available information should be considered when examining the Response Style Analysis, including scores on the Negative Impression Index and Inconsistency Index, the number of omitted items, and the pace (if applicable). The presence of a flagged metric in the Response Style Analysis is a very strong indicator that the responses are atypical and potentially invalid, but the absence of a flagged metric may or may not indicate the reporting as invalid. It should be noted that in clinical settings, it is best practice to use independent validity assessment tools to provide a more thorough evaluation of validity of responding. Such information should also be integrated with the information provided in the Response Style Analysis. The validity of ratings is not an absolute yes or no decision. In this first step of the interpretation sequence, consider all possible concerns and use this information to guide your interpretation of the results.

Step 2: Examine responses to Critical & Indicator Items.

The Conners 4 has two sets of critical items—Severe Conduct and Self-Harm (see chapter 6, Development, for a discussion of how these items were selected). Severe Conduct Critical Items represent severe misconduct and behaviors that describe past violence, destructive behaviors, or harm to others (e.g., “Has intentionally set fires for the purpose of causing damage” and “Is cruel to animals”). Self-Harm Critical Items consist of two items on the Conners 4 Parent and Teacher forms, and three items on the Conners 4 Self-Report form that ask about past self-injurious thoughts or behavior (e.g., “I have planned or tried to hurt myself”). When a critical item is endorsed, meaning the response to any of the critical items is anything other than 0, “Not true at all (Never/Rarely),” it is strongly recommended that assessors investigate immediately. Additional information should be gathered through interviews with family and close friends, as well as the youth themself. Asking the youth directly may inform if the potential for self-harm or severe conduct is present currently or only historical in nature (see chapter 9, Validity, for details on endorsement differences as a function of raters). It would also be appropriate to consider administering other measures that go beyond the screening items to explore these areas further when they have been flagged on the Conners 4.

The Sleep Problems Indicator consists of items that reflect behaviors that may suggest problems or difficulties with sleep (e.g., “Has trouble falling or staying asleep”). There are two screening items on the Conners 4 Parent and Self-Report forms, and one item on the Conners 4 Teacher. If the level of endorsement exceeds the cut-offs (see Critical & Indicator Items: Item Selection & Scoring in chapter 6, Development for more details about the determination of cut-offs), it suggests that the reported potential sleep issues are higher than typical for youth of the same age (and gender, if Gender Specific Normative samples are selected). When the Sleep Problems Indicator is flagged, it does not automatically mean that the person is experiencing significant sleep problems. It means that a more in-depth assessment of sleep difficulties is recommended to examine the relationship between the sleep problems and other problems reported in the Conners 4.

Follow up can be done using the PROMIS Sleep Related Disturbance–Short Form 8a and/or the PROMIS Sleep Related Impairment–Short Form 8a (which are available for free to Conners 4 users on the MHS Online Assessment Center+). The full test description and scoring manual can be found here: www.healthmeasures.net. Note: For the English version of the Parent-Proxy, items have been modified from the source item. Specifically, gender-specific pronouns have been replaced with gender-inclusive pronouns.

Sleep Disturbance–Short Form 8a: The PROMIS Sleep Disturbance instruments assess self-reported perceptions of sleep quality, sleep depth, and restoration associated with sleep. This includes perceived difficulties and concerns with getting to sleep or staying asleep, as well as perceptions of the adequacy of and satisfaction with sleep. The Sleep Disturbance–Short Form does not focus on symptoms of specific sleep disorders, nor does it provide subjective estimates of sleep quantities (total amount of sleep, time to fall asleep, amount of wakefulness during sleep). The Sleep Disturbance–Short Form is universal rather than disease specific. It assesses sleep disturbance over the past seven days.
Sleep-Related Impairment–Short Form 8a: The PROMIS Sleep-Related Impairment items focus on self-reported perceptions of alertness, sleepiness, and tiredness during usual waking hours, and the perceived functional impairments during wakefulness associated with sleep problems or impaired alertness. Though the Sleep-Related Impairment–Short Form does not directly assess cognitive, affective, or performance impairment, it does measure waking alertness, sleepiness, and function within the context of overall sleep-wake function. The Sleep-Related Impairment–Short Form is universal rather than disease-specific. It assesses sleep-related impairment over the past seven days.

Critical & Indicator Items that warrant immediate or recommended follow-up are flagged in the reports (see appendix C for a list of these items; see Figure 4.3 for a sample of this section).

Figure 4.3. Conners 4 Single-Rater Report: Critical & Indicator Items

Step 3: Interpret scale scores.

The third step is divided into two stages. Step 3a involves examining the scores from the Content Scales, Impairment & Functional Outcome Scales, DSM ADHD Symptom Scales, and the Conners 4–ADHD Index to determine if there are any scale-level elevations. It also includes an examination of the Within-Profile Comparisons among the Content Scales directly associated with ADHD (i.e., Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, and Emotional Dysregulation) and the Impairment & Functional Outcome Scales (i.e., Schoolwork, Peer Interactions, and Family Life on the Parent and Self-Report forms; Schoolwork and Peer Interactions on the Teacher form). Step 3b involves comparing how the scores from the Content Scales, Impairment & Functional Outcome Scales, DSM Symptom Scales, and the ADHD Index relate to each other. To review the item content of each scale, please see appendix C. Figure 4.4 provides a sample of the Conners 4 Scales section of the report that provides the various scale scores.

Figure 4.4. Conners 4 Single-Rater Report: Conners 4 Scales

Step 3a: Examine Conners 4 Scale Scores.

First, the scores from the six Conners 4 Content Scales are examined. These scales include content that is representative of constructs that are directly or indirectly associated with ADHD in youth. Table 4.5 provides a brief description of each content scale. High scores on the Content Scales are indicative of problems in the specified area.

In addition to examining each scale individually, Within-Profile Comparisons for the Conners 4 Content Scales directly related to ADHD (i.e., Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, and Emotional Dysregulation) can also be investigated. This analysis compares each of the scale’s T-scores to the youth’s average T-score across these four scales. The Within-Profile Comparisons identify which scores, if any, are significantly higher than the rated youth’s average score across these scales, as well as which scores, if any, are significantly lower than the rated youth’s average score. Identifying these differences is useful for understanding the best areas to focus on first, during intervention or treatment, especially if the youth’s profile features multiple scale elevations. It is important to note that if any of these four Content Scales could not be scored due to omitted responses, the Within-Profile Comparison is not provided. The scoring program includes the Within-Profile Comparisons (this option is turned on by default), although the user can choose to exclude this analysis in the report. See chapter 3, Scoring and Reports, for details about report options.

Click to expand

Table 4.5. Conners 4 Content Scale Descriptions

Scale	Description	Common Problems Reported by High Scorers
Inattention/Executive Dysfunction	Items on this scale relate to issues the youth may have with focusing, sustaining, and shifting attention, as well as self-management.	May report trouble with paying attention and being easily distracted, as well as difficulty with other areas of executive function such as planning, organizing, and time management.
Hyperactivity	Items on this scale reflect the youth’s level of motor or verbal activity and restlessness.	May report difficulty with staying still or sitting still for long periods of time, needing to move around, getting overly excited, and/or talking when they should be quiet.
Impulsivity	Items on this scale reflect difficulties a youth may have with response inhibition.	May report problems with inhibition, both verbal (e.g., talking out of turn) and behavioral (e.g., acting without thinking).
Emotional Dysregulation	Items on this scale reflect the youth’s experience of, or difficulty with, regulating or managing emotions (can include emotional impulsivity, anger management, and over-reacting).	May report trouble calming down when upset and quick and drastic mood changes.
Depressed Mood	Items on this scale assess features of depression.	May report feeling helpless, hopeless, and worthless, as well as reporting tiredness and decreased enjoyment of favorite activities.
Anxious Thoughts	Items on this scale reflect the youth’s experience of, and difficulty with, regulating fears and worries.	May report appearing (or feeling) tense or nervous and worrying too much about different things.

Next, the Impairment & Functional Outcome Scales are considered. These scales consist of impairments reported in key functional domains, namely Schoolwork, Peer Interactions, and Family Life. Evaluating the results of these scales is a critical part of the interpretation process because responses to these items indicate the level of impairment experienced by the youth in the different areas of their day-to-day functioning. It is also linked to diagnosis and identification. The Diagnostic and Statistical Manual of Mental Disorders¹ (DSM) requires evidence of clinically significant impairment in social, academic, and occupational functioning, as well as the presence of impairment in at least two settings (e.g., home, school) for a diagnosis of ADHD. In addition, the Individuals with Disabilities Education Improvement ACT (IDEA 2004) requires evidence that the reported problems adversely impact the youth’s functioning for determination of eligibility. Refer to Table 4.6 for the descriptions of each scale and the common concerns reported by high scorers. Following up with a more comprehensive measure of impairment, such as the Weiss Functional Impairment Rating Scales™ (WFIRS™; Weiss et al., 2018) or the Rating Scales of Impairment™ (RSI™; Goldstein & Naglieri, 2016) will allow for a more in-depth assessment of the youth’s level of impairment.

Within-Profile Comparisons are also provided for the Impairment & Functional Outcome Scales, which compares each of the scale’s T-scores to the youth’s average across these scales. The Within-Profile Comparisons identify which scores, if any, are significantly higher than the rated youth’s average score across these scales, as well as which scores, if any, are significantly lower than the rated youth’s average score. Identifying these differences is useful for understanding the best areas to focus on first, during intervention or treatment, especially if the youth’s profile features multiple scale elevations. If any of Impairment & Functional Outcome Scales could not be scored due to omitted responses, the within-profile comparison is not provided. As described earlier, the scoring program includes the Within-Profile Comparisons by default, although the user can choose to exclude this analysis in the report. See chapter 3, Scoring and Reports, for details about report options.

Click to expand

Table 4.6 Conners 4 Impairment & Functional Outcome Scale Descriptions

Scale	Description	Common Problems Reported by High Scorers
Schoolwork	Items on this scale reflect typical problems or difficulties that youth with ADHD experience in their schoolwork.	May report turning in late or incomplete work, losing homework, and not checking work for mistakes.
Peer Interactions	Items on this scale reflect typical problems that youth with ADHD experience when interacting with peers.	May report being perceived as annoying by peers, not being invited by others to play or go out, and others not wanting to be friends with them.
Family Life	Items on this scale reflect typical problems or difficulties that youth with ADHD experience or contribute to in family interactions.	May report creating stress and chaos among family members, as well as causing family to be late for appointments.

Next, the DSM Symptom Scales are examined. The Conners 4 DSM Symptom Scales include the DSM ADHD Inattentive Symptoms, DSM ADHD Hyperactive/Impulsive Symptoms, DSM Total ADHD Symptoms², DSM Oppositional Defiant Disorder Symptoms, and DSM Conduct Disorder Symptoms. These scales are rationally derived and map onto symptom criteria from the DSM (Refer to Table 4.7 for the descriptions of each scale and the common concerns reported by high scorers). However, these scales do not include the full diagnostic criteria. They focus on DSM Diagnostic Criterion A only, and therefore the evaluation of additional criteria (e.g., course, differential diagnosis, level of impairment, pervasiveness) must be completed before a DSM diagnosis can be assigned. Furthermore, items on the Conners 4 are approximations of the DSM symptoms; they are intended to represent the main clinical construct in a format that most individuals can understand. Rewording the professional language of the DSM into more common language is likely to lead to more informative responses because the rater can understand what is being asked. However, because of this rewording, some aspects of the DSM criteria may not be fully represented. It should also be noted that certain aspects of the DSM symptom criteria cannot be completely covered by a rating scale and must be independently determined by the assessor. For full details about the scoring criteria for the Conners 4 DSM Symptom Scales, see appendix D.

Click to expand

Table 4.7. Conners 4 DSM Symptom Scale Descriptions

Scale	Description	Common Problems Reported by High Scorers
ADHD Inattentive Symptoms	Items on this scale reflect each of the DSM Diagnostic Criteria A for DSM ADHD Predominantly Inattentive Presentation.	May report often failing to pay attention to detail, making careless mistakes, having difficulty sustaining attention, being easily distracted, and being forgetful.
ADHD Hyperactive/Impulsive Symptoms	Items on this scale reflect each of the DSM Diagnostic Criteria A for DSM ADHD Predominantly Hyperactive/Impulsive Presentation.	May report often fidgeting, running around or climbing in inappropriate situations, blurting out responses before questions are completed, interrupting, and intruding.
Total ADHD Symptoms	Items on this scale are the combination of all items from DSM ADHD Inattentive and DSM ADHD Hyperactive/Impulsive symptom scales. Combining items from these two scales provides a dimensional representation of the ADHD symptoms, irrespective of presentation type.	May report problems that reflect mainly inattentive symptoms, or mainly hyperactive and/or impulsivity symptoms, or both.
Oppositional Defiant Disorder Symptoms	Items on this scale reflect each of the DSM Diagnostic Criteria A for DSM Oppositional Defiant Disorder.	May report often having an angry or irritable mood, often engaging in defiant behavior, and being vindictive.
Conduct Disorder Symptoms	Items on this scale reflect each of the DSM Diagnostic Criteria A for DSM Conduct Disorder.	May report engaging in aggression towards others, destruction of property, stealing, and engaging in serious rule violations.

The Conners 4 DSM Symptom Scales are reported with both standardized scores and Symptom Counts. It should be noted that because the DSM Total ADHD Symptoms combines the items from both the DSM ADHD Inattentive and DSM ADHD Hyperactive/Impulsive Symptoms scales to reflect a dimensional assessment of the ADHD symptoms, there is no Symptom Count for this scale.

Standardized scores, such as T-scores, and Symptom Counts provide two different perspectives—relative and absolute, respectively. The relative perspective refers to comparing one individual to a group of individuals (e.g., how the person is doing in comparison to others of the same age [and gender, if Gender Specific reference samples are selected]). The absolute perspective refers to an absolute rule (e.g., based on counts of levels of symptom endorsement, in reference to a criterion).

The DSM Symptom Counts of the Conners 4 reflect an absolute perspective, wherein the presence of a certain number of symptoms from a finite list must be documented. To be considered for a diagnosis of the DSM ADHD Predominantly Inattentive Presentation, or a diagnosis of the DSM ADHD Predominantly Hyperactive/Impulsive Presentation, a symptom count of 6 or greater from the DSM Diagnostic Criterion A must be demonstrated by a youth aged 6 to 16 years, or a symptom count of 5 or greater for individuals aged 17 years or older. For a diagnosis of DSM Oppositional Defiant Disorder, a symptom count of 4 or greater from the DSM Diagnostic Criterion A must be presented, whereas for a diagnosis of DSM Conduct Disorder, a symptom count of 3 or greater is needed. For each of the DSM Symptom Scales (except for the DSM Total ADHD Symptoms scale), each of the symptoms outlined in Criterion A is represented by at least one item. Depending on the specific criterion, an endorsement or specific level of endorsement (e.g., an item response of 2 or 3) will add to the Symptom Count. Some of the DSM symptom criteria have a combination of items to represent the symptom (see appendix D, DSM Symptom Scales Scoring Criteria, for more information). As outlined in appendix D, there are specific scoring notes and interpretive considerations for each of the DSM Symptom Scales. For example, for the DSM Conduct Disorder Symptoms scale, the assessor must ensure that the truancy occurred before the age of 13 years in order for Criterion A15 (truancy) to be indicated and thereby counted towards the Symptom Count. These specific scoring notes and interpretive considerations all need to be considered during evaluation (see appendix D for a full list of considerations). To review the item ratings that led to the Symptom Count, see the Items by Scale section of the report, where the criterion status for each symptom is identified.

For the DSM ADHD Scales, T-scores and Symptom Counts are integrated in the Interpretive Summary of the report to provide information about ADHD Combined Presentation and Other Specified ADHD.

DSM ADHD Combined Presentation. The Interpretive Summary section of the report combines information from the Symptom Counts and T-scores for both the DSM ADHD Inattentive Symptoms scale and the DSM ADHD Hyperactive/Impulsive Symptoms scale to determine if follow-up about ADHD Combined Presentation may be recommended. First, it is required that the DSM ADHD Inattentive Symptoms scale and the DSM ADHD Hyperactive/Impulsive Symptoms scale each have a Symptom Count of 6 or greater for youth aged 6 to 16 years, or 5 or greater for individuals aged 17 years and older. This requirement is based on the symptom threshold, as outlined by the DSM. In addition, the DSM requires that reported symptoms be outside of developmental expectations to warrant consideration of a diagnosis of the ADHD Combined Presentation. Thus, in addition to meeting the Symptom Count requirements, the T-scores for both scales must be at least in the Slightly Elevated range (i.e., 60 or higher). If both conditions are met, the report will provide interpretive text indicating that a diagnosis of ADHD Combined Presentation merits further consideration.
DSM Other Specified ADHD. The Interpretive Summary section of the report combines information from the Symptom Counts and T-scores for both the DSM ADHD Inattentive Symptoms scale and the DSM ADHD Hyperactive/Impulsive Symptoms scale, as well as the T-score from the DSM Total ADHD Symptoms scale. When the ADHD Total Symptoms T-scores is Slightly Elevated or higher (indicating that the Total ADHD Symptoms score was higher than what is typically reported) in the absence of elevations on both of the DSM ADHD Inattentive and Hyperactive/Impulsive Symptom scales (i.e., the T-scores are both in the Low or Average range and the Symptom Counts are both below the DSM threshold), then the pattern of results is inconclusive. If other sources of information suggest the possibility of ADHD, a designation of Other Specified may be appropriate.

Because both T-scores and Symptom Counts are provided for the DSM Symptom Scales, the Interpretive Summary of the report integrates both score types to guide next steps for the assessor. For example, for the DSM ADHD Inattentive Symptoms scale, if the T-score is in the Low or Average range (i.e., < 60) and the Symptom Count is 0, then the report will indicate that symptoms of ADHD Predominantly Inattentive Presentation are not present. On the other end of the spectrum, if the T-score is Very Elevated (i.e., ≥ 70) and the DSM threshold for the Symptom Count was met or exceeded (i.e., ≥ 6 for ages 6 to 16 years; ≥ 5 for ages 17 years and older), then the report will indicate that clinically significant symptoms of ADHD Predominantly Inattentive Presentation are present. In these examples, both T-scores and Symptoms Counts are aligned; however, at other times, there may be discrepancies between them.

Discrepancies are to be expected, given that the T-score and Symptom Count are based on different metrics (i.e., relative versus absolute, respectively). In contrast to Symptom Counts, the DSM Symptom Scale T-scores take age (and gender, if Gender Specific reference samples are selected) into account.

In addition, the calculation of these scores also differs. T-scores are based on the sum of the ratings on all the scale items. For example, a rating of 1,“Just a little true (Occasionally),” adds 1 point to the raw score. Conversely, Symptom Counts are based only on specific ratings of the items that satisfy the DSM symptom criteria. For example, an endorsement of a 2,“Pretty much true (Often/Quite a bit),” or a 3, “Completely true (Very often/Always),” on an item in the DSM ADHD Hyperactive/Impulsive Symptoms scale would count towards the Symptom Count for this scale. For more information on how these two scores are calculated, see chapter 6, Development.

Thus, it is possible for T-scores and Symptom Counts to differ. While these differences can occur for all DSM Symptoms Scales, this is especially relevant for the DSM Conduct Disorder Symptoms scale (see Why are DSM Conduct Disorder T-scores so Easily Elevated? in this chapter for details).

Why are DSM Conduct Disorder T-scores so Easily Elevated? The DSM Conduct Disorder Symptoms scale includes 15 items (13 items on the Teacher form) that map onto the diagnostic criteria for Conduct Disorder (refer to appendix D). These items assess aggression to people and animals, destruction of property, deceitfulness, theft, and serious violations of rules. Overall, these items are very rarely endorsed in the Normative Samples. Therefore, any endorsement of these items can easily lead to T-score elevations. For example, a T-score of 61 can result if a parent rater endorses any two DSM Conduct Disorder items at a level of 3, “Completely true (Very often/Always).” However, because only two of the symptoms are endorsed, the Symptom Count is below the DSM cut-off score of 3 out of the 15 symptoms (American Psychiatric Association, 2013). Thus, even endorsement of only two symptoms at a level of 3 can lead to this slight elevation in T-scores, resulting in a difference between the T-score and the Symptom Count. It is important, therefore, that the assessor carefully considers the items endorsed by the rater to guide interpretation. As discussed in Step 2: Examine responses to Critical & Indicator Items, endorsement of any of the Severe Conduct Critical Items would always warrant immediate attention. It is important to note that DSM Conduct Disorder does not require that the symptoms exceed developmental expectations, as the majority of symptoms listed are not appropriate at any age and do not have a developmental trajectory. Therefore, while the T-scores can help guide an assessor in terms of how typical a given level of endorsement is relative to peers, it is the absolute Symptom Count that must be examined in determining if further investigation of a possible Conduct Disorder diagnosis is warranted.

Finally, the Conners 4–ADHD Index is examined. The Conners 4–ADHD Index consists of 12 items on each rater form (Parent, Teacher, and Self-Report) that best distinguish individuals in the ADHD Reference Sample from individuals in the general population. The index provides a probability score that ranges from 1% to 99%. The score represents the likelihood that the youth being evaluated has ratings that are either more similar to those of individuals in the general population, more similar to those of youth with ADHD, or fall in the borderline category (see chapter 12, Conners 4–ADHD Index, for more information on the development of the Conners 4–ADHD Index).

Suggested interpretation guidelines are provided in Table 4.8. The higher the probability score, the more likely the youth’s scores correspond to the scores of youth diagnosed with ADHD; the lower the probability score, the more likely the youth’s scores correspond to the scores of youth in the general population. An elevated ADHD Index score suggests that an ADHD diagnosis should be considered. If the ADHD Index score is low, the person may still qualify for a diagnosis of ADHD, but their scores are not like those of typical individuals with ADHD. Note that the Conners 4–ADHD Index is available as a standalone form.

Click to expand

Table 4.8. Conners 4–ADHD Index Probability Score Guidelines

Probability Score	Guideline
90% to 99%	The probability score is in the Very High range, indicating very high similarity with youth of the same age who have ADHD. The ADHD Index score is very dissimilar to scores from the general population.
80% to 89%	The probability score is in the High range, indicating high similarity with youth of the same age who have ADHD. The ADHD Index score is dissimilar to scores from the general population.
60% to 79%	The probability score is in the Moderate range, indicating the score is slightly more similar to scores from youth of the same age who have ADHD, compared to the general population. Scores in this range require careful examination of scale- and item-level elevations from the remaining Conners 4 Scales.
40% to 59%	The probability score is in the Borderline range, indicating the score is similar to those produced by youth of the same age, whether they are in the general population or have been diagnosed with ADHD. Estimating whether the youth is more likely to be in one of these groups than the other will require consideration of additional findings.
10% to 39%	The probability score is in the Low range, indicating low similarity with youth of the same age who have ADHD. The ADHD Index score is more similar to scores from the general population.
1% to 9%	The probability score is in the Very Low range, indicating very low similarity with youth of the same age who have ADHD. The ADHD Index score is much more similar to scores from the general population.

Step 3b: Examine the Profile of Conners 4 Scale Scores.

The next step is to examine the profile of all the scores to evaluate the overall presentation. In the report, refer to the Overview section for a graph (see Figure 4.1 for an example) and the Conners 4 Scales section for a table (see Figure 4.4 for an example) illustrating the scores for the Conners 4 Content, Impairment & Functional Outcome, DSM Symptom Scales, and the Conners 4–ADHD Index.

If all, or many, of the scores are elevated, this result may indicate pervasive problems across many areas. It might also indicate exaggerated symptoms (exaggeration is likely if the Negative Impression Index score is also high). If all the scores are low, this result may suggest relatively few problems or a denial of problems. If there is a mix of elevated and low scores, the elevated scores will indicate which areas are most problematic for the youth and could offer areas to focus on for treatment or intervention. In most cases, scores among these scales would complement each other; however, there will be instances where they might not, which would present a much more complex clinical profile (see Possible Discrepancies: Content Scales and DSM Symptom Scales and Possible Discrepancies: Content/DSM Symptom Scales and the Conners 4–ADHD Index for these various scenarios). When faced with seemingly inconsistent scores, a discrepancy does not necessarily indicate an error or invalid administration (although it is always wise to double-check data entry if the Conners 4 was administered through a paper form, and to consider possible response style concerns). The components of the Conners 4 entail multiple perspectives to enrich a user’s understanding of each youth being rated, and these different scores provide different types of information. In such cases, a closer examination of the ratings is needed to determine what may be leading to the scale elevations.

Possible Discrepancies: Content Scales and DSM Symptom Scales

There might be differences between the Content Scale scores and the DSM Symptom Scale scores. On the Conners 4, the DSM ADHD Inattentive Symptoms Scale is a subset of the items from the Inattentive/Executive Dysfunction Content Scale, and the DSM ADHD Hyperactive/Impulsive Symptoms Scale is a subset of the items from the Hyperactivity and Impulsivity Content Scales. Thus, the Content Scales go beyond what is measured on the DSM Symptom Scales, and that is why they should be considered in conjunction with each other. Due to the overlap between the Content and DSM Symptom Scales, their results are often aligned (i.e., both are elevated, or both are low). In these cases, the same broad interpretive guidelines can be applied. However, because the Content Scales go beyond the symptom criteria outlined in the DSM and include items that are important for capturing the broader construct being assessed, it is possible for the Content Scales to not be elevated even though the narrower DSM Symptom Scales are. For example, the Inattention/Executive Dysfunction Scale goes beyond the DSM ADHD Inattentive Symptoms Scale by including items that ask about problems with areas of executive functioning other than attention (e.g., planning, organization, time management). Similarly, the Hyperactivity and Impulsivity Content Scales focus on problems in each of these areas separately, whereas the DSM Hyperactive/Impulsive Symptoms Scale combines these two areas and places greater emphasis on hyperactive behavior (six criteria in DSM) than impulsive behavior (three criteria in DSM). For this reason, it is possible for the DSM Symptom Scales to be elevated while the Content Scales are not, and vice versa. When this mismatch occurs, it is best to examine the item-level ratings (i.e., the Items by Scale section in the report) to determine where the discrepancy lies. For instance, the rater might have given the non-DSM items more extreme ratings than the DSM items, or vice versa.

Possible Discrepancies: Content/DSM Symptom Scales and the Conners 4–ADHD Index

Low Scales with High ADHD Index. If all or most of the T-scores for the scales that measure ADHD symptoms (i.e., Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, and Emotional Dysregulation, and the ADHD DSM Symptom Scales) are in the Low or Average range (i.e., T-score < 60) but the Conners 4–ADHD Index probability score is High or Very High (i.e., 80% or higher), this suggests that the youth being described has similarities to the ADHD Reference Group, although symptoms and associated features of ADHD were not endorsed at very high levels. This pattern of results is the most common type of discrepancy found on the Conners 4 because the Conners 4–ADHD Index was created to be a sensitive screener that minimizes the number of false negatives. Note that the Conners 4–ADHD Index was designed to discriminate between an ADHD Reference Sample and a General Population Sample, not to differentiate among various clinical groups. When this pattern of results occurs, consider other diagnostic possibilities that could account for this mismatch in scores. For instance, there might be benefits from the accommodated settings and/or effective interventions that improve the core symptoms of ADHD but do not have as much impact on behaviors captured by the Conners 4–ADHD Index. In this scenario, it is best to turn to the scale- and item-level elevations of the Conners 4 Content and DSM Symptom Scales to help in determining a possible diagnosis of ADHD, as well as to help target areas for treatment and intervention.
Elevated Scales with Low ADHD Index. It is possible (although it is rare) to have several of the scale T-scores as Slightly Elevated or higher (i.e., T-score ≥ 60) while the Conners 4–ADHD Index probability score is Low or Very Low (i.e., 39% or lower). This suggests that the youth being described has many symptoms and associated features of ADHD but does not show a similar pattern to what is often seen for youth of the same age diagnosed with ADHD. It is possible to have a diagnosis of ADHD even when the Conners 4–ADHD Index probability score is 39% or lower. This pattern of results simply means the youth’s presentation is different from the ADHD Reference Sample used in the development of the Conners 4. When this pattern occurs, it is best to turn to the scale- and item-level elevations of the Conners 4 Content and DSM Symptom Scales to help in determining a possible diagnosis of ADHD.

Step 4: Consider item-level responses of the Conners 4 Scales and the Additional Questions.

To better understand scale scores, it is important to review individual item responses by examining the Items by Scale section of the report. The utility of item-level responses is especially notable when elevated T-scores are obtained and/or significant differences from the within-profile comparisons have been identified, so that one can determine which items may have contributed to the elevated scale scores. It is also informative to examine the items in the absence of scale elevations to identify whether any items are rated at high levels and marked as elevated compared to youth of the same age (and gender, if Gender Specific reference samples are selected). Items endorsed at levels that were deemed infrequent in the Conners 4 Normative Sample (that is, endorsed at approximately the 85th percentile or higher), are flagged as “elevated” in the Items by Scale section of the report. For instance, one or two items may be rated at 3, “Completely true (Very often/Always),” where the rest are not, meaning they are rated mostly at 0, “Not true at all (Never/Rarely),” or 1, “Just a little true (Occasionally).” This pattern of item responses would typically not lead to scale elevations, especially in scales with many items (such as the Inattention/Executive Dysfunction Scale, with 20 items on the full-length form and 10 items on the shortened form). However, it would be helpful for the assessor to examine which items, if any, were elevated, to explore potential difficulties the youth may be experiencing despite not having sufficient difficulties for a full-scale elevation. It is also possible to have an Elevated scale score without any elevated items for that scale. This result can be due to several items endorsed just below the threshold for elevation, and when summed together, result in an overall content area that may be of concern.

Whether or not there are scale elevations, looking at the individual item responses for a scale will ultimately result in improved interpretation and intervention. Items rated most problematic may help prioritize which treatment targets represent the most immediate needs in that setting, or they may indicate additional treatment goals.

Finally, each rater is asked to complete three Additional Questions at the end of the rating scale. The first question asks the rater to provide extra information on the pervasiveness of the problems reported in the different domains of functioning. The second question gives the rater the opportunity to describe any other current issues or problems that have not been captured by the Conners 4 items. The rater’s response to this question may indicate other areas that should be investigated. For both questions, the rater may describe problems that are already captured by their responses on the Conners 4; this reiteration typically represents high levels of concern about the issues and indicates that the rater wants to be certain that their concern is communicated clearly to the assessor. Finally, the third question asks the rater to describe the youth’s strengths and skills. This question encourages consideration of the youth’s positive qualities, which can increase rapport with the youth and their family and reduce a problem-focused, negative perspective on the youth being rated. Recognition of strengths and skills is important for effective interventions, in that the youth’s abilities can be used as building blocks for success in treatment.

Step 5: Integrate results across multiple raters, with other sources of information, and monitor change over time.

Across Raters

Comparing Conners 4 scores obtained from different raters (Parent, Teacher, and Self-Report) not only helps to gain an understanding of the relationship between behaviors and different contexts, but it also allows for different perspectives on the rated youth to be considered. For example, the youth who does not get what they want may be able to control their emotions in a structured setting (such as school or work) but may lose control in unstructured settings (e.g., at home). Similarly, the youth may lose the ability to concentrate in noisy environments (e.g., when peers are talking or there is construction in the vicinity) but find it easy to work effectively in a quiet setting (e.g., a quiet room). These discrepancies can provide insight into factors that may improve the youth’s functioning and those factors that should be managed when setting up the youth’s environment. It is recommended that different raters complete the Conners 4 around the same time period so that the behaviors rated by all raters reflect current/recent functioning and are less impacted by the time elapsed or potential development over time.

Comparing scores obtained from different raters must take measurement error into consideration by examining statistically significant differences between raters’ scores. For example, a 16-year-old male’s T-score of 67 (Elevated) for the Inattention/Executive Dysfunction Scale on the Conners 4 Self-Report might be interpreted as higher than the score of 63 (Slightly Elevated) for the same scale on the Conners 4 Teacher Form. However, the difference is not statistically significant when the measurement error associated with these scales is taken into account. These two scores should be interpreted as similar, rather than different, despite their differing classifications (see Table 4.1). To compare the Conners 4 T-scores obtained from different raters, values needed to establish statistical significance were calculated. The Conners 4 Multi-Rater Report compares scores from multiple raters (see chapter 3, Scoring and Reports, for details) and significant differences between raters’ scores are automatically calculated and included in this report.

Other Sources of Information

A complete evaluation must include multiple modalities of data collection. Information from the Conners 4 should be integrated with information collected in other ways, such as interviews, record reviews, observation, other rating scales, and other forms of testing. The richness of this information can help interpret the validity and clinical significance of the Conners 4 scores.

Changes Over Time

The American Academy of Pediatrics recognizes that ADHD is a chronic condition that requires ongoing monitoring across the lifespan of both symptoms and treatment efficacy (Wolraich et al., 2019). The Conners 4 works well for periodic monitoring, given that it describes a discrete time period (1 month), allowing for valid retesting at regular intervals, and because test-retest reliability is strong (see Test-Retest Reliability in chapter 8, Reliability). Ideally, the time in between administrations should be at least one month. The measurement of change to evaluate treatment effectiveness can be difficult because it involves differentiating actual change brought about by intervention from random fluctuations in behavior or measurement error (Jensen, 2001; Ogles et al., 2001; Tingey et al., 1996). If there is a decrease in the score between pre- vs. post-treatment, this change suggests that the treatment is working in the right direction. But first, one needs to evaluate whether the change between the test scores is statistically significant. Note that the scores being compared should be obtained from the same rater.

A commonly used method for gauging statistically significant change has been outlined by Jacobson and Truax (1991), which involves calculating the Reliable Change Index (RCI) to determine whether a change in scores between test administrations is statistically significant. The values needed to establish statistical significance when comparing Time 1 to Time 2 scores were calculated for all scales on the Conners 4. These values take the SEM of each of the scales into account (see chapter 8, Reliability), creating different cut-off values for each level of significance. For instructions to manually calculate the difference between scores on pairs of administrations over time, as well as the values needed to determine the significance of differences at the p < .05 level of significance for the Conners 4 and Conners 4–Short, respectively, see appendix E

¹ Throughout this manual, DSM refers to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision (DSM-5-TR, 2022).

² The DSM Total ADHD Symptoms scale score is based on ratings provided for all the DSM ADHD items (i.e., a combination of the items from the DSM ADHD Inattentive and DSM ADHD Hyperactive/Impulsive Symptoms scales).

< Back

Next >

Chapter 4: Interpretation

Conners 4 Manual

Chapter 4: Step-by-Step Interpretation Guidelines

Step-by-Step Interpretation Guidelines

Step 1: Examine the Response Style Analysis.

Step 2: Examine responses to Critical & Indicator Items.

Step 3: Interpret scale scores.

Step 3a: Examine Conners 4 Scale Scores.

Why are DSM Conduct Disorder T-scores so Easily Elevated?

Step 3b: Examine the Profile of Conners 4 Scale Scores.

Possible Discrepancies: Content Scales and DSM Symptom Scales

Possible Discrepancies: Content/DSM Symptom Scales and the Conners 4–ADHD Index

Step 4: Consider item-level responses of the Conners 4 Scales and the Additional Questions.

Step 5: Integrate results across multiple raters, with other sources of information, and monitor change over time.

Across Raters

Other Sources of Information

Changes Over Time