Manual

Conners 4 Manual

Chapter 7: Standardization Procedures and Continuous Norming


Standardization Procedures and Continuous Norming

view all chapter tables | print this section

Trends in the raw scores of the Conners 4 were examined to determine normative groups with regard to gender and age. Analyses of variance (ANOVAs) were conducted to compare mean differences between males and females for each raw scale score for Parent, Teacher, and Self-Report. Consistent with previous research on the presence of gender differences in the symptomatology of ADHD (e.g., Arcia & Conners, 1999; Arnett et al., 2014), small, but statistically significant differences (p < .01) were observed between genders across many of the Conners 4 scales. As a result of these findings, normative groups were created for Combined Gender (comprising male, female, and other genders), as well as separate normative groups for Males and for Females. Gender differences are explored in more detail in chapter 10, Fairness.

Furthermore, analyses were conducted on the Normative Samples to determine whether statistically and practically significant differences across ages within each gender could be observed using Kruskal-Wallis chi-square tests, and to compare standardized differences (Cohen’s d effect size ratios). There was a statistically significant linear effect of age, when treated as discrete groups by full-year intervals (with ages 17 and 18 combined into a single group) on many raw scale scores (see Tables 7.42 to 7.44 for full results). Additionally, many scales revealed a small but statistically significant association between raw score and age, when treated as a continuous variable (as measured by Kendall’s tau [τ] correlation coefficient; p < .01 for the majority of the scales; see Table 7.45 for a summary of results).






Effect sizes between all adjacent age intervals were then examined for the Combined Gender and Gender Specific samples, separately, to determine whether single-year intervals or larger age bands were more meaningful. Effect sizes, as measured by Cohen’s d standardized mean difference ratios, for adjacent intervals ranged from -0.28 to 0.32 for Parent, -0.22 to 0.25 for Teacher, and -0.22 to 0.26 for Self-Report. Taken together, these results indicate that, beyond the overall age trend previously noted, there are small but potentially important differences between pairwise age intervals. These modest differences were deemed meaningful enough to warrant the creation of 12 age-based normative groups (i.e., 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17/18 for Parent and Teacher; note that Self-Report begins at age 8, and therefore only has 10 normative age groups).

To best capture the important relationship between age and raw scores on the Conners 4, continuous norming was selected as the method for creating standardized scores. Through this regression-based method (Roid, 1983; Zachary & Gorsuch, 1985; Zhu & Chen, 2011), the means and standard deviations of the normative age groups were statistically smoothed to mitigate the effects of sampling variability and to better model the progression of symptoms across ages. By establishing a line or curve that best fits the data, continuous norming makes efficient use of information from the whole sample rather than drawing upon one age group at a time (Angoff & Robertson, 1987). All analyses were conducted in R Studio, using the stats package (version 3.6.2; R Core Team, 2013).

For each scale on the Conners 4, standard assumptions for general linear models were checked (e.g., inspecting normality, absence of outliers, heteroskedasticity; Tabachnick & Fidell, 2007). Due to the nature of the constructs (symptomology that is not present in much of the general population), the raw scores for most scales violated assumptions of normality. The positive skew was corrected by applying a square root transformation, where necessary, prior to entering the scores into regression models (Lenhard, Lenhard, Suggate, & Segerer, 2016). Each regression model included age (in years) and age-squared as predictors, testing both a linear and curvilinear relationship, with the scale score as the outcome variable. If the curvilinear (quadratic) term was not statistically significant (p < .05), this term was dropped from the regression and a linear model was used instead. The resultant parameter estimates (i.e., unstandardized beta weights) from the regression model were extracted and used to derive a smoothed predicted mean for each age group.

The same process was applied to the standard deviations. Standard deviations for each scale score were calculated for each age group, serving as the outcome variable, and age (and age-squared, where statistically significant) was entered as a continuous predictor. Again, the parameter estimates of these regression models were extracted and used in the computation of smoothed predicted standard deviation values for each age group. The smoothed means and standard deviations were then used to calculate T-scores, and these standard scores showed reduced noise due to sampling variability and ensured no discontinuity between adjacent age groups. The Conners 4 Combined Gender and Gender Specific Normative Samples, for Parent, Teacher, and Self-Report, were standardized to have T-scores with a mean of 50 and standard deviation of 10.

Empirical percentiles were also calculated within each group in the Normative Samples. Empirical percentiles are generated using the frequency distribution of the actual scores. Therefore, if 90% of the scores are at, or below, a given raw score, that raw score is assigned the 90th percentile. In contrast, theoretical percentiles could be calculated from the empirical T-scores, such that a T-score of 50 is equivalent to a percentile rank of 50 and follows a standard normal distribution. However, due to the skewness and non-normality of many of the scales in the Conners 4, empirical percentiles were selected instead to better reflect the shape of the distributions when communicated via percentiles for the Normative Samples.

This continuous norming process was also applied to the ADHD Reference Samples to create their standardized scores. For consistency with the Normative Samples, the same age groups were used where possible; however, note that due to sparsity of the data, age 6 and age 7 were collapsed into a single age group (i.e., 6/7) for the Parent and Teacher ADHD Reference Samples. For each scale, a smoothed predicted mean and standard deviation was calculated from a regression to predict each scale’s raw score using age (and age-squared, where statistically significant) as a continuous variable. Given the shape of the distribution of the ADHD Reference Samples, corrections for positive skew were not necessary, in contrast to the Normative Samples, and untransformed raw scores were entered into the regressions. By using this method, scores for all ages can be interpolated (based on the resultant regression line), even for age groups with a small sample size (e.g., N = 7 for 13-year-olds for Self-Report). The entire sample is employed in calculating the shape of the regression line, and therefore provides a stable estimate by reducing dependence on small samples. This method, in turn, reduces noise from sampling variability and produces scores that can be used to interpret all age groups without discontinuous jumps between groups. Using the same process as the Normative Samples, the Combined Gender and Gender Specific ADHD Reference Samples, for Parent, Teacher, and Self-Report, were standardized to have T-scores (M = 50, SD = 10) for each scale on the Conners 4. To calculate percentiles for the ADHD Reference Sample, theoretical percentiles were chosen, rather than empirical, to better capture the shape of the distribution of responses. The raw scores of the ADHD Reference Samples were more normally distributed than the Normative Samples, and therefore, theoretical percentiles are most appropriate.


< Back Next >