Manual

Conners 4 Manual

Chapter 10: Race/Ethnicity


Race/Ethnicity

view all chapter tables | print this section

The historical and continuing marginalization of certain racial and ethnic groups in North America has the potential to either disadvantage groups (e.g., by yielding a higher score) or introduce unintended bias during the test construction. Accordingly, it was meaningful to examine how the Conners 4 operates within these groups to ensure equity standards are met. For the U.S. portion of the Conners 4 Normative Samples, race and ethnicity were categorized according to the U.S. Census Bureau classifications into the following five groups: Hispanic (regardless of race), Black, Asian, White, and Other (includes Native American, Multiracial, and other racial identities not otherwise listed). Youth whose race/ethnicity was classified as Asian or Other were excluded from MI and DTF analyses due to small sample sizes (for Other in particular, meaningful interpretation of group-level scores is challenging given that this category includes a multitude of race groups). Additionally, race/ethnicity analyses in this section are limited to youth who live in the U.S., as the Canadian sample sizes were too small to permit meaningful analyses. More details about the correspondence to the U.S. Census classifications can be found in Race/Ethnicity in chapter 7, Standardization. Differences among the U.S. racial and ethnic groups were explored with regard to the Conners 4 structure and scores. It was expected that there would be negligible differences in terms of race/ethnicity, as the test was designed to minimize the influence of cultural background, with the goal of generalizing to diverse populations, as described in chapter 6, Development.

First, MI was explored within the U.S. subsamples of the Conners 4 Total Samples. Tables 10.8 and 10.9 present the MI results for Conners 4 Parent and Teacher, respectively, for the comparison of Hispanic (N = 520 for Parent and N = 411 for Teacher) and White youth (N = 1,727 for Parent and N = 1,545 for Teacher). For Parent, there were no meaningful decreases in model fit, as hypothesized. The Satorra-Bentler χ2 test was significant for the intercept model when testing Conners 4 scales, but other fit statistics did not show a decline in fit, indicating no evidence for a lack of invariance. For Teacher, while the comparison between loading and intercept models (also termed strong invariance) for the Content Scales was significantly statistically different when examined with the Satorra-Bentler χ2 test, no other fit statistics showed any meaningful decrease in fit, indicating that strong invariance was met.

Tables 10.10 and 10.11 present the MI results for Parent and Teacher, respectively, for Black (N = 312 for Parent and N = 376 for Teacher) and White youth (N = 1,727 for Parent and N = 1,545 for Teacher). Neither rater form showed a meaningful decrease in model fit when comparing these two racial groups. While the Satorra-Bentler χ2 test was significant for some comparisons (e.g., the intercept model when testing Content Scales), many other fit statistics did not show a decline in fit, ultimately supporting the overall invariance of the Conners 4 scales.

Due to the smaller sample size for Black youth on the Conners 4 Self-Report (N = 171), Hispanic youth (N = 286) and Black youth were combined into a larger group (N = 457), and the combined group was compared to White youth (N = 791) for MI analysis. It should be acknowledged that combining the groups in this way may limit interpretability of results, despite the necessity due to limited sample size. Future research will explore White vs. Hispanic and White vs. Black comparisons in larger samples to confirm the pattern of results presented within this section. Table 10.12 presents the MI results for Self-Report. The Satorra-Bentler χ2 test was significant for comparisons of the Content Scales and the loading model of the Impairment & Functional Outcome Scales. However, many other fit statistics did not show a decline in fit across factor structure, thresholds, loadings, and intercepts for all scales, supporting evidence of the Conners 4’s ability to measure the construct with the same structure for White youth as it does for a combined sample of Black and Hispanic youth.


Click to expand

Table 10.8. Measurement Invariance U.S. Race/Ethnicity (White vs. Hispanic): Conners 4 Parent

Scales

Model

χ2

df

RMSEA

CFI

TLI

SRMR

Comparison

Satorra-Bentler χ2

df

CFI

Content Scales

Configural

9110.26***

3274

.040

.974

.973

.042

Threshold

9158.18***

3333

.039

.975

.974

.042

configural v. threshold

63.07

59

.001

Loading

9095.52***

3386

.039

.975

.975

.042

threshold v. loading

52.56

53

.000

Intercept

9019.27***

3439

.038

.976

.976

.042

loading v. intercept

86.53**

53

.001

Impairment & Functional Outcome Scales

Configural

1751.44***

298

.066

.982

.980

.045

Threshold

1788.04***

317

.064

.982

.981

.045

configural v. threshold

19.89

19

.000

Loading

1759.53***

333

.062

.983

.982

.045

threshold v. loading

21.86

16

.001

Intercept

1713.73***

349

.059

.983

.984

.046

loading v. intercept

30.26*

16

.000

DSM Oppositional Defiant Disorder Symptoms Scale

Configural

660.92***

70

.087

.982

.977

.038

Threshold

680.33***

80

.082

.982

.979

.038

configural v. threshold

7.93

10

.000

Loading

638.92***

89

.074

.983

.983

.038

threshold v. loading

8.18

9

.001

Intercept

609.47***

98

.068

.984

.986

.039

loading v. intercept

19.49*

9

.001

DSM Conduct Disorder Symptoms Scale

Configural

787.31***

180

.055

.971

.966

.076

Threshold

812.44***

195

.053

.971

.968

.076

configural v. threshold

15.32

15

.000

Loading

798.18***

209

.050

.972

.972

.076

threshold v. loading

14.83

14

.001

Intercept

707.97***

223

.044

.977

.978

.076

loading v. intercept

14.04

14

.005

Note. N = 520 Hispanic youth; N = 1,727 White youth. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001.


Click to expand

Table 10.9. Measurement Invariance by U.S. Race/Ethnicity (White vs. Hispanic): Conners 4 Teacher

Scales

Model

χ2

df

RMSEA

CFI

TLI

SRMR

Comparison

Satorra-Bentler χ2

df

CFI

Content Scales

Configural

9932.68***

3274

.046

.969

.968

.054

Threshold

9983.78***

3333

.045

.970

.969

.054

configural v. threshold

61.16

59

.001

Loading

9927.18***

3386

.044

.970

.970

.054

threshold v. loading

58.39

53

.000

Intercept

9754.35***

3439

.043

.971

.971

.054

loading v. intercept

72.88*

53

.001

Impairment & Functional Outcome Scales

Configural

881.43***

106

.087

.984

.980

.052

Threshold

917.16***

118

.083

.984

.982

.052

configural v. threshold

12.52

12

.000

Loading

901.49***

128

.079

.984

.984

.052

threshold v. loading

10.61

10

.000

Intercept

883.86***

138

.074

.985

.985

.052

loading v. intercept

11.16

10

.001

DSM Oppositional Defiant Disorder Symptoms Scale

Configural

695.40***

70

.096

.986

.982

.042

Threshold

720.91***

80

.091

.986

.984

.042

configural v. threshold

7.75

10

.000

Loading

689.30***

89

.083

.987

.987

.042

threshold v. loading

10.96

9

.001

Intercept

600.96***

98

.072

.989

.990

.042

loading v. intercept

7.38

9

.002

DSM Conduct Disorder Symptoms Scale

Configural

457.97***

130

.051

.975

.970

.115

Threshold

472.88***

139

.050

.975

.972

.115

configural v. threshold

11.67

9

.000

Loading

445.70***

151

.045

.978

.977

.118

threshold v. loading

13.50

12

.003

Intercept

456.46***

159

.044

.978

.978

.117

loading v. intercept

13.41

8

.000

Note. N = 411 Hispanic youth; N = 1,545 White youth. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. *p < .05, **p < .01, ***p < .001.


Click to expand

Table 10.10. Measurement Invariance by U.S. Race/Ethnicity (White vs. Black): Conners 4 Parent

Scales

Model

χ2

df

RMSEA

CFI

TLI

SRMR

Comparison

Satorra-Bentler χ2

df

CFI

Content Scales

Configural

7948.80***

3274

.037

.975

.974

.044

Threshold

7986.59***

3333

.037

.975

.974

.044

configural v. threshold

67.97

59

.000

Loading

7946.09***

3386

.036

.975

.975

.044

threshold v. loading

65.65

53

.000

Intercept

7853.37***

3439

.036

.976

.976

.044

loading v. intercept

79.83*

53

.001

Impairment & Functional Outcome Scales

Configural

1624.32***

298

.066

.983

.980

.048

Threshold

1646.85***

317

.064

.983

.981

.048

configural v. threshold

23.07

19

.000

Loading

1606.40***

333

.061

.983

.983

.048

threshold v. loading

21.66

16

.000

Intercept

1563.55***

349

.058

.984

.984

.048

loading v. intercept

32.62**

16

.001

DSM Oppositional Defiant Disorder Symptoms Scale

Configural

600.33***

70

.086

.983

.978

.039

Threshold

619.63***

80

.081

.982

.980

.039

configural v. threshold

11.30

10

.001

Loading

612.51***

89

.076

.983

.983

.039

threshold v. loading

13.09

9

.001

Intercept

596.34***

98

.071

.984

.985

.039

loading v. intercept

14.33

9

.001

DSM Conduct Disorder Symptoms Scale

Configural

681.68***

180

.052

.974

.970

.083

Threshold

696.69***

194

.050

.974

.972

.083

configural v. threshold

12.56

14

.000

Loading

667.32***

208

.047

.977

.976

.083

threshold v. loading

15.27

14

.003

Intercept

594.27***

221

.041

.981

.982

.083

loading v. intercept

10.89

13

.004

Note. N = 312 Black youth; N = 1,727 White youth. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. * p < .05, ** p < .01, ***p < .001.


Click to expand

Table 10.11. Measurement Invariance by U.S. Race/Ethnicity (White vs. Black): Conners 4 Teacher

Scales

Model

χ2

df

RMSEA

CFI

TLI

SRMR

Comparison

Satorra-Bentler χ2

df

CFI

Content Scales

Configural

9766.24***

3274

.045

.970

.969

.054

Threshold

9818.22***

3333

.045

.970

.970

.054

configural v. threshold

55.42

59

.000

Loading

9798.50***

3386

.044

.971

.970

.054

threshold v. loading

78.03*

53

.001

Intercept

9732.96***

3439

.044

.971

.971

.054

loading v. intercept

103.61***

53

.000

Impairment & Functional Outcome Scales

Configural

909.08***

106

.089

.983

.979

.049

Threshold

951.18***

118

.086

.982

.980

.049

configural v. threshold

14.38

12

.001

Loading

949.13***

128

.082

.982

.982

.049

threshold v. loading

9.10

10

.000

Intercept

993.80***

138

.080

.982

.983

.049

loading v. intercept

36.15***

10

.000

DSM Oppositional Defiant Disorder Symptoms Scale

Configural

826.94***

70

.106

.985

.981

.041

Threshold

860.22***

80

.101

.985

.983

.041

configural v. threshold

5.95

10

.000

Loading

864.62***

89

.095

.985

.985

.041

threshold v. loading

10.14

9

.000

Intercept

830.89***

98

.088

.986

.987

.041

loading v. intercept

10.71

9

.001

DSM Conduct Disorder Symptoms Scale

Configural

548.95***

130

.058

.975

.970

.115

Threshold

572.46***

141

.056

.974

.971

.115

configural v. threshold

14.94

11

.001

Loading

565.47***

153

.053

.975

.975

.115

threshold v. loading

12.32

12

.001

Intercept

590.70***

164

.052

.974

.975

.118

loading v. intercept

39.57***

11

.001

Note. N = 376 Black youth; N = 1,545 White youth. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. * p < .05, ** p < .01, ***p < .001.


Click to expand

Table 10.12. Measurement Invariance by U.S. Race/Ethnicity (Hispanic/Black vs. White): Conners 4 Self-Report

Scales

Model

χ2

df

RMSEA

CFI

TLI

SRMR

Comparison

Satorra-Bentler χ2

df

CFI

Content Scales

Configural

6298.23***

3390

.037

.955

.953

.055

Threshold

6373.18***

3450

.037

.954

.953

.055

configural v. threshold

94.95**

60

.001

Loading

6414.82***

3504

.037

.955

.954

.055

threshold v. loading

103.69***

54

.001

Intercept

6394.55***

3558

.036

.956

.956

.055

loading v. intercept

75.24*

54

.001

Impairment & Functional Outcome Scales

Configural

1095.85***

298

.066

.945

.936

.067

Threshold

1127.31***

317

.064

.944

.939

.067

configural v. threshold

16.37

19

.001

Loading

1160.22***

333

.063

.942

.941

.069

threshold v. loading

47.84***

16

.002

Intercept

1136.58***

349

.060

.945

.946

.070

loading v. intercept

23.28

16

.003

DSM Oppositional Defiant Disorder Symptoms Scale

Configural

478.67***

70

.097

.947

.931

.065

Threshold

507.39***

80

.093

.944

.937

.065

configural v. threshold

12.61

10

.003

Loading

459.53***

89

.082

.952

.951

.065

threshold v. loading

3.36

9

.008

Intercept

425.50***

98

.073

.957

.961

.065

loading v. intercept

6.54

9

.005

DSM Conduct Disorder Symptoms Scale

Configural

454.47***

180

.049

.953

.945

.087

Threshold

470.69***

194

.048

.952

.948

.087

configural v. threshold

11.44

14

.001

Loading

467.13***

208

.045

.955

.955

.087

threshold v. loading

14.52

14

.003

Intercept

443.84***

222

.040

.962

.964

.091

loading v. intercept

18.57

14

.007

Note. N = 457 Black and Hispanic youth; N = 791 White youth. RMSEA = Root mean square error of approximation; CFI = Comparative Fit Index; TLI = Tucker-Lewis Index; SRMR = Standardized root mean square residual; ∆CFI = change in CFI. * p < .05, ** p < .01, ***p < .001.


DTF analyses were examined next, to explore the invariance of the Conners 4 scales between Hispanic and White youth and between Black and White youth. An example of a DTF graph for the comparison of Black and White youth is provided in Figure 10.2. Test functioning curves for White youth and Black youth are depicted, along with a shaded band to display a 95% confidence interval, and the two groups’ curves are almost completely overlapping, demonstrating a lack of difference for the Inattention/Executive Dysfunction scale. Similar results were found for all scales across all forms, as well as for the comparisons between Hispanic and White youth.

The effect size of the DTF statistics measured by the ETSSD are summarized in Table 10.13. The largest effect size across Parent, Teacher, and Self-Report, when comparing either Black or Hispanic youth to White youth, was ESSTD = |0.11|, representing a very small effect size. The differences between groups was negligible, demonstrating a lack of measurement bias between White, Black, and Hispanic groups and reinforcing the generalizability of the Conners 4.

Taken together, both MI and DTF analyses indicate that the Conners 4 demonstrates psychometric equivalence for White and Hispanic youth, and for White and Black youth, providing evidence to support the fair use of the test.


Figure 10.2. Differential Test Functioning by U.S. Race/Ethnicity: Inattention/Executive Dysfunction

a) Parent

Parent

b) Teacher

Teacher

c) Self-Report

Self-Report



To investigate racial/ethnic groups in terms of their mean score differences, subsamples from the U.S. portion of the Normative Samples of Hispanic, Black, and Asian youth were compared, respectively, to a corresponding subsample of White youth from the Normative Samples that were matched on gender, age, language(s) spoken, clinical status, and PEL (PEL was matched Parent only). To preserve sample size, youth in the Asian and White sample for Self-Report were matched only in terms of age and gender. The demographic characteristics of the rated youth in the matched samples (and their raters, where applicable) are presented in appendix F in Tables F.38 to F.42.

Across all forms, comparisons between the matched Hispanic and White samples, Black and White samples, and Asian and White samples were analyzed via a series of ANOVAs, with results presented in Tables 10.14 to 10.22. These tables present the means and standard deviations of the Conners 4 scale scores, along with the significance tests and effect sizes.

When comparing ratings of Hispanic and White youth across all three forms, results indicated there were no statistically significant effects observed across all scales. Cohen’s d effect sizes, capturing the size of the difference between group means, demonstrated negligible to small effects (with Cohen’s d ranging from 0.00 to |0.26|).

When comparing ratings of Black and White youth, the Parent and Self-Report results showed no statistically significant differences across all scales. Cohen’s d effect sizes described negligible effects (with Cohen’s d ranging from 0.00 to |0.19|). The effect of student race on the Teacher form was statistically significant for Inattention/Executive Dysfunction, Hyperactivity, Impulsivity, Schoolwork, and DSM Oppositional Defiant Disorder Symptoms, wherein ratings of Black students resulted in slightly higher scores than ratings of White students (Cohen’s d ranging from 0.00 to |0.43|). While race/ethnicity did not have a significant effect on the scale scores for Parent and Self-Report, the scale scores across race groups (White and Black comparisons) for several scales on the Conners 4 Teacher demonstrated small but statistically significant differences. This effect manifests as a difference of up to 4 points more for Black students on the Conners 4, with the largest effect observed on the Schoolwork scale (i.e., teachers were more likely to indicate that Black students exhibit more impairment at school when compared to White students).

When comparing ratings of Asian and White youth across all forms, results indicated that there were no statistically significant effects observed across all scales. The effect of race was minimal, as evidenced by negligible to small effect sizes (Parent Cohen’s d ranged from -0.25 to 0.44, Teacher Cohen’s d ranged from 0.00 to 0.34, and Self-Report Cohen’s d ranged from -0.29 to 0.29). These small effects were observed for a variety of scales that differed between rater forms, and the differences may manifest as an approximately 3 T-score-point gap between Asian and White youth.

Overall, there were small differences, the majority of which were not significant, observed between White youth and Hispanic, Black, and Asian youth on the Conners 4. While the results presented in this chapter provide evidence for a lack of measurement bias in the Conners 4, when considering obtained scores, as described in chapter 4, Interpretation, it is important to note that there was a small but statistically significant trend for Black students to be rated higher by teachers on some scales than White students on the Teacher form. This trend of ratings may be important to keep in mind when initiating contact with teacher raters, to encourage mindfulness of potential threats to providing unbiased ratings, or when comparing results for Black youth from the Parent and Self-Report form to the Teacher form, as difficulties may appear to vary across domains. This trend also reflects what has been documented in previous literature, such that teachers may perceive Black students’ behavior as more problematic than their White peers (Neal et al., 2003; Rowley et al., 2014). Together with the absence of evidence for measurement bias, there is support for equity in terms of race/ethnic groups for the Conners 4 and its appropriate use in racially and ethnically diverse populations.









Click to expand

Table 10.21. Group Differences by U.S. Race/Ethnicity (White vs. Asian): Conners 4 Teacher

Scale

White
(N = 43)

Asian
(N = 43)

Cohen’s d

F
(1, 84)

p

η2

Content Scales

Inattention/Executive
Dysfunction

M

47.7

45.8

0.22

1.07

.304

.01

SD

8.1

8.8

Hyperactivity

M

49.6

47.4

0.26

1.47

.228

.02

SD

9.5

7.4

Impulsivity

M

49.3

46.6

0.34

2.51

.117

.03

SD

9.0

7.0

Emotional Dysregulation

M

49.2

47.1

0.28

1.66

.201

.02

SD

7.5

7.4

Depressed Mood

M

48.5

47.7

0.09

0.18

.672

.00

SD

9.3

7.1

Anxious Thoughts

M

49.7

48.3

0.15

0.50

.481

.01

SD

10.2

8.6

Impairment & Functional Outcome Scales

Schoolwork

M

46.6

45.2

0.17

0.61

.435

.01

SD

8.9

8.0

Peer Interactions

M

49.2

48.1

0.11

0.27

.607

.00

SD

10.1

8.0

DSM Symptom Scales

Oppositional Defiant Disorder Symptoms

M

48.5

46.9

0.24

1.23

.271

.01

SD

8.1

5.7

Conduct Disorder Symptoms

M

49.2

49.1

0.00

0.00

.989

.00

SD

7.7

9.8

Note. Guidelines for interpreting η2: negligible effect size < .01; small effect size = .01 to .05; medium effect size = .06 to .13; large effect size ≥ .14. Guidelines for interpreting Cohen’s |d|: negligible effect size < 0.20; small effect size = 0.20 to 0.49; medium effect size = 0.50 to 0.79; large effect size ≥ 0.80. A positive Cohen’s d value indicates that ratings of White youth resulted in higher scores than ratings of Asian youth.



< Back Next >