This supplement provides detailed tables and methodology to supplement the YouthTruth Concurrent and Predictive Validity Study report.
Searchable Tables
- Descriptive statistics: The sample for each analysis varied based on availability of the outcome and data quality. Descriptives are available for each school level and outcome sample.
- Full results: Coefficients for each variable, including standard errors, t and p values are available.
Methodology
Control variable derivations
Most of the control variables used in the analyses come from EdFacts. These variables include:
- Percent English Learner
- Percent with a disability
- Percent homeless
- Percent Asian, non-Hispanic
- Percent black, non-Hispanic
- Percent Hispanic
- Percent multiple or other race, non-Hispanic
- Percent male
- State
The source of two control variables varied by analysis for two variables: percent economically disadvantaged and school enrollment.
Percent economically disadvantaged
- For the reading and mathematics proficiency rate analyses, percent economically disadvantaged is the number of assessment takers each year who were reported as economically disadvantaged divided by the total number of students taking the assessment. Separate estimates of economic disadvantage were generated for reading and mathematics assessment takers for grades 5, 8, and high school in 2017 and 2018.
- Because the CRDC did not collect information on student economic disadvantage, the chronic absence rate and log suspension index analyses utilize the percent economically disadvantaged estimates used in the mathematics proficiency rate models.
- For the supplemental graduation rate analyses, percent economically disadvantaged is the number of economically disadvantaged students in the denominator of the Adjusted Cohort Graduation Rate (ACGR) divided by the full denominator. The denominator of the ACGR is ninth-graders three years earlier (e.g., 2015 ninth-graders for 2018 graduation rate) plus any transfers in and minus any transfers out.
School enrollment (logged). We took the natural log of enrollment variables to adjust for right skewness (i.e., the idea that some schools have much larger enrollments than others).
- For the reading and mathematics proficiency rate analyses, school enrollment was the number of assessment takers reported each year by EDFacts.
- For the chronic absence rate and log suspension index analyses, we drew school-wide enrollment numbers from the CRDC. Similarly, we pulled ninth-grade enrollment information from the CRDC for the ninth-grade retention analysis.
- For the supplemental graduation rate analyses, enrollment was the number of students in the denominator of the Adjusted Cohort Graduation Rate (ACGR). The denominator of the ACGR for a given year is ninth-graders three years earlier (e.g., 2015 ninth-graders for 2018 graduation rate) plus any transfers in and minus any transfers out.
Analysis and model specifications
Each YouthTruth scale’s concurrent validity was assessed with respect to five outcomes: three school-level academic outcomes (reading proficiency rate, math proficiency rate, and ninth-grade retention) and two school-level behavioral outcomes (chronic absence rate and log suspension index). Each scale’s predictive validity was assessed for four of these outcomes, not including ninth-grade retention rate. Concurrent and predictive validity were assessed for each scale separately in elementary, middle, and high school. Four of the scales were assessed at all school levels, while belonging and peer collaboration was assessed only in middle and high school, and college and career readiness was assessed only in high school.
Concurrent validity
For our concurrent validity analyses, we fitted Ordinary Least Squares (OLS) regressions of outcomes measured in 2018 on predictors of interest and control variables measured in 2018. For the ninth-grade retention concurrent validity model, the outcome (retention in grade) was measured at the completion of the school year. For our predictive validity analyses, we fit OLS regressions of outcomes measured in 2018 on predictors of interest and control variables measured in 2017.
More specifically, to investigate concurrent validity, we fitted a series of j x k OLS regression models, where j is a count of 14 different outcomes of interest:
- Grade 5 reading proficiency
- Grade 5 math proficiency
- Grade 8 reading proficiency
- Grade 8 math proficiency
- High school reading proficiency
- High school math proficiency
- Elementary school chronic absence rate
- Middle school chronic absence rate
- High school chronic absence rate
- Elementary school log suspension index
- Middle school log suspension index
- High school log suspension index
- Ninth-grade retention rate
- High school graduation rate
and k is a series of YouthTruth student survey predictors (six student experience scales measuring student engagement, academic challenge, student-teacher relationships, culture, student belonging and peer collaboration (middle and high school only), and, in high school, college and career readiness, along with supplemental individual survey items). The models we fitted are represented by the equation:
where for each pairing of outcome and key YouthTruth predictor, for individual school i, the outcome Yijk is a function of the predictor of interest, Xik, a vector of school demographic variables, Di, and a vector of dummy variables representing the state in which the school is located, Si. YouthTruth predictor variables and demographic control variables are all for the year 2018. The intercept is represented by β0. For each of j outcomes and the k YouthTruth student survey predictors, the primary coefficient of interest is represented by β1, which represents the relationship between the predictor of interest and the outcome of interest, accounting for state and the demographic variables in the model. The coefficients on demographic variables are represented by γ’ while the coefficients on the state dummies are represented by π’. The error term is represented by eijk.
Predictive validity
Our approach to predictive validity was similar, but year 2018 outcomes were predicted by YouthTruth and demographic variables measured in the prior year, 2017. The equation representing models to test predictive validity is:
where t or t-1 subscripts have been added to time-varying model variables and to the error term to clearly represent that we will be modeling 2018 outcomes as a function of 2017 predictor variables.
Change over time: supplemental model specifications
For our supplemental change-over-time analyses we utilized a first-difference estimator. That is, we fitted OLS regressions of the change in outcomes from 2017 to 2018 on the change in predictors of interest and control variables from 2017 to 2018:
where for each pairing of outcome and key YouthTruth predictor, for individual school i, the change from 2017 to 2018 in the outcome, ∆Yijk, is a function of change from 2017 to 2018 in the predictor of interest, ∆Xik, and a vector of changes from 2017 to 2018 in school demographic variables, ∆Di. Note that there is no vector of dummy variables representing state in this model. The reason for this is that state is constant from 2017 to 2018, and when differencing, all time-constant predictors have changes of zero, and, as a result, they fall out of the model.
Similarly, if we revisit the error term from Equation 1, eijk, we can break it up into two parts: unobserved individual school heterogeneity (αijk) and idiosyncratic error uijkt. The former contains everything that is constant over the period of performance and not otherwise measured in the present analysis, while the latter represents random error and the influence of any unobserved time-varying variables. What remains after first-differencing is ∆uijk, the change in the idiosyncratic error from 2017 to 2018. Finally, note that the intercept, δ0, represents the 2018 year effect (relative to 2017) on the outcome of interest.
Limitations
The present study has several limitations. Although we attempted to investigate a diverse array of outcomes of interest, including academic outcomes (reading and mathematics proficiency rates and ninth-grade retention rates) and non-academic outcomes (chronic absence rate and log suspension index), the number of outcomes we could investigate was limited by both the availability of relevant extant data and by the resources available to execute the present study. Relationships between YouthTruth student experience scales and outcomes of interest not investigated in the present study, such as socioemotional outcomes, are likely to exist.
It is important to understand that the concurrent and predictive analyses presented in this report are correlational—not causal. For example, just because a statistically significant positive relationship existed between belonging and peer collaboration and high school reading proficiency rate, that does not mean that increased belonging and peer collaboration caused increased high school reading proficiency rates. Rather, it could be that higher reading proficiency levels led to increased belonging and peer collaboration, or that a third factor or constellation of additional factors are linked to both belonging and peer collaboration and reading proficiency rate.
Related to this, the analyses presented in this report may suffer from omitted variable bias. Even though we incorporated a set of school-level demographic covariates in our analytic models, including percent economically disadvantaged, a set of race and ethnicity variables, log enrollment size, and state, results were likely influenced by variables excluded from the analysis. For example, we found that better student-teacher relationships in middle school were linked to higher reading proficiency rates at the school level, conditional on the other variables in our analysis. Perhaps middle schools in the analytic sample that had teachers who were better at forging positive relationships with middle schoolers also had teachers who were better trained in reading instruction. Because we were unable to include a variable measuring teacher training in reading instruction in our analyses, our estimate of the relationship between teacher-student relationships and reading proficiency rate would be biased.
Another way in which the results of our analyses may have been undermined is via aggregation bias. Aggregation bias is when misleading results occur because data is aggregated from lower to higher units of analysis before investigating relationships between variables. The research team only had access to school- or grade-level aggregated data on both YouthTruth student experience measures and student outcomes. As a result, student-level responses on YouthTruth student experience measures and student-level outcomes were aggregated to the grade or school level prior to the authors being able to analyze relationships among them. Therefore, all results are based on between-school variation in predictors and outcomes of interest. The fact that five out of six YouthTruth student experience scales demonstrated predictive validity with respect to ninth grade retention—our most granular analysis—suggests that further disaggregating data to the student level would enable the detection of underlying relationships. After all, aggregating data to the school level limits variation. Student-level analyses would enable the investigation of within-school, between-student variation in both outcomes and predictors of interest, leading to a fuller understanding of the concurrent and predictive validity of YouthTruth student experience measures.
Related to the limitations associated with aggregation, our analyses were, in general, low in statistical power. Instead of being able to investigate thousands of individual records, our analyses were based no more than 170 school records, and, for some outcomes, as few as 30. As a result, our estimates of the relationships between YouthTruth student experience scales and outcomes of interest were relatively imprecise, with large confidence intervals. This lack of precision made it more difficult to detect statistically significant findings.
The analyses with very small sample sizes also pointed to issues with data quality for the outcomes we drew from the Civil Rights Data Collection (CRDC): chronic absence rate, log suspension index, and ninth-grade retention rate. For example, with respect to chronic absence rate, more than 10 schools, all from a single state, reported unusually high chronic absence rates—above 50 percent. Furthermore, it was unclear whether zeros on the CRDC data file were true zeros or actually represented missing data. As a result, we excluded schools with chronic absence rates of zero or above 50 percent from the chronic absence rate analyses, limiting the analytic sample size as well as the generalizability of the results. Similarly, we excluded schools that reported zero suspensions from the log suspension analysis and schools that reported zero ninth grade retentions from the ninth-grade retention analysis. Again, this limited sample sizes, as well as the generalizability of results. Results do not necessarily generalize to those schools that truly have no suspensions, no chronic absences, and no students retained in ninth grade.
A final limitation is the fact that the reading and mathematics proficiency rates outcomes were limited to a single grade (grade 5 in elementary schools, grade 8 in middle schools, and a single grade in high school that varies by state). In contrast, our predictors of interest were drawn from students enrolled in all grades within each school, except for those used in the ninth-grade retention analyses. Findings for reading and mathematics proficiency rates may have been different if the match between outcomes and predictors were more exact.