Wealth and grades: Compare Connecticut’s school districts

Print More

Students in school districts where families live closer to poverty are, on average, four grade levels behind their counterparts in the wealthiest school districts of Connecticut, according to a new analysis.

Researchers used data from 2009 to 2012 from the National Assessment of Educational Progress – which includes standardized math and reading tests – to develop a scale.

The scatter plot above shows that the higher the median family income in a school district, the higher the average grade equivalent tends to be.

That trend holds true in Connecticut.

In New Canaan, with a median family income of $210,000, test scores show students are almost three grades ahead of the national average.

On the other end of the spectrum, Hartford’s school district has a median income of $27,000 and students there are about 1.7 grade levels behind the national average.

Salisbury’s school district is an outlier.

Though the town has a median family income of $38,000, sixth grade students there have scored almost two grade levels ahead of the the national average. Salisbury scores at the same level as towns with higher incomes, like Brookfield and Canton.

Update: Thanks to Mark in the comments below, we know that Salisbury’s family median income is incorrect in Stanford’s data, which would explain why it’s an outlier. The family median income is around $76,000 not $38,000 as the chart above indicates.

Explore the other school districts below.

Check our work: The GitHub repository containing our work is available here. We encourage you to look over our calculations and expand upon our analysis.

What do you think?

  • Joseph Brzezinski

    Have you done any correlation and/or regression analysis. it would appear from the charts that you would get some high r-square results. With regression to get relative importance of ethnicity, gender, etc. I experimented last year and got close to .9.

    • Andrew Ba Tran

      Here’s the link to all their data. https://cepa.stanford.edu/seda/download?nid=1727&destination=node/1717 I couldn’t find the grade-equivalent scores broken out by ethnicity or gender but there’s a bunch of other interesting categories you’d have a lot of fun playing around with.

      • Joseph Brzezinski

        Thanks for the link. Your readers should link to cepa.stanford.edu to look at some of their working papers. CSDE also should look at them as well and duplicate that research for Ct but at the school and grade within school levels. I would recommend against considering any results for teacher evaluation, though I suspect that clustering and grouping of results would provide valuable insights into what works or does not in improving student achievement or reducing achievement gaps. Hopefully, the new student privacy legislation or budget legislation would permit the state conducting research paralling CEPA….

  • Dross1958

    This is a well known effect in education. Indeed CT used to publish this curve with an R value. it was around .9
    Ct at one point plotted this against ERG, Economic Resource Group. The break down to gender and race is available, but the data sets start to get into small sample sizes. It is fascinating.

    • Joseph Brzezinski

      District level for the entire country should have a large enough sample size for a few levels of breakdown for regression analyses, though for ct with 180 districts sample size is a problem. School level data would alleviate that problem as would be grade within school. Such level data are not released by CSDE, if it is available at all within CSDE.

      • Dross1958

        Oh, they (CSDE) have it.

        I notice RHAM Regional 8 is in the list. RHAM is 7-12, no 6th grade. So if this is 6th grade data why would RHAM or any of the Regional Middle/High Schools be listed?

        • Joseph Brzezinski

          You make a good point re RHAM. Such anomalies often invalidate this kind of analysis even though they may or may not impact results. Attention and close scrutiny are essential quality control tasks in analyses so that this kind of data anomaly is eliminated or at the least explained with how it affects the result.

  • mark

    The income figures for Salisbury are not correct. Median household income there is in the $65,000 to $90,000 per year range. The other towns look right.

  • Dross1958

    In the GITHUB what language is that? python? R? Something else?

  • John Schnee

    But what’s the cause and what’s the effect? At first glance it appears that more family income leads to smarter students. But perhaps it’s the other way around. Maybe smart parents end up with higher incomes. In turn, they encourage their own children to study more. In which case, it’s smart parents that lead to smart kids. It’s an important distinction because it suggests that throwing money at education might not have the desired effect. Instead, parents setting higher expectations for their children could be the key.

    • Dross1958

      Bingo. Expectations. Actually the one thing that seems to be a predictor (and I’d love to see if this data holds that up) is MOM. Is Mom. Mom’s education level. Now does Mom pass on her expectations, or appreciation of education or is Mom smart and pass that smartness on? We get in to causal issues. The data seems to indicate after a certain level of funding, increased spending does not lead to increased performance. Fascinating stuff.

  • Dross1958

    I’ve spent some time poking around the data, unfortunately I made a deliberate decision 2 years ago not to learn R (R has a strange syntax, those of us who program find it quite odd), but python, which turned out to be limiting in its own ways. But having been on a BOE and some level of familiarity with school policy and other districts, I am concerned some data is not right in this data set. I think the conclusions the author reached are correct and that data for that conclusion is good. However the classroom size data seems off, way too small classrooms and the % parents with degrees seems high and the % Free and Reduced seems low. For example most classroom sizes were reported as 20 students per teacher. I know that is not the case in many districts. And doesn’t address what para’s may be in the classroom.

    Clearly using aggregate data (by that I mean one set of data for all grades in one district) is dangerous. But then when you look at individual grades sample sizes get small, especially if you just look at one state like Ct.

    But this is as a rich data set I’ve ever seen. I applaud the author for publishing the R code. Great job.

    I’d love to get an excel of all the Ct data in one excel. That way I could feed it into my python code. Someone verse in R could create this data set for me easily…….

    • Andrew Ba Tran

      Right, grade-specific data can be troublesome at this scale but it’s a good start. It’d be good to compile this data for these specific school districts– it’s just tough for districts that don’t align with town borders.

      I went and subsetted the Connecticut-related data. Look for the files with the prefix “ct-”


      • Dross1958

        This is too good to be true. Thank you!