COVID-19 story recipe: Analyze school enrollment changes in the districts you cover
Where to find the data, how to explore it, and questions to ask to reproduce the story for your community
Schools have been through a roller coaster throughout the COVID–19 pandemic—navigating a patchwork of state and local guidance, trying to provide quality education as students transition between in-person, hybrid, and remote-learning environments. Parents and their children have faced tough decisions about whether to continue in their current school or make a change.
In late 2020, Big Local News and OpenNews surveyed local journalists about what they needed to cover the pandemic’s effects in their communities, and education data was high on the list. We knew there would be lots of stories to pull from 2020–2021 school enrollment figures, but there was no national dataset for this information and no way to compare across states, and even within states there are widely varying ways of reporting school data. The Stanford School Enrollment Project is a collaborative project that collects and normalizes enrollment data from dozens of states (so far)—creating a dataset that didn’t exist before. That data has already led to stories that couldn’t have been told without it, and maybe it can do the same for you.
The stories we found
A national story from The New York Times found that kindergarten enrollment across the country dropped by nearly 10% in 2020, particularly affecting neighborhoods near and below the poverty line. Across the 33 states where we were able to gather data, more than 10,000 public schools lost at least one-fifth of their kindergarten enrollment. The enrollment picture played out in different ways state by state, and two of our collaboration partners investigated California and Colorado.
The obvious story in the California data was the enormous 3% drop in enrollment statewide year-over-year—almost three times larger than any single-year decline in the past two decades. It didn’t take much digging to discover that plummeting kindergarten enrollment accounted for almost 40% of the reduction, and four out of five districts with kindergarteners saw a drop in their numbers.
Beyond that, there wasn’t a clear pattern to the decline. The drop in California transcended regions, socioeconomics, demographics, even politics. Charter schools were slightly less affected, but charters and non-charters both saw large enough drops in enrollment to be considered a major issue.
In Colorado, the data also revealed a microcosm of what was happening around the country: large enrollment declines between the 2020 and 2021 school years, and among the very youngest students in particular—full-day kindergarten, half-day kindergarten and pre-kindergarten.
When we sliced the Colorado data by race and grade, the most interesting finding was that white kindergartners represented 54 percent of the enrollment from 2020, but 65 percent of the decline in 2021. So white students were pulled away from kindergarten more than expected for their share of the population.
How you can analyze the data
The enrollment data for these stories was acquired through lots and lots of public-records requests and downloads from state and local agencies—but if your state is one of the more than 30 covered by the Stanford School Enrollment project, you get to skip all that. The data at Big Local News has already been cleaned and normalized, and it’s easy to download a csv file ready for analysis.
To get to the platform, visit https://biglocalnews.org/ and log in with any Google account. Once you are logged in, click “Open Projects” and wait for the list of databases to load (sometimes it can take a minute). Once loaded, search for the “Stanford School Enrollment Project” and open it. You will see all the csv files for each state that can be downloaded to your computer via the cloud icon next to the file name.
One note before we get started: These data files are generally too large to work with in Google Sheets, so we’ll be using Excel for this walkthrough.
- Download the csv file for your state.
- Open the csv file in Microsoft Excel and save it as an Excel workbook (.xlsx). Within this new Excel workbook, you’ll be able to do all your pivot table analysis in different worksheets.
- The dataset includes both school-level and district-level data, so you’ll have to be careful in filtering admin_level by district so that you don’t overcount.
- In the Excel menu, go to Data > Summarize with PivotTable (on a Mac, you may need to use the menus at the very top of your screen). That will open up a new sheet where the pivot table output will generate.
- In the PivotTable Fields menu, filter by admin_level. (Depending on your version of Excel, you may need to click and drag the column names you want into each pivot table area.) Year should be in Columns, grade should be in Rows and your Values should be total, which should be set to display as the sum.
- The pivot table should then show you total enrollment per grade, over time. This now empowers you to calculate raw and % changes between the 2020 and 2021 school years.
If you’re interested in looking at the same data by race, then create another pivot table…
- In the Excel menu, go to Data > Summarize with PivotTable to generate another pivot table in a new sheet.
- Filter by admin_level and set it to “district.” Then, also filter by grade and set it to the grade you’re focusing on. Uncheck all other grades in the filter.
- Your Columns should include year.
- Make sure your Rows are set to show values. Your pivot table may put values into Columns by default—if it does, just drag it over to the Rows area.
- The pivot table should then show you enrollment by race over time. This now allows you to calculate raw and % changes between the 2020 and 2021 school year. You can also calculate the percent share of each race and whether that race category was over- or under-represented in the decline.
Things to watch out for in the data
If you’re doing any district-level analysis in particular, be aware that district names can change. We had to consolidate entries for several Colorado districts that were referred to in the data by different names over time—and not just your average “slightly misspelled seven different ways” variations. For example, these refer to the same district:
- Upper Rio Grande School District C–7 & DEL NORTE C–7
- Elizabeth School District & ELIZABETH C–1
- District 49 & FALCON 49
We found about a dozen total cases like this in the Colorado data.
Most importantly: No matter what you find, double-check your results before publishing. If one school seems like a massive outlier, give them a call to confirm. If your intuition tells you the numbers seem off, go back to the original source data and make sure it lines up with other sources’ reports. Be your worst critic, because if you don’t, you can expect that your story will attract others who will tell you what is wrong.
Where to look for your story
After California issued a press release describing the historic crash, we tried to break down the enrollment data by race, socioeconomics, urban/rural, region—but it was as close to a universal problem as you’ll ever see in a state this size. That became the story, but we had the receipts to back up those claims.
That said, EdSource primarily covers California on a statewide basis. While we did look at counties and regions, we did not give them the attention or scrutiny we gave to the state as a whole. Localizing your story by looking at a specific county—or even schools within a specific district for which you have strong institutional knowledge—could bring great stories to the surface, be they trend stories or even humanizing stories about an especially hard-hit school struggling to adapt or prepare.
Your existing knowledge and expertise are key here. Two reporters with different backgrounds can look at the exact same data, sorted and filtered, and see two different stories. Institutional knowledge and personal experience are critical in reporting on data, so never be afraid to cover already-trodden ground.
Find more step-by-step COVID–19 data story recipes like this one. If you have questions about a story you’re working on, our free peer data review program is here to help.
Programs like these are part of the OpenNews COVID–19 community care package. If you’re using this story recipe, please let us know—we’d love to promote your work! If you’ve got a story recipe idea, we’d love to hear about it. Drop us a line at email@example.com.
Vignesh Ramachandran is a freelance journalist and co-founder of Red, White and Brown Media. He has written for the Colorado Sun, Knight Foundation and NPR and previously worked for ProPublica, the Stanford Computational Journalism Lab, NBC News Digital and Mashable.
Daniel J. Willis is EdSource’s data analyst and database designer. He previously spent 10 years at the Oakland Tribune, Contra Costa Times, and San Jose Mercury News in a variety of roles. His work has been honored by the Education Writers of America, Northern California Society of Professional Journalists, California Newspaper Publishers Association, and White House Correspondents Association among others. He is an alumnus of the University of California at Santa Cruz where he studied economics.