US Shutdown Scuttles Data

(cc bmills)

Roundtable data US Shutdown Scuttles Data

The US Government Shutdown has affected federal feeds from Mars to the Census—how news developers are adapting and collaborating

As the government shutdown grinds into its third day, many news developers, civic data hackers, and open gov activists are starting to feel the hurt due to the suspension of most government data feeds, APIs, and websites. For the last few days, Twitter has been active with developers helping each other find alternate sources of data, troubleshoot temporarily dead-ended projects, and collaborate on open data repositories.

We approached a number of these developers to get their tips, workarounds, and thoughts on the current shutdown’s effect on their data work.

It hasn’t happened, yet.”

For Ben Welsh, at the Los Angeles Times, the shutdown isn’t a crisis—yet. “We have local copies of most of the things we work with day-to-day,” he explains, “and an occasion hasn’t cropped up yet where we’ve tried to reach for something online and it wasn’t there.” However, Welsh says, “The longer this goes on, the more likely it is that will happen. But it hasn’t happened just yet.”

Matt Stiles at National Public Radio echoes Welsh: “I haven’t encountered any insurmountable problems—yet.”

Yet” appears to be the operative word right now. Some sites, feeds, APIs, and other data sources went dead on October 1, others have stayed up but aren’t updated, and others still have continued chugging along. If it all feels a little a little arbitrary, that’s because it kind of is. As a story in Quartz explains, “Basically, until Congressional funding is restored, it is illegal to do anything that would give someone an excuse to ask the government for money when it re-opens—and that includes hosting fees and electricity bills. And the varied responses underscore how confusing and messy it is when the government has to stop usual operations on a dime.”

That confusion has scuttled some in-process projects. As Chris Keller, a developer at KPCC in Southern California explains, “On Monday I wrote a couple scripts that used the Census Bureau’s API to loop through some tables to get the basic breakdowns of the tracts. I had planned to look at American Community Survey data on Tuesday. Needless to say that went nowhere.”

The Census was a problem for Stiles as well: “I did have a slight hiccup on Tuesday when I wanted to check the Census Bureau’s website to see whether the agency maintained a list of diversity index scores at the tract level. Of course, I couldn’t. The site was down.”

Whole-Hog Survivalist”

Every newsroom developer we pinged for this story mentioned the Census as a potential fail point in their projects. Thankfully, the CensusReporter project has been a rock-solid fail-safe, offering a mirrored cache (and better API) to all the census data.

The purpose of the CensusReporter project is to make it easier for journalists to get U.S. Census information,” explains Ryan Pitts, the lead developer of CensusReporter (full disclosure: Pitts is also a developer on Source), “so before we started building anything, we had to have our own copies of the data. Much of that has been in place for months, so the government shutdown isn’t that much of a tarpit for us, really.”

Pitts and the CensusReporter team went the extra mile and shared their raw data. “We added a quick S3 bucket on Tuesday morning to share the raw files various team members still had sitting on hard drives. We obviously weren’t the only ones thinking along these lines,” he adds.

They weren’t. Mike Migurski, a Code for America Fellow, had been working with Census geodata files last week. “If I had chosen this week to do the same work I would be looking for another project,” he explains. But instead, he’s decided to go “whole-hog survivalist canned-food mountain man” and has collected shapefiles from “other census nuts” and made them available to everyone.

It’s exposed a single point of failure”

Whether the shutdown lasts for a few more days or a few more weeks, going “whole-hog survivalist” and sharing cached data among the communities of interest that need it is a good idea, says Waldo Jaquith, a long-time open-data activist: “In the long run, it’s going to be really healthy for folks to be cut off of open government data for a bit, because it’s exposed a single point of failure.”

Pitts agrees, and says the shutdown has kicked off a needed conversation around data infrastructure, pointing to work by the Sunlight Foundation among others.

Jaquith sees an opportunity for a fundamental re-think of how governments supply their data: “For many types of data, I favor bulk-data-as-API,” he explains. “This is the practice of providing bulk data not as a monolithic file, but as thousands of JSON files, which functions exactly like an API. These files can be mirrored en masse, providing the same functionality as an API, should the master, .gov version disappear.”

Until that point though, news developers, open-data activists, and those who simply want to keep programmatic tabs on the government’s work, are going to have to scramble until the feeds start back up.

About Dan Sinker

comments powered by Disqus