How 21 Newsrooms Pooled Funds for Texas Public Data
Together we spent thousands for important voter registration data—a collaboration model that you can use, too.
In September, twenty-one newsrooms banded together to purchase Texas’s voter registration database and voting history data from the Texas Secretary of State’s office. The data cost more than $3,500 for tens of millions of records. It was the largest collaboration in recent Texas history for the sole purpose of purchasing public records.
When news of our collaboration hit social media, journalists were enthusiastic. Reporters saw it as a potential model for getting expensive data or records where they work. Other journalists were encouraged by so many competing organizations willing to collaborate on a big project. At the same time, the non-media public seemed mystified that big media companies couldn’t afford a $3,500 bill on their own.
The truth is newsrooms are getting squeezed by two big trends: Government agencies are charging larger amounts for public records while newsrooms are getting smaller and budgets are shrinking.
New Fee Structures for Public Data
Journalists are relying more on public data for enterprise and investigative reporting. Until the last decade or so, state public record laws were often out of date and didn’t address how much agencies could charge for data. It was fairly common to see an agency attempt to charge a journalist for a database by the number of pages it would take up if the records were printed out on paper.
As we all transitioned to a more digital world, it should have become easier to get public records electronically. This is particularly true for databases. That’s not what happened. Instead of being charged for paper copies of electronic data, journalists in Texas and other states are now being charged for “programming time” or “computer processing time.” These cost estimates can be difficult to refute and can lead to huge bills.
Here’s an example: a recent request made to the Texas Department of Public Safety for a small amount of data yielded a $800+ bill for 16 hours of programming time. I know from my data experience that it’s highly unlikely that this request would take that long to produce. Reporters can try haggling with agencies, but if the government insists that’s what it takes to produce the record, there are few options to appeal. Lawsuits are unlikely because they’re difficult to win and are also expensive to fight. What usually happens is the newsroom can’t afford it, and the data remains out of the hands of journalists and the public at large. It’s a win for secrecy and a loss for transparency.
When You Can’t Pay, Collaborate
When you combine this with the reduced newsroom budgets, you have a huge problem for watchdog journalism. I recently conducted an informal survey of data journalists on this issue. The responses show this isn’t just a Texas problem. Out of 18 newsrooms surveyed, seven newsrooms said a bill between $100 and $300 would likely get a no from editors. Five newsrooms said a bill between $300 and $500 would be too rich for their newsroom’s blood. Yes, national newsrooms can afford larger bills than smaller community papers but no one has a blank check for records.
Newsroom budgets aren’t getting any larger. Government agencies aren’t going to lower the costs of records out of the goodness of their bureaucratic hearts. The only way through is collaboration.
In the past, competitive pressures made it almost impossible for newsrooms in the same markets to work together to get records. Consolidation and tightening budgets may have dragged us to a tipping point. Collaborative efforts like the one we did in Texas are the key to breaking through this cost barricade.
What Collaboration Can Look Like
The whole thing started when our newsroom wanted these data sets for election stories. Haggling over the price with the Secretary of State’s office wasn’t getting us anywhere. The collaboration idea was a long shot of sorts. We embarked on it with a healthy dose of skepticism. Here’s what I expected: maybe a larger outlet like Texas Tribune goes in on it with us but everyone else demurs because it’s too risky. Newsrooms are collaborating now more than ever, but they often follow a different, more time-intensive model than the one we ultimately used. Typical collaborations involve one to three newsrooms and focus on a single investigative or enterprise project. These projects can take months to produce and can involve multiple layers of reporters and editors from each newsroom. We didn’t have time for something like that.
Another model is more nimble but still wasn’t right for us: ProPublica has made great strides in convincing large numbers of newsrooms to work together on projects like Documenting Hate and Electionland. These projects help partners write more and better stories on highly important subjects like hate crimes and voting issues. We needed a collaboration that was even more flexible, and not focused on producing specific kind of stories.
What We Tried Instead
What we developed was something else. We just wanted to get data. We didn’t have a particular story under development. Instead, we wanted data that was broadly useful, not just for stories but also as a people finder and for backgrounding individuals in the news. Still, when we put out the call for newsrooms to collaborate with us, it was a big ask. Without a large number of newsrooms involved, the whole thing wouldn’t be worth it.
It was only when we started making calls that we realized how many newsrooms were willing to make this happen. I started calling just Houston newsrooms. Then I tried to call or email every TV station, newspaper, or public radio station in each of the larger markets in Texas. After reaching out to almost all the newsrooms in San Antonio, Dallas, and Austin, I worked on the smaller markets like Corpus Christi, El Paso, Laredo and more.
For each newsroom, I’d find the politics editor or the news director. The pitch went something like this: you can get thousands of dollars worth of data for bargain basement prices. But the only way you can get that data is if you work with us and act now. If I knew other outlets in their market were in already, I’d stress that no one wants to be the only newsroom that doesn’t have this database.
Editors were quick to see the value of the data we sought. There was little worry about competition. We had more questions about possible coordination on stories related to the data instead. But instead of trying to navigate story collaboration, we kept it focused on getting the data first. Almost everyone jumped on board. Most newsrooms that didn’t participate just never got back to me after voicemails or emails. In the end, 21 newsrooms were involved, making the overall cost for a massive amount of voter registration and voter history data to be around $180 per participant.
Where We Are Now
We’re still in the middle of the project. Checks are still being written and data is still being distributed. Some newsrooms are more capable of handling such a large amount of data than others. We’re working on building bridges between organizations so newsrooms with more data skills can help those who might not have a dedicated data team or even a data person. The data has already proven useful. In Waller County, students from the local university had a voter registration issue. Since the university does not have individual mail boxes for each student, it was unclear what address they should list as their residence if they wanted to vote in Waller County. The county gave one set of directions earlier this year and now was changing the directions shortly before the registration deadline. Thanks to having the voter registration database in hand, we could show how many people had registered by the older instructions. We could illustrate the size and scope of the problem easily. (Our original story is here, with follow-ups here and here.)
The sincere hope of many of us is that this is not a one-hit wonder. The current environment for newsrooms can feel insurmountable sometimes. The powerful frequently call the press the enemy of the people. Shrinking budgets mean fewer reporters and editors. Sometimes, it feels like no one is paying attention to anything other than slanted coverage that tells them what they want to hear. But if we can set aside our competitive nature and work together where we can, we can overcome almost any obstacle in our way.
Let a thousand collaborations like this take flight. The resulting journalism will be more than worth it.
Collaborators: Texas Tribune, KFOX/KDBC, WFAA, Dallas Morning News, KRIS, KVUE, Electionland/ProPublica, KVIA, Houston Public Media, KTRK, Univision- Houston, Corpus Christi Caller Times, El Paso Times, KEYE, KPRC, WOAI/KABB, Texas Monthly, San Antonio Express News, BigLocalNews/Stanford University, KHOU, Laredo Morning Times
Matt Dempsey is the data editor at the Houston Chronicle. He worked on projects involving wildfires, state pensions, and the chemical industry. His passion for public records frequently leads to disclosure of data from all levels of government. His series Chemical Breakdown won the 2016 IRE Innovation award and the National Press Foundation’s “Feddie” award. His work was a key part of the Chronicle’s Pulitzer Prize finalist entry for Breaking News.