How (and Why) ProPublica Got Into the Elections Game
A Q&A with the team behind Electionland and the Election DataBot
Yesterday morning, ProPublica announced two new projects: Electionland (announcement post), a large-scale intiative to report on voting access and problems in the upcoming US elections, and Election DataBot (announcement post), a comprehensive election-info data tracker and feed. We spoke with four members of the team on the OpenNews community call and by phone, and asked them to walk us through the projects. The transcripts have been edited for length and clarity.
Source: These are two incredible, ambitious projects. Tell us what we’re looking at, please.
Derek Willis: Electionland is, first and foremost, a project to study and report on—in real time—how elections are conducted here in the United States. We obviously have a big one coming up, and the weird thing about elections in the US is that they’re not actually one election, it’s more like 50-plus individual state elections. And monitoring those is a big job: There’s no federal election authority in the US as there are in many other countries…states have their own rules and procedures for how people are allowed to vote, and when they’re allowed to vote, and it’s very difficult to get a handle on that nationally. Most elections are pretty well run in the US, but there are definitely some problems, and there’s a lot of litigation going on right now about the ways you can vote and who is eligible to vote.
What we wanted to do with Electionland was to bring as much power as possible to bear on trying to figure out how well the elections are actually being conducted. There’s more data available around this question now than there ever has been before. We’re partnering with great organizations—Google and the Lawyers’ Committee for Civil Rights and others who have been working in the space for a long time—to surface and collect information about the voting process and procedures…. and drive [that information] to news organizations so that they can better cover this in real time—not a day or a week or a month after the election has occurred.
The other thing we launched today is Election DataBot, which is a real-time feed of election data: campaign finance filings, some congressional votes, polls—sort of an IV drip of election-related information about the federal races in the fall election.
Scott Klein: [Election DataBot] gathers data from every source we know of, every source we can think of, and present it in local snippets. It’s essentially all the polls in the HuffPost Poll API, all the forecasts that FiveThirtyEight runs, all of the campaign finance data, and all the Google Search trending data for campaign. Instead of reporters needing to understand that stuff from a data journalism perspective, we cut it up for them so they can cover their local candidates—and ultimately, after the election, their congresspeople can use all these data sources.
Ken Schwencke: One cool part: [DataBot] allows you to get email alerts for campaign finance filings and things like that—everything in the feed, you can get alerts for, which alleviates the pain of going to websites and checking them for information all the time. I’m signed up for a bunch of them, and I’ve learned a lot about races I was never interested in until now.
Lena Groeger: On DataBot, I’ll speak to the design angle, which is what I was mostly focused on. his was a pretty big challenge in terms of how much data we were trying to feed people. Do we focus on data displays in some kind of cartogram of what states are most active? Or other kinds of maps or charts? At some point I [realized]—the most interesting thing is really this firehose, that you can see these updates in real time. We shifted to focus on that and have that be the centerpiece.
Source: What was the first conversation about these projects like?
Scott Klein: For years, I have been very jealous of my news-nerd friends who make such awesome stuff on Election Day. As an investigative newsroom, we mostly go home early on Election Day, and the question has always been in the back of my mind: “What does an investigative newsroom do on Election Day, how can we be useful as investigative data journalists?”
I had a conversation with Simon Rogers at the Google News Lab about a project we might be able to do together, and one of the overlaps in our Venn diagrams is that we both see it as core to our mission to empower local newsrooms to do great data projects by kind of preparing the data for them.
So we thought of some election-related projects that might overlap in that way, and we came up with two ideas: one was Election DataBot, and the other was back to, “What do we do on Election Day?”
Everyone is looking at who’s winning. It’s a very important question and I m glad newsrooms spend so much time on it, but there are a ton of other questions that they don’t spend as much time on—and maybe they should spend more time on, and maybe we can help make it easier for them to spend more time on—and that’s the status of voting itself. How are people experiencing the polls? Are the lines so long they can’t vote, or are they getting turned away? Is there a confusing voter ID requirement that’s making it hard for them to vote? And to be sure, have they seen someone voting two or three times? We’re absolutely open to that as a possibility.
Ken Schwencke: The more research we did—and we did a lot of research and reading—into ways to look at that, and what sort of problems arise [in voting], we realized, “Oh man, this is probably one of the first years when you could do this really well, really comprehensively.” Once we realized that, we were all in.
On Election Day, our plan is to be a giant hub of the US’s largest—or maybe second largest after the AP—newsroom working on election-related problems.
Source: How many people are working on this at ProPublica, and how many do you expect to dedicate to this immediately before/during/after the election?
Scott Klein: There were four full-time here at ProPublica working on this, and as we got closer to the end, maybe six. But there’s a lot of excitement here, people are asking if they can help, and I think that number will grow. I’ll be [looking] for lots more volunteers on Election Day. It’s not clear to me how many people, but it could be as many as 100 people—that’s bigger than ProPublica entirely. As many as will come!
Source: What were the major technical challenges of an undertaking like this?
Ken Schwencke: To answer that question, I would like to say that if there are any or developers or people who specialize in building systems that process signals or do natural language processing, if you’d like to help, we would absolutely love to hear from you. There’s a lot of signal collection, and then a lot of filtering, and a lot of that filtering is going to be a human endeavor. But we do have lots of technical tasks and we would love help on that.
Derek Willis: The other thing I would add is that in terms of the ethics of it…obviously we’re interested in when voting goes badly, when there are mistakes or barriers to voting, or things like that, so those incidents usually involve individuals. But we’re more interested and focused on the systems and the processes themselves that lead to those problems. So for example, the Lawyers’ Committee has an Election Protection hotline that people call in to report issues or problems or ask questions. And there is an ethical and a privacy concern about taking individuals’ names and saying,“Hey, this person is having this problem.” Our focus has been much more, “This problem exists in this area.” We’re trying to collect signals that might suggest issues or problems with the existing processes, and then have our reporting partners figure out if [improper] things are actually going on and who’s impacted by them.
Lena Groeger: And hopefully then solve it in real time rather than write about it three days later. That’s the other thing we’re excited about.
Source: Tell us about the partnerships. Have you had another project where you worked with this many organizations, or this many kinds of organizations before? And how did it all come together?
Scott Klein: It’s probably the widest-ranging, because it also pulls in outlets not thought of as journalism, and we’re bringing technologists in as well.
Derek Willis: It came together because we realized that we couldn’t do this alone—no single news organization can do this well, there’s just too much to cover. That said, at ProPublica, we’re in a pretty good position to drive something like this. We don’t do live election results on Election Day or night, we do more long-term projects. So while other folks are covering the races on a day-to-day basis, we have the opportunity to do something different and meaningful.
It was natural to turn to news organizations that have that kind of scope, like the USA Today Network (formerly Gannett), with newspapers and television stations in a lot of these markets;NPR member stations with folks on the ground; and Univision—we want to try and make sure we have the broadest possible participation, particularly in areas that have had troubles in the past, whether it’s things like long lines to vote or voter ID laws that have different effects on different people, or other changes or discrepancies in how voting is conducted.
Scott Klein: We’ve got a lot of people who are working really hard on this. The First Draft people are helping us think about how we’re going to train all of the students and how we’ll process signals, and the USA Today Network brings tons of papers of their own, and they’re bringing all of them aboard. Univision helps us reach one of the most important constituents here when it comes to voting rights and voting problems, and obviously Google. The folks at Google News Labsnot only bring knowledge and API access to the Google Search Trends, they also bring a lot of the project thinking and project management that non-journalists are able to do. So they’re bringing lots of smarts and lots of expertise from different places.
WNYC has run a live newsroom on Election Day, so they’re helping us make sure we’re doing that right, and they’re also bringing a huge number of member stations—they’ve done multi-station projects as well. There are a lot of us involved in supervising and recruiting and understanding the even bigger circle.
Derek Willis: We’re going to have a newsroom on Election Night in New York— our partners will be all around the country, and we have some journalism schools involved as well. Television networks have “decision desks” on Election Night to call races. We’re going to have desks that are similar except they’re going to be filtering through reports of voting problems or other things that bubble up—through the Lawyers’ Committee for Civil Rights hotline or through signals we can detect on social media, or other things. [We will provide] a tip-service to our partners, so they can hopefully jump on it and move pretty quickly.
I don’t think any of us really know exactly how it’s going to work on Election Night, but we’re really excited about trying to figure out what the best way to do this is, and to be able to bring to bear some things that we do very well at ProPublica—gathering data and using it to help push us in the right direction.We’ve got less than two months to figure it out, so I’m really looking forward to it.
Source: What kind of testing can you do for an experiment like this? Are you doing trial runs?
Scott Klein: So first, there are a few primaries before November 8. Next Tuesday, we’re going to get together at CUNY, and it’s very early, we don’t have any software written yet, but we’re going to look at Facebook, look at Twitter, and see who’s talking about the primary in New York, and the other primaries that are also coming up.
But we can also use the Twitter API and ask, what did the day of the Arizona primary in March look like? [We can] replay that day on Twitter, replay that day on Facebook, take a look at the calls that came in to Election Protection on that day, and see what Google Search looked like.
In part because of the great archiving of all of these services, we can make one of the primaries happen again and test our software against it, test our assumptions against it, and see what the pace looks like, see what people would have needed to be doing at that time.
Once we’ve got systems and ideas and software in place, we’ll use it against old data. The most important thing is that early voting will start soon. And a third of the country votes by November 8th, so we will have long lines and provisional balloting. The same problems we’ll have on Election Day, we’ll have on a smaller scale around the country for early voting. So whatever software we build and systems we make and hypotheses we have will have been tested for about a month by the time November 8th comes around.
Source: You’ve mentioned that you hope to address voting problems in real time rather than writing about them after the fact. What does that look like?
Scott Klein: ProPublica is about impact. we’re not just about publishing important stories. On November 9th, fixing problems is too late. People who have missed an election have been, for practical purposes, disenfranchised. The brass ring we’re aiming for is for is that those problems go away—on Election Day while there’s still a chance for people to vote.
Source: What do you hope will be done on Election Day to ensure people get to vote?
Scott Klein: This is why we focused on the local newsrooms. If a county in Ohio sees that a nonprofit newsroom in New York has published a story about how their lines are too long, they’re so busy that day— it’s just not going to appear on their radar. Whereas if the local NPR station or the local newspaper starts talking about it, those stories become a voice that [officials] have to respond to.
Journalists covering elections and non-journalists who want to help ProPublica and its partners gather data can sign up at propublica.org/electionland.
Lena Groeger is an investigative journalist and developer at ProPublica, where she makes interactive graphics and other data-driven projects. She also teaches design and data visualization at The New School and CUNY. Before joining ProPublica in 2011, Groeger covered health and science at Scientific American and Wired magazine. She is particularly excited about the intersection of cognitive science and design, as well as creating graphics and news apps in the public interest.
Scott directs a team of journalist/programmers building large interactive software projects that tell journalistic stories, and that make complex national statistics relevant to readers and their communities. Scott is also co-founder of DocumentCloud, a service that helps news organizations search, manage, and present their source documents.
Journalist and programmer. Building @propublica’s @electionland. Formerly @nytimes @latimes. Go Gators.
Derek Willis is a news applications developer at ProPublica, focusing on politics and elections. He previously worked as a developer and reporter at the New York Times, a database editor at The Washington Post, and at the Center for Public Integrity and Congressional Quarterly.