How curiosity and tinkering let Al Jazeera America publish historical data for a derailed train’s route without Amtrak’s cooperation.
Sometimes you write a piece of software and it gets used for purposes you didn’t quite imagine at the time. Sometimes you write a piece of software and it unexpectedly rearranges your life.
Derek Willis breaks down the three stages of scraping (denial, annoyance, and acceptance) while confronting the election-results form from hell.
D’Vera Cohn on Everything You Ever Wanted to Know About Marriage Data But Were Afraid to Ask
Jeremy Merrill and Ken Schwencke explore the fine art of anticipating and catching errors while wrangling the eccentricities of US elections data.
Michael Maciag‘s walk-through of this under-utilized goldmine.
Emily Alpert Reyes on how to find promising needles in Census haystacks.
Jake Harris reverse-engineers Twee-Q to evaluate its use of data (and see if his ratio is as disappointing as Twee-Q says it is)
Basile Simon walks through the process of building a new tool that aims to help reporters cover beats, and that was prompted by work by Knight-Mozilla Fellows and a presentation at Hacks/Hackers London.
The Chronicle of Higher Education set out to compare net cost of colleges and found an unexpected discrepancy. The team describes the piece they created to help explain the difficulty in comparing net costs.
The BBC News Labs team explores ways of exposing linked data in public-facing election coverage, and encounters some interesting challenges.
Ronald Campbell on using census data to find facts in a world of speculation
Jacob Harris on six ways to make mistakes with data—and how to avoid them.
Paul Overberg explains base tables and how to get the best data from them (hint: ask good questions!).
Jacqui Maher says it’s not just the numbers, it’s what they mean about the audience.
People tweet what they think, when they think it—and, crucially, we wanted to provide a visualization for the State of the Union speech which reflected that. This wouldn’t be a (shudder) word cloud based on frequencies but a way to track the conversation on Twitter as it was directly influenced by the President’s speech.
Jonathan Stray’s guide to turning documents into data you can run with.
Tyler Dukes on combining the power of data-sorting tools with old-fashioned digging.