Articles

Projects walkthroughs, tool teardowns, interviews, and more.

Articles tagged: python

What I Learned Recreating One Chart Using 24 Tools

By Lisa Charlotte Rost

Posted on December 8, 2016
Lessons learned from trying to create one chart with as many applications, libraries, and programming languages as possible.
Peda(bot)gically Speaking: Teaching Computational and Data Journalism with Bots

By Nicholas Diakopoulos

Posted on April 25, 2016
Bots encapsulate how data and computing can work together, in journalism. And when we use bots to teach concepts and skills in computational journalism, we’re actually teaching two kinds of thinking: editorial and computational.
How I Investigated Uber Surge Pricing in D.C.

By Jennifer A. Stark

Posted on April 18, 2016
As part of my research with the Tow Center, I investigated the geographical and demographic data around how Uber works in D.C., to find out if its wait times varied by neighborhood (and, as a result, by demographic). Here’s how I did it.
How La Nación Listened to 20,000 (Possibly Interesting) Audio Files

By Juan Elosua and Francis Tseng

Posted on February 8, 2016
With about 20,000 unlabeled audio files to classify, as part of a big breaking story, we created a process to help us focus on the files we actually needed.
Inside the Wall Street Journal’s Prediction Calculator

By Martin Burch

Posted on January 8, 2016
We made an interactive based on a government agency’s method of predicting a person’s race and ethnicity based solely on their name and address. The response from our readers was, fittingly, a little unpredictable.
Introducing Elex, a Tool to Make Election Coverage Better for Everyone

By Jeremy Bowers and David Eads

Posted on December 10, 2015
“End the elections arms race” has become a rallying cry in American data journalism. Many newsrooms spend tremendous resources writing code to simply load and parse election data. It’s time we stopped worrying about the plumbing and started competing on the interesting parts. We decided it was time we put some code against our beliefs – our contribution is a tool we’re calling Elex. And it needs your help, too.
Introducing agate: a Better Data Analysis Library for Journalists

By Christopher Groskopf

Posted on October 27, 2015
Meet agate, a Python data analysis library optimized not for performance, but for the performance of the human who is using it. That means focusing on designing code that is easy to learn, readable, and flexible enough to handle any weird data you throw at it. Here’s why you should try it.
Introducing broca

By Francis Tseng

Posted on July 31, 2015
Made at our recent code convening, broca creates a system for easier experimentation and implementation of natural language processing.
Introducing Bedfellows

By Nikolas Iubel

Posted on December 18, 2014
The financial relationship between PAC contributors and recipients can be difficult to divine from the information reported to the FEC. Bedfellows is a new Python library based on a model developed at The Upshot for understanding those relationships via several different measures.
A Botmaking Primer

By Joseph Kokenge

Posted on March 27, 2014
Not sure where to begin with this whole bot thing? Joseph Kokenge is here to help you get started with botmaking 101.
To Scrape, Perchance to Tweet

By Abe Epton

Posted on January 14, 2014
At the Chicago Tribune, we had a simple goal: to automatically tweet contributions to Illinois politicians of $1,000 or more, which campaigns are required to report within five business days. To see, in something approximating real time, which campaigns are bringing in the big bucks and who those big-buck-bearers are. The Illinois State Board of Elections (ISBE) has helpfully published exactly this data for years online, in a format that appears to have changed very little since at least the mid-2000s. There’s no API for this data, but the stability of the format is encouraging. A scraper is hardly an ideal tool for anything intended to last for a while and produce public-facing data, but if we can count on the format of the page not to change much over at least the next several months, it’s probably worth it.
Introducing Treasury.IO

By Michael Keller and Cezary Podkul

Posted on January 7, 2014
The U.S. Treasury’s Daily Treasury Statement lists actual cash spending down to the million on everything the government spent money on each day, as well as how it funded the spending. But, the Treasury only releases these files in PDF or fixed-width text files like this one, making any analysis very difficult. To liberate the data and make it easy to analyze federal money flows across time, we created Treasury.IO. The system we built downloads and parses the fixed-width files into a standard schema, creating a SQLite database that can be directly queried via a URL endpoint.
How We Made the (New) California Cookbook

By Megan Garvey, Erin Kissane, Lily Mihalik, and Anthony Pesce

Posted on November 21, 2013
At the Los Angeles Times, a design-editorial-programming team has resurrected the spirit of the beloved, out-of-print California Cookbook as a new website collecting hundreds of recipes from the Times Test Kitchen. In our Q&A;, the project’s editor, designer, and lead programmer share their goals and challenges, and offer a peek at the site’s building blocks and planned future.
How We Made Lobbying Missouri

By Danny DeBelius, Christopher Groskopf, Erin Kissane, and Matt Stiles

Posted on November 12, 2013
Lobbying Missouri is a collaboration between St. Louis Public Radio and members of NPR’s news apps teams. We spoke with three team members about the project, their design process, and the code under the hood.
All About Reporter

By Erin Kissane and Jeremy Singer-Vine

Posted on September 10, 2013
The Wall Street Journal’s Jeremy Singer-Vine recently released Reporter, an open source tool that makes it easy to hide and reveal the code behind common forms of data visualization presented on the web. We spoke with him about the tool’s makeup, design goals, and future development plan.
Introducing csvdedupe

By Derek Eder and Forest Gregg

Posted on August 15, 2013
Introducing csvdedupe, an open source command line tool for de-duplication and entity resolution.
Chase Davis on fec-standardizer

By Chase Davis and Erin Kissane

Posted on January 28, 2013
Chase Davis breaks down his fec-standardizer project and explains where it’s going next.

Articles

Articles tagged: python

What I Learned Recreating One Chart Using 24 Tools

Peda(bot)gically Speaking: Teaching Computational and Data Journalism with Bots

How I Investigated Uber Surge Pricing in D.C.

How La Nación Listened to 20,000 (Possibly Interesting) Audio Files

By Juan Elosua and Francis Tseng

Inside the Wall Street Journal’s Prediction Calculator

Introducing Elex, a Tool to Make Election Coverage Better for Everyone

By Jeremy Bowers and David Eads

Introducing agate: a Better Data Analysis Library for Journalists

Introducing broca

Introducing Bedfellows

A Botmaking Primer

To Scrape, Perchance to Tweet

Introducing Treasury.IO

By Michael Keller and Cezary Podkul

How We Made the (New) California Cookbook

By Megan Garvey, Erin Kissane, Lily Mihalik, and Anthony Pesce

How We Made Lobbying Missouri

By Danny DeBelius, Christopher Groskopf, Erin Kissane, and Matt Stiles

All About Reporter

By Erin Kissane and Jeremy Singer-Vine

Introducing csvdedupe

By Derek Eder and Forest Gregg

Chase Davis on fec-standardizer

By Chase Davis and Erin Kissane

New Source Job Listings

Recently Added Code Repos

Search this site

From our Archives: