Projects walkthroughs, tool teardowns, interviews, and more.
Articles tagged: python
What I Learned Recreating One Chart Using 24 ToolsPosted on
Lessons learned from trying to create one chart with as many applications, libraries, and programming languages as possible.
Peda(bot)gically Speaking: Teaching Computational and Data Journalism with BotsPosted on
Bots encapsulate how data and computing can work together, in journalism. And when we use bots to teach concepts and skills in computational journalism, we’re actually teaching two kinds of thinking: editorial and computational.
How I Investigated Uber Surge Pricing in D.C.Posted on
As part of my research with the Tow Center, I investigated the geographical and demographic data around how Uber works in D.C., to find out if its wait times varied by neighborhood (and, as a result, by demographic). Here’s how I did it.
How La Nación Listened to 20,000 (Possibly Interesting) Audio Files
By Juan Elosua and Francis TsengPosted on
With about 20,000 unlabeled audio files to classify, as part of a big breaking story, we created a process to help us focus on the files we actually needed.
Inside the Wall Street Journal’s Prediction Calculator
By Martin BurchPosted on
We made an interactive based on a government agency’s method of predicting a person’s race and ethnicity based solely on their name and address. The response from our readers was, fittingly, a little unpredictable.
Introducing Elex, a Tool to Make Election Coverage Better for Everyone
By Jeremy Bowers and David EadsPosted on
“End the elections arms race” has become a rallying cry in American data journalism. Many newsrooms spend tremendous resources writing code to simply load and parse election data. It’s time we stopped worrying about the plumbing and started competing on the interesting parts. We decided it was time we put some code against our beliefs – our contribution is a tool we’re calling Elex. And it needs your help, too.
Introducing agate: a Better Data Analysis Library for JournalistsPosted on
Meet agate, a Python data analysis library optimized not for performance, but for the performance of the human who is using it. That means focusing on designing code that is easy to learn, readable, and flexible enough to handle any weird data you throw at it. Here’s why you should try it.
Introducing brocaPosted on
Made at our recent code convening, broca creates a system for easier experimentation and implementation of natural language processing.
Introducing BedfellowsPosted on
The financial relationship between PAC contributors and recipients can be difficult to divine from the information reported to the FEC. Bedfellows is a new Python library based on a model developed at The Upshot for understanding those relationships via several different measures.
A Botmaking PrimerPosted on
Not sure where to begin with this whole bot thing? Joseph Kokenge is here to help you get started with botmaking 101.
To Scrape, Perchance to Tweet
By Abe EptonPosted on
At the Chicago Tribune, we had a simple goal: to automatically tweet contributions to Illinois politicians of $1,000 or more, which campaigns are required to report within five business days. To see, in something approximating real time, which campaigns are bringing in the big bucks and who those big-buck-bearers are. The Illinois State Board of Elections (ISBE) has helpfully published exactly this data for years online, in a format that appears to have changed very little since at least the mid-2000s. There’s no API for this data, but the stability of the format is encouraging. A scraper is hardly an ideal tool for anything intended to last for a while and produce public-facing data, but if we can count on the format of the page not to change much over at least the next several months, it’s probably worth it.
By Michael Keller and Cezary PodkulPosted on
The U.S. Treasury’s Daily Treasury Statement lists actual cash spending down to the million on everything the government spent money on each day, as well as how it funded the spending. But, the Treasury only releases these files in PDF or fixed-width text files like this one, making any analysis very difficult. To liberate the data and make it easy to analyze federal money flows across time, we created Treasury.IO. The system we built downloads and parses the fixed-width files into a standard schema, creating a SQLite database that can be directly queried via a URL endpoint.
How We Made the (New) California Cookbook
By Megan Garvey, Erin Kissane, Lily Mihalik, and Anthony PescePosted on
At the Los Angeles Times, a design-editorial-programming team has resurrected the spirit of the beloved, out-of-print California Cookbook as a new website collecting hundreds of recipes from the Times Test Kitchen. In our Q&A;, the project’s editor, designer, and lead programmer share their goals and challenges, and offer a peek at the site’s building blocks and planned future.
How We Made Lobbying MissouriPosted on
Lobbying Missouri is a collaboration between St. Louis Public Radio and members of NPR’s news apps teams. We spoke with three team members about the project, their design process, and the code under the hood.
All About Reporter
By Erin Kissane and Jeremy Singer-VinePosted on
The Wall Street Journal’s Jeremy Singer-Vine recently released Reporter, an open source tool that makes it easy to hide and reveal the code behind common forms of data visualization presented on the web. We spoke with him about the tool’s makeup, design goals, and future development plan.
By Derek Eder and Forest GreggPosted on
Introducing csvdedupe, an open source command line tool for de-duplication and entity resolution.
Chase Davis on fec-standardizer
By Chase Davis and Erin KissanePosted on
Chase Davis breaks down his fec-standardizer project and explains where it’s going next.