Articles

Projects walkthroughs, tool teardowns, interviews, and more.

Articles tagged: python

  1. What I Learned Recreating One Chart Using 24 Tools

    By Lisa Charlotte Rost

    Posted on

    Lessons learned from trying to create one chart with as many applications, libraries, and programming languages as possible.

  2. Peda(bot)gically Speaking: Teaching Computational and Data Journalism with Bots

    By Nicholas Diakopoulos

    Posted on

    Bots encapsulate how data and computing can work together, in journalism. And when we use bots to teach concepts and skills in computational journalism, we’re actually teaching two kinds of thinking: editorial and computational.

  3. How I Investigated Uber Surge Pricing in D.C.

    By Jennifer A. Stark

    Posted on

    As part of my research with the Tow Center, I investigated the geographical and demographic data around how Uber works in D.C., to find out if its wait times varied by neighborhood (and, as a result, by demographic). Here’s how I did it.

  4. How La Nación Listened to 20,000 (Possibly Interesting) Audio Files

    By Juan Elosua and Francis Tseng

    Posted on

    With about 20,000 unlabeled audio files to classify, as part of a big breaking story, we created a process to help us focus on the files we actually needed.

  5. Inside the Wall Street Journal’s Prediction Calculator

    By Martin Burch

    Posted on

    We made an interactive based on a government agency’s method of predicting a person’s race and ethnicity based solely on their name and address. The response from our readers was, fittingly, a little unpredictable.

  6. Introducing Elex, a Tool to Make Election Coverage Better for Everyone

    By Jeremy Bowers and David Eads

    Posted on

    End the elections arms race” has become a rallying cry in American data journalism. Many newsrooms spend tremendous resources writing code to simply load and parse election data. It’s time we stopped worrying about the plumbing and started competing on the interesting parts. We decided it was time we put some code against our beliefs – our contribution is a tool we’re calling Elex. And it needs your help, too.

  7. Introducing agate: a Better Data Analysis Library for Journalists

    By Christopher Groskopf

    Posted on

    Meet agate, a Python data analysis library optimized not for performance, but for the performance of the human who is using it. That means focusing on designing code that is easy to learn, readable, and flexible enough to handle any weird data you throw at it. Here’s why you should try it.

  8. Introducing broca

    By Francis Tseng

    Posted on

    Made at our recent code convening, broca creates a system for easier experimentation and implementation of natural language processing.

  9. Introducing Bedfellows

    By Nikolas Iubel

    Posted on

    The financial relationship between PAC contributors and recipients can be difficult to divine from the information reported to the FEC. Bedfellows is a new Python library based on a model developed at The Upshot for understanding those relationships via several different measures.

  10. A Botmaking Primer

    By Joseph Kokenge

    Posted on

    Not sure where to begin with this whole bot thing? Joseph Kokenge is here to help you get started with botmaking 101.

  11. To Scrape, Perchance to Tweet

    By Abe Epton

    Posted on

    At the Chicago Tribune, we had a simple goal: to automatically tweet contributions to Illinois politicians of $1,000 or more, which campaigns are required to report within five business days. To see, in something approximating real time, which campaigns are bringing in the big bucks and who those big-buck-bearers are. The Illinois State Board of Elections (ISBE) has helpfully published exactly this data for years online, in a format that appears to have changed very little since at least the mid-2000s. There’s no API for this data, but the stability of the format is encouraging. A scraper is hardly an ideal tool for anything intended to last for a while and produce public-facing data, but if we can count on the format of the page not to change much over at least the next several months, it’s probably worth it.

  12. Introducing Treasury.IO

    By Michael Keller and Cezary Podkul

    Posted on

    The U.S. Treasury’s Daily Treasury Statement lists actual cash spending down to the million on everything the government spent money on each day, as well as how it funded the spending. But, the Treasury only releases these files in PDF or fixed-width text files like this one, making any analysis very difficult. To liberate the data and make it easy to analyze federal money flows across time, we created Treasury.IO. The system we built downloads and parses the fixed-width files into a standard schema, creating a SQLite database that can be directly queried via a URL endpoint.

  13. How We Made the (New) California Cookbook

    By Megan Garvey, Erin Kissane, Lily Mihalik, and Anthony Pesce

    Posted on

    At the Los Angeles Times, a design-editorial-programming team has resurrected the spirit of the beloved, out-of-print California Cookbook as a new website collecting hundreds of recipes from the Times Test Kitchen. In our Q&A;, the project’s editor, designer, and lead programmer share their goals and challenges, and offer a peek at the site’s building blocks and planned future.

  14. How We Made Lobbying Missouri

    By Danny DeBelius, Christopher Groskopf, Erin Kissane, and Matt Stiles

    Posted on

    Lobbying Missouri is a collaboration between St. Louis Public Radio and members of NPR’s news apps teams. We spoke with three team members about the project, their design process, and the code under the hood.

  15. All About Reporter

    By Erin Kissane and Jeremy Singer-Vine

    Posted on

    The Wall Street Journal’s Jeremy Singer-Vine recently released Reporter, an open source tool that makes it easy to hide and reveal the code behind common forms of data visualization presented on the web. We spoke with him about the tool’s makeup, design goals, and future development plan.

  16. Introducing csvdedupe

    By Derek Eder and Forest Gregg

    Posted on

    Introducing csvdedupe, an open source command line tool for de-duplication and entity resolution.

  17. Chase Davis on fec-standardizer

    By Chase Davis and Erin Kissane

    Posted on

    Chase Davis breaks down his fec-standardizer project and explains where it’s going next.

Current page