Learning Convening Python Code natural language processing Introducing broca

Introducing broca

Made at our recent code convening, broca creates a system for easier experimentation and implementation of natural language processing.

Project campaign finance Python Introducing Bedfellows

Introducing Bedfellows

The financial relationship between PAC contributors and recipients can be difficult to divine from the information reported to the FEC. Bedfellows is a new Python library based on a model developed at The Upshot for understanding those relationships via several different measures.

How-to Python #botweek bots A Botmaking Primer

A Botmaking Primer

Not sure where to begin with this whole bot thing? Joseph Kokenge is here to help you get started with botmaking 101.

Project Python scraping Twitter data To Scrape, Perchance to Tweet

To Scrape, Perchance to Tweet

At the Chicago Tribune, we had a simple goal: to automatically tweet contributions to Illinois politicians of $1,000 or more, which campaigns are required to report within five business days. To see, in something approximating real time, which campaigns are bringing in the big bucks and who those big-buck-bearers are. The Illinois State Board of Elections (ISBE) has helpfully published exactly this data for years online, in a format that appears to have changed very little since at least the mid-2000s. There’s no API for this data, but the stability of the format is encouraging. A scraper is hardly an ideal tool for anything intended to last for a while and produce public-facing data, but if we can count on the format of the page not to change much over at least the next several months, it’s probably worth it.

Project SQLite3 Treasury Python data Introducing Treasury.IO

The U.S. Treasury’s Daily Treasury Statement lists actual cash spending down to the million on everything the government spent money on each day, as well as how it funded the spending. But, the Treasury only releases these files in PDF or fixed-width text files like this one, making any analysis very difficult.

To liberate the data and make it easy to analyze federal money flows across time, we created Treasury.IO. The system we built downloads and parses the fixed-width files into a standard schema, creating a SQLite database that can be directly queried via a URL endpoint.

Project recipes natural language processing Python Django Varnish NLTK edge side includes Armstrong ESI How We Made the (New) California Cookbook

How We Made the (New) California Cookbook

At the Los Angeles Times, a design-editorial-programming team has resurrected the spirit of the beloved, out-of-print California Cookbook as a new website collecting hundreds of recipes from the Times Test Kitchen. In our Q&A, the project’s editor, designer, and lead programmer share their goals and challenges, and offer a peek at the site’s building blocks and planned future.

Project finance Python politics flask MapBox Peewee How We Made Lobbying Missouri

How We Made Lobbying Missouri

Lobbying Missouri is a collaboration between St. Louis Public Radio and members of NPR’s news apps teams. We spoke with three team members about the project, their design process, and the code under the hood.

Project Ruby Python analysis Jekyll data All About Reporter

All About Reporter

The Wall Street Journal’s Jeremy Singer-Vine recently released Reporter, an open source tool that makes it easy to hide and reveal the code behind common forms of data visualization presented on the web. We spoke with him about the tool’s makeup, design goals, and future development plan.

Tool CSV Python de-duplication dedupe Introducing csvdedupe

Introducing csvdedupe

Introducing csvdedupe, an open source command line tool for de-duplication and entity resolution.

Roundup events Event Roundup, June 3

Event Roundup, June 3

Two local ONA chapter meetups this week, and later this June, join Knight-Mozilla OpenNews at MIT.

Roundup events Event Roundup, Apr 22

Event Roundup, Apr 22

Journalists gather in Italy this week, while Hacks/Hackers chapters hold meetups on balloon mapping and HTML 5, plus a cryptoparty.

Learning data Olympics Lessons: Data Journalists, Meet Your Audience

Olympics Lessons: Data Journalists, Meet Your Audience

The NYT’s Tiff Fehr on figuring out what Olympics fans expected and how her team made them happy.

Roundup events Event Roundup, Mar 18

Event Roundup, Mar 18

The Knight News Challenge deadline is today Tuesday at 5pm Eastern Daylight Time. (Due to technical difficulties, the deadline has been extended one day.)

Roundup events Event Roundup, Mar 4

Conference season is gearing up: Last week NICAR, this week SXSW Interactive.

Tool documentation style validation design ProPublica’s News Apps Guides

Yesterday morning, the ProPublica apps team released a series of documents outlining their coding philosophy, app design and development practices, data validation techniques, and more. We spoke with Scott Klein about how his team’s processes evolved and how they made the time to document it all.

Interview dataviz illustration Meet Mariana Santos

The first in a series of interviews with Knight International Journalism Fellows.

Roundup events Event Roundup, Feb 25

Event Roundup, Feb 25

This week, the National Institute for Computer-Assisted Reporting Conference hits Louisville, Kentucky.

Roundup events Event Roundup, Jan 28

Event Roundup, Jan 28

Google Journalism Fellowship deadline is this week. This weekend, learning with Code with me and hacking with the Sunlight Foundation and Digital Democracy.

Project campaign finance Python Django machine learning Chase Davis on fec-standardizer

Chase Davis on fec-standardizer

Chase Davis breaks down his fec-standardizer project and explains where it’s going next.

Roundup events Event Roundup, Jan 7

Event Roundup, Jan 7

New year and lots of event planning underway. Plus, we’re entering awards entry season: the IRE deadline is this Friday.