Articles

Projects walkthroughs, tool teardowns, interviews, and more.

Articles tagged: data

  1. Marriage Data: It’s Complicated

    By D’Vera Cohn

    Posted on

    D’Vera Cohn on everything you ever wanted to know about marriage data, but were afraid to ask.

  2. Everything You Ever Wanted to Know About Elections Scraping

    By Jeremy B. Merrill and Ken Schwencke

    Posted on

    Jeremy Merrill and Ken Schwencke explore the fine art of anticipating and catching errors while wrangling the eccentricities of US elections data.

  3. The Census of Governments Has Your Number

    By Mike Maciag

    Posted on

    Michael Maciag’s walk-through of this under-utilized goldmine.

  4. Finding Stories in Census Data

    By Emily Alpert Reyes

    Posted on

    Emily Alpert Reyes on how to find promising needles in Census haystacks.

  5. Gender, Twitter, and the Value of Taking Things Apart

    By Jacob Harris

    Posted on

    Jake Harris reverse-engineers Twee-Q to evaluate its use of data (and see if his ratio is as disappointing as Twee-Q says it is)

  6. From the BBC News Labs: Datastringer

    By Basile Simon

    Posted on

    Basile Simon walks through the process of building a new tool that aims to help reporters cover beats, and that was prompted by work by Knight-Mozilla Fellows and a presentation at Hacks/Hackers London.

  7. When and How to Use Census Microdata

    By Robert Gebeloff

    Posted on

    Robert Gebeloff’s primer on working microdata magic

  8. Comparing the Net Cost of College

    By Soo Oh, Erika Owens, and Beckie Supiano

    Posted on

    The Chronicle of Higher Education set out to compare net cost of colleges and found an unexpected discrepancy. The team describes the piece they created to help explain the difficulty in comparing net costs.

  9. Covering the European Elections with Linked Data

    By Basile Simon

    Posted on

    The BBC News Labs team explores ways of exposing linked data in public-facing election coverage, and encounters some interesting challenges.

  10. Pushing Hot Buttons with Census.gov

    By Ronald Campbell

    Posted on

    Ronald Campbell on using census data to find facts in a world of speculation

  11. Distrust Your Data

    By Jacob Harris

    Posted on

    Jacob Harris on six ways to make mistakes with data—and how to avoid them.

  12. How to Use the Census Bureau’s American Community Survey like a Pro

    By Paul Overberg

    Posted on

    Paul Overberg explains base tables and how to get the best data from them (hint: ask good questions!).

  13. Newsroom Analytics: A Primer

    By Jacqui Maher

    Posted on

    Jacqui Maher says it’s not just the numbers, it’s what they mean about the audience.

  14. How We Made the SOTU Twitter Visualization

    By Nicolas Belmonte and Simon Rogers

    Posted on

    People tweet what they think, when they think it—and, crucially, we wanted to provide a visualization for the State of the Union speech which reflected that. This wouldn’t be a (shudder) word cloud based on frequencies but a way to track the conversation on Twitter as it was directly influenced by the President’s speech.

  15. You Got the Documents. Now What?

    By Jonathan Stray

    Posted on

    Jonathan Stray’s guide to turning documents into data you can run with.

  16. Human-Assisted Reporting Gets the Story

    By Tyler Dukes

    Posted on

    Tyler Dukes on combining the power of data-sorting tools with old-fashioned digging.

  17. How to Make a News App in Two Days

    By Al Shaw

    Posted on

    As part of the orientation week for the 2014 class of Knight-Mozilla OpenNews Fellows, fellow nerd-cuber Mike Tigas and I led a hackathon at Mozilla’s headquarters in San Francisco…

  18. To Scrape, Perchance to Tweet

    By Abe Epton

    Posted on

    At the Chicago Tribune, we had a simple goal: to automatically tweet contributions to Illinois politicians of $1,000 or more, which campaigns are required to report within five business days. To see, in something approximating real time, which campaigns are bringing in the big bucks and who those big-buck-bearers are. The Illinois State Board of Elections (ISBE) has helpfully published exactly this data for years online, in a format that appears to have changed very little since at least the mid-2000s. There’s no API for this data, but the stability of the format is encouraging. A scraper is hardly an ideal tool for anything intended to last for a while and produce public-facing data, but if we can count on the format of the page not to change much over at least the next several months, it’s probably worth it.

  19. Introducing Treasury.IO

    By Michael Keller and Cezary Podkul

    Posted on

    The U.S. Treasury’s Daily Treasury Statement lists actual cash spending down to the million on everything the government spent money on each day, as well as how it funded the spending. But, the Treasury only releases these files in PDF or fixed-width text files like this one, making any analysis very difficult. To liberate the data and make it easy to analyze federal money flows across time, we created Treasury.IO. The system we built downloads and parses the fixed-width files into a standard schema, creating a SQLite database that can be directly queried via a URL endpoint.

  20. Open Your Data

    By Waldo Jaquith

    Posted on

    Waldo Jaquith on the whys and wherefores of making it open

Current page