The HuffPost Predict-o-Tron is a tool we built to let people make their own March Madness bracket predictions using basketball statistics, expert ratings, and results from the past four tournaments.
There are some interesting tidbits to be found in the data, although they all need to be qualified with the understanding that model performance is based on only four years of data, which leaves us at risk of overfitting. This means that slider combinations that appear to do very well for the past four years may not continue to perform as well if expanded to the past 10 years. With that said, it looks like the experts are very good at picking a bracket, taller teams tend to do better than shorter teams, younger teams do better than older teams, and teams with more depth (both in scoring and playing time) do better than teams with less depth.
People tweet what they think, when they think it—and, crucially, we wanted to provide a visualization for the State of the Union speech which reflected that. This wouldn’t be a (shudder) word cloud based on frequencies but a way to track the conversation on Twitter as it was directly influenced by the President’s speech.
Hacks/Hackers are meeting up in NYC, London, and Berlin this week. Plus, deadline for submissions for the Knight News Challenge to answer: “How can we strengthen the Internet for free expression and innovation?”
Jonathan Stray’s guide to turning documents into data you can run with.
I was asked to join BBC News Labs a couple a weeks ago to work on a project that, when it was first briefly explained to me by email, left me clueless about what it was about. (Imagine the discomfort before my job interview with Matt Shearer, Innovation Manager at the Lab.)
The project is called #newsVane—and yes, we refer to it with the hash sign every time, don’t ask me why.
We see a moment coming when the collection of endless streams of data is commonplace. As this transition accelerates it is becoming increasingly apparent that our existing toolset for dealing with streams of data is lacking. Over the last 20 years we have invested heavily in tools that deal with tabulated data, from Excel, MySQL, and MATLAB to Hadoop, R, and Python+Numpy. These tools, when faced with a stream of never-ending data, fall short and diminish our creative potential.
In response to this shortfall we have created streamtools—a new, open source project by the New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It offers a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system.
Talking open data this weekend in Montevideo, plus lots of Hacks/Hackers events around the world this month.
Eva Constantaras on training data journalists where data journalism isn’t a standard practice.
In the year-and-a-bit we’ve been publishing Source, we’ve built up a solid archive of project walkthroughs, introductions to new tools and libraries, and case studies. They’re all tagged and searchable, but as with most archives presented primarily in reverse-chron order, pieces tend to attract less attention once they fall off the first page of a given section. We’ve also been keeping an eye out for ways of inviting in readers who haven’t been following along since we started Source, and who may be a little newer to journalism code—either to the “code” or the “journalism” part.
Tyler Dukes on combining the power of data-sorting tools with old-fashioned digging.
If you want to show information with a geographical component, you should start with a map, right? Not so fast, writes Tasneem Raja. Questioning your assumptions can help you make something much more effective.
The NICAR conference is this week, but the news nerd fun doesn’t end there.
The Reuters Graphics team’s unusual Fed interactive grabbed our attention when it appeared late last month and sparked some interesting conversations on Twitter. Reuters Global Head of Graphics Maryanne Murray and Interactive Data Designer Charlie Szymanski kindly wrote up their rationale and process for us.
Today, we’re launching Source Jobs, a new place to list jobs for the newsroom designers and developers already populating our Community section—and for the curious developers and designers who don’t yet realize that their future lies in journalism. As the global journalism-code community continues to grow, our goal is to offer a simple, scalable listings service that newsrooms can edit on their own.
The Source roundup returns with biweekly summaries of notable interactive features, news apps, data work, and newsroom code commentary.
Christopher Groskopf’s tricks for going to the office without going to the office.
This week, vote on NICAR lightning talks, pitch ideas to Tribeca Hacks
Vox Media’s VP of technology breaks down the hard and frequently messy lessons his organization has learned about clearing the way for successful tech collaboration in newsrooms.
Send your NICAR lightning talk proposals now and check out ONA and Hackers/Hackers meetups this week.