How We Made ‘Homan Square: a portrait of Chicago’s detainees’

A Q+A with The Guardian U.S. interactive team

On October 19, the Guardian published Homan Square: A Portrait of Chicago’s Detainees as a part of its ongoing investigation into the Chicago Police Department’s alleged abuses of detainee rights at a warehouse facility on Chicago’s west side. We spoke with the Guardian interactive team responsible for the interactive feature, both in their NYC offices and via email. The following interview has been edited for length and clarity.

The Story & the Data

Q. Can you each give us a quick introduction to your role on the interactives team and on this project?

The Guardian US interactive team is Kenan Davis, Rich Harris, Nadja Popovich, and Kenton Powell. We’re all editors/designers/developers and we each wear many hats depending on the day. For this piece, we all worked with the rest of the newsroom on shaping and refining the contours of the project, analyzing the data, and designing and developing the interactive.

Q. How and why did this project get started?

Guardian reporter Spencer Ackerman broke the initial Homan Square story back in February and shortly thereafter, the Guardian filed suit against the Chicago Police Department seeking records of individuals taken to the facility. It finally became clear over the summer that the CPD would be compelled to release some of this data.

There were a lot of records to parse—over 3,500 in the first batch of disclosures, and another 3,500+ in the second batch—so we helped make sense of the data dump. For the first story, we spent most of our time doing data analysis and producing a few graphics. After it became clear another several thousand records were forthcoming, our editors thought is was important to do a bigger, explanatory interactive that illustrated the full scale of detentions at the facility.

Q. How does this piece fit into your overall reporting on Homan Square?

Spencer Ackerman, Zach Stafford, and other Guardian reporters have been reporting on Homan Square and corruption in Chicago’s police department since the beginning of this year. Our interactive piece serves as a primer on what we know about the scale of detentions at the facility.

Q. Do community organizations who’ve been observing Chicago policing know about the piece?

Throughout the Guardian’s reporting on Homan Square, our reporters have continually interviewed representatives of groups from the Chicago chapter of Black Lives Matter to the NAACP, from city council members to members of Congress. So stakeholders are very aware of this ongoing project, and we’ve seen them sharing it online and discussing since it came out.

Q. Tell us about actually getting your hands on this data, from CPD.

The data results from an ongoing lawsuit. We sued for all detention records, but the CPD haven’t given us everything. Right now, they have released two batches of records of people arrested and detained at Homan Square between 2004–2015, totaling 7,185 arrests. But they haven’t proffered any information about people detained and not ultimately charged, nor any records from before 2004, which is why the lawsuit is ongoing.

The police gave us the current records in two batches over the summer, half in July and half in August.

Q. What’s the story with the second disclosure?

It’s just more—it’s over 3,500 records in addition to that first tranche. It’s very similar in terms of the format of the data we got. They actually gave us race data, which they didn’t give us at first, so we’d had to find it. We had people picking out 3,000 records and writing it down by hand. There was a team of researchers painstakingly [researching]—literally in the Cook County Records Office—because that’s the only way we were about to get it.

There were actually three sets [of data] that we combined: There was one initial set of one-sheets that were copies of original arrest records, and then subsequent records for both sets of around 3,500 were tabular-style data—we originally thought we might get the original arrest records for everything, and then subsequently it ended up being quasi-machine-readable. The first tranche was not machine readable at all, it was a lot of OCR and manual data entry, and quite unpleasant.

Why there were two [disclosures]: they told us that the complete set was the first 3,500, and then right around the time we were finishing that first story, they said there would be more. Later they said there would be that many more again, and we were like…oh. But these were just people who were detained and charged.

Q. So you’re still waiting—

For people who were detained but not charged.

Q. And it’s TBD what’s going to happen with that?

Right. And that’s the tough thing about the story, is that those people are potentially even more interesting in some ways. But there’s been pretty significant resistance on [the CPD’s] part, I believe.

Q. Is there any rationale—that you can talk about—that they’ve offered?

There have been some talking points from the CPD—the “fact sheet.”

In some cases they did the classic thing that happens when you try to FOIA data and said it would be overly burdensome to produce. So we requested things in separate tranches: they gave us the digitized data, and the rest is a work in progress.

The burdensome comment was about any data prior to 2004. They’ve had the warehouse since 1995, and the police moved in shortly after. And they said that none of those records are digitized, prior to 2004, and that’s why we only got them from August, 2004 on.

Q. You send people into Chicago, or you have people stationed there?

Zach Stafford is there. And then we also send people in and out, and we have a couple of freelancers. And Spencer takes trips there.

Process & Design

Q. How did you arrive at this interface? How did the idea of people being “disappeared” factor into your design choices, and what were your inspirations?

Because this is such a complex story, we wanted to walk readers through what we know about Homan Square step-by-step—but using the more natural act of scrolling, rather than clicking.

We also decided early on that we wanted to overwhelm the reader with the sheer quantity of arrests, so the idea of rendering a “cloud” of documents—representing arrest records—made sense. But we soon realized a document-centric visual representation wouldn’t work, both conceptually and due to data constraints. (For one, we don’t let readers explore/read the arrest records because we’re not revealing details of individual cases).

For us, it was important to remind people that numbers are about human stories and that pixels in data visualizations often represent people, so we chose to represent the arrestees with more human icons. From there, the idea of rearranging these icons into different visualizations felt incredibly natural.

Our influences include scroller interactives like Fewer Helmets, More Deaths from the New York Times and A Really Small Slice of Americans Get to Decide Who Will Rule the Senate from Bloomberg. We think persistent visualizations that adapt to where you are in the story can be really powerful ways of helping readers understand the numbers, compared to embedding a series of discrete charts.

Q. Can you say more about the decision not to disclose names/faces without explicit permission?

This was one of the first decisions we made when we began to examine the data. Homan Square has a reputation as a place where police pressure people into becoming informants. It would have been irresponsible to compromise the identities of these individuals. The data we obtained also reveals sensitive details about the arrests of over 7,000 individuals. These, however, are just arrest records, not court proceedings, and don’t include the final disposition of any arrests.

When we stop to tell the story of a named person, it’s because our reporters talked to them and got permission to tell their story. Mostly, this story is about a questionable police practice, not about the arrestees, and including arrest details for all 7,185 arrests simply wasn’t necessary to tell this story, and could have put people at risk.

Q. What were your major design challenges?

I would say our first challenge was figuring out how we were going to make records that we had to keep anonymous more human. Our first idea was having [visuals] look like the police one-sheeter records. But having something that looked more like a paper than a person wasn’t what we wanted. Though we weren’t using photographs of the arrestees, we still wanted to humanize them, so we hired an illustrator, Oliver Munday, to make a series of abstract portraits to represent the people held at Homan Square.

Q. The kinetics on this project are beautiful. What was your process like, for dialing in the transition animations?

The four of us all work very closely together and we all sketched out the bones of the project. There’s little individual ownership, and we all adjust the design as we go, fixing or noting issues as we encounter them. As our publication date loomed, we gathered around a screen to tweak and finalize the visual design and transitions. And this is just how we usually work, we’ll edit each other and build on each other’s ideas collectively—this time we gave ourselves enough time to do even the smallest things, so that was nice.

Design-wise, what usually happens is that one of us will come up with something a little…”out there,” and the rest of the team imposes a bit of restraint, so there’s a gradual process of refinement. At one point we had lots of colors, at one point we had the camera doing crazy things, looking at the documents from stupid angles. And then we were like…no, no stupid angles. [We’re] trying to allow taste to triumph over possibility. And that’s where our team dynamic comes into play, because we work in this very collaborative, egalitarian way, none of us is shy about telling the others “That’s ridiculous, stop doing that.”

There have been multiple generations of our team, but the culture now is very egalitarian, and there’s shared ownership, and you can be more critical of something you feel is your own work. It’s not like you’re being critical of a team or of other people’s work, we’re being self-critical. We all do that, and respect that. I think it’s a really important part of our process.

Q. What was your project timeline like?

We worked on it on and off while juggling a few other projects. In all, it took about four weeks.

There was one big push during the initial design phase, and then this later push. We started thinking it through with the first 3500 [records]—we had an idea that we would do something like this, though it wasn’t fully formed. The first real effort was whether we could get the interactive done in time for the first article and we decided that we really wouldn’t, and it was important to publish that as soon as possible.

Q. And after you got all the data, how long did it take you to produce the interactive?

Two weeks, [in addition to] the two weeks up front.

Q. Did the project change between those pushes? Did the arc change?

The story structure changed. It was always going to be explainery, but the question was, “Can we weave more of the actual narrative, Spencer’s story, into the interactive? Can they be one and the same?” At a certain point we decided that it would be better to have [the interactive] be more succinct.

We always had this idea that the visual form would be something like what we ended up with. It was always going to be a scroll-driven thing with documents flying around the screen rearranging themselves into visualizations. So even though the story changed, we agreed on [the format] fairly early. And Spencer’s story changed a lot, too, so that kept the two pieces converging and diverging.

Q. How do you-all work with a reporter on a story like this? Is it just constant, informal back-and-forth?

We made a layout for them, and then we had the two reporters go in there and add their own text in and edit all of the stuff. It’s not that complex a piece—you have the big numbers and the little sections, so it wasn’t that hard for them to understand it and to run with that concept, and I think it worked really well that way. You do have to build out quite a bit before you can show them what you’re doing with something like this because we start being like, “Well there’s documents falling and there’s gonna be numbers!” But once we were able to show them with mock text, the lightbulb went on and they were able to shape it.

One of the things that helped was that we started referring to the text that accompanied our piece as a “script.” We tried early on to say, "There are a number of goalposts, and we can rearrange the order in which we do that, and we can say whatever we want around those things, but we’re going to find a way to get from here to here to here to here.” We referred to them as scenes, and to the script, and I think that’s important—it’s not a narrative or a story. And I think that helped [the reporters] break their normal ideas about how these things come together.

Q. Where did that language about scenes and scripts come from, for your team?

It may have been the technical implementation, because you need to be able to think of things as discrete moments in the code—otherwise, the code has no structure. And so you step back from that and know that your content is going to have to follow roughly the same form. I think that helped avoid the classic thing of “Here’s a story and now we need to bolt the visual moments onto that,” which is a very easy trap to fall into, and we were quite determined not to do that from the very beginning. And I think—I hope—that we mostly succeeded.

Underlying Technologies

Q. Tell us about the tech behind the interactive. Is it WebGL?

It is. It’s Three.js—that’s the backbone.

Q. But it’s so light!


Q. Is it not? It performed really well for me!

Thank you!

Q. So why WebGL and not another approach, like pre-rendered videos, for example?

Pre-rendered videos hobble you in all sorts of ways: you have to sacrifice quality for file size, or vice-versa. You’re locked to a particular aspect ratio (in some slides, e.g. the intro “cloud” and the map, we extend the view beyond the box that “frames” the data visualisation itself). And everything has to be strictly linear—you can’t smoothly transition from slide 3 to slide 7. It also becomes much, much harder to iterate, because you have to re-render and redeploy any changes before you can actually test the updated clips, and small changes such as tweaking the icons mean you have to trash everything.

Doing it programmatically—“blindly manipulating symbols,” to quote Bret Victor—is certainly more complex than using (say) After Effects, but in terms of what we’re able to achieve, there’s no contest, particularly given the skill sets on our team.

We always planned to use WebGL, because we knew that we would be dealing with thousands of items onscreen at once. SVG just can’t cope with that many objects, particularly on mobile. Even with WebGL, we had to do a ton of optimization—for example, the 7,185 squares you see are actually a single piece of geometry, which means that Three.js only needs to make one “draw call” for each frame. The geometry (all the vertices, faces, “normals,” and texture coordinates) are encoded as “array buffers,” a wonderful JavaScript feature that allowed us to interact with memory at a binary level, the same way a game developer might if they were working in a low-level language like C++. Initially, we’d taken a more naive approach, and mobile devices struggled to render the animations smoothly.

We didn’t really have any experience using WebGL—we’d used it in one project before, in 2014 for making a spinning-globe thing, and we’ve used Mapbox GL on a few projects, but this was the first time we’ve used WebGL in this context, so there was a good week or so of…“Is this possible? Is this going to work?” And figuring out how to bend the technology to our will.

Bending the technology to our will was a really interesting thing. One of the things that bothers me about WebGL is that you can’t get crisp, sharp type. There is some cool trickery for our labels: Rich [Harris] figured out how to glue on an annotation layer—you can put an SVG in there, you can put in HTML—and figured out coordinates of WebGL, so we can coordinate those two spaces. This was great, because we could use SVG for what SVG is really good for—which is amazing, sharp type on Retina displays, and then use WebGL for performance.

Q. The piece really does seem super lightweight (2MB?) for the level of interaction. With performance such a discussion point now, how much did that factor in?

It’s actually even less [than 2 MB], including the site CSS, JavaScript and fonts, and by far the biggest chunk is the map of Chicago (and we load a smaller version on smaller screens).

Performance and weight are things we’re always very conscious of—we test on real mobile devices early on during development so that we’re not scrambling to remove features at the last minute. That’s something we learned the hard way: on one occasion, we produced a series of interactive embeds for a long story, and the combination of the embeds plus the article, images, site JavaScript and ad code was too much for the webview inside our mobile app to handle on older phones. So even though we’re trying to push the technology as far as it’ll go, we’re pretty conservative about what we can expect in terms of bandwidth and processor power. In practical terms, that means loading assets lazily where possible, and building workflows that make it easy to do the right thing—for example, we created Rollup, which generates much smaller JavaScript bundles than competing tools.

We learned a huge amount, doing this [project], about how to structure code in this context. If you work with the DOM, you work with this persistent structure, and you are manipulating it to reflect the current state of the interactive, and that’s not true with WebGL—you’re just rendering a ton of pixels over and over again. So by virtue of how that works, and how you don’t have things like events, it forces a very different sort of programming model, which none of us was really familiar with.

So we had to figure that out, and we had to figure out how to maximize the benefit of getting the GPU to do all this work for you. Because it turns out—disappointingly!—that it’s not enough to say “We’re going to use Three.js and we’re going to have 7,000 squares in our Three.js scene instead of 7,000 rectangles in an an SVG plane.” You don’t get any benefit from that—you have to figure out how to pass the information to the GPU in such a way that it can do the heavy lifting for you. So the whole thing was eye-opening for us, and if we were to do this project again, it would be better code and we would take less time.

It’s quite a heavy lift to work in a space that’s quite unfamiliar. I think we’ve all been burned by our prior experiences of using certain features of the web. Everyone who works on news apps everywhere. It’s like “Oh, you’re doing that? You’re going to have a bad time, the performance is going to suck.”

It’s like electrical fences: After a cow’s been shocked a couple of times, the cow no longer goes to the electrical fence, they’ve learned to stay away from the electrical fence, and so we’ve all kind of learned, “Don’t use those features, don’t try to render that many things at once.” The technology is changing, and some of the fences are no longer electric. You need to keep trying to find out how to break out of this field and go and eat the green grass over there. (Sorry, this is a terrible metaphor.)

But we’re finding that you can do things now that were incredibly difficult, if not impossible, two years ago. The boundaries of what’s possible, storytelling-wise, with web technologies, is changing. And we need to keep trying to break outside of our comfort zones, picking these new things up.

We did target specific webviews inside apps a little bit differently, because scroll events are handled in a very peculiar way, and it’s a very scroll-intensive piece. But it was performance too—in our app, webview was not able to handle any of the transitions, so it falls back to PNGs, inside the app.

But that also comes with the territory. At one point [in testing] we thought, “Oh my god! It’s terrible in Firefox!” and then we realized it’s not terrible in Firefox, it’s terrible if you’re using it in Firefox and in Chrome. For whatever reason, if you’re using it in two browsers [at once] and they’re both trying to access the graphics card, it’s awful! But if you’re just using it in Firefox or you’re just using it in Chrome, it’s fine.

Q. I’m going to go home and pull it up in four browsers.

It’ll be terrible, you should try it out! It’s these weird things—we encountered all these weird things, and again, I think it’s just the maturity of the technology. I mean, people have done weirder things with HTML and SVG and CSS, so that stuff is a little better. But when you use WebGL you’re like “Oooooh it’s weird.” It’s intended for games and people [in newsrooms] don’t typically do games, so some of that stuff you’re not likely to ever figure out. You just have to work through it. It’s just these weird things, like with the two browsers—but that’s not a use case that needed to be supported.

Q. I will be your only user running it on four browsers.

Yeah, @ us [on Twitter]: “It DOESN’T work on four browsers. Sigh.”

Q. And are the viz tools reusable? This simple way of displaying numbers seems very applicable to a range of visualizations.

The idea is certainly reusable; the code probably less so. Working with WebGL is very different than working with the DOM, and it took us a long time to figure out how to structure the code for this particular project in a sensible way. There were a lot of dead ends, and the codebase reflects that. If we were to do something similar in future, we’d probably start from scratch, using the lessons we learned on this project.

Q. What was really interesting or challenging or hard about this piece that I haven’t asked you about?

I think it’s important for all of us to reflect on the fact that just because you have it doesn’t mean you should publish it. Early on, we said…”there’s a lot more information about a lot of these arrests.” But that’s not the story. Or in part, it’s not the story—in part it’s irresponsible and dangerous given the context of this location.

The Guardian is a very transparent organization. As data journalists these days, it often makes a lot of sense to just say “Let’s just publish it!” That was something that we dealt with quite early on—especially because earlier this summer we were working on “The Counted” which is about pushing out as much as possible and being as transparent as possible, and kind of the opposite [of this]. In this…we don’t need to name people. And we [thought] about data protection, making sure that what we put in the source code was quite clean, and those special considerations. It was a different approach that we all need to be reminded of. Good journalism is being as transparent as possible, sometimes—and good journalism is also sometimes withholding what’s not responsible to put out.

And there are, frankly, some really compelling arrest narratives [in the data], and those are really interesting. Each one of them is its own story, and you think “Wow, this is great, these are really interesting stories!” But they’re not our stories to tell. Especially en masse. Those are not data stories—they can play a part in a data story, but those are individual narratives that need to be reported on.

Where From Here?

Q. Are you getting any reactions from Chicago media?

Related to the Guardian’s prior reporting, CJR has written about the Chicago media’s coverage of Homan Square.

Q. Is there more to come from you on this story?

The Guardian plans to continue the lawsuit against the Chicago Police Department in order to compel them to turn over new data. At the end of our new interactive piece, we detail what we still don’t know about Homan Square and still hope to find out.





Current page