Yo Dawg I Heard You Like Bots
Bots on bots on bots on bots on
As a card-carrying old, I hate to admit that Xzibit and his meme were an inspiration to anything. But I have to be honest: We put a bot, on a bot, on a bot to make a website that gives people a near real-time view of the Platte River basin that stretches from the Continental Divide, 10,000 feet above sea-level in Colorado, across more than 500 miles of midwestern prairie to the confluence with the Missouri River in eastern Nebraska.
Yo dawg indeed.
This all starts with the Platte Basin Timelapse Project, based at the University of Nebraska-Lincoln’s Institute of Agriculture and Natural Resources and housed at NET, the statewide public broadcaster in The Good Life. PBT started in March 2011 with the goal of placing timelapse cameras throughout the basin and documenting time passing along one of Nebraska’s most important water resources. Now, they have more than 40 cameras placed, each taking photos during daylight, every day, every hour, all year long. Over the life of the project, they’ve gathered more than a million images and terabytes of data.
Last year, as part of my job at the university, I was appointed to PBT to help them with data and web-making projects. And that’s when they told me that a few cameras were special. A few cameras had cellular cards attached to them. And they weren’t in such a remote location that they couldn’t pick up a signal. And every hour, at least when they had enough power to do so, they sent their photos to Dropbox.
What Could We Do with That?
Sometimes you have to look at what you have, what people might want, and just get the hell out of the way. What did we have? We have big, beautiful images of the Platte. In near real-time. Day after day after day. In places people might be interested in. So we just needed to make a website that showed people those big, beautiful images, organized by location, as soon as we could. People should be able to watch the day unfold. And could a computer make a timelapse video automatically at the end of every day?
And Could We Do It at Little to No Expense?
Any software project, bot or not, is usually best accomplished first by thinking about what your user needs, then by thinking procedurally about what it will take to accomplish that. First, we must accomplish this task, then, we need this task. Solve enough problems, complete enough tasks, and you’ve got yourself a bot, or a website, or a whatever you set out to build.
Yep. Here’s How We Did It.
Our first task was to get the photos. Since Dropbox has an API, this was really very simple. They even provide libraries in various languages to access it.
Next, we needed to resize them. The photos come in as billboards, at high resolution, each one far too large to display on a website, let alone the dozens that would could make up a day. Our first iteration didn’t resize the photos. A single page load was over 40MB, dwarfing even the most bloated, horrifically user hostile news websites. So we turned to the Python Image Library, which has been around for quite some time and has a few little quirks in the documentation if you’re not reading carefully. For instance, if you set a quality to greater than level 95, it doesn’t actually do anything (read this Stack Overflow thread for more). Once we figured this out, page loads dropped through the floor.
The next task was multifaceted. We needed to make a website out of these images, and since it was going to update every hour, that argued for a dynamic backend of some variety. However, we had zero budget for servers, and no capacity to maintain one if we did have the money. That argued for a static website—simple HTML, no database, no caching layer, no middle of the night alerts that something has gone down.
Enter: The Raspberry Pi 2. A tiny computer that costs $35 total and plenty powerful enough to run a Linux distribution and a series of Python scripts. I firmly believe that organizational IT managers were put on earth to put the No in Innovation, so I’ve written about the Pi as a way to thwart useless bans on crontabs or running scrapes from inside networks. We did not have that problem here: ours cheerfully ordered the Pi for us and wished us luck. But I love RasPis. Cheap. Easy. Does the job. Shockingly durable for the money. One of mine ran for more than 400 days straight before a power failure in the building knocked it offline.
So for the sum total of $35, our “server” fires off a Python script at the top of the hour that fetches photos, resizes them, and loads them into our backend, which is a Django app that stores the data in a locally hosted SQLite database and pushes the images to Amazon’s S3 service. At Pi speeds, that takes about 20 minutes to complete. Then, at 30 minutes past the hour, the Pi fires off a second script. This time, the script uses the LA Times Data Desk’s Django Bakery library to create static HTML of each page, and then pushes that static HTML to S3. That S3 bucket is rigged as a static web host, and answers to a subdomain. I haven’t seen the S3 bills yet, but by my estimates, it’s costing PBT pennies to host Current there.
Then, when the day is done and the sun has set, the Pi does one last thing before going to bed: It takes all of the images for the day and stitches them together into a timelapse movie using FFMPEG. After each movie is done and uploaded, the Pi stops for the night. It will then fire up again at 8 a.m. central time, giving the cameras time to build up some images after sunrise to make it interesting for users.
Bot cameras, to a bot fetching photos, to a bot making a website, all on a computer that’s about the size of a deck of cards.
Doom Doom Doom
Now the engineering minded are probably itching with all the points of failure here. And yes, that’s a problem that we had to account for early on. It starts in the field.
The cameras rely on solar to get enough power to take a photo and send it to the cloud. If it’s cloudy for a few days, or if snow covers the solar panel, then that camera goes dark until the sun comes back and charges it back up. “Power is always the biggest issue,” said David Weber of TRLcam, the company that builds the timelapse cameras for PBT.
The cameras are housed in rugged weather-resistant boxes, but they are out in the open on the Great Plains of America. Stuff happens. “When one goes down, you don’t know,” Weber said. “We have them stolen, vandalized, shot at, all kinds of unique things. And it’s never the one you expect to happen.”
One, in early April of this year, was struck by lightning. Yep. Lightning. “It’s, uh, down for a little while,” project co-founder Mike Forsberg told me.
Here’s the thing I love about engineers. When I asked Weber—jokingly—about why their cameras couldn’t take a few gigawatts of lightning and keep on ticking, they’d actually thought about it. “All the scenarios that you run through, that of course is one of them,” he said plainly. “Well, what are we going to do about it? You can’t stop lightning from striking it.”
Suffice it to say, the whole system has to handle failure gracefully. They have to deal with scrambled data from weak cell connections, or nothing at all. The camera systems have gone through a series of iterations over the years—the latest exchanges a bespoke logic board for…a Raspberry Pi—so each system behaves a little differently. Some places have good cell reception, some only work when atmospheric conditions are right. And, on the web facing side of things, every week is a new wrinkle in the fabric of how it can all break. Post launch tidying up has been fun.
But the result, if I may be so bold, is a breathtaking view of life in the Platte Basin. You can watch as life literally unfolds before you. In March, we watched the annual migration of Sandhill Cranes. The birds roost by the thousands along the river, coming in at night, and leaving shortly after the sun is up. You can watch storms go by, surprise spring snows, and the occasional critter if you watch closely. Planting season starts soon. Before you know it, it’ll be summer.
Humans + Bots = Something Wonderful
But, for me, this is what bot-making is all about. Take in data, make something beautiful. Human journalists on PBT are making beautiful timelapse movies out of years of photos, requiring all kinds of subjective, aesthetic human decisions. It would be criminal to take them from that work and have them grind out daily videos devoid of decisions. We can make aesthetic decisions on the front end, and let the bot produce that at scale.
Or, put another way, we can take the big beautiful photos we have, give them to people, and get out of the way.