Features:

An Open Source Bot Factory

How the New York Times Interactive team uses Huginn agents


If Huginn is a good enough agent for Odin (Norse god pictured above), why not for a news apps team too? (Wikimedia Commons)

Huginn is an open-source project built with Ruby on Rails that’s kind of like an open-source If This Then That. With it you can create automated agents and schedule them to do different tasks at different times. More powerfully, these agents can talk to each other–passing JSON data from one step to the next.

In short, it’s kind of like a bot factory.

Here on the Interactive News team at the New York Times, we’ve started running an instance of Huginn to send us alerts and update emails about a variety of things.

RSS Shim

Huginn comes in handy to monitor web pages for changes and send an email with any changes to the content. Useful if a site you’re interested in doesn’t provide an RSS feed.

You can also use it to keep tabs on competitors or sources. One agent I have set up alerts me any time a new health or fitness app makes it to the iTunes top charts. Others I use to watch the corporate websites of companies and organizations I’m interested in.

Olympics Monitoring

Most prominently, we used it during our Olympics coverage to monitor the results of the API we built and let us know if the data ingestion pipeline ever grew stale. To do that, we set up a pipeline:

Website Agent -> Peak Detector Agent -> Email Agent
              \-> Peak Detector Agent -> Twilio Agent

Every two minutes, the website agent checked an API endpoint that reported the last time the data was updated. The peak detector was set to filter that agent. If the check ever went more than three standard deviations above the mean, it sent us an email warning. If it went five standard deviations above, I would start to get text message warnings.

We set up similar alerts for our database status, and to let us know anytime the medal standings changed.

Huginn Agents

Huginn’s agents are written as slim Ruby classes that do only a few things.

  • Run a check at some scheduled time.
  • Receive events, any kind of JSON object.
  • Emit their own JSON object events with create_event

For example, here is a simplified version of the TwitterUserAgent (scheduled to check a Twitter user’s timeline at a regular interval). Actual code with validation and other bits here.

require "twitter"
module Agents
  class TwitterUserAgent < Agent
    include TwitterConcern

    default_schedule "every_1h"

    def default_options
      {'username' => "tectonic"}
    end

    def check
      since_id = memory['since_id'] || nil
      opts = {:count => 200, :include_rts => true, :exclude_replies => false, :include_entities => true, :contributor_details => true}
      opts.merge! :since_id => since_id unless since_id.nil?

      tweets = twitter.user_timeline(options['username'], opts)

      tweets.each do |tweet|
        memory['since_id'] = tweet.id if !memory['since_id'] || (tweet.id > memory['since_id'])

        create_event :payload => tweet.attrs
      end

      save!
    end
  end
end

Here’s an agent that takes events and tweets them out.

require "twitter"
module Agents
  class TwitterPublishAgent < Agent
    include TwitterConcern

    cannot_be_scheduled!

    def default_options
      {'message_path' => "text"}
    end

    def receive(incoming_events)
      incoming_events.each do |event|
        tweet_text = Utils.value_at(event.payload, options['message_path'])
        begin
          publish_tweet tweet_text
          create_event :payload => {
            'success' => true,
            'published_tweet' => tweet_text,
            'agent_id' => event.agent_id,
            'event_id' => event.id
          }
        rescue Twitter::Error => e
          create_event :payload => {
            'success' => false,
            'error' => e.message,
            'failed_tweet' => tweet_text,
            'agent_id' => event.agent_id,
            'event_id' => event.id
          }
        end
      end
    end

    def publish_tweet(text)
      twitter.update(text)
    end
  end
end

For these to work, you do have to manually register a Twitter app and user and provide the proper API keys.

To create a simple echo bot you could set up a TwitterUserAgent and have it pass events to a TwitterPublishAgent. One original motivation that got me involved with Huginn was to try and automate the syncing of my Twitter account to a Weibo account. (Unfortunately, that Weibo account ultimately ran afoul of the real identity verification policy and tweeted too many New York Times links for its own good.)

The best part of Huginn is that the above examples are just about all the code that’s needed to create a new type of agent, and it’s all written in Ruby. To make things even easier, a new agent allows the creation of agents that can run JavaScript sandboxed within the Rails environment.

Future Uses

Huginn also has a TwitterStreamAgent for consuming Twitter’s streaming API and Huginn’s creator Andrew Cantino uses it to monitor Twitter for trending terms he’s interested in, for example “san francisco fire” and “san francisco earthquake.”

Peak Detector Agent Charts:

San Francisco Peak Detectors

There are also other agents to interact with weather APIs, Twitter, Weibo, and Amazon’s Mechanical Turk.

Of course, it’s still early days for Huginn and the interface isn’t quite ready for non-programmers to use it smoothly. Fortunately it feels like the project is picking up steam and gaining contributors as people dream up more potential uses for it.

Credits

Recently

Current page