Low-Budget Natural Language Processing

Let’s talk about natural language processing.

How-to VR natural language processing Low-Budget Natural Language Processing

Easier ways to include natural language processing in your work

If you’ve ever talked to Siri or performed a Google Search using colloquial language and gotten the right answer, you probably had that magical feeling of being understood by a machine. The discipline that studies the interactions between human languages and computers is called natural language processing (NLP), and it’s a very active field. Companies and computer scientists are developing amazing techniques for improving performance on this task, but adding these features to our sites and apps can be very complex. Even great, free resources aren’t useful if you don’t have the time or skills to use them.

The good news is that we can take advantage of our human ability to analyze natural language and use really simple techniques to assist and amaze our users. I’ll explore a couple of ways to use these techniques in your own projects. These examples use web technologies but can be translated to other platforms and systems easily.

Basic Example: Predicting a Form Field

One of the goals of the Coral Project team while building Ask, a web product that enables news organizations to ask questions of their readers, was to build the form generation side of the project as an API.

One of the benefits of an API is that it allows developers to create their own integrations and user interfaces for creating and editing the forms. To showcase some possibilities, I built an alternative form creator targeting journalists and news devs who were setting up Ask for the first time.

Using NLP techniques to shape UI

When creating a form, it’s important to try and select the most appropriate UI input for the question. This helps the user understand how to complete the information, and it helps us understand the data. Since every question in the questionnaire needs a title, I thought it was the perfect scenario for applying a silly but effective NLP technique. The idea is simple:

  • The user types the question title

  • An algorithm analyzes the content and guess the question type

  • If the algorithm identifies the question type, it returns it—otherwise it returns the default

  • If the user already selected a question type, we consider it made the right choice and do nothing

  • Otherwise we switch the question type select input to the guessed type

I used Preact for writing this website (source code), just because I like over-engineering my experiments. But we can implement this easily with jQuery:

$('.question-title').on('keypress', function() {
  var inferredType = inferQuestionType($(this).val());
  if (!$('.question-type').data('dirty')) {
    $('.question-type').val(inferredType);
  }
});

And of course, this is easy to implement because we didn’t show the hardest part: the inferQuestionType algorithm.

If you want to get really advanced, try this: Before taking a look at this finished inferQuestionType algorithm below, start creating a form yourself. Go to the first question of the form you’re working on, and see if you can figure out what the algorithm might look like. Once you’ve given that a try, check out how I implemented it:

function inferQuestionType (str) {
  if (/e-?mail/gi.test(str)) {
    return 'EmailField'
  } else if (/age|how many|how old/gi.test(str)) {
    return 'NumberField'
  } else if (/^when/gi.test(str)) {
    return 'DateField'
  } else if (/^which/gi.test(str)) {
    return 'MultipleChoice'
  } else if (/\?.*\?/gi.test(str)) {
    return 'TextArea'
  } else if (/phone/gi.test(str)) {
    return 'PhoneNumber'
  } else {
    return 'TextField'
  }
}

That’s it. That’s the way I modeled the English language for my use case. Even if you don’t know what a regular expression is, you can get an idea of how to implement your own model. In case you didn’t try it for yourself or my algorithm didn’t work for you, here is an animated GIF of what you should see:

Screen shot

Is this algorithm covering every possibility? No. Is this going to work in every case? No. But this function runs in microseconds in our user’s browser (I actually measured it); it’s really simple to implement; and it helps most of the users choose the right question type, saving time on form creation.

Testing and improving your model

Once you have your script working, you may want to know the “success rate” which in this case can be something like: “What’s the percentage of cases where the model chose a different question type than the default, and the user didn’t change it?”

This is easy to measure:

  • When we infer the type, we set a variable for the question with the inferred type.

  • On form submission, we check whether the inferred type exists and is different than the submitted type. If so, we lose 1 point. If the types are the same, we win.

How can you store all of these events? An easy way: if you are using an analytics solution, you probably get events for free. I usually send this event to Google Analytics where it’s easy to add the results and get the success rate. After all, this success rate is a measure of the behavior of your users on the site.

Improving your model

You can always improve your model by adding, modifying or removing rules. The good thing is that if our rules don’t detect the user input, you just didn’t help your user, and it will be a normal form—but the app still works as intended. The only thing that can really hurt are the false positives. If we think that every time the user writes “the” the question is (for example) an email, the user will probably need to change the question type every time. That’s a bad scenario. Detecting an option is cool, but good defaults can help a lot.

A More Complex Example: Creating VR from a Text Editor

As part of my OpenNews fellowship, I’m exploring easy ways to produce virtual reality (VR) content. Since VR is a great resource for telling stories, I think it should be easy for people who don’t have coding skills to create prototypes or even full experiences. That belief led me to build GuriVR, an online editor that can interpret natural language. Users describe what their VR scene should look like, in their own words, and Guri VR tries to convert it into an actual VR scene.

Screen shot

Creating something like GuriVR may seem like an impossible mission, especially for someone like me, a developer with no graphics knowledge and who is certainly not a linguist. A lot of people ask me what machine learning techniques that GuriVR uses.

But the domain (for basic VR scenes) is not that broad. We just want to place objects onto the screen in a specific position and probably transition from one scene to another.The idea behind the GuriVR parser is that it’s just one JavaScript file with less than 350 lines of code that runs both in the browser and Node.js, in microseconds—for any story ever created in the editor. That kind of speed is a bonus that anyone can appreciate, whether or not they know how to code. (And it’s especially cool given that the code is a mess, and I totally need to refactor it.)

Creating GuriVR: identifying objects

So how did I make GuriVR work using NLP? It’s all about anticipating user behavior. Here are some names of the objects that GuriVR is designed to support:

  • Text

  • Image

  • Video

  • Panorama

  • Audio

  • 3D Model

This is the basic stuff that anyone wants in a barebones VR scene.

I also got a little fancier and added some aliases for the objects. Some of these are text aliases, like “sound” for “audio.” Users can also input any of these emoji and GuriVR understands them.

  • Image -> Picture Picture emoji

  • Audio -> Sound, Sound emoji

  • Chart -> Chart emoji

You might think adding support for emoji is idiotic, and maybe you are right, but it’s a low-hanging fruit. Emoji, unlike words, are language agnostic, so they are ideal for a lot of scenarios (and really fun to use). Cat

Adding modifiers

For each object that GuriVR supports, I needed to get different kinds of information. For example, a panorama needs a URL, but the text object needs the text content and a position. In some cases when the patterns can be easily identified, I try not to force a format. For example, for URLS, I use a URL regular expression, and for specifying position I use a list of possible values. Listing values is not the best idea for capturing every way that people will try to say something, but for things like position it works most of the time. And it gives you that magical sensation of being understood by the computer. I’m also working on a full 3D positional interface for GuriVR.

(Note: I am working on a full 3D positional interface for GuriVR. Now you can only say: “I have text to my left,” but in the future you will be able to say: “I have text 20 meters to my left and 30 meters above me.”)

When it comes to free-text properties, like the content of a text object, I use a specific format that maybe doesn’t feel very natural. But it does work. I use double quotes (“) to enclose text content because it’s easy to parse and feels better than other patterns like {} or [].

To help users actually follow these formats, it’s important to show some examples. In GuriVR, the editor is always pre-filled with a basic scene that holds the language I’m using, to give the user an idea of its patterns. It also highlights the keywords recognized by the parser.

(There is a GuriVR user guide, but there’s a rule that says, “A user interface is like a joke. If you have to explain it, it’s not that good.” Adding explanatory text in the editor isn’t the most elegant solution, but reading a guide? No thanks.)

So that’s it. The “natural language parser” that runs on top of GuriVR is just that one small file, which lists a bunch of words and orders them. Nothing fancy, huh?

Adding moar languages

Screen shot

In case you didn’t notice, GuriVR not only works in English, it also interprets Spanish. What kind of sorcery is this? Well, since I’m just identifying arbitrary keywords and patterns, this kind of silly model works for any language, so just like I added the word “sound” as an alias of “audio” I also added “sonido”—and now it recognizes audio in Spanish. There is no magic around it. Could I add Italian or German? Maybe I will or hey, maybe you send a pull request with the additions.

As Evelyn Eastmond explains in her post about her experience with GuriVR, you can find yourself writing in Spanglish. This is because (maybe intentionally, maybe because I’m too lazy) it recognizes any alias under any circumstance. In the hypothetical situation of a conflict between language keywords, I can just add a language parameter and limit the keywords, but I guarantee you that the Spanglish scenes are the funniest.

To See is to Believe

I encourage you to try to implement NLP techniques for yourself. Once you see how easy it is to improve the user experience of your apps by adding a small function with almost zero cost, you’ll become a “Low-budget NLP” fan. Writing these kinds of algorithms is also a challenge for yourself: How good can it get? How can I improve the success rate?

If you have any questions and/or want to share some of your ideas, you can find me on Twitter as @impronunciable.

About Dan Zajdband

Find me on Twitter as @impronunciable.

comments powered by Disqus