Features:

Introducing Sheetdown

A New Command Line Node.js Module for Turning a Google Spreadsheet into a Markdown Table


Sheetdown is a command line Node.js module for turning a Google Spreadsheet into a Markdown (well, actually, a GitHub Flavored Markdown) table. Sheetdown started with a tweet:

As of late I’ve been honing my node-module-making skills and when I saw the tweet, and realized I knew how it could be done, I was so excited that I stopped what I was doing and wrote it.

Getting the Raw CSV

The first snag was that I knew there was a query string for viewing a Google Spreadsheet as a raw CSV but I couldn’t remember it and Google, the search engine, was failing me. @takempf to the rescue with output=csv. (Maybe I’m doing it wrong, but I still can’t come up with a search that gives this answer.)

So first things first, sheetdown makes a request to Google and returns with the raw CSV. We make the request using the Node module request and a URL we construct like so:

var base = 'https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key='
var query = '&single=true&gid=0&output=csv'
var URL = base + KEY + query

Next, I wrote the rest of the module by creating functions that pulled out the headers from the CSV (a string), replaced commas (,) with pipes (|) and wrote a row of bars and dashes between the headers and the table body. It worked! And just a wee 47 lines, not bad! But wait—

Turns out, it only worked if you didn’t have commas within your spreadsheet cells. So for instance, having "Oakland, California" would not work, it would be treated as two cells instead of one.

Parsing as JSON

I was prepared to embark on a long regular expression journey for finding commas only outside of quotes when @maxogden suggested parsing the CSV as JSON first and schooled me on a couple of helpful modules he’d written.

Parsing the CSV as JSON means we’ll know what commas to avoid because we’ll be dealing with them as object values and not one giant string—meaning we won’t have to come up with the perfect , fail-proof regular expression.

Max’s node module binary-csv reads the CSV and then by setting the option for JSON to true, it returns JSON objects.

Concatenate All the Things

Next, Max’s other module concat-stream takes that stream of objects and puts them together as the complete array of objects we want to work with.

Each of these can be chained together to our request like so:

request(URL).pipe(csvParser).pipe(concat(rows))

Above, concat (concat-stream) passes the data to a callback function, rows:

  function rows(data) {
    var table = '|'
    var headers = Object.keys(data[0])
    var underHeaders = ''
    headers.map(function(key) {
      table += key + '|'
      underHeaders += ' ------ |'
    })

    table += '\n|' + underHeaders + '\n'
    data.map(function(row) {
      var values = headers.map(function(h) { return row[h] })
      table += '|' + values.join('|') + '|\n'
    })
    return callback("null", table)
  }
}

When rows gets our data it’s an array of objects, each object contains key/value pairs corresponding to the spreadsheet’s headers and a row’s cell value. For example:

{ 
      ID: '0',
      'PLACE NAME': 'Oakland Zoo',
      'PHOTO-URL': 'http://farm4.staticflickr.com/3744/12293538223_84b099d982_b.jpg',
      VENUE: 'Oakland Zoo',
      IMAGE: 'elephant',
      NOUN: 'animal',
      LAT: '37.7502645413531',
      LONG: '-122.14675065',
      HEXCOLOR: '#FFE7E7',
      CITY: 'Oakland',
      STATE: 'California',
      YEAR: '2013' 
}

Making the Table

Next we get our future table started with a pipe, which will become the top leftmost pipe in our final Markdown table. We get the spreadsheet’s headers, an array of strings, by getting all the keys from one object in the data. Headers:

[ 
  'ID',
  'PLACE NAME',
  'PHOTO-URL',
  'VENUE',
  'IMAGE',
  'NOUN',
  'LAT',
  'LONG',
  'HEXCOLOR',
  'CITY',
  'STATE',
  'YEAR' 
]

The underHeaders are my best attempt to describe the row of dashes between the headers and table body. For every header that exists we add it to the table string followed by a pipe. Then we create the underHeaders by adding dashes and a pipe for every header that exists.

Next we add the underHeaders to the table on a new line below the headers, which are now in table, and with a new line after.

To put the spreadsheet’s body into the table we take just the values from each object (the same as a row in the spreadsheet) in the data and add a pipe per value, then a new line at the end. When it’s completed it returns the Markdown table.

We added 2 modules but cut the lines of code by 17. Voila!

This is a command line module, so the entire process is triggered when someone runs sheetdown from their terminal. Depending on if they choose to save it as a file or not, sheetdown either writes the table to a file or prints the table into the terminal.

Everything was done within a couple of exciting (at least for me) hours. I set out thinking I already knew how to do it and still ended up learning many new things. Always learning!

Sheetdown has already gotten some useful pull requests for: adding the ability to use the full spreadsheet URL or the key and getting a message when it’s done writing your file.

The Future

I’d like to see some more options for sheetdown so that:

  • a user can request a range of rows from the spreadsheet instead of all rows.
  • it can be required and used inside of other code.
  • the user can chose the file name it saves your table with

I also need to get in the test-writing frame of mind and write some tests for it.

If you’d like to give sheetdown a try:

You’ll need Node.js and NPM installed

npm install -g sheetdown
sheetdown SPREADSHEETURLORKEY --save

People

Code

Credits

Recently

Current page