I spent a lot of years believing that I couldn’t program, whatever that meant. On the advice of a friend in the late ’90s I had bought the O’Reilly Perl books and took a stab at it, with the Llama book perched on my knee as I pecked in example code, but I didn’t have any particular problem to solve and it didn’t stick.

Ten years later, I was working in online media and had become my group’s SME on web analytics. The company I worked for was going through layoffs and the number of sites I had to manage went from three to fifteen, but I had to keep up the level of reporting I’d been producing by hand. Suddenly, I had a problem to solve, and it was a pretty quick path from AppleScript to standalone Ruby to Rails.

Though I don’t really follow it exactly, I do keep a copy of this XKCD strip handy to remind me of how to think about automation and utility programming generally.

XKCD: Is it worth the time

I differ with the sentiment a little:

Coding to solve problems yields more than time savings. It also helps you test your understanding of the problem by having to describe it to a very literal-minded partner (your computer), and in the early going you’re going to spend more time learning that solving; but that’s an investment beyond the immediate problem.

This page has a few tools I’ve written over the years to solve assorted problems, from web reporting to content classification, to assessing my team’s priorities.

The thing they all have in common is that they bias toward cheap and cheerful. When I couldn’t get an IT department to give me server space to host a web front-end for an analytics tool, I just routed around and wrote an intake script, processing reporting requests by hand. When it wasn’t practical to connect via an API because it was just simpler to download an archive and process it with a utility script, I just did that.

For a period I had a little bit of an inferiority complex about it all. I wasn’t doing all the things professional developers do. Over time, though, I realized I was solving my particular problems just fine, and learned to trust my instincts about when it was time to add a layer of complexity or sophistication: It had to be about whether it would help me get more done more quickly, not because my inner critic said so.


https://github.com/pdxmph/perfdb (Ruby on Rails)

PerfDB started out pretty painfully, with AppleScript, massaging downloaded CSV files by automating spreadsheets. Then I realized that if I could figure out how to talk to it, Google Analytics had an API that could automate pulling those reports. Then I realized that all of the article metadata I’d been typing into my reporting could be extracted from the RSS feeds on all my sites. I couldn’t figure out how to do that with AppleScript, but the first tutorial I read on how to parse RSS feeds was done with Ruby, so I learned that.

Within a year I’d written a Rails application that could do my reporting for me. A few months after that, I set it up so I could produce reporting for my teammates.

In the process of writing the tool, I learned a lot about writing for extensibility, so when a big Google algorithm change came through and flattened the traffic on our site, it was simple to extend the application to sift through the analytics data I’d been caching to figure out what penalized articles had in common.

With that success, the business was more interested in what the tool could do in terms of analyzing the cost effectiveness of content, so internal business teams opened up their site revenue data, allowing me to see how much each article from across several dozen sites over ten years had made. We learned which kinds of content were the most cost effective and performed the best over five year periods instead of the default 30-day view most editors went to when they did their reporting. By looking at how much time editors spent working on the copy of each writer, we learned that a lot of our “free” content, submitted by tech pundits trying to boost their consultancies, actually cost more than the stuff we paid for by professionals.


https://github.com/pdxmph/priorities (Ruby on Rails)

Priorities app screenshot

I really struggled to have a good conversation about my team’s priorities with my boss. After butting heads with him over headcount for a few weeks, he got frustrated and told me the three things he wanted to know about everything I was doing: How important did I think it was, how well was I doing it, and what did it cost me to do it?

I went away and modeled that in a spreadsheet, walked my team through putting the information together, and took that back to him. The lights came on between us and we were able to talk about what mattered, what didn’t, what he was willing to fund, and what I could drop.

The thing I realized, though, was that when I looked at what the team was doing, there were disconnects: We didn’t do the most important things as well as we believed we should, and sometimes seemingly trivial things represented pretty bad heat loss over time. So I developed some formulas to help look for those disconnects and visualize them.

Because I was running a services team, I also had a lot of external scrutiny to deal with: Everyone had an opinion about what should be most important to us. So I took my formula and build Priorities, which allows you to run through my prioritization exercise in a web app that provides a page anyone can look at to see what you are prioritizing.

You can visit the working version and try it for yourself.

HipChat transcript retriever

https://github.com/pdxmph/hipchat_transcripts (Ruby/Sinatra)

I was director of IT operations at Puppet when Slack bought HipChat from Atlassian and killed it. Our legal and security teams were opposed to doing a full lift of all the historical data from HipChat to Slack, seeing it as an opportunity to remove a significant source of liability.

At the same time, we learned that a lot of our customer-facing teams had been using HipChat as an informal kb of sorts, keeping information in private team rooms where they could search for it later. They weren’t happy at the thought of all that going away, but they also weren’t sure where all of the data they wanted to keep was.

I spent an afternoon putting together this tool, which allowed us to store our HipChat archives and make them retrievable as Markdown or a PDF if it turned out a team had left critical information behind: With the legal team’s approval, the IT services team could run the app from an encrypted drive, retrieving a transcript for the team, with the understanding we’d format the drive after a year.

Docs Decomposer

https://github.com/pdxmph/docs_decomposer (Rails)

After a sudden growth spurt in developer resources and a tripling of available technical writers, my docs team was struggling to keep up with all the changes in the product they were documenting. Screenshots were aging, we were missing documentation that needed to be updated, and people were having a hard time telling us about bugs they found in the docs.

The Docs Decomposer provided a few tools to help. It ingested the docs git repos, which were all versioned and:

  • Checksummed screenshots and compared them across versions, providing reporting about how old screenshots in each part of the docs were, making it easier to replace them.
  • Provided a way for people comment, Google Docs style, on preview documentation.
  • Provided a bookmarklet that allowed people to report on a bug in the docs on our production site by highlighting the text and creating a JIRA ticket that was pre-filled with the URL of the page and the problematic text.
  • Let the tech writers assign error risk estimates to each of their pages to help direct editing and review resources.


https://github.com/pdxmph/reclassr/blob/master/reclassr.pdf (Ruby/Sinatra)

After the analytics app I wrote figured out why Google was punishing our sites, we realized that we had a dozen editors running several dozen sites who each needed to re-sort thousands of articles each to quarantine things that had aged out, retire categories that didn’t make sense anymore, and move articles around to new categories.

It wasn’t the kind of thing we could completely automate: Humans needed to decide which content was still relevant and how to recategorize it.

The IT team was willing to automate the last mile of the problem by not forcing us to do it in the CMS. They gave us all a CSV file and told us to do the remappings by hand. My own CSV file had 3,000 lines in it, and others had even more.

So I wrote a little Sinatra app that provided a more efficient front-end to handle that: Editors just had to run it and click instead of typing things in by hand in a spreadsheet. What was looking to take days was reduced to a few hours of clicking, max, and the tool gave a correctly formed CSV to the IT team that had the benefit of being typo free.