Archive | Uncategorized RSS feed for this section

Deploying Sinatra On Ubuntu: In Which I Employ A Secretary

As mentioned previously, I really hate getting woken up at 3 AM in the morning.  This happens fairly frequently for me, though, because I live in Japan and about half of the people who call me do not. I have not been effective at getting them to check what time it is here before they call, but I certainly want them to call, and even call me in the middle of the night if it is an emergency.

So I made myself a phone secretary with Twilio, their Ruby gem, and Sinatra (a lightweight Ruby web framework).  I gave my friends and family a US number assigned to me by Twilio. Dialing it causes Twilio’s computer to talk to my server and figure out what I want to do with the call. The server runs a Sinatra app which checks the time in Japan and either forwards the call to the most appropriate phone or gently informs the user that it is 4:30 AM in the morning.

The code for this took 10 minutes. Reasoning my way through a deployment took, hmm, 3 hours or so. I am a programmer not a sysadmin, what can I say. I thought I’d write down what I did so that other folks can save themselves some pain.

Code (You’re probably not too interested in the exact logic, but feel free to use it as a springboard if you want to make a secretary/call forwarding app):

require 'rubygems'
require 'sinatra'
require 'twiliolib'
require 'time'

@@HOME = "81xxxxxxxxxx"  #This is not actually my phone number.
@@CELL = "81xxxxxxxxxxxx"  #Neither is this.

def pretty_time(time)
time.strftime("%H:%M %p")
end

def time_in_japan()
time = Time.now.utc
time_in_japan = time + 9 * 3600
end

def is_weekend?(time)
(time.wday == 0) || (time.wday == 6)
end

def in_range?(t, str)
time = Time.parse("#{Date.today.to_s} #{pretty_time(t)}")
range_bounds = str.split(/, */)
start_time = Time.parse("#{Date.today.to_s} #{range_bounds[0]}")
end_time = Time.parse("#{Date.today.to_s} #{range_bounds[range_bounds.size - 1]}")
(time >= start_time)  && (time  "45")
@r.append(say)
@r.append(call)
@r.respond
end

def redirect_twilio(url)
@r = Twilio::Response.new
rd = Twilio::Redirect.new("/#{url.sub(/^\/*/, "")}")
@r.append(rd)
@r.respond
end

post '/phone' do
t = time = time_in_japan
if (is_weekend?(time))
if in_range?(time, "2:00 AM, 10:00 AM")
redirect_twilio("wakeup")
else
forward_call(@@CELL)
end
else #Not a weekend.
if in_range?(time, "2:00 AM, 8:30 AM")
redirect_twilio("wakeup")
elsif in_range?(time, "8:30 AM, 6:30 PM")
redirect_twilio("working")
elsif in_range?(time, "6:30 PM, 9:00 PM")
forward_call(@@CELL)
else
forward_call(@@HOME)
end
end
end

post '/wakeup' do
if (params[:Digits].nil? || params[:Digits] == "")
@r = Twilio::Response.new
say = Twilio::Say.new("This is Patrick's computer secretary.  He is asleep right now because it is #{pretty_time(time_in_japan)}.  If this is an emergency, hit any number to wake him up.")
g = Twilio::Gather.new(:numDigits => 10)
g.append(say)
@r.append(g)
@r.respond
else
forward_call(@@HOME, true)
end
end

post '/working' do
if (params[:Digits].nil? || params[:Digits] == "")
days_left = (Date.parse("2010-04-01") - Date.today).to_i
@r = Twilio::Response.new
say = Twilio::Say.new("This is Patrick's computer secretary.  He is at work as it is #{pretty_time(time_in_japan)}.  Only #{days_left} days left!  If this is an emergency, hit any number call him at work.")
g = Twilio::Gather.new(:numDigits => 10)
g.append(say)
@r.append(g)
@r.respond
else
forward_call(@@CELL, true)
end
end

get '/' do
'Hello from Sinatra!  What are you doing accessing this server anyway?'
end

This script is a bit ugly but, hey, what do you want in ten minutes. (Memo to self: correct it after leaving my job.)

Sinatra Deployment On Ubuntu

A quick look around the Internet didn’t show any cookbook recipes for deploying Sinatra. I thought I’d write up what I’m using, which uses Apache reverse proxying to Sinatra. (Instructions included for Nginx as well.) It assumes you already have your webserver running and are familiar with basic Ruby usage and the Linux command line.

1) Install the daemons gem. We’re going to daemonize Sinatra so that it runs out of our console and starts and stops without our intervention, much like Apache does.

2) Create an /opt/pids/sinatra directory. (It seemed as good a place as any.) Let a non-privileged user write to that directory, for example by executing “sudo chown www-data /opt/pids/sinatra; sudo chmod 755 /opts/pids/sinatra”. Make a note of what non-privileged user you use. I am just reusing www-data because Apache has conveniently provided him for me and he is guaranteed to not to be able to screw up anything important if he is compromised.

2) Write a quick control script and put it in the same directory as your Sinatra app (called phone_sinatra.rb for the purposes of this demonstration). I threw these in /www/var/phone.example.com/ but you can put them anywhere. Make sure the scripts are readable, but not writable, by www-data. (sudo chmod 755 /www/var/phone.example.com/ will accomplish this: it makes only the owner able to write to it, but any user on the system — including www-data — can read from it.)

require 'rubygems'
require 'daemons'

pwd = Dir.pwd
Daemons.run_proc('phone_sinatra.rb', {:dir_mode => :normal, :dir => "/opt/pids/sinatra}) do
Dir.chdir(pwd)
exec "ruby phone_sinatra.rb"
end

3) (Optional) Add in a reverse proxy rule to Apache or Nginx to send requests to the subdomain of your choice to Sinatra instead. I ended up deploying this through Apache, so the rule is pretty quick:


ServerName phone.example.com

ProxyPass / http://phone.example.com:4567/

You could also do this on Nginx and it is similarly trivial.

server {
listen       80;
server_name phone.example.com;
proxy_pass http://phone.example.com:4567/;
}

The main reason I do this is to not have to remember non-standard ports in my URLs. It also simplifies firewall management if you’re into that sort of thing.

4) Add a control script to /etc/init.d/sinatra so that we can start and stop Sinatra just like we do other services, like Apache.

#!/bin/bash
#
# Written by Patrick McKenzie, 2010.
# I release this work unto the public domain.
#
# sinatra      Startup script for Sinatra server.
# description: Starts Sinatra as an unprivileged user.
#

sudo -u www-data ruby /var/www/phone.example.com/control.rb $1
RETVAL=$?

exit $RETVAL

5) Tell Ubuntu to start your daemon when the computer starts up and shut it off when the computer starts down: sudo update-rc.d sinatra defaults

6) Start the service manually for your first and only time: sudo /etc/init.d/sinatra start

There you have it: Sinatra is running the application you wrote, and it will start and stop with your Ubuntu server. If you were doing this for Twilio now you’d check your Twilio account settings to make sure it has the right URL set up for your phone number, and then try calling yourself. Preferably NOT from the phone you try to forward to.

All code in this blog post was written by Patrick McKenzie in early 2010. I release it unto the public domain. Feel free to use it as the basis for your own apps.

Twilio development makes me feel like a kid in a candy store — you can affect the real world through an API, how cool is that? I think next time I have a few hours to kill I’m going to make a similar secretary for my business. I don’t give folks my phone number because a) I live in Japan and b) they don’t pay me enough to do telephone support. However, quoting a telephone number on your website instantly says “There is a real business behind this!”

I think I’ll whip up a computer secretary for the business which handles the most common two support requests (“I didn’t get my Registration Key” and “I lost my password.”), and for anything else takes their message and emails it to me. That sort of thing costs megacorporations bazillions and can be whipped up these days by a single programmer on Saturday morning for under $5 a month in operating costs. Like I said, candy store.

def pretty_time(time)
time.strftime(“%H:%M %p”)
end

def time_in_japan()
time = Time.now.utc
time_in_japan = time + 9 * 3600
end

def is_weekend?(time)
(time.wday == 0) || (time.wday == 6)
end

def in_range?(t, str)
time = Time.parse(“#{Date.today.to_s} #{pretty_time(t)}”)
range_bounds = str.split(/, */)
start_time = Time.parse(“#{Date.today.to_s} #{range_bounds[0]}”)
end_time = Time.parse(“#{Date.today.to_s} #{range_bounds[range_bounds.size – 1]}”)
(time >= start_time)  && (time < end_time)
end

def forward_call(number, surpress_intro = false)
@r = Twilio::Response.new
say = Twilio::Say.new(“#{“This is Patrick’s computer secretary.  ” unless surpress_intro}I’m putting you through.  Wait a few seconds.”)
call = Twilio::Dial.new(number, :timeLimit => “45”)
@r.append(say)
@r.append(call)
@r.respond
end

def redirect_twilio(url)
@r = Twilio::Response.new
rd = Twilio::Redirect.new(“/#{url.sub(/^\/*/, “”)}”)
@r.append(rd)
@r.respond
end

post ‘/phone’ do
t = time = time_in_japan
if (is_weekend?(time))
if in_range?(time, “2:00 AM, 10:00 AM”)
redirect_twilio(“wakeup”)
else
forward_call(@@CELL)
end
else #Not a weekend.
if in_range?(time, “2:00 AM, 8:30 AM”)
redirect_twilio(“wakeup”)
elsif in_range?(time, “8:30 AM, 6:30 PM”)
redirect_twilio(“working”)
elsif in_range?(time, “6:30 PM, 9:00 PM”)
forward_call(@@CELL)
else
forward_call(@@HOME)
end
end
end

post ‘/wakeup’ do
if (params[:Digits].nil? || params[:Digits] == “”)
@r = Twilio::Response.new
say = Twilio::Say.new(“This is Patrick’s computer secretary.  He is asleep right now because it is #{pretty_time(time_in_japan)}.  If this is an emergency, hit any number to wake him up.”)
g = Twilio::Gather.new(:numDigits => 10)
g.append(say)
@r.append(g)
@r.respond
else
forward_call(@@HOME, true)
end
end

post ‘/working’ do
if (params[:Digits].nil? || params[:Digits] == “”)
days_left = (Date.parse(“2010-04-01″) – Date.today).to_i
@r = Twilio::Response.new
say = Twilio::Say.new(“This is Patrick’s computer secretary.  He is at work as it is #{pretty_time(time_in_japan)}.  Only #{days_left} days left!  If this is an emergency, hit any number call him at work.”)
g = Twilio::Gather.new(:numDigits => 10)
g.append(say)
@r.append(g)
@r.respond
else
forward_call(@@CELL, true)
end
end

get ‘/’ do
‘Hello from Sinatra!  What are you doing accessing this server anyway?’
end

The Solo Founder (Startup) Rap

I saw the title Will Single Founders Please Stand Up and got a bit inspired.  Substantive comment continues below the song.  With apologies to Eminem, Weird Al, and fans of quality music everywhere:

Download (or hit icon to play inline): MP3 / Ogg

Won’t The Solo Founders Please Stand Up?

May I have your attention please?
May I have your attention please?
Will all you solo founders please stand up.

I say again, will all you solo founders please stand up.
We're gonna have a problem here.
Y'all act like you've never heard a solo founder before
Jaws all on the floor, slower than the App Store
Can't approve a new idea in a year or four
Like it takes a team of a million monkeys
To make a web 2.0 app.  Oh, sorry -- did
I just imply your Twitter client was easy?
But Paul Graham said...
(spoken)  Well, yeah, but he said "Use Lisp", too.
So that's two things he's wrong on.  Sorry Paul.  Back in character.

Customers love solos.  Ca-ching!
Matz, heard of him?
His left pec once wrote an interpreter while sleeping.
What do you mean you can't market an app
In your spare time.

Yeah, we've probably got a couple cycles in our threads loose
Banging out code in darkened bedrooms
Sometimes want to log on WoW and let loose
But its the business that brings the phat loots.

"You've got to quit your job.
You've got to quit your job."
And if you're lucky get some funding you SOB.
And that's the message we send to part-time slobs.
And expect them to keep banging on MS Bob.

Of course they're gonna blog six days a week for fun.
But that ain't real marketing.  Engineers can't market, can they?
"We're just code monkeys."  Well some of us apes
Can write Java and copy.  Why you gape?
And if we can grok code, AdWords, and email
Then there's no reason we can't do software retail.
EWW!  Oh wait, seventy percent margins ain't fail.

So you solos out there,
Put your hands in the air!

Chorus:

I'm a solo founder, yes, I'm a solo founder,
All you VCs can go eat clam chowder.
So won't the solo founders please stand up, please stand up, please stand up.

Repeats.

DHH got to cuss in his keynotes to get attention,
Well I don't, but I guess it works for him
(By the way thanks for Rails.)
Half of you Redditors don't buy software,
"But if I don't, who does?"
About half the population.  300 million in the nation.
Try selling to women -- you'll be swimmin' in it.
Sometimes a profit beats an exit.

Shoot, I help your kid's teacher play bingo
If that don't prove there's a niche for everyone
I don't know when you gonna believe.

"Yeah, that's real cutting edge, hehe."
Yeah some of us ain't downloading MP3s.
My customers can't even spell PHP.
So what?  Their money's green as your ping stats
Better your wallet than your disk FAT.

And there's a million of us just like me
Who code like me, who sell stuff just like me,
Who write like me, blog, talk, and act like me,
It ain't the next best thing, the real deal's me!

Chorus.

I'm like a lecture to listen to, cause I'm only giving you,
Things to joke about on IRC in your chatroom.
The only difference is I've got a mailing list.
With 2,000 people paying to be on it.
It's called "Customers" -- what a concept.
I'm not the only one on the forum yakking
About making apps, the overflow's stacking,
And I'm building stuff but it ain't rocket science,
Every single one of y'all could try it.
You could be working at Burger King,
Laying out the router rings,
Or in the cubicle banging, Screaming "That ain't null!"

With your collar down and your temper up,
So won't you single founders please stand up
And put three fingers on your hand up?
And be proud to be out of your mind and in control.
And one more time,
Loud as you can,
How does it go?

Chorus x2

Haha,
Guess there's a single founder in all of us.
Let's all stand up.

Actual discussion:

OK, now that my burning urge to karaoke is out of the way: there are a lot of business models out there, and many of them can work.  You can probably even make money with a Twitter client written in Lisp.  I wouldn’t want to have to do it but, then again, I don’t have to, just like you don’t have to wake up in the morning and help little old ladies get their bingo cards in large print.  And that’s perfectly fine.

If it wasn’t totally obvious, tongue was planted firmly in cheek while writing the above, and when given the choice between saying what I really think and making lame attempts at humor, the humor may have won out.  For example, you might on listening be under the impression I have not quit my day job, which is not accurate as of last Monday. (I’m still kind of in shock — and still have work to go to in the morning. See you on a more permanent basis in a few months.)

I don’t claim to be anything particularly special as a businessman and, as you can see, I have few musical skills to speak of.  But if you can’t laugh at yourself then who can you laugh at?

The above song and lyrics are copyright Patrick McKenzie 2010 and released under the Creative Commons By Attribution license.  Go nuts.  It was recorded with assistance from Audacity.

Visualizing Your Commit History

I was curious today about whether I tend to work consistently or in bits and spurts, so I decided to check by asking my SVN repository.  First I dumped all my commit messages since I started using SVN (back in October 2007, which was over a year after starting my business — I know, I know, I was young and stupid once) into a text file.

Then, I parsed the file with a quick Ruby one-liner to pull out how many commits were made in each month.  My commits when I’m working in Rails can range from a single character to a feature worth of code, but I figure over the years the variation probably averages out.

I then dumped them into Excel 2007 (totally worth the price just for the pretty graphs, although if I was doing this dynamically I’d do it differently) and annotated the graph in Paint.NET.

You can probably tell a lot about me from the shape of this graph:

  • I tend to work in spurts when I have inspiration.
  • I prefer to ship v1.0 of something and iterate like mad rather than spending more time shipping something more ambitious to start with.
  • I sometimes get slammed by the day job and am left with no energy for the business.  (See, e.g., summer of 2008.)

What you can’t tell from the shape of the graph is that, happily, code is not everything in the software business.  Here’s another graph which I’ve truncated to fit the same time scale:

I had another thought: why not visualize code bloat how much valuable code I’ve written over the years.  So I checked out a copy of revision 1 into a temporary directory, then used another Ruby script to step through each revision and use rake stats to count how many lines of code were present.
I really don’t recommend doing this at peak time for the server with your SVN repository on it, incidentally, as it is wasteful as heck.  This script took about half an hour to execute on my slice.  Two ways to improve it: number one, use a local copy rather than using svn+ssh to grab the code.  Number two, instead of rake stats use a purpose-built code counter like CLOC, which doesn’t require loading the entire Rails framework to run.  Loading the entire Rails framework 1,334 times will not make your CPU happy.  Of course, smarter than any of these options would have been figuring out what revisions I wanted to look at a priori, and then just looking at those ~20 revisions rather than deep scanning all 1,334 of them.
Then I combined the per-revision LOC counts with the SVN log to graph dates against code size.  Tada!
I guess slow and steady wins the race.

Engineering Your Way To Marketing Success

//

I visited Thomas Ptacek and the gang at Matasano (who are developing a firewall management product) over Christmas break and had a very productive discussion about marketing.  One of the things Thomas mentioned was that I should probably blog out how you can use engineering resources to improve your marketing.

In Which I Have A Revelation

Have you ever been talking to someone and had them crystallize an idea you’ve been fumbling around about but never seem to put into words?  That was what I felt like when Thomas mentioned the engineering-to-marketing conversion: the lightbulb went off.  This is what I’ve been doing so much of the last few years, with automated scalable approaches to SEO, A/B testing the living daylights out of my site, optimizing user interactions, tracking tracking tracking, implementing Mailchimp APIs, etc etc.  It is all engineering means to marketing ends: make your customers happy, get your name out there, sell more stuff.

And really, we don’t do nearly enough of it in our industry.  We’re sitting on top of more software and programmer brain juice than Saudi Arabia has oil, and when we deploy it to maximize our sales… we build marginal features that nobody has asked for and that nobody will see.

I am as guilty of that as anybody else: version 3.0 of the desktop Bingo Card Creator included a prodigious amount of time spent on integrating a bit of a client/server application into it.  I thought, hey, customers empirically ask me all the time how to get access to the same cards on their home and school computers, so allowing them to save it to the server (wait, I have to be buzzword-compliant) cloud would solve a problem for them.

It turns out that those customers with the problem had already either figured out how to email files to themselves or were part of the mass exodus to the web version of my program.  The client/server features of BCC got terrifically underused: exactly 22 of my customers have used them.  That works out to be about 1% of the customers I’ve added in the last year.  By way of comparison, adding color to my bingo cards scratched off one of my Top 3 user feature requests, took me a tenth the time, and gets used by about 8.5% of customers.

As if that weren’t painful enough, not a single customer was enticed by the client/server features being pay-only to purchase BCC to get them.  (If you’re wondering “Hmm that sounds like a curiously specific claim for someone who is not telepathic” tie a mental string around your finger to read the forthcoming part about analytics carefully.)

Features Do Not Sell Your Software

For the last three years and change I’ve been hanging out on the Business of Software discussion boards and have advised more programmers on websites than I have friends.  You can tell when someone is on version 1.0 of their website: they have The Traditional Shareware Website with six pages listed in the horizontal navigation, featuring Trial, Purchase, Features, Screenshots, Support, and About Us, and the main content on their front page and the Features page is a list of features.

The first advice we always give them is strike Features and replace with Benefits because benefits, not features, sell software.  (I have heard that this is not true in some markets comprising of very technical experts who are looking for exactly the tool to fit their problem and, if you sell to these people, hat’s off to you and please don’t ever take my advice on anything.  Well, OK, one little thing: you may not actually be selling to that customer, regardless of what you think, so go get some data.)

A quick sidenote for people who have not repeated the benefits-not-features mantra on every sunrise for the last several years: features are things that your software does.  Objectively speaking, Microsoft Powerpoint reads PPT files and lets you animate bullet points.  Benefits are the perceived improvements in the user’s life that will result from purchasing your software.  Subjectively speaking, customers of Microsoft Powerpoint buy it because it will let them close the deal, please their bosses, and get promoted.

Closing the deal has nothing to do with the internal structures of PPT files.  It answers a human need for the customer.  You could sell Powerpoint to a tribe of cavemen of spear-hunting lions on the savanna.

  • Ogg see lion.
  • Ogg poke lion with sharp pointy bits.
  • Tribe eat lion.
  • Ogg get to synergize with Ogga.

Marketers Get It.  Engineers Don’t.

I have a bit of the tribal programmers’ disdain for marketing, but marketers get this concept.  You’ll very rarely (God willing) have people in Marketing obsessing about features because they understand that benefits bring home the bacon.  Sadly, engineers typically work on features, features, features, and more features, when we could do so much more productive things with our time.

For example, when Marketing says “This software should really make the user feel like they just killed an effing lion!”, they’ll typically have, well, no clue whether it actually does or not.  Nobody in the organization does, which is when the problem gets kicked to Management, and we all know that is where interesting questions go to die and, if they were wicked in life, get reborn as meetings.

Your mission as an engineer is to stop thinking the job is building features and start thinking that the job is building systems to answer interesting questions like that.  For example, you could pretty trivially put an item on the sidebar of the software saying “Do you feel like you just killed an effing lion?  (Thumbs up)  (Thumbs down)”, and very quickly you’d have actual data on whether the software is delivering on the promise of visceral feline slaughtering action.

Measuring Vicarious Lion Slaying As A Process

Of course, engineers are expensive and building a bunch of one-off “Do you feel like you just killed a lion?” quizzes is unlikely to result in you covering your desired hourly salary.  Instead, you should be thinking of building tools and processes — give people the resources they need to ask questions like “Do you feel like you just killed a lion?” and make it so brain-dead easy and so ingrained in the culture that if somebody asked “I wonder if users prefer gazelles to lions” didn’t immediately start designing an experiment it would result in chatter about them at the water cooler.

This notion of tools and processes to use engineering as a force multiplier for everything else you do is the key to decoupling productivity from hours worked.  This is a handy feature to have for startups and small businesses.

For example, I’m a one-man shop with occasional help from freelancers, and virtually by definition I’m the most qualified man alive to write content for my website.  However, writing content for my website is a poor use of my time. While it is quite profitable for me, it is much more efficient to build a system to let somebody else do it.  This frees me up to build more force multipliers rather than grind out 757 bingo activities with 28,761 words of content about them.  (That is, incidentally, about as many as a young-adult novel.  The chief difference is that I pay more.)

Enough With The Lions.  Give Me Actual Things I Could Build Today.

A/B testing that anybody can use.  I hate to harp on A/B testing so much since it is just one arrow in the quiver and I would hate if it blinded anybody (including myself) to other productive uses of their time.  That being said, in terms of dollars gained per hour invested, it is really, really hard to beat.  You need to make it brain-dead easy to that whoever does your website and, ideally, whoever is developing the application can quickly iterate through text, button designs, and workflows to find what works for you.  Feel free to crib design points from A/Bingo, my OSS Rails testing library.

Scalable content generation.  SEO is sort of my first love in marketing, probably because of the obvious potential for automating it.  Essentially nobody hand writes every page on their website these days, which is A Good Thing because your CMS of choice will make it much less painful and greatly improve the quality of the output.

If you take that to the next step and figure out how to inject content into the CMS without having you personally type it into the HTML form area, you can fluff up your website and collect an awful lot of long-tail search traffic without overly distracting you from the business of running your business.  For example, Demand Media has creating vast oceans of garbage down to a science.  With a bit of creativity, you can use similar techniques (freelancers available as a utility, algorithmic discovery of topics to write about, and automating the quality control) and combine them with existing data sources to actually create value in your niche.

For example, a quick script I wrote up in five minutes to dump the most commonly used words on bingo cards that are not used by a bingo card I have available reported that more than 50 people in December independently typed these into their cards: Star of China, darjeeling, genmaicha, jasmine.  This taught me something I had no clue of: there is a group of people in the world who really want to play Tea Bingo.  With a little more packaging my ad hoc Ruby script can be incorporated directly into the interface for my freelancers.  Then, they could just take a gander at the list of words at the top and use them for inspiration for new writing assignments.

Automated Error Detection/Correction. I was so amused by the popularity of Tea Bingo I checked to see if I already had a tea-related activity and discovered, much to my surprise, that I did.  A handful of boring technical problems resulted in it not getting spidered properly by Google.  (Typically large numbers of customers typing the same activity into the program, rather than starting from one I’ve provided for them, indicates that I don’t have anything responsive to their needs or that they can’t find it.)  I’ve since fixed those problems, and am now contemplating how I can have the computer check this for me in an automated manner so that I never have to expend effort on it again.

There are probably bugs in your own marketing/advertising/etc systems which are leaking a percent here and a percent there of prospects.  Since improvements in many things we do are multiplicative, a percent here and a percent there is worth real money if you can recover them.  Consider automating the process of detecting and addressing these things, so it isn’t merely an ad hoc task you do when it is brought to your attention.  (Or, more often, that you don’t do, because it is boring.  I’m quite guilty of that.)

Write your own CMS.  I would have totally disagreed with this advice up until last week or so, but Thomas convinced me: writing a single-purpose CMS is pretty much the new Hello World for modern web frameworks (heck, it is the official Rails demo), and with a man-week or two you can make something much more productive for your purposes than using, e.g., WordPress.  (Though if you can do whatever functionality you need as a WordPress plugin, I’d still be inclined to suggest that.  No need to reinvent the wheel for basic CRUD operations on textual content, or HTML parsing.)

Lifecycle customer contact.  One of my big realizations in 2009 was that I was avoiding sending customers emails mostly because I hate receiving emails, and since I am not a forty-something schoolmarm with two kids, my opinion does not count.  So I signed up with MailChimp, spent three hours incorporating their API into my site, and started sending customers what I call “lifecycle” emails: thanks for signing up, wait a day, you signed up yesterday here’s some stuff you can do, wait a week, hey remember us by the way here’s advanced features.

This is stupidly cost effective relative to finding new prospects.  (It costs me a penny to send an existing trial user an email but about a quarter to recruit a new trial user via AdWords.)  Since 97.6% or so of trial users aren’t buying, scraping back a mere fraction of the waste generates great returns for me, and it is incredibly scalable.  (I write the API integration once and test variations on the emails periodically, they get sent to thousands of people without my intervention, money hats all around.)

That is, incidentally, a pretty brain-dead way to do things: with a little more work, I could e.g. send emails only to customers who weren’t active on the site, or vary the email contents with respect to how active or how sophisticated a user appeared to be, etc.  These are both things I intend to try out in 2010.

Similarly, you can create scalable systems to have your users do retention-improving activities for you.  By far the most brilliant implementation I’ve ever seen of this is on Facebook.  If you look to the top right of your main Facebook page right now, you’ll see “Suggestions” where Facebook tells you to add somebody as a friend or reconnect with someone on Facebook.  I will bet you a dollar that anyone who they suggest adding has few friends and anyone they suggest reconnecting with has not logged in recently.  Go check right now, I’ll wait.

Pretty amazing, right?  That is a few hours of engineer time, but it is going to get amazing increases in retention for Facebook (a key marketing goal) because it leverages the spontaneous-looking social pressure of a person’s own friends to keep them in the service.  And no Facebook engineer or marketer has to touch that system again, except trivially to test improvements to the textual calls to action.  You could have done this, or something which is similar for your niche or service.

Automatically generating advertising creatives. If you can create content for your website in a scalable fashion, why do you still have highly paid artisans fashioning exquisite one-off works of art for your landing pages again?  Generate a couple hundred, throw traffic at them, see what works and iterate.  If your analytics are sophisticated enough to track conversions back to whatever creative someone saw prior to signup (hint: this isn’t really all that hard but it also isn’t out-of-the-box behavior), you can quickly identify what works and what doesn’t.  Better yet, you can have a computer quickly identify what works and what doesn’t, so that you don’t have to worry about it.

I did this by repurposing the same content I use for my website and slotting it into a landing page template, which gives me about 750 distinct landing pages to work with.  If I took it to the next level and made variations on that template, I’d have thousands available for very little extra cost.  After that you just design a strategy for splitting traffic coming to them and Bob’s your uncle.

Don’t do what I do, but I just split half of my incoming traffic into a the best landing page I’ve handwritten and half into the landing page my system thinks is best.  (Check out how complicated the logic is: “Send people to the landing page corresponding to the most popular content on the site this week.”  This tends to select for holiday bingo in the runup to holidays and my most popular generic activity — currently baby shower bingo — in dull times.)

This should be the point where I tell you “My system beats the stuffing out of me, here are the numbers to prove it” but I actually don’t have the numbers handy, because I apparently had more important things to do with an hour of my time back in September than making a few thousand dollars.  Oh, that’s right, I was busy implementing the client/server feature.  Anyhow, forensic evaluation of my conversion rates for all my AdWords suggests that the 50/50 handwritten/algorithmic mix converts better than my previous 100% handwritten mix for the same landing page, so I’m betting that the system does indeed trounce my intuitions, but that is itself an intuition only marginally supported with data.  Let me get back to you on that in a few months.

Remove friction in your processes.  Another hat tip to Thomas for this idea.  One of the key insights to increasing productivity is changing things you do from disconnected tasks to processes.

This one idea explains a huge amount of why Toyota ran roughshod over Detroit, and has been discussed so often in the business literature I’d forgive you if you thought it was false.  Stopped clocks are right twice a day, and the hype about Toyota management you’ll find in your Business Books section is based on reality.

One of the corollaries to this notion is that processes which include steps that are boring, annoying, or tedious tend to fail to get performed.  For example, if anything you do for your business includes boring manual processing of data which you (consciously or otherwise) consider an insult to your intelligence, you probably will fail to do it despite the process being designed to be executed, e.g., weekly.  This is an example of friction in the process, and computers are really good at eliminating it.  You can either automate the boring bits or automate their assignment to someone more qualified than you to do them (i.e. freelancers), then automate the quality control, and then automate the notification to you that the raw data has been massaged and you can now continue with the work that actually matters.

There are literally infinite opportunities for this in your business.  Eliminate the friction in content creation by creating your own CMS or re-using existing data sources, as suggested above.  Eliminate the friction in testing by writing automatically executable tests.  Eliminate the friction in bookkeeping by having the computer do it for you.  Eliminate the friction in using your APIs by redesigning or wrapping them such that the common cases take no work at all.  etc, etc.

Try It.  You’ll Like It.

Hopefully the above list got the juices flowing on how you can do a bit of programming to improve the marketing in your business.  I’m also going to be exploring the topic quite a bit in the New Year, so stay tuned to the blog if you’re interested in it.

Credits: The beautiful lightbulb was lightly edited from a Creative Commons licensed work available through Flickr.

Twilio (phone call web API) is crazy fun

I live in Japan but my family lives in Chicago.  I wanted to make it simple for them to call me, so I looked for a service which would provide a US number and forward it to an international number.  This way they can call me without having to pay for the call to Japan or figure out how to do it, which nobody managed in nearly five years.

For the last year I’ve used TollFreeForwarding.com, which is… adequate.  Aside from the dropped calls, poor call quality, times when they call and I am connected to an irate Chinese man wondering what happened to his wife, and service interruptions, they pretty much do what they say on the tin.

My other point of annoyance with having a Chicago number so that my family can call me at any time is that my family sometimes calls me at any time, and while I really like speaking to them I would prefer not to do it at 4:30 AM on a work day.  (This also goes to the Bosnian high school student who found my number on my whois records and decided to chat about software strategy at 4:30 AM.  I’m totally willing to do that, but please folks, learn to use this website.  Of course I can’t just block calls between 3 AM and 8 AM, because if it is an emergency then I’ll deal.

Which is a long-winded way of introducing why I love Twilio, a startup that makes it easy to script phone calls with an API.  (Which uses — cue cursings from the Big Freaking Enterprise Java Web App programmer — XML.  But only a little XML.)

They were featured on Hacker News the other night with a quick guide to making a “customized international calling card”.  I promptly saw the possibilities:

  1. Mom (et al) calls the 312-XXX-YYYY number they provide me.
  2. Twilio’s computers answer the call and do an HTTP request to my website to get my call script.
  3. If I am likely to be home and awake, the website says “Play them a little message then connect the call to my home in Japan.”
  4. If I am likely to be not at home, the website says “Play them a little mssage then connect the call to my cell in Japan.”
  5. If I am likely to be asleep, then the website says “Tell them it is ‘3:47 AM Japan time’ and ask them to call back tomorrow or, if it is an emergency, hit 1 to have Patrick woken up by Ride of the Valkyries.”

This took really disgustingly little code to accomplish.  My v1.0 of the “app”, which just redirected between two phone numbers with a message, was a hard-coded XML file that I whipped up in vi.  It was five lines long.  The more complicated version — not quite deployed yet, as I am still in the US for Christmas — is a Sinatra application and will be about 30 lines long, with a bit more for config files for the web server.  (Nginx, naturally.  I’m debating whether to try out Passenger.)

This would have been a bit easier if I had just deployed another Rails app to the same server as BCC but that would be gratuitous overkill, and Thomas Ptacek is trying to convince me that Sinatra is great for speedy little web dev tasks.

Anyhow, Twilio appears to be cheaper than my existing solution for my usage levels, have features more in line with my needs, and hopefully will not disrupt anyone’s marriage.  Which is sort of a plus.

Now if I can just find a way to use Twilio for my next application…

Bingo Card Creator Year In Review 2009

My name is Patrick McKenzie and for the last three years and change I’ve run a small software business selling Bingo Card Creator, which creates… OK, so I’m not the world’s most creative namer.  I traditionally publish stats and other observations from the business that I think are interesting. You can see my automatically compiled statistics and reports for 2006, 2007, and 2008 elsewhere on my blog.  2009 technically isn’t over yet but business typically comes to a standstill after this point in the year, so the overall financial picture is likely accurate in broad strokes, with perhaps a few hundred in sales yet to happen and some expenses which may or may not happen in this calendar year depending on the vagaries of when vendors charge my credit card.

Business Stats for This Year:

Sales: 1049 (up from 815 — 29% increase)

Refunds: 24 (unchanged from last year, down from 2.9% to 2.3% of sales)

Sales Net of Refunds: $31,156.18 (up from $21,141.60 — 47% increase)

Expenses: $12,630.47 (up from $12,318.54 )

Profits: $18,525.71 (up from $8,823.06 — 110% increase)

Approximate wage per hour worked: $125 ~ $150 (I have never been good with timesheets.  Sorry.)

Web Stats For This Year

(All stats are for bingocardcreator.com unless otherwise noted)

Visits: 546,000

Unique Visitors: 470,000

Page Views: 1.6 million

Traffic sources of note: Google (48%), AdWords (20%), “My Sprawling Bingo Empire” (see below) 4%.

Trial downloads: 56,000 (flat from last year)

Trial signups to online version: 17,000 (new this year)

Guest signups to online version:  8,500 (new this year)

Approximate download-to-purchase conversion rate: 1.17%

Approximate online trial-to-purchase conversion rate: 2.33%

Narrative Version:

The defining event for my business this year was releasing the web version of my application this summer, which was extraordinarily successful for me.  In addition to the (clearly visible above) massive increase in conversion rates it afforded me, it has substantially decreased my support burdens, development costs per feature added, and headaches due to version incompatibilities.  (Well, with the noted exception of that time I cost myself $3,500 due to a CSS bug.)

I’m currently making more than enough money to live off of on Bingo Card Creator (I live rather simply in a rural prefecture in Japan — think Kansas with rather less white people) and, depending on the month, have made more than my day job a few times.  (My highest month in sales was October at $4.5k… and that is even with the $3.5k lost to that bug.)

Things That Went Right

  • The launch of the web application.
  • Raising prices, from $24.95 to $29.95.
  • My development of A/Bingo, a Rails A/B testing framework.  In addition to collecting numerous mentions from luminaries in the community, actually using A/Bingo has been key to my ongoing conversion optimization efforts.  (Last year, for example, I had approximately 1.4% conversion rates for the downloadable version of the software.)  A/Bingo is also deployed in production at over a dozen other businesses, helping to make other people lots of money, which makes me very happy.
  • Iterating on the Christmas Bingo Cards experiment from last year to develop a stable of mini-sites, currently mostly centered on holiday bingo from Valentines through Halloween.  While a full ROI breakdown is outside the scope of this article, systematizing the process, automating the deployments, and using freelancers to do a lot of the repetitive work has resulted in greater than a 10x ROI.  This isn’t quite as impressive as the returns on the freelancer-produced content -> SEO engine that made my year in 2008 and continues to produce dividends, but it has been both fun and profitable.  I hope to improve more on it next year.
  • Mailing list marketing through Mailchimp.  In the course of signing up for the online trial of my site, many teachers choose to accept a semi-monthly newsletter from me.  Since the typical user behavior on my site is to use it for an immediate need and forget about it, this gives me further bites at the conversion apple.  (It costs me about 24 cents to get a trial download or trial signup through AdWords at the margin, but only 2 cents to remind someone that their account still exists and that it would be perfect for the Halloween Thanksgiving Christmas festivities around the corner.)  I feel like I’ve only scratched the surface of email marketing, and it will pay for all my Christmas presents this year and then some.
  • Meat and potatoes SEO, marketing, customer support, and all that jazz.

What Didn’t Go Quite Right

  • Did I mention I lost $3.5k to a CSS bug?
  • AdWords has made me very unhappy at several points this year, from when they turned off my account to their inability to approve my new ad copy within a month of submission to the strange partial limitation for “gambling content” that my account is sporadically flagged for.  (Attempts to resolve this through AdWords customer support have been…  you know what, in the Spirit of Christmas (TM), I think I’m just not going to go there.)
  • I again failed at my goal to launch a 2nd product, largely due to lack of time and mental bandwidth.  (Although I suppose the online version practically counts as a second product.)
  • I spent a whole lot of time implementing online features in the desktop version of my app — probably something on the order of half the time I spent making the online version.  This was largely done on the theory “Hey, it adds value to the desktop offering for current customers, who wouldn’t want to switch to the web app.”  I’m too cool to actually ask users about their feelings beforehand, of course.  Fans of the Lean Startup can already predict how this story ends up, right?  To date, twenty users have touched those three features.  Let’s see: add marginal feature to application, make twenty users happy, or for just a few hours more develop and market a new application which doubles my revenue streams.  Decisions decisions!
  • I am not satisfied with the level of attention I gave the business at some points during the year.  In particular, communication with my freelancers was subpar, and on three occasions I took more than 24 hours to respond to a customer.
  • I was sort of hoping to hit the nice round $20k number for profit.  Didn’t quite get there.

Plans For 2010

  1. In 2007, I had vague dreams of someday going full time on this.  In 2008, I had a half-formed wish that maybe I would go full-time on this in 2008.  In 2009… I have a date circled in red on my calendar, forms from the tax office and immigration authority ready for submission as soon as the new year starts, and I’ve had the first conversation with my boss to inform him that I am considering my options.
  2. I’d like to make $45,000 in sales from BCC.  This is probably a little on the low side, considering that if I don’t have a repeat of the CSS bug it would only require sustaining my current performance rather than markedly improving.  But, hey, I don’t want to get too caught up in hitting this to the exclusion of my other business and personal goals for the year.
  3. I’d like to have about $30,000 in profit from BCC.  This is a bit on the aggressive side if you assume $45,000 in sales — it means I’ll have to get most of the growth out of doing the old things better rather than just spending up through AdWords.  (Which would be wonderful if it were possible, don’t get me wrong.)
  4. Personal goals: I’d like to attend (and, ideally, present something fun) at an industry event overseas and at one in Japan.  (I got a bit of an invitation to talk software design in Osaka this January.  Jeepers.)  I’d also like to spend a lot more time with my family, which will be easier after I no longer have to ask for permission four months in advance to fly out to see them.  (I’m still not entirely decided on where I’ll end up with my business, but the great thing about it is the whole thing fits in a laptop bag, so aside from some irksome tax and visa issues I have a lot of flexibility in where I live.)

Bringing A/B Testing To The Fortune 5 Million

After writing an A/B testing library and blogging about the subject for a couple of years, I somehow unwittingly sleepwalked into being that most loathsome of creatures, a technology evangelist.  This means that periodically I get emails from folks with the Next Big Thing who want my opinion on it.  I rather enjoy this, as I’m passionate about the subject, can usually find ideas to steal become inspired by, and always enjoy a good tech demo.

One of the folks who asked my opinion was Paras Chopra over at Wingify.  Wingify is an analytics startup which, from my brief impression of using it, was trying to be an awful lot of things to an awful lot of people.  That is sort of the nature of the beast with enterprise software.  I did the requisite “install the tracking code” thing, commented to Paras that I would have felt a little lost if I didn’t breathe analytics systems, and more or less forgot about it.

But Paras and his team kept iterating and they produced a real gem — probably the best single piece of software I’ve seen in this field.  As Paras explains:

One of the biggest [lessons] I have had is that there is a huge gap in what we (split testing community) profess and what small businesses actually adopt. I have [learned] that many website owners are curious about split and multivariate testing but don’t have a clue where to start and what to use. Though [the] Google Website Optimizer guys are doing a great job to the community by evangelising split testing, … the difficulty in using the tool and fiddling with code leaves most people wondering how to really [use] testing.

I think this is an extraordinarily good job of learning from your customers, like that “customer development” that seems to be going the rounds these days.  Wingify initially set out to be an enterprise-class analytics system, but when trying to sell it Paras et al found out that the customers don’t need an enterprise-class analytics system.  They need the Microsoft Word of Internet marketing, a simple pick-up-and-go tool that you can install, employ, and benefit from without needing cooperation from the operations, marketing, and engineering teams which you either a) don’t have or b) can’t get to do your bidding.

This is sort of like the process I went through with making A/Bingo, except from another direction.  I couldn’t use the market-leader A/B testing tool (Google Website Optimizer) because it was built for non-technical marketers and made too many compromises to be useful for me.  Paras’ customers can’t use GWO because it isn’t nearly non-technical enough — it still requires inserting multiple chunks of Javascript code, knowing HTML to make alternatives, being comfortable with regular expressions and URLs, etc.  These aren’t core concerns if your market is Rails developers, but if your market is e.g. real estate agents with 5 page brochureware sites who want to split test the call to action to join their mailing list without having to engage a freelancer to do it, they’re huge stumbling blocks.

Genius UI

Visual Website Optimizer, Wingify’s new product, has such a UI for creating A/B tests so simple it will crush the life out of all other solutions for non-technical users:

  1. VWO opens your site in a browser.
  2. You click the element on the page you want to A/B test.
  3. You click “Add variation.”
  4. You add a variation by typing into a WYSIWYG editor.  (TinyMCE, if I don’t miss my guess.  Score one for OSS.)
  5. Copy/paste the Javascript we give you into your page.  You don’t have to identify sections, massage your HTML, or create alternate URLs.  We do that %(#$ for you.

For example, here is me clicking on the headline for BCC:

And I think I’ll rewrite it to say “PC or Mac” instead of computer.

Dead easy.  The remainder of the process involves a bit of Javascript cut/pasting and some URL specifying.  (The interface for this could still be improved a little bit, to be sure.  I don’t think it is quite at the level where it needs to be for non-technical folks to intuitively grasp it.  But, hey, ship and iterate, right?)

Why I’m Particularly Impressed With This

Aside from the work going on in the background to make this process so pain-free (that is live, unaltered HTML they’re working with, and I didn’t do anything when coding it to make it particularly easy for them to rewrite the DOM model on the fly with their injected Javascript), this software impresses me as a business.  It solves a clear need for a huge number of small businesses, and brings a powerful technique to people who would never have been able to use it before.  Moreover, it does it so disruptively, embarrassingly better than Google does that it puts a smile on my face.  I like Google, don’t get me wrong, they’ve made me a lot of money.  But all the kings horses and all the kings men apparently can’t deliver a UI as good as a small team.

A quick note: Paras et al are from India.  After a few years of doing outsourcing management I’m quite happy to see a young team producing something very worthwhile rather than doing the traditional thing and being a wee little cog in a giant corporation working on grinding out back-office software.  I have to say one thing, though, and it is straight out of the Economic History of Japan playbook: people don’t take copiers seriously, and it is a hard impression to shake once you’ve gotten it attached to you, fairly or otherwise.  The design “inspired by” Basecamp is… ahem…  well, suffice it to say that it does not demonstrate nearly as much originality as the software does.  I’d hate for folks to write this startup off just for that, but first impressions matter.

Rather than taking (deserved) lumps for flying the Jolly Roger, I’d suggest folks to either use an open source web design, one of the attractive reasonably priced templates the Internet is overflowing with, or hire somebody with design skills to bang out something decent for v1.0.  StyleShout (OSS) and ThemeForest (paid, but sinfully cheap) both have very attractive Web 2.0-y designs.  After you have revenue you can always improve it to your heart’s content, particularly if you’ve kept the design mostly separate from your program logic.  (A taller order than it needs to be in PHP, I know…)

Want To Try It Out?

Paras provided me with an invite code (“patrick-bcc” without quotes), which will work for the first 30 folks who use it.  Visual Website Optimizer is free when it is in beta, and will be a paid tool after that.

I’d expect pricing to be “reasonable”, although my advice (to Paras and anyone else) is to charge more than he thinks it is worth.  Trust me, it would be cheap at ten times the price if it worked for dentist offices, real estate agents, car mechanics, and the other constituents of the Fortune 5 Million.  One conversion at the margin is potentially worth hundreds or thousands of dollars, and even a single lead would pay for a month of most software-as-a-service products.

It Seems I Have A Podcast Now

I was just checking referrals today and found some from HearABlog, a new startup which narrates blogs for consumption on the go.  Apparently they used my blog as an example to demo their service, making podcasts of the following three posts:

  • Practical Metaprogramming: Storing User Preferences (download)
  • Tracking Down a Subtle Analytics Bug (download)
  • The IE CSS Bug Which Cost Me A Month’s Salary (download)

The quality is quite good — I’m especially impressed at their oral descriptions of the screenshots and graphs that were crucial to understanding some of those posts.  The translator in me approves immensely.  My only constructive criticism: Cal-zoo-ME-us, not Cal-zoo-MAY-us.  Because even made up words have a canonical pronunciation, even if only in my mind. :)

I’m not sure if strictly speaking creating derivative works from this blog is copyright kosher without asking me first, but in the interests of supporting people doing cool things, HearABlog has my explicit permission to produce and distribute audio recordings of all my blog posts on this site, and that will continue until I say otherwise.  (I have no particular plans of ever doing it but, hey, you never know.)

A request to people doing cool things in the future: I have an email address a marked aversion to saying “No”, so please, feel free to tell me about it in advance.

Edited to add: I just went back and checked my email, and Pablo from HearABlog did try to get in touch with me on October 29th.  Unfortunately, he managed to email me when I was on a plane over the Pacific, and I didn’t see the email later in the crush of payment notifications.  (A high class problem, to be sure.  Halloween is my busy season.)

Practical Metaprogramming with Ruby: Storing Preferences

// <![CDATA[
if (typeof window.Delicious == "undefined") window.Delicious = {}; Delicious.BLOGBADGE_GRAPH_SHOW = false; Delicious.BLOGBADGE_TAGS_SHOW = false;// ]]>

All code in this article is copyright Patrick McKenzie 2009 and released under the MIT license. Basically, you can feel free to use it for whatever, but don’t sue me.

The other day on Hacker News, commenting on a recent Yehuda Katz explanation of the nuts and bolts of metaprogramming, I mentioned that I though discussions of programming theory are improved by practical examples of how the techniques solve problems for customers.  After all, toy problems are great, but foos and bars don’t get me home from the day job quicker or convince my customers to pay me money.

My claim: Metaprogramming allows you to cut down on boilerplate code, making your programs shorter, easier to write, easier to read, and easier to test. Also, it reduces the impact of changes.

I’m going to demonstrate this using actual code from my PrintJob class, which encapsulates a single request to print out a set of bingo cards in Bingo Card Creator.  PrintJobs can have any number of properties associated with them, and every time I implement a new feature the list tends to grow.  For example, when I added in the ability to print in color, that required adding five properties. This pattern is widely applicable among many times of one-to-many relationships where you never really look at the many outside of the context of their relationship to the one — user preferences would be an obvious example.

There are a few ways you can do this in Rails. The most obvious is put each property as a separate column in your table. This means that

  1. you’d do a database migration (downtime! breakage! unnecessary work!) every week you add a new property.

If you’re getting into the associations swing of thing, you might consider creating a has_many relationship between PrintJob and PrintJobProperties, with each PrintJobProperty having a property_name and a property_value.  Swell.  Now you need to:

  1. Do twenty joins every time you inspect a single PrintJob.
  2. Add a bunch of unique constraints (in your DB or via Rails validations — I hope you have earned the favor of the concurrent modification gods) to prevent someone from assigning two properties of the same name to the same print job.
  3. Have very intensely ugly syntax for accessing the actual properties.  (Lets see, print_job.options.find_or_create_by_name(“foo”).value = “bar” ought to do it.)

Instead of either of these methods, I save the properties to an options hash, and then serialize that to JSON to save to and load from the database column.  Rails takes care of most of the details for me.

Enter The Metaprogramming Magic

However, this means I would have to write code like print_job[:options][:background_color], which is excessively verbose, and every time I referred to it I would need to possibly provide a default value in the event it was nil.  Too much work!

Instead, we’ll use this Ruby code:

#goes in /lib/current_method.rb

#Returns the method name ruby is currently executing.
#I use this just to make my code more readable to me.
module CurrentMethodName
  def this_method
    caller[0][/`([^']*)'/, 1]
  end
end

#goes in /app/models/print_job.rb
class PrintJob  5, :columns => 5, :column_headers => "BINGO", :free_space => "Free Space!", :cards_per_page => 1, :card_count => 1, :page_size => "LETTER", :title => nil, :title_size => 36, :font => "Times-Roman", :font_size => 24, :good_randomize => true, :watermark => false, :footer_text => "Omitted for length", :call_list => true, :background_color => COLOR_WHITE, :second_background_color => COLOR_GREY, :border_color => COLOR_BLACK,
:text_color => COLOR_BLACK, :color_pattern => "plain"
}

#set up accessors for options
  DEFAULT_OPTIONS.keys.each do |key|
    define_method key do
      unless options.nil?
        options[this_method.to_sym]
      else
        nil
      end
    end

    define_method "#{key}=".to_sym do |value|
      unless options.nil?
        options[this_method.to_s.sub("=","").to_sym] = value
      else
        options = {}
        options[this_method.to_s.sub("=","").to_sym] = value
      end
    end
  end

#Other stuff omitted. Sorry, I'm not OSSing my whole business today.
end

What This Code Does

OK, what does this do? Well, first I define a bunch of default options, which are later used (code not shown) to initialize PrintJobs right before they’re fed into the actual printing code. Each default option is used to create a getter/setter pair for PrintJob, so that instead of typing print_job[:options][:background_color] I can just type print_job.background_color. You’ll notice that it also note that both setters and getters pre-initialize the options array if I haven’t done it already. This saves me from accidentally forgetting to initialize it and then winding up calling nil[:some_option].

Why This Code Is Useful

Clearly this saves keystrokes for using getters/setters, but how does it actually save work? Well, because each of the properties are now methods on the ActiveRecord object (the PrintJob), all of the Rails magic which you think works on columns actually works on these options, too. This includes:

  • validations
  • form helpers
  • various pretty printing things

Since card_count is just another property on the PrintJob ActiveRecord object, Rails can validate it trivially. Try doing that within a hash — it isn’t fun. I sanity check that card_count (the number of cards printed for this print job) is an integer between 1 and 1,000, and additionally check that, for users who aren’t registered, it is between 1 and 15. (I’ve omitted the message which tells folks trying to print more to upgrade.)

  validates_numericality_of :card_count, :greater_than => 0, :less_than => 1000
  validates_numericality_of :card_count, :greater_than => 0, :less_than => 16,
:unless => Proc.new {|print_job| print_job.user && print_job.user.is_registered?}

Here’s an example of a portion of the form helper from the screen where most of these options are set:

#Just a part of the form.

   4, :title => "Total number of cards to print."%>

Ordinarily in the above code you’d expect card_count to correspond to a column in the database, and then the column would cause there to be card_count and card_count= methods on PrintJob, and this would be used to initialize the above text field. Well, Rails doesn’t really care how those methods came to be — they could be placed there by ActiveRecord magic, or attr_accessor, or defining by hand, or creative use of metaprogramming, as above. It takes about 7 lines to define a getter/setter pair in most languages. I have twenty properties listed up there. Instant savings: 140 lines of code.

Similarly, I’m saved from having to write a bunch of repetitive non-sense in the controller, too.

def some_controller_method
  @print_job = PrintJob.new
  @print_job.sensible_defaults! #initializes defaults for options not already set
  #update all parameters relating to the print job
  params[:print_job].each do |key, value|
  if @print_job.options.include? key.to_sym
    @print_job[:options][key.to_sym] = value
    end
  end
end

This walks over the param map for things being set to the PrintJob and, if they’re an option, sets them automatically. This saves about twenty lines of manual assignment. (Nota bene: PrintJob.new(param) will not work because the virtual columns are not real columns. In general, I hate mass assignment in Rails anyhow — sooner or later it will bite my hindquarters for security if I use it, so I assign only those columns which I know to be safe. Note that nothing in the options hash is sensitive — after all, they’re just user options.)

This controller is extraordinarily robust against change. When I added five extra options to the print jobs to accommodate my new features (font, color, and pattern selection), I didn’t change one single line of the associated controllers.

But wait, there’s more! You see, 200 lines of negacode (code that you don’t have to write) means 200 lines of code that you don’t have to read, test, maintain, or debug. I didn’t have to change the controller at all. I didn’t have to check to see if the new properties were automatically initialized to their starting values, since the code which performed the initialization was already known to work. I didn’t have to debug typos made in the accessors. It all just worked.

This is the power of metaprogramming. The less boilerplate code you have to write, understand, read, test, debug, and maintain, the more time you can spend creating new features for your customers, doing marketing, or doing other things of value for your business. The last three features I added caused five new properties to be added to my PrintJob model. I just diffed the pre- and post-commit code in SVN. Three features required required:

  • No change to the schema.
  • No change to the controller.
  • 35 lines to implement three features in the model. (Counting the white space.)

(The view required about 25 lines of new code, mostly inline Javascript, because a good UI for picking colors is tricky. I ended up liberally borrowing from this chap, who has an awesome color picker in Prototype that was amenable to quick adaptation to the needs of technically unsophisticated users to who wouldn’t think #F00F00 is a color.)

This is a much, much more effective use of my time than writing several hundred lines worth of boilerplate model code, then repeating much of it in XML configurations, just so that I can actually have access to the data needed to begin implementing the features. (How did you guess I program in Java?)

A Note To DBAs In The Audience

Yeah, I hear you — stuffing preferences in JSON is all well and good, but doesn’t this ruin many of the benefits of using SQL? For instance, isn’t it true that I can no longer query for PrintJobs which are colored red in any convenient manner? That is absolutely true. However, it isn’t required by my use cases, at all.

PrintJobs are only ever accessed per user and per id, and since both of those have their own columns, having all the various options be stored in an opaque blob doesn’t really hurt me. I regret not being able to do “select sum(card_count) from print_jobs;” some times, but since I don’t have to calculate “total number of cards printed since I opened” all that frequently, it is perfectly adequate to just load the entire freaking table into memory and then calculate it in Ruby. (It takes about 5 seconds to count: 217,264 cards printed to date. Thanks users!)

Note Regarding Security

Programmatically instantiating methods is extraordinarily dangerous if you don’t know what you’re doing. Note that I create the methods based on keys used in a constant hash specified in my code. You could theoretically do this from a hash created anywhere — for example, Ruby will happily let you create methods at runtime so you might decide, eh, PrintJob needs a method for everything in the params hash. DO NOT DO THIS. That would let anyone with access to your params (which is, well, anyone — just append ?foo=bar to the end of a URL and see what happens) create arbitrarily named methods on your model objects. That is just asking to be abused — setting is_admin? to 1 or adding can_launch_nuclear_weapons to role, for example.

#Fixing a WordPress bug.  Don't mind me.

Tracking Down A Subtle Analytics Bug

I have a confession to make: I often trust code that I thinks works.  For example, after I’ve got analytics code up and running, and verify to my satisfaction that it in fact increments the count of registrations by one when I sign up, I generally assume “OK, that is the last time I have to worry about that.”

This is, perhaps, a weakness.  For the last several months, for example, I’ve periodically checked in to Mixpanel or my A/Bingo stats and thought “Hmm, that is a lot less conversions than I expected over that time interval”, but I knew my conversion code worked and it wasn’t displaying zeroes, so I mentally pushed it out of mind.

Today, I was finally motivated into checking what was up when I reviewed the statistics for my recent Halloween promotions.  Portions of these did exceptionally well: for example, in the week before Halloween, I had over 2,000 trial signups.  In the course of writing a blog post about this, I tried to figure out how many of those trial signups were caused by the Halloween promotion, and to do this I opened my statistics at Mixpanel.  (Mixpanel is an analytics service with a wonderful API.  I highly recommend them. They’re blameless for the bug I’m about to talk about.)

Mixpanel, however, reported that over the same week, I only had 375 people open the sign up form and some 180 of them actually sign up.  While it is quite common for multiple analytics sources to disagree about exact numbers, a discrepancy that large meant that there was probably a bug somewhere.

Having previously had some issues with my Mixpanel integration, the first place I looked was my log files, to see if I was actually calling the Mixpanel API.  Some light grepping suggested that I was, in about the frequencies I expected to.  Then I thought “Hmm, I wonder if the unique IDs I am passing with each visitor are, in fact, unique?”

A quick check on the command line suggested, no, I was in fact duplicating IDs.  By the scads.  This resulted in me pulling out the code which assigned Mixpanel IDs:

  def fetch_mixpanel_id
    user_id = #code which sets current user ID, or nil for "no logged in user"

    mixpanel_id = session[:mixpanel_id]
    if (user_id)
      mixpanel_id = Rails.cache.read("mixpanel_id_for_#{user_id}") || mixpanel_id
      mixpanel_id ||= user_id
      Rails.cache.write("mixpanel_id_for_#{user_id}", mixpanel_id)
    else
      mixpanel_id ||= rand(10 ** 10).to_i
    end
    session[:mixpanel_id] = mixpanel_id
    #omission for clarity
  end

I was flummoxed, because the code appeared correct, if a little convoluted on first glance. The idea is to persist the Mixpanel ID for a given user — first, by stuffing it in their session cookie, and second, by stuffing it in Memcached so that if they bounce between multiple different machines I can still track them after they log in. If they don’t have an ID set anywhere, it gets randomized — that is the rand (10 ** 10) call.

Given that my user login code was correct (or there would be much, much more serious problems than analytics failing — people would be seeing my admin screens, bingo cards would bleed across accounts, cats and dogs would be friends, etc) and I trust Memcached not to have critical data-corruption bugs, the only place that made sense for introducing the error was the random statement. Which, of course, makes no sense at all — 10 digit random numbers should, by definition, not routinely collide. (There is the birthday paradox to worry about, I know, but the odds of that happening with only 15,000 accounts on the system were fairly slim and the odds of it causing the current issue were too low to measure.)

So I dropped into my Rails console and used it to inspect the memcached IDs, to see if I could find any patterns.

  mixpanel_ids = (1..User.count).inject({}) {|hash, user_id| hash[user_id] = Rails.cache.read("mixpanel_id_for_#{user_id}"); hash}
  counts = mixpanel_ids.to_a.map {|tuple| tuple[1]}.inject({}) {|hash, id| hash[id] ||= 0; hash[id] +=1; hash}
  counts.to_a.sort {|a,b| a[1]  b[1]}.reverse[0..9]

For those of you not fluent in Ruby, these are just some quick, basic operations on data sets. Ruby excels at them and the ability to do them in real time on the console is fantastic. (I don’t suggest doing them with truly massive data sets but when you count things in thousands, oh well, Slicehost gave me 8 cores for a reason.)

Anyhow, this exploration showed that the most common IDs were repeated literally hundreds of times each. However, it wasn’t a uniform distribution — instead, it looked like exponential decay, from the most repeated ID having 800 copies to a long tail of truly unique unique IDs.

As soon as I saw that pattern, I knew what had to be causing it: an srand bug. srand sets the random seed for your process. The random seed is used to generate what random number the next call to rand gives you. Two pieces of code which execute rand with the same random seed will always get the same random number. Since this is not desirable 99% of the time, if you execute rand before setting a seed through srand, srand is initialized to a string composed of the current time, process ID, and a sequence number. This virtually guarantees that you’ll get a unique random seed, and thus you get mostly unique streams of random numbers.

This narrowed down the possible causes of my bug quite precipitously: either there was a bug in Ruby’s srand (unlikely) or I was setting srand determinalistically somewhere. So I did a quick search through my code base and, yep, there is was:

if (options[:good_randomize])
  srand
else
  srand(12345678)
end

This lovely bit of poorly thought out code is in the code which controls printing bingo cards. Each card has the words scrambled, but trial users (whose options[:good_randomize] gets set to false) don’t get truly scrambled cards. They get a sequence of cards which is always the same, capped at 15 cards, so that no many how many times they try to print they can never create more than 15 unique cards. This encourages them to purchase the software, which removes that limitation.

Since that code snippet is executed for every print job, every user either sees a properly random random seed or the fixed random seed. However, after the code finishes, the random seed lives on in the Mongrel process. This was the ultimate cause of my bug.

Imagine a sequence of events like so:

1) Trial user Bob prints out a single bingo card, which sets the random seed to the deterministic value and uses 25 calls to rand.
2) Jane comes to the site for the first time and has a random number assigned to her. It is guaranteed to be the 26th element (call it R26) in the sequence from a fresh call from the deterministic random seed.
3) Bob prints out another bingo card, which sets the random seed to the deterministic value and uses 25 calls to rand.
4) Frederick comes to the site for the first time and has a random number assigned to her. It is guaranteed to be the 26th element (call it R26) in the sequence from a fresh call from the deterministic random seed.

Thus, Jane and Frederick end up getting the same “random” ID assigned to them. Thus, when I report that both Jane and Frederick signed up for the free trial, Mixpanel decides “Patrick’s conversion code is reporting the same event twice, OK, no biggie: I’m going to discard that second funnel event.”

In addition, since both Jane and Frederick shared the same ID for their A/Bingo identity, both of them would always see the same A/B test alternatives. That wouldn’t be too bad, except they also were counted as the same person for conversion purposes, so if Jane converted Frederick’s conversion wouldn’t be counted.

In actual practice, many users still end up getting random IDs, since any particular Mongrel’s random seed got unborked when a paying user printed out a bingo card… at least until the next trial user printed out a bingo card on that Mongrel. (Random seeds are maintained at the process level in Ruby, and each Mongrel is a separate process.)

Anyhow, long story short:

1) No users except me were adversely affected by this bug — it only affected analytics and A/B test results.
2) Other folks using my public Mixpanel APIs and A/Bingo A/B testing code were unaffected — it was only the interaction of this code with the srand(12345678) code that caused the issue.
3) Two months of my analytics and A/B test data is corrupted.

I’m not sure it matters to the bottom line of the A/B test results, which are shockingly robust against programming errors like this: as long as the source of error doesn’t correlate with your A/B choices, the errors will be evenly distributed across both alternatives, so it just washes out. For example, it is likely upwards of 60% of data for the A/B tests was thrown out as “duplicate” erroneously, but since the “duplicates” were evenly distributed across A and B this borks the measured totals and percentages but not the measured significances. That’s some comfort to me, cold as it is.

The moral of the story: don’t use srand(constant) anywhere in your production code, even if it is the easiest way to get what you want, because sooner or later you will use rand() in some unrelated code called by the same process and now you have extraordinarily subtle bugs caused by reuse of random numbers. If you absolutely must use srand(constant), call srand() on a per-request basis to clear out anything which may be lurking around from previous requests. (For example, in a application-wide before filter.)