Lesson from Madlibs Signup Fad: Do Your Own Tests

Periodically, news of an innovative, goofy, compelling, or compellingly goofy design decision will sweep across the Internets like wildfire.  Most recently, this happened with a madlibs-looking lead generation form.

I think it has much to recommend it in the context of lead generation forms (long, arduous monstrosity that you sign up for in the hopes you are contacted but not spammed to death), but I didn’t see much possible upside for using it on a new user registration form (short form which you sign up to use something).

However, I’m wary of trusting my instincts on such things when I could trust data instead.  There is a key point about A/B testing: trust your data, not somebody else’s data.  After all, you only make money when it improves your conversion rate, not their conversion rate.  You can feel free to use other folk’s successful experiments for inspiration but for heaven’s sake use them to inspire you to run tests, rather than inspire you to fire blindly.

I was particularly wary about trusting this result because, as pointed out by numerous people in the Hacker News discussion, roughly seven things changed between the two forms in the A/B test performed on the standard form versus the madlib form, and there is no particular reason to assume that the salient difference was caused by the part which strikes us as creative as opposed by more boring things like e.g. the call to action in the header.

When In Doubt, Test.  (When Not In Doubt, Test Twice.)

No less than six people said “Hey Patrick have you seen this madlibs thing yet?  You’ve got to try it.”, and because knocking something together would take less than 10 minutes because I have an A/B testing framework that makes this a one-line proposition, I decided I’d humor them.  I isolated just the madlibs versus standard style for the test, knocked up an alternative in about ten minutes with my (decidedly limited) CSS and Javascript skills, and set them against each other.  My conversion goal for this test is successfully inducing someone to sign up for the free trial of Bingo Card Creator.

My Usual Registration Form

The Madlibs Registration Form

P.S. If you have good eyes you’ll spot the other A/B test ongoing on this page.  I’m using the traditional way of mitigating cross-test interaction… ignoring the possibility of it.  Don’t tell your college stats professor, but this actually works pretty well in practice.

Results

I ran this test until A/Bingo, my A/B testing framework for Rails, told me that further testing was just a waste of my time.  It didn’t take long at all — 34 hours after the test alternative went live for the site, the first time I checked the results, they were already overwhelming.  Let me copy/paste right off my public results page:

Signup Madlibs Versus Standard Standard (27.55%) winner
Madlibs (21.73%)
95%

By my count that is a 22% decrease in conversion rates for using the madlibs signup style over the standard signups style, and the fact of the decrease (but not the magnitude) is significant at the 95% confidence level.

For the curious: there were 736 participants in this test, split roughly 50/50, as you would expect.  I love the Internet because where else can you get 736 people to help you improve your website while you sleep, work at the day job on Saturday, have an evening out with friends, and then sleep some more?

Anyhow: test ended, not touching the madlibs idea again.  Before adopting this or any other fad (or good suggestion, for that matter): do your own A/B tests.

Comments Off

Women, Men, And Other Things Done Wrong By Silicon Valley

This post is waaaaaaay outside the usual ambit of my blog, as it is at least arguably political and about cultural norms in Silicon Valley.  (I’m a sometimes visitor and spiritual resident, but I’ve never lived there.)  I’ll be back to software blogging on the weekend if all goes well. 

There was a bit of a dustup recently about there not being enough women in Startup Land.  By this, they really mean “the startups we can see in the Valley”, because the Valley thinks that it is the beginning and end of all things tech and startup.  This bit of hubris has a lot more going for it than some other bombastic nonsense I’ve heard over the years, since quite a lot of tech innovation has indeed happened in the Valley.   I live in the heart of Japan’s automobile industry.  We’re justifiably proud of our cars.  That doesn’t mean I think your cars suck.

Strictly speaking the complaint was phrased in terms of “diversity”.  This is the peculiar diversity of the American academy, where a gay Jewish man in New York, an Englishman in London, a 4th generation zainichi kankokujin (ethnically Korean who was born in Japan), and an Irish Catholic dogmatist living in a rice field in Central Japan are so close they are practically brothers.  True diversity, of course, is the 5-member iStockPhoto of attractive twenty-somethings sitting on the college quad who check different boxes on the demographic inventory and think alike in every way that matters.

A serious question for metrics-focused individuals: If demographic diversity is a proxy for diversity of thought, is there some reason we’re not measuring diversity of thought?  Is it hard to measure somehow?  We sound like we’re counting hits because grepping the Apache log is easy and implementing conversion funnel tracking is hard, despite the fact that we know hits are meaningless and we’re really interested in conversions.  (Cards on the table: I think we’re the pathological PHBs who learned only half the lesson and now seek hits as a goal unto themselves.)

I mean, I would be sympathetic to “We can’t build products for women if we don’t have more women in the room” if it weren’t so laughably false.  (Context if you need it: 90% of my customers are ladies.  They’re also older, better educated, less coastal, and more religious than would be anticipated of the customer base of most B2C startups.  I’m pretty much your typical 27 year old male engineer… well, for certain quirky values of “typical”.)  If you wanted to recruit a team with experience building products for women, rather than quickly polling their X chromosome count, you could just ask “What have you made for women?”

That is probably worth doing as the Valley creates a persistent undersupply of products targetting the needs of women.  “Persistent undersupply” is one of those words that should be music to your ears if you are a capitalist, because market failures are opportunities to make lots and lots of money.  (For that matter, if you think that there is a vast pool of untapped female talent working for 80% of the price of equivalent male talent… what are you doing hiring men, again?  That would suggest that you could field whole teams of ladies and clean up.  I tend to be skeptical of the “persistent underpricing of female labor” hypothesis in the large, but there’s at least one example which has produced the outcome Econ 101 suggests: you can hire a stay-at-home mom with a graduate degree in Middle America for less than $10 an hour.  If you figure out a way to exploit that, you’ll end up very, very rich.  This is the unsung secret to Demand Media’s success.  If you think I’m wrong on the probable lack of this opportunity for female computer programmers, please go prove me wrong and make billions.)

My beef with the discourse of “diversity” in a nutshell: it screams “give us more women” and whispers “give us more women like us”.  We want more women to be early stage startup employees working for equity and battling code until 2 AM in the morning.  We want more women making products to pump VC cash into so that they can be flipped to Google in two years despite having less paying customers than your local Girl Scout troop’s worst cookie salesman.  We want more women mentors and women VCs and women industry group organizers so that we can pat ourselves on the back for embracing change while making sure that the Valley stays the way it is.

After quite a bit of time studying diversity in college (yay, liberal arts degree — there was a buy-one-get-one deal the day I majored in CS) I can rattle off all the hypotheses for you: there’s a biological basis, no it’s a cultural issue, no it’s a pipeline issue, no it’s a lack of role models, no it’s a …  and it is probably a witches brew of all of the above and more.  But my gut instinct has always been that people avoid joining startups because joining startups sucks.  The question isn’t what are we doing that’s keeping ladies out of the Valley, gentlemen.  The question should be why in God’s name are we still here.

Let’s review:

  • Most startups require you to be at or near the top of the game in a very difficult, competive field, requiring a college degree (or equivalent education gained in the School of Hard Knocks) in a subject which is widely agreed to be difficult.
  • The no-risk option for anybody capable of doing a startup is to go to their local insurance company and get a job cranking out CRUD apps.  They will immediately be in the upper middle class, have ample opportunities for professional advancement, and leave work each day at about 5 PM.
  • Of course, just because you might possibly be good at programming doesn’t mean you’re limited to doing it.  You could go into a host of fields in engineering or outside of it.  You could have the societal respect of being a doctor, or the material rewards of going into finance, or the work/life balance of teaching, or the rock-solid stability of being a technocrat ensconced in a minor government office somewhere.
  • Your sales pitch as a startup is “Turn your back on all that!  We’ll work you 100 hours a week, pay you nothing while requring you to live in a freakishly expensive area, give you social status one rung above the homeless, take two to three years of your life, ruin your relationships, and with better than 90% probability subject you to the most crushing defeat of your professional career with no lateral move except into doing the same thing over again.”
  • Your upside, should you make it to the pinnacle of your profession and do everything right, is theoretically unbounded but, practically speaking, what’s left after the VCs get their share will probably work out to a few million for most founders and barely cover the opportunity cost for early non-founder employees.  I don’t mean to say that is totally insane, but it requires that you have the risk-tolerance slider bumped to the maximum.

Unsurprisingly, the conversion rate for the above sales pitch has lagged expectations.

I don’t know which factor makes the Valley most gender-skewed but I’m pretty certain that casting a jaundiced eye at the reality of the situation turns off many intelligent women.  It certainly has to turn off many intelligent men, too.

So here’s a quick action plan to fix some Valley pathologies and make the whole thing a little more palatable:

  1. Make stuff for people who pay money for stuff.  This is a shortcut for getting paid money for stuff.  It also puts fun little infusions of funds between startup, series A round, and flipping to Google… which creates a whole spectrum of successful options for the business other than “achieve flipping to Google”.  (If you want more evangelizing on this subject, I suggest checking out DHH’s speech at Startup School.)
  2. Ditch the Valley.  Recognize that the same factors which make that tiny tip of the Silicon Valley distribution go infinite can make just about anybody, anywhere scale freakishly well — with respect to capital, with respect to time, with respect to team size, with respect to any metric you want to name.  OSS doesn’t stop working because you’re not in the Valley.  You can write an A/B test while sitting in a rice field.  (Trust me.)  SEO can be accomplished from anywhere on the intertubes and still scales worldwide.  There is a  rich ecosystem of businesses and APIs which so lower costs and barriers to entry for us — many of them built by the blood, sweat, and tears of twenty-something guys in the Valley (thanks for doing the work so I don’t have to, guys).  It is the best time in the history of the world to create a business.
  3. Don’t worry about the gatekeepers.  A lot of the angst about the old boy’s network is that the old boys are perceived as controlling opportunities to funding, which gives them power in the same way that the cartels gain power by rationing access to cocaine.  OK, quick solution: don’t seek funding.  Wham.  With a single stroke you’ve just managed to make the opinions of everyone but your customers utterly irrelevant.  (After you’re profitable, if you really want to, you can go to the Valley.  My guess is they’ll fall over themselves trying to give you money because they have no freaking clue how to create success in a reproducible fashion — c.f. 90+% failure rate.)
  4. Send people home at six.  I’m a poor example of this because I’m a Japanese salaryman (which means I get to say “100 hours a week!?  Slackers!” and then cry into my sake softly) , but you really can get an awful lot done in something vaguely resembling a human existence.  (I had originally described this as “a traditional workweek”, but I’ve got no particular love for 40 hours.  It is one arbitrary point you could find success at.  I know some people with businesses at 60 and at least one at five.)  One of the curious cultural pathologies of the Valley — and I suppose if I were a gender feminist I might describe it as “macho”, except I’m a Republican so I’ll just go for “stupid” — is that we treat overwork as a badge of pride and model it as the correct behavior.  This is insanity on a societal scale.  You mean we can scale to millions of visitors, our PCs got a bazillion times faster, and you can download twenty thousand man-years of software created by engineers smarter than you or I will ever be legally, for free, with explicit encouragement to build a business on it… and the only thing that hasn’t gotten better is the work week?  What.  The.  Heck.  I checked my Ruby standard library: there is no TriedToHaveALifeOutsideOfWorkException. 
  5. Take all advice with a grain of salt.  OK, so in deference to my liberal arts degree I’ll hit a few notes from it: institutions tend to try to perpetuate themselves.  Valley culture is a lot like an institution: it has its peculiar jokes and rhythms and closely-held shadow beliefs which owe a bit more to repetition than they owe to empirical reality.  A quick survey of the rest of this post should show you what I think of a few of these memes.  Remember that people, and I’m no exception, have a tendency to privilege the things they know and can think of easily over the things which are foreign to their experience.

If we fix this, it will result in more ladies at the margin seeing startups as an attractive career choice.  It might not change the percentages in the Valley.  Heck, it might even make it more skewed towards the guys.  I don’t profess to know and, honestly, I don’t really care that much either — it is worth doing regardless for the benefits to human welfare.

Comments Off

I Had Downtime Today. Here's What I'm Doing About It.

I screwed up in a major way yesterday evening. This post is part of my attempt to fix it.

This morning I woke up to an email from a paying customer saying that they tried to print cards but couldn’t. Specifically, they said that they were able to use the Print Preview feature, but that using the actual print button, quote, “caused the server to hang.” That can’t actually happen but it was sufficiently detailed as a bug report to immediately clue me in one what probably happened: the Delayed::Job workers must be down. A quick check of the server (ps -A | grep ruby) showed that this was indeed the case.

I quickly restarted the Delayed::Job workers then logged into the Rails console to check how many jobs had piled up. Six thousand.  Oof.  Most of them were low priority tasks (e.g. pinging the Mixpanel server with stats updates, which I do asynchronously to avoid having a failure there affect my users), but sixty users were affected — their print jobs were delayed.  Print jobs normally take under five seconds to execute and are checked with a bit of AJAX magic which polls the server until the job is ready, which means that most of these users probably got an animated GIF spinner to look at until they got tired and closed the web page.  The worst affected jobs took over twelve hours.

Happily, the downtime hit on a Saturday, which is the lightest day of the week for me.  If this had happened a week ago right before Valentine’s Day over 5,000 users would have been affected.

Apologizing To Affected Users

I used the Rails console to create a list of users affected by this, and have sent individual apology emails to the 2 paying customers affected (including attachments for the cards they had tried to print).  I will be contacting the trial users in a more scalable fashion.  Since I don’t have permission to email free trial users (the anti-spam guarantee I give is fairly strict), I dropped the development I had planned for this morning and built a simple messaging system into the site (~20 lines of code — I love you, Rails).  It gives me one-way “drop a message directly to your dashboard” functionality.

For example:

I prefer using this feature to the standard industry responses to outages:

  • “Outage?  What outage?”
  • “Please see our status page, which we’ve conveniently located in electronic Siberia.”
  • “ATTENTION ALL USERS!  0.7% of you were affected by very serious sounding things yesterday!  Please be worried unnecessarily even if you weren’t affected, and swamp our support line, who we will provide no effective tools to to tell you whether you’ve been affected or not!”

It allows me to apologize directly to affected users, makes minimal demands on their attention while still almost certainly reaching them, and does not cause any issue for the other 25,000 users.  Plus I can re-use this feature later in the event of needing to contact specific users without needing to email them (one obvious candidate would be plopping something straight on the screens of anonymous guests if I found something they individually needed to know, for example, if one of my automated processes caught that a recent print job of theirs did not come out right).

Preventing It From Happening Again

I’m something of a fan of Toyota’s Five Whys methodology for investigating issues like this.  (It has recently been popular with the lean startup crew.  My coworkers at the day job enjoyed some mostly justifiable smirks when I told them that.)

  1. Why couldn’t my users print?   Because the Delayed::Job workers were terminated when I upgraded the production server to Ubuntu Karmic Koala last night.
  2. Why didn’t the post-deploy checklist catch that users couldn’t print?  The post deploy checklist has “manually verify you can print cards” on it. I didn’t follow the post-deploy checklist with sufficient attention to detail because it was late (midnight) and I was tired (because I worked a six day crunch week at the day job… 30 days to go).  Here, I used the Print Preview feature to verify that I could print cards (“Hey, it tests the same code path, right?”), not realizing that while it tests the same code path they have different failure scenarios if e.g. Delayed::Job workers are down.  Fix: Quit day job and, regardless of how tired you are, follow the freaking checklist.
  3. Why weren’t you woken up by the Ride of the Valkyries playing on your cell phone when the site failed?  Don’t we have a system in place to do that? It turns out that the automated diagnostic (an external service pings a URL, the URL runs various tests and throws an HTTP error if any fail, the service mails my cell phone if there is an HTTP error twice in a row) tests nginx, mongrel, the D/B, and core program logic but doesn’t test the Delayed::Job processes or sanity check the job counts.  Fixed.
  4. Why didn’t the ‘god’ process monitor detect the workers were down? God sees every sparrow, but god only knows about the processes you tell it to manage, and my god_config.rb file has the Delayed::Job bits commented out with the notation “#This is buggy.”  I don’t remember why it was buggy and my notes in SVN are similarly unhelpful.  New task: unbuggy it.
  5. Why don’t you have commit notes, comments, or a development journal telling you what you were thinking when you found it was “buggy”? Failure to keep adequate records for “minor” changes and failure to follow up on a bug that was prioritized “Eh, get to that whenever” and then never gotten to.  Fix:  Look into beefing up developer documentation practices.

In the course of investigating this I discovered the update to Koala also killed Memcached on the server.  (Thankfully, Memcachedb — where I persist long-term user data that for whatever reason isn’t in the database, such as A/B testing participation data — is on another server.)  Unbeknownst to me, my use of memcached fails totally silently: if Rails can’t find the data in the cache it just regenerates it.  That would have had very unpleasant consequences for users if it had continued until Monday, and none of my automated tests would have picked up on it, because they all ignore timing.  I’ve added an explicit check to see if memcached is up and running.  I’ll also look into doing something about monitoring response times.

What I Learned From Japanese Engineering

I’m indebted to my day job for teaching me both a) how to do this and b) the absolute necessity of doing it, in spite of my longtime cavalierness with software testing. It was quite a culture shock for me the first time I logged into the test server at work to deploy something and got a rap on the knuckles for not:

  • Having a written explanation of exactly what commands I was going to enter.
  • Having a written checklist describing what tests to perform to ensure the deploy worked, and what the expected results would be.
  • Writing in the wiki that I was doing the deploy for a particular version done to close out a particular bug, so that there would be a trail to follow if the version I was about to deploy failed years from now.

That’s what we do for the test server.

All of the writing, test suites, automated test processes, and monitoring takes some time to set up and much of it generates additional overhead on all your tasks.  However, in the last three years, I’ve come to recognize that it is a net time-savings over writing apology letters and doing emergency incident response, neither of which are ever fun or quick.

Alright, development journal entry over.  Back to new development.

Comments Off

A/Bingo 1.0.0 Official Release

Back in August I released A/Bingo, an MIT-licensed OSS Rails A/B testing framework.  I have been using it continuously on Bingo Card Creator, and judging from the support requests I’ve been getting it has gotten some traction in the Rails world.  The 5,000 or so people seeing A/B tests on my site on Valentine’s Day are almost certainly less than 1% of the beneficiaries of the software now. Yay.

As A/Bingo has grown in popularity, I have begun to get requests for features that I did not need urgently for my own development, as well as the usual support requests, patches, and the like.  I want to make your use of the software as pleasant as possible to further evangelize the cause of A/B testing, so here you go:

New features:

A/Bingo now ships with a default dashboard.  Previously, I assumed that everyone would be writing their own dashboard code, so I just included the absolute minimum to show you what you’d need to do to get data out of A/Bingo.  Many people have remarked that they would really appreciate a “works out of the box” solution.  Your wish is my command — you can now enable a default dashboard in about ~30 seconds. It would work totally out of the box, but there are security implications, so I wanted you to have to think for a moment prior to enabling it.

#Create a new controller.  The name is up to you -- this example uses abingo_dashboard_controller.rb
class AbingoDashboardController  :abingo_dashboard

You can customize the dashboard code yourself. Nota bene: it uses your application layout, and has CSS classes applied to most of the elements, so you can style it quickly with CSS if you desire to. By default, it probably looks terrible. If you want to send me a patch to make it pretty, be my guest.

Experiments can now be stopped: Using either the built-in links on the above controller or, if you prefer programmatically scripting things, experiment.end_experiment!(alternative_content), you can now stop an experiment without touching the code.  Stopping an experiment causes all users to get the specified alternative rather than what they would have gotten randomly.  It also ceases stats collection.  Stopping an experiment is irreversible (currently — that might change later).  I tried to make this feature not affect the performance of A/Bingo for larger sites — it makes each test require one extra cache access.  (*cough* Rounding error, hopefully.)

A/Bingo internals are now fairly thoroughly tested: Unit tests are not exactly my cup of tea (“Argh, it works in production, what else do you want from me?!”), but Rails developers look askance at software that does not include them.  So I knuckled down and wrote a test suite.  (Hat tip to Nathaniel Talbott for mentioning A/Bingo in a conference presentation.  The constructive criticism regarding testing drove this change.)

I have not written thorough integration tests for the syntax sugar that you get via the included helper methods, but I’ll fix that eventually.

Named conversions: Previously, all A/Bingo tests required one line to add the test and one line somewhere else to track conversions.  Typically, since businesses have very many tests and fairly few conversion events, this resulted in code like:

#A controller method
def purchase
#Business logic goes here.
  bingo!("new_button_test")
  bingo!("email_copy_test_january")
  bingo!("microcopy_test")
  bingo!("button_colors")
  bingo!("login_button_alignment")
end

That isn’t very DRY at all.

Now, A/Bingo will take an optional parameter :conversion (or :conversion_name) when you’re defining a test, telling it to listen to a particular named conversion. This way, you can reuse the same conversion for as many tests as you want, decreasing the lines of code needed to create most new tests from two to one.

def some_method_with_a_test
  @alternative = ab_test("some_test_name", %w{altA altB}, :conversion => "purchase")
end

def some_other_method_with_a_test
  @foo = ab_test("bar_test", %w{coke water}, :conversion => "purchase")
end

def purchase
  #Business logic goes here!
  bingo!("purchase")  #Calls conversions for both of the above tests.
end

A/Bingo handles tests with spaces in them more gracefully: Although I still don’t recommend doing it, A/Bingo has been improving its handling of test names which have a space in them.  (The reason I don’t recommend it is because some cache stores — particularly memcached — do not support this well.)

Official support for Redis: Assaf Arkin picked Redis for his awesome Vanity project (which also does A/B testing for Rails, among other things), which inspired me to take a look into it.  It appears to be a much, much better alternative for a key/value store than Memcachedb, which is what I use for persistence.  A/Bingo has always accepted any cache store that Rails does, but I want to make it explicitly clear that I run tests against Redis, Memcached, and MemcacheDB. Just add the following to your environment:

#Goes in environment.rb
config.gem  'ezmobius-redis-rb',
  :source => 'http://gems.github.com',
  :lib => false

config.gem  'jodosha-redis-store',
  :source => 'http://gems.github.com',
  :lib => 'redis-store'

#Goes in whatever environment you're using:
require 'redis-store'
Abingo.cache = ActiveSupport::Cache::RedisStore.new

I intend to migration my own deployment to Redis when it becomes reasonably convenient for doing so.

Versioning: Previously I’ve just released patches to the A/Bingo git repository when I got done coding them, but I feel that is suboptimal now that there are substantial deployments which I could potentially break with changes.  So, here’s the skinny: A/Bingo is now, as of this blog post, 1.0.0.  I’ll communicate breaking changes by bumping that number up.  If it goes up by a tenth or more, expect that you need to re-run the migrations and that you will probably lose data on any tests in-progress, so plan ahead for that.  Version increases in that last number should be safe to apply directly.

I do not anticipate breaking the published A/Bingo API (i.e. methods mentioned in the docs) until at least v2.0.0, if ever, so upgrading A/Bingo should almost never cause you to need to update your own code.

How To Contribute

I would like to thank everyone who has submitted bug reports and patches. As usual, I’m always happy to get bug reports or feature requests. If you’d like to contribute code, make it available via git anywhere you please, and then send me an email telling me about it.

How Do I…

If the question isn’t answered in the (copious) documentation, feel free to ask me over email. If your business has particular needs for A/Bingo or you just want to talk A/B testing strategy with somebody who breathes it, I’m available for consulting engagements starting April 1st.

You Should Be Doing A/B Testing

I really can’t stress this enough: A/B testing is an easy, reproducible process that you can use to improve your marketing, website copy, product, user experience, etc. If you haven’t started yet, take A/Bingo, Vanity, or your other framework of choice for a spin. It won’t take you five minutes until you’re getting actionable data which you can use to make money.

Comments Off

Using CrazyEgg on Pages Requiring A Login

Long-time readers of this blog know I’m absolutely goo-goo for CrazyEgg, principally because they keep making me money.  They’re seriously my favorite $19 to pay every month, even when I don’t actually use them, because some day I know I’ll get the itch again and then bam actionable insights into what my customers are doing. Today is an itchy day.

One thing I have never gotten around to is tracking how users click on pages in my web application, behind the login screen.  CrazyEgg can do this but you need a bit of magic to let their screenshot bot grab samples of the page being interacted with — otherwise, they won’t be able to match up the events their Javascript recorded with the form fields on, um, your login screen.

CrazyEgg’s current user-agent string is:

Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.1.4) Gecko/20091016 (CrazyEgg 2 screenshot agent) Firefox/3.5.4

Then just shortcircuit your login procedure for people with that user agent.  (This may be undesirable if you are very, very security conscious.  I sell bingo cards to 60 year old women.  If you are security conscious, email them and they’ll provide a listing of IPs to whitelist.)

In Rails, doing something on the basis of the user agent is easy but, and I know this might come as a surprise, not covered in the documentation.

def authenticate_user
  if (request.env["HTTP_USER_AGENT"] =~ /CrazyEgg/)
    #Whatever you need to do to let them in as CrazyEgg
  else
    #Actual logic goes here
  end
end

For a quick 30 second solution, I signed up as a trial user for my own service with the email address crazyegg@bingocardcreator.com, and have the analogue to the above method on my site just pretend that anybody with the appropriate User Agent is authenticated as that user.

Comments Off

Dashboard Design For Metrics-savvy Software Companies

I have a confession to make: I’m something of a metrics junkie.  I have lost entire days of my life just staring at Google Analytics reports.  Metrics have always activated that same part of my brain that WoW did: ooh, a page view, ooh, a sale, ooh, if this had purple bars on it I’d pay $15 a month.  So I would flip from email to Analytics to e-junkie (the extremely appropriately named payment processor I use) to Analytics to… and end up accomplishing nothing of real importance.

Because although I’m a metrics junkie, I’m a smart metrics junkie.  And any smart metrics junkie can tell you that if your metrics aren’t giving you actionable insights to make decisions that matter to your business, well, you might as well go play WoW for all the good you’re doing.

That’s why, in one of my periodic bits of investing in the business, I built myself a dashboard.

Goals of the Dashboard

A dashboard is, simply, an easy to digest one-glance view on how your business is doing.  Mine is implemented in Rails and it is the page that greets me if I visit my site (password protected, naturally).  The purpose is threefold:

  1. Minimize the time it takes me to do common repetitive tasks.
  2. Show me information about my business at a glance, so that I don’t feel the need to log into my various other sources of info.
  3. Arrange for easy access to drilling down into important things for me.

Like all of my other software, it is a work in progress.  (Incidentally, some folks have offered to buy it when I have mentioned it previously.  I can’t sell it, since it is tied very tightly to my business needs and data available.  At all of ~200 lines of code, though, you can knock one for yourself out in an afternoon.)

Here’s a screengrab from it:

Some comments on what this shows:

Search box (“The Omnibox”): The Omnibox is my Swiss Army Knife support tool.  Given absolutely anything I know about a customer — from her name to email address to transaction number to a (hopefully unique) phrase she has used in her bingo card — it goes off and fetches her customer record.  This is a real timesaver because many of my customers don’t remember what email address they purchased the software under, told Google to obscure their email address (a misfeature in Checkout, if you ask me), purchased with their husband’s credit card, etc etc.  The Omnibox saves me from having to actually do work to find customer records about 80% of the time.  Search results are the same as…

Customer Entries (Latest First): The vagaries of the bingo business mean that I’m overwhelmingly more likely to get a support incident from you in your first 24 hours of use than at any other time.  Accordingly, as soon as I open the dashboard, the last 10 customers pop straight up so I don’t even have to search for them.  (This also serves as a quick visual health check and lets me see if a customer’s transaction didn’t go through.)  This illustrates a core principle of dashboards: do less work, get more done.  Ten keystrokes saved doesn’t sound like a lot until you’ve done it 200 times.

Customer Support Options:  You’ll note some customers have their names in green.  This means they are using the online version of Bingo Card Creator, rather than (or in addition to) the desktop version.  Previously I just made that green to satisfy my curiosity (it gives instant, accurate visual feedback that 70% of my sales come from the online version), but as I got a feel for customer support needs I decorated their records with a pair of hyperlinks.

“Ghost me”: This is the same link you get if you try the Forgot Password function in the online app — one click logs you in.  I right click it, select Chrome’s Private Browsing option (to avoid overwriting my cookie), and suddenly I’m you.  This lets me see exactly what a customer is seeing, so that I can diagnose problems easier, or in a pinch just do what they need done.

“Email password”: The same as the password recovery functionality on my site — mails them a link to let them in so that they can change their password.

Sales Counter: This is only on the page because otherwise I’d a) log into e-junkie to check and then b) start trying to guesstimate how much money I was going to make this month.  Both are bad habits.  Putting it front and center decreased my logins to e-junkie from several times per day to once a blue moon (only when I need to speak with them, which since they run a very tight ship is “Almost never”).

Edit Bingo Cards: Takes me into the original core of the Bingo Card Creator site: a CMS which makes, you guessed it, bingo cards.  This is where I approve work submitted by freelancers, make minor content edits to it, or create new bingo activities.  Again, instant access on the dashboard saves time.  (There are also a few affordances there, like AJAX approval so that all I have to do is mouseover a new card and click OK to approve it.  Do less work, get more done.)

Downloads Per Month: This shows a graph of how many PDFs have been downloaded from my site, on a monthly basis.  It is a quick one glance indicator of how effective my SEO is, from back when I didn’t have the user stats to look at.  I could probably demote it from the dashboard these days.  This information is public, by the way.

All-time Sales: Graph of sales by month.  I mostly use this to check on market seasonality, year over year increase (70%, ho), and whether I’m on pace to make my revenue goals.  This is also public.

A/B Test Results: If you’ve been around this blog much you probably know that I obsessively test and measure.  When I have a particular A/B test on the front burner of my mind it gets promoted to the top of the dashboard.  At the moment nothing I have running is super-critical so they’ve been placed a click away.  You can see some of my A/B test results here — I’m a big believer in code reuse so a portion of my test results are also used for the documentation for my A/B testing library.

E-junkie: Convenience link to my payment processor, where I used to log into a lot.  This should be demoted from the dashboard.

uISV email: The Google Apps for Domains email account I use.  (uISV stands for “Micro-independent software vendor” — i.e. like “small software company” with a more pretentious acronym.)

Logins Today / This Week: This is my “one glance health check” for the business.  I have a rough idea of what these numbers should be.  If they go down far below that, I have broken the login button and need to fix it immediately.  If they go high above that, it must be Halloween.  If they stay flat I might have broken the login button close to Halloween.

Vanity Stats: I keep these around just to satisfy my primal WoW player urges.  They have no particular relevance to my business, but they’re fun to quote to people.

You might think “conversion rate” is a really important number.  See, conversion rate for a channel or a creative or a landing page is a very important number.  Conversion rate for all of your visitors, on the other hand, is very, very sensitive to traffic mix.  Since customers arriving by AdWords always outconvert customers arriving by organic search in aggregate (which is an artifact of my SEO strategy and not to be worried about), a gyration in conversion rate is generally caused more by a change in prospect mix than a strong reaction to a change in my product.

Things Not Pictured

This dashboard can potentially display a few other types of information.  Thankfully, I don’t have an example to show you today.

Exceptions in payment processing: An exception happening in most places in my code is probably a misbehaving spider or a bug.  An exception happening in the callback for successful payments is a bug.  A very critical bug, because it often means that a customer loses access to the software they paid for.  If that happens, the computer sends me an email, my cell phone gets rung, and this whole page gets hidden under a stacktrace written in blaze orange until I hit the “I dealt with it” button.  (This obvious probably won’t happen if the whole site goes down — mon.itor.us watches that for me.)

Regular exceptions get written to my log files.  That is, candidly speaking, where data goes to die.

Bug of The Day: I have a confession to make: I’m not much of a test driven developer.  It is really, sincerely difficult to anticipate every corner case which could happen with customers trying all possible combinations of words and bingo card options.  This causes a huge portion of the user-visible bugs in BCC, and because the symptom is typically “It looks… ugly?”  (for a value of “ugly” not known at runtime) it is sort of hard to capture ahead of time in the predominant Rails testing frameworks.

That’s why I have a wee little daemon who periodically sanity checks all the print jobs people have run recently, looking for anything I’ve identified as the symptom of the Bug of the Day.  Currently, the Bug of the Day is that under certain circumstances the combination of a very long title and a bingo card with four to six short words in a certain box can cause the box to render across multiple pages, which is not desirable.  This is algorithmically checkable, but only in retrospect.

If the daemon detects that a particular print job caused one of the Bugs of the Day to pop up, it displays that fact in red, along with a link which copies the word list to my personal account so that I can reproduce the behavior and start trying to squash it.

This system also lets me sanity check upgrades to the site.  If I cause a regression in behavior, typically (squashed) Bugs of the Day will typically resurface, and my dashboard lights up like a Christmas tree.

What’s On Your Dashboard?

I love hearing implementation details from other businesses, and this is about as wonky as it gets.  What do you put on your dashboard?  Do you have any suggestions for things I should consider adding to mine?

14 Comments

What My User Survey Taught Me

Some weeks ago I mentioned that I was implementing a user survey in Bingo Card Creator, using Wufoo.  About forty of my customers have now taken the time to give me detailed advice.  I thought I’d share some things I learned.  A few takeaways may be applicable to your business, and at the very least “detailed, actionable advice can be yours if you just ask for it” should convince you to start a survey this week.

Incentivize Any Surveys You Do

I ran two A/B tests in taking the survey.  The first selected whether a user was asked to take it at all.  The second tested what prompt was more effective at inducing people to take it.  Half of participants were invited to to take it for altruistic reasons (“Would you like to take a survey about Bingo Card Creator?  We’d like your feedback so that we can improve the website and software for all of our customers.”)  The other half were incentivized (“Would you like to take a survey about Bingo Card Creator?  If you do, we’ll let you print 5 extra free bingo cards.”)

I gave all users who completed the survey the free cards, regardless of whether they had been promised it or not.

I have never had a more lopsided A/B test.  The response rate of incentivized users so resoundingly crushed the response rate of non-incentivized users right out of the gate that I was scared of a bug.  It ended up being more than a 2X difference in conversion: 2.51% vs 1.17%, which is significant at 95% confidence.

Thus the conclusion: if you want responses, give away free stuff.  (Incidentally, you might think that you’d get lower quality feedback from incentivized users.  I did not get that impression from reading the results, but I can’t reduce that to a simple statistical measure.)

Surveys Don’t Cost You Sales

A major worry I had was that putting the survey prominently on the trial version of my application would cost conversions.  Nope — there was no significant difference in sales given to folks asked to take the survey versus folks not asked to take the survey.  (I actually got very, very marginally more sales from the folks asked — but not statistically significant.)

Opinions Confirmed About My Customers

Prior to instituting the survey I had guessed that my customers were mostly female, older (I was guessing normally distributed around 40), and that they were probably teachers.

  1. I was dead on the money on gender.  85% of participants were ladies — this roughly comports with my experience looking at customer names and doing customer support.
  2. My customers are, indeed, mostly older: fully half of them are over 40.  Another 30% are in their 30s.  About 10% are in their 20s and a little less than 10% are below that.
  3. I now have data to substantiate that most of my users are teachers: 30% of respondents use BCC in elementary schools and another 20% use it in high schools (that number is much higher than I would have expected).  I was mildly surprised with the number of people playing with adult family members (~15%).  All other uses are fairly marginal.  Confirming that most of my users are teachers is going to be very helpful in crafting my marketing messages.  (Though I suppose I have to worry about stumbling into the local optimum: I might be seeing this because I market to teachers well and market to, e.g., parents poorly.)
  4. Two survey takers, as predicted, used the survey free response box to ask customer support questions.  One of them I was able to de-anonimize and fix the issue she had (she had created a new trial account rather than using the old registered account and wondered why it was asking her to purchase).  The other one, sadly, was using an anonymous guest account from an IP I’ve never seen before, so I couldn’t track them to an email address to resolve their issue with the purchase.

Things I Didn’t Know Prior to Asking

  1. A lot of my users are very, very appreciative of Bingo Card Creator.  I mean, I knew it saved them some time and that lots liked it, but I got stories of it saving a lesson plan, brightening the day of a room full of seniors, and teaching a son to read.  A few of the results warrant follow up emails to ask if I can quote them publicly.  (Like, right next to the Buy Now button.)
  2. I am terrible at Quality Assurance.  Well, I knew that.  But, specifically, a particular combination of A/B tests would result in a user getting textual instructions on one screen which conflicted with what was written on the buttons on the next screen.  Thank you for the report, ma’am — bug squashed.
  3. A few customers reported generalized anxiety about using “new programs” and said they wanted more handholding in the instructions.  I beefed up instructions throughout the application.
  4. Surprisingly few of my customers reported problems with ease of use — over 90% rated BCC either “Very easy to use” or “Mostly easy to use”.  I also got a lot of free-form comments praising the daylights out of that — many of my customers compared it favorably to other unpleasantness they had had doing routine computer tasks.
  5. Surprisingly many of my customers self-evaluate as comfortable with computers.  50% were “very comfortable”, and 30% were “mostly comfortable”.  These numbers are, candidly speaking, not what I would have assigned on the basis of reading support requests for three years.  It is possible both the survey and I are right, just looking at different segments of reality: 80% of the customers are good with computers and fairly rarely email me, and 80% of my inbox is caused by the remaining 20%.
  6. Customers respond very strongly to features I consider so core to the program I scarcely mention them — in particular, I  learned twelve different ways to express the thought “Every card is different!” and another eight for “I love that I can customize the word list for the lessons I am teaching that week”  I will be incorporating additional variations of those into my copy.  Repetition never hurt any teacher.

The Number One Complaint My Users Have

Bingo Card Creator isn’t free.  Some variations on the theme:

Make it totally free by having advertisers. Many, many sites that elementary teachers use are kept free by the magic of Google AdWords.  I should know — I’m the guy paying for the ads.  While you can keep the free bubble going by passing around VC dollars to Internet firms to Google to Internet firms to Google to Internet firms to Google to … for a while, eventually, if you’re not getting money from customers, everyone dies.  This was, essentially, the last Internet bubble.  I will continue charging because charging keeps the rest of the ecosystem alive.

Relatedly, the fact that Bingo Card Creator is so easy to use is directly related to the fact that I charge money for it.  I obsess over getting customers (or trial users) through the pages to their beautiful bingo cards, because people who see beautiful bingo cards are very inclined to pay me money.  The sites that I advertise on do not optimize for their user experience because if their website is better than my textual ad, they don’t get paid.  That is why the experience of using them sucks.  (I won’t out anybody who I essentially have a business relationship with, but take a look at the free options in my market some time, or look at pages which compete with mine in the search results.)

Charging also subsidizes the experience of the 97.3% of trial users who don’t actually pay me money.

15 card limit does not allow for a class set in which everyone can have a different card 25 would be more appropriate for teachers.”  I regularly get asked what my business model is — i.e. how I convince people to pay money for my software when I give so much away for free.  This lady nailed it in one sentence.

Incidentally, I don’t feel any ranchor at folks who believe that everything should be free on the Internet.  I just will not accomodate your preferences.  You’re welcome to use my free competitors if they better fit your needs.  (I actually provide folks  with lists, on request.)

Takeaway Lesson

What are you waiting for, go sign up for Wufoo (or whatever — they have a few competitors) and do a survey.  You’ll learn stuff that you can use to make helpful decisions for your business.

Comments Off

Four Open Letters To The Book Industry

Dear Publishers:

Hiya.  You don’t know me, but I’m a pretty good customer of yours.  I buy several thousand dollars of books a year, in almost every genre you sell: fiction, non-fiction, fantasy, sci-fi, classics, mysteries, you name it.  I have bought everything from Gladwell to the most obscure author in your backlists and back again.  I may well be the only heterosexual male in the entire world who spent more on urban fantasy alone than he does on video games, movies, and newspapers combined.  I really love books.

For the last decade or so, I’ve bought many of my books through Amazon.  Amazon knows me.  They know what I like, they send me recommendations via email, and on those rare instances when I have a problem with a book they fix it for me within 24 hours.  I really like Amazon.

That is a new experience for me in a book vendor.  Typically, bookstores are anonymous entities who happen to be in the same airport, mall, or street as I am when the craving strikes.  I have absolutely no loyalty to them.  (I hear that before my time there were neighborhood book stores where you’d go in to get a recommendation from someone who knew your tastes intimately.  I’ve never been in a neighborhood book store.)

Some months ago, I bought a Kindle.  Publishers, if you thought I was a good customer before, you should see me now.  I don’t even have to find an anonymous dealer to get my fix — I just punch a button and bam, new book.  And punch that button I do — about four times as frequently as I did previously.  Amazon now sells me over 90% of the books I buy.

Recently, it has come to my attention that some of you are having a bit of a spat with Amazon, centering over release schedules, pricing issues, and, above all, control.  This sent me walking over to my book shelf to check whether those of you who are having the spat with Amazon actually publish authors I read.  The fact that I didn’t know this off the top of my head, and that this is the first time I’ve thought about individual publishing companies in my entire life, should be a preview of coming attractions for you as regards to which company I am backing in this fight.

Let me be perfectly clear: I have no price sensitivity with regards to books.  I read the books I want to read when I want to read them.  I have never bought or avoided buying a book based on whether it was hardcover or trade paperback. (Incidentally, since we’re all businessmen here, let’s be honest: you want to extract as much money out of me as possible because I am price insensitive, and staggering hardcover and paperback release dates is just a way to accomplish that.  Neither of us really care about the physical format in the slightest.)

It does not matter to me what you charge for the books on my Kindle.  However, I’m hearing things about you windowing Kindle releases — i.e. delaying them so that you can protect your hardcover sales.  You think my likely behavior is to go to the bookstore where no one knows my name and pay extra so that I can have the hardcover on release day.  Words cannot express how mistaken you are.

I will read books on my Kindle.  Whether they are your books or the books of your competitors matters to me not one whit.

Dear Authors:

We’re quite the odd ducks, aren’t we.  You don’t know my name either, despite the fact that we’re on surprisingly intimate terms.  I spend most of my leisure time rattling around in worlds of your creation.

I understand you feel a bit of connection to the people who liberated you from the slush pile, sign your royalty checks, and respond to your emails.  You know their names, after all.

I want to read your books as fast as you can write them, on my Kindle.  If you support me in this, I will stick with you like the plucky heroine to the aloof and semi-abusive vampire lord who turned her.  (P.S. urban fantasy authors: stake his worthless carcass.  Signed, beta males everywhere.)

If, on the other hand, you should support your publishers’ interests over my desire to pay you money, mark me on this: I will buy books from authors willing to sell them to me.  I might get a little depressed over not being able to read my favorites, but if you haven’t noticed, I read a lot faster than you can possibly write and that makes me promiscuous by nature.  Any regret I feel over losing you will be quickly assuaged by epic heroism, vile betrayal, true love, and other themes of investing advice books.

Dear Book Stores:

Good luck with that coffee thing.

Dear Amazon:

Keep being awesome.

Regards,

Patrick McKenzie

(A man of no particular importance, who bought more books in 2009 than 20 average American households.)

Comments Off

Followup Questions for "Strategic SEO for Startups"

Peter Christensen had a few questions for me regarding my last blog post about SEO for startups.  I thought the questions were interesting enough to require a bit more than a comment on his post, so I’m going to answer them in detail here.  The details are very, very specific to my particular business — if you want a high-level strategic overview, I suggest reading that post instead.

In the past you’ve talked about outsourcing your content creation to your “army of freelancers”.  What did that consist of on your end?  My guess is you looked at terms and topics people were searching for (you mentioned “baby shower bingo” once) and then sent a job to your freelancers to come up with 80 or so baby shower words that you feed into your card generator and sample bingo card landing pages.

Periodically, when I have an idea for a new project, I put out a call for freelancers on my blog similar to this (for blog writing on my “sprawling bingo empire”) or this (for creating bingo cards).  (Incidentally, “army” is an overstatement: I think in my business career I’ve used a bit less than a dozen, but don’t have my expenses report in front of me.  One woman  in particular is easily 80%+ of that.  Why mess with something that works?)

The work-flow for those two projects is a bit different, but in general I write up the general outline of what I expect (you can see the most important bits in those posts) and then let my freelancers run with them.  For bingo cards I typically give them discretion to choose their own topics (although I let them see my stats for what previous cards were popular — for example, sorted by genre or popular this week).  For the blog creation project I came up with a list of 14 mini-sites via a one-off SQL query.

The deliverable for bingo cards has changed over the years as I’ve upgraded my CMS.  Currently, there is a back-end one page web form on my site which asks for a title for the card, a subtitle, a brief sentence of description, and then a word list.  Anything submitted there goes into my database and awaits my review, which given that my freelancer is very good at what she does is typically “Oh, good, here’s 30 lists for this month.  Approve All.  Goes off to bank site to mail check.”  Within a few seconds of me hitting approve, the CMS backing my site turns the word list into a PDF file, grabs a screenshot of it, and does a bit of content page generation.

The deliverable for the mini-sites is just pages made in WordPress, extolling the virtues of Valentine’s Day bingo or what have you.

How do you analyze and rank your SEO strategies?  I see your sample card landing pages have an id that they pass to the registration page so you know how the different landing pages are converting.  What other methods do you use to determine which SEO methods are most valuable to you?

The flippant answer is that if I make more money than I expect to then I guess everything is working.  Seriously speaking, though, I do very little backwards facing analysis (“Did that work?”) and concentrate mostly on forward facing analysis (“What opportunities can I exploit now?”), with the exception of when I’m writing a blog post to comment on how something worked.

One of the reasons I’ve cooled on Google Analytics over the years is it doesn’t really lend itself to providing data which lets you make actionable decisions in a reasonable amount of time.  For example, if I look at my stats, I can tell you with arbitrary precision how much more popular baby shower bingo cards are than football bingo cards.  Whee.  That doesn’t tell me anything I can do to improve my business today.  Most of the things which can tell me stuff that will improve my business are the domain-specific analytics functions I’ve created (like the above) or fun little one-off explorations of my database that I do from the Rails console.

For example, I might play around one day and see what the most common 25 words are for customers making bingo cards.  (That was what clued me into baby shower bingo.)  That usually identifies a weak spot in my pre-made card lineup, which I can either tell a freelancer about or just fill myself.

Incidentally, you mention that you think the ID I pass to the registration page is for tracking conversions.  Actually, not so much.  I track conversions with Mixpanel.  The reason that ID gets passed is to provide continuity of experience for new trial users.  I’m actually really proud of this hack: if you show up on my landing page for, I don’t know, tea bingo, and you click “Create Your Own Bingo Cards” and sign up for the free trial, your free trial account gets pre-initialized with my set of tea bingo cards already in it and “personalized” instructions on the dashboard about how you can print bingo cards like the tea bingo cards you were just interested in.

This greatly increases funnel success in A/B tests.  (You are roughly 20% more likely to successfully download a set of customized bingo cards if I give you the “personalized” treatment than you are if I drop you at a blank dashboard and expect you to fight your way through.)

I also do this for my PPC (AdWords) campaigns: if you respond to an ad for Halloween Bingo Cards, then by jove I’m going to everything short of dropping a pumpkin on your desktop.

Best idea here: I don’t think enough software companies unify the marketing and product sides, incidentally.  We tend to treat everybody coming in to the top of the funnel as absolutely the same.  Then we treat everybody who makes it through funnel step N exactly the same.  But we’ve got data that says they are different — why not use the data to enhance their experience and, not incidentally, improve their propensity to buy the product?

For example, if I were in charge of World of Epic Dragonslaying, and I had a PPC

Your Bingo Card landing pages allow you to programatically generate tons of pages from content in your product.  What other tips do you have for getting lots of good SEO content for a low investment of time/money?

I suggest reading the parts of the article about scalable content generation.  I don’t have another magic secret that I use for my own business.  OK, maybe half a secret: data begets data.  For example, I’ve got my 800 or whatever the number is bingo card activities that my freelancers and I cooked up.  I use that in several places: each bingo activity becomes

  • a content page
  • a PPC landing page
  • an activity in the downloadable version of the software
  • an activity in the online version of the software

This gives me usage/popularity data about the same subjects.  I use that for:

  • automated interlinking of content pages  (see left hand sidebar, “Related Activities”)
  • automated decisions of promoted content on the front page
  • my popular activities list
  • widgets across my “sprawling bingo empire” which list popular activities
  • semi-automated decisions on which content to promote to mini-sites

Anyhow, if I should come up with a good second idea to generate content for the website, you’ll likely hear about it here roughly contemporaneously with me implementing it.  Many of my friends have suggested I might be at the point of diminishing returns for BCC.  I think that is likely accurate, and so my very best ideas this year are probably going to be in service of my next software project.  However, given that BCC has always been nights and weekends for me, that doesn’t necessarily mean “maintenance mode” for it will be totally bereft of new ideas.

I hope that answers your questions, Peter.  Thanks for asking.

Comments Off

Strategic SEO for Startups

One way I’ve found to cut down on support requests is to make sure I write publicly about any issue that keeps coming up for my customers.  Other small companies contact me for advice fairly frequently, and that also tends to retread the same issues, so I’m going to blog it in depth once rather than giving fifteen people 30% of my thoughts on the same issue. One common issue is “How do I improve our SEO?”

Strategy as opposed to tactics: SEO has a lot of opportunities for micro-optimizations in it, from rewriting title tags to dynamically interlinking content pages.  They’re all interesting subjects and I’m not going to talk about them.  If you don’t feel comfortable in your meat & potatoes SEO yet, head on over to SEOBook or SEOMoz.  Both are excellent resources.  I’m going to focus on core decisions you make about your business and marketing approaches rather than page-level optimization.

Why Startup SEO Is Different

Essentially every business on the Internet from multi-billion dollar giants like Bank of America down to a one-man software business is dependent on SEO, because Google has become the primary navigation tool for the Internet.  (I suppose I could write “search engines” but I feel no particular need to maintain the polite fiction that there is more than one search engine in the United States.)

SEO for a small business is very different than it is for Bank of America.

Limited budgets: Startups cannot devote huge amounts to advertising, branding campaigns, or link acquisition.  (Paying for links will theoretically draw the wrath of Google to you.  In practice, once you’re above a certain size, you’re immune.  If you’re reading this article, you do not have immunity.)

Low domain strength / trust: Google tends to trust older domains, domains with lots of links, and domains with lots of older links.  All of these are signals of what one might call trust: the longer you’ve been on the Internet and the more people who asserted your quality by linking to you, the less likely you are to be a useless spammer.  However, if you just registered your domain last Tuesday, Google has a priori no reason to trust you over the other billion pages on the Internet.

Cultural aversion to SEO: There is a pernicious myth among startups that SEO is a black art aimed at perverting the purity of the search results.  This is partially because search engine spam is indeed a problem and partially because Google is very good at influencing the culture of technically adept people, and it is in Google’s best interest to make people think that their algorithms are the authoritative voice of God.  (Google, for all its image as an open company with significant OSS contributions yadda yadda yadda guards their index and algorithms with a ferocity that would do Microsoft credit.)

Algorithms have no moral status.  If your engineering team sorts records using an n^2 sorting algorithm, then tells you that they did it because the sorting has always been n^2 and therefore this is the Morally Correct Way To Sort, you need to whack your engineering team over the head and tell them to do better.  Similarly, your SEO strategy is simply the input you provide Google’s black-box algorithm which sorts search results: just because it is ineffective does not mean it is the Morally Correct Way To Sort.

A related worry is that SEO hurts the user experience.  It certainly doesn’t have to — a good deal of SEO is about creating stuff your users want to use, surfacing content in a way that is understandable to them, and not breaking your site’s usability when seen from the primary Internet navigation method (Google).  I wouldn’t advocate black hat methods: the black hatters are better than you are at them, and if you use them you’re in a constant arms race with Google (who has billions of dollars, thousands of sharp engineers, and the peaceful conflict resolution skills of Darth Vader) when as a startup you’re already biting off more than you can chew.

Why Startup SEO Is Better

On the plus side, you do have some advantages as a startup:

Strong Technical Skills: I’m a moderator in charge of programming topics at SEOBook and we get an awful lot of nuts and bolts questions like “How do I edit a title tag?” or “How do I do a 301 redirect in Apache?”  Thankfully, since you presumably have programmers who know what they’re doing, you’ll never need to ask either of those.  In addition, you can program tools and content to improve your marketing, including SEO.  We’ll discuss specifics in a moment.

Link Richness: SEO is, at competitive levels, mostly about link acquisition.  It is very difficult to get a link without paying for it in many sectors of the information economy.  For example, while there is probably a thriving micro-community of online taxidermists, they probably control relatively few links compared to their numbers.  However, if you’re a startup, you probably hang out on Hacker News or similar where the blogs-to-person ratio is 6.3, a new useful bit of OSS can make news in four continents on the first day, and online interaction forms a substantial portion of the personal and professional identities of your peers.

There are pluses and minuses to this: a lot of people overadapt to the fickle preferences of TechCrunch et al.  That reminds me of dodgeball in fourth grade except there are 100,000 kids and it is mathematically possible for all of them to be picked last.  Appealing to your peers can’t be your only marketing strategy.  However, it is helpful for when you’re making a cold start, to help get the link to rankings snowball running.  One business which did this very well is Balsamiq, which sent letters to blogs big and small to get coverage.  Steal Peldi’s approach to writing them: it is aboveboard and works.

Strategic SEO Objectives

Ideally speaking, well prior to launch you should figure out exactly what you hope to get for from SEO.  “Rankings” is not an acceptable answer.  Neither is “visitors”.  I could get your startup ranked for [fried squirrels with wasabi] by the end of the day, but unless you’re selling a book of very eclectic recipes that probably won’t do you much good.

If you’re selling display advertising, coating every search result under the sun might actually work for you.  (Display advertising is, essentially, search advertising’s less talented brother: it is essentially a second bite at the apple for advertisers to get a click when users avoided the AdWords ads on Google.  I have deep, deep doubts about the sustainability of display advertising as a business model.)

If on the other hand you’re trying to get users or sales for your application, you have to balance the needs of your SEO operation with the need to convert users.  For example, your homepage will almost invariably be the strongest page on your site.  It probably has to be conversion-oriented rather than conversation-oriented.  However, outside of the home page, conversion-oriented pages don’t attract links that frequently.  Almost nobody blogs “Hey guys, I saw an awesome sales letter today, check it out” and if they do you probably don’t want their attention anyhow.

So your SEO strategy is likely going to involve a mix: non-commercial offerings designed purely to solicit links/attention, semi-commercial scalable content generation which we’ll talk about in a minute, and sales funnels supported by the rest of your website.

Aiming at a moving target: The first cut of your SEO strategy will be wrong, just like v1.0 of your product will be non-responsive to the needs of your users.  That is OK: after you start you’ll begin collecting insights and data which let you refine it.  You want to get something out the door as soon as possible so that you can begin collecting links, other indicia of trust, and data on what is working for you.  Many startups wait until launch to put a significant amount of content on their websites.  This is almost always a mistake.  If you can’t show the application yet, no problem, talk about the problem domain.  Talk about the needs of your customers.  The “media launch” where Steve Jobs comes down and presents the iCommandments works very well if you have a built-in base of millions of radical fans and a PR budget which could buy Chile.  If you’re reading this, that probably doesn’t apply to you.  Google is going to hate your bones when your website first debuts onto the world stage: start that clock ticking as soon as possible.

There is no Google sandbox: If you’re well read about SEO you’ve probably heard about the “Google sandbox”, where sites languish for months or years prior to ranking.  There is no Google sandbox per se: a site doesn’t magically jump from zero to hero because it is 180 days old.  Google can find sites within minutes of them appearing on the Internet and rank them inside of an hour if Google has sufficient reason to.  The sandbox is the perceived reality, though, because from a cold start it takes a while to build up symbols of trust, such as links from trustworthy domains.  All the more reason to get started early.

SEO Is A Feedback Loop

Sites tend to built self-reinforcing authority: the site at the top of the rankings for teddy bears (almost certainly Wikipedia, I can tell you without looking) is the first people go for teddy bears and the most likely to collect another citation when someone is writing about teddy bears.  That will help that site rank for teddy bears and everything else in the future.  In this sense, winners win in SEO.

What does that mean for you?  Well, if your startup does designer teddy bears, Wikipedia has a built-in advantage over you for ranking for [teddy bears] and that advantage gets stronger with each passing day.  However, all is not lost: by moving further down the long tail of search terms, you too can benefit from self-reinforcing authority.  If you’re the best place on the Internet to go for [kimono teddy bears], your site will get stronger each passing day just by virtue of that.

If you’ve done much conversion optimization this should not be a big surprise to you, but things at the top of a page get clicked much more than things lower on the page, all else being equal.  This is equally true of search results: when AOL released its data, the top result got over 40% of the clicks, the second result 11.9%, etc.  The entire second page, by comparison, got only 10%.  SEO is a winners take most game: for a given search term, the vast majority of the benefits flow to the handful of sites at the top of the first page.

What does this mean to you?  It means focus on search terms you can win.  You will not prevail against the likes of Microsoft, Google, et al for head keywords in most circumstances, unless your product becomes synonymous with the niche.  (A head search term is at the popular end of the search frequency distribution, as opposed to on the long tail.  This is completely relative: [money] is a head term relative to [bingo cards], and [bingo cards] is a head term in the bingo niche relative to [valentines day bingo].

Incidentally, I can’t recommend The Long Tail enough for anyone interested in SEO.  If you’ve been on the Internet the last few years you’re probably sick to death of it and have read the (accurate) criticisms of conclusions about books and music being overstated.  However, no single book will improve your thinking on SEO as much as The Long Tail will.  (In particular, read up on tails within tails.)

For the amount of effort it would take you to rank #12 for the head term of your choice, which will result in marginal traffic even if the head is huge, you could rank in the top three for a huge basket of tail terms.  Additionally, one of the things you’ll notice is that conversion rates for head terms are terrible.  People searching for the terms on the head are either just beginning their research into a topic or are less sophisticated.  Generally, those are not the searchers you want.  Longer, specific queries are more common among people who have done the research and are nearing a purchasing decision.

Here’s an example for you: for the last several years I’ve ranked on the first page for [bingo cards] most of the time.  At the moment I’m probably, oh, eightish or so.  That was worth about 6,300 visits in 2009.  That resulted in three purchases of my software, for a value per visitor of a bit more than a penny.  Wheeeee.

By comparison, [free bingo cards] gets less than a fifth as much traffic, according to Google’s keyword tool.  However, the 1,200 visitors there also bought 3 copies.  (If that you didn’t expect people explicitly looking for free things to convert at five times the rate of undifferentiated searchers, welcome to the Internet.  Nothing makes sense except the data you collect.  Get something out there so today so you can find which 90% of everything you know is wrong.)

Now if we go waaaay down the tail to [geography bingo], we find that despite it having fairly few searchers (I only got about 300 hits visits year from it), it is quite lucrative ($70 CPM).  I could spend my entire life working in bingo and never be #1 for [bingo cards], but for a non-competitive tail term like [geography bingo], I’m #1 by virtue of showing up.

Sadly, a lot of startups of my acquaintance are so focused on the product that they don’t bother showing up for the topics that matter to their customers.  I won’t pick on anybody in particular (sidenote: write “Its OK to mention this conversation publicly” on an email to me and you might get a backlink when I need an illustrative example, like here), but it is very common for startups to launch with less than 1,000 words of text on their website and all the content behind the sign in screen.  That essentially cedes the long tail to your competitors.

Thus, my generic SEO strategy for a startup is a) be the best on the Internet for b) as many topics as you possibly can be that c) matter to your paying customers.

Making SEO Scale

Everything about a startup has to scale ridiculously disproportionately to the time invested in it, because you have too much to do and not enough people to do it with.

Some people say this is why you have to work 80 ~ 100 hour weeks.  If I worked 100 hour weeks, Scholastic Publishing would still be able to afford to devote a thousand man-hours for every one I can, if they chose to.  Your only hope for rising above the din on the Internet is to work smarter than your competitors.  Happily, your small size, technical skill, and agility let you run rings around the other guys.  One way is through scalable content generation.

Content in SEO is sort of a dirty word.  It can mean anything your users can consume: text, video, whatever.  Sadly, when people talk about content they are mostly talking about commoditized garbage, because the quality levels of content produced at scale are generally terrible, as you’re about to see.

There are about four approaches for creating content at Internet scale:

User-generated content.  Strategies centering around user generated content really devolve into two things: one, you hope people will steal hand-crafted content from elsewhere and put it on your site while you look the other way long enough to build traction (hello, Youtube, Scribd, etc) and two, you generate vast amounts of mostly excruciatingly worthless content which happens to match an equally vast amount of search terms.  Then, you sell display advertising against the visits for those searches.  This is essentially the business model for WordPress — give a blog to anybody who asks for one, display AdSense ads to folks who arrive on old posts via Google.  The ads give them the answers the content could not.

I don’t mean to malign user-generated content too much.  Sturgeon’s Law says that 90% of everything is garbage, which implies that 10% is not.  However, it is very difficult to use that 10% that is not garbage to advance your business goals, because it is not conversion-oriented and your advertisers don’t pay premium CPM rates just because the page the user landed on is worthwhile.  (Actually, in practice it tends to work out the other way around: if the page the user lands on is worthwhile, it will likely satisfy their desire, and economic value from that searcher ends.  That means low CTRs to ads and, accordingly, low CPMs.  If on the other hand the page is useless, then they might click on an AdSense link to continue the search.  This is the perverse incentive by which advertisers pay to make the Internet a mass of garbage.)

Mass Semi-Amateur Content Creation: The Demand Media model is capturing quite a bit of attention these days: take an authority domain like eHow, use sophisticated algorithms to generate article ideas for it, pay an army of underemployed freelancers miniscule wages to write uninspired content about the suggested titles, collect hundreds of millions in AdSense revenue.

The quality of Demand Media (et al) content is a cut above Youtube comments, but not by all that much.  I don’t really recommend implementing this model for startups.  First of all, I think Google is going to have to crush it like a bug in the next 12 months, because currently it is a license to print money and is polluting far too much of the search space.  Second, the amount of sophistication it requires is considerable, and while I think that is probably duplicable for a startup (particularly if you used something like TextBroker to automate dealing with the freelancer army) I think you’re better off with your engineering investments in more defensible places.

That being said, study this model and study it well: they’ve got a tight analytics-to-pipeline loop, they’ve got almost everything automated, and their margins are out of this world.  There is no reason you can’t do those things while producing great content by taking advantage of focus and engineering ability that Demand Media cannot devote to every microniche they want to expand into.  DemandMedia can saturate the world in How To questions but will never be able to outpublish me for bingo cards, because they will never detail someone to write a CMS to let their freelance army make those easily.

Talented expert workers: You can have all of your website content created by talented artisans who laboriously polish every bit to perfection.  For example, you could write every page by hand yourself, or hire a team of journalists to do it for you.  Have you seen the financial results for the New York Times recently?  Still want to do this except without the 200 year old megabrand?  Good, moving on.

Scalable Content Creation That Works

So how are you going to create large amounts of content that satisfies needs for your users while still advancing your business needs and not being garbage?  You leverage the unfair advantages that you have because you’re the smallest guy in the room.

Data You Can’t Get Anywhere Else: If you hang out around geeks who can’t get dates, you’ve seen a series of posts by OKCupid on topics such as how your race affects responses in online dating.  This is brilliantly done linkbait: it takes a huge amount of proprietary data (OKCupid response analytics) and exposes it in such a way that it is interesting (“Whoa, the very hottest women really do get hit on less than than you would expect “) , easily consumable (“Whoa, this pretty picture demonstrates that black guys have it hard when dating.”), and easily shareable (“Guys, I found scientific proof of why we need to take our shirts off!”)  If you’re J. Random Dating Affiliate, you can’t possibly duplicate that linkbait.  OKCupid can do it over and over and over again, though: they’ve written the analytics tools, they’ve figured out how to do the research and visualizations, all they need to do is come up with a new hook and bam they’re at the top of the social news sites collecting links again.

If you don’t have interesting data, you should start collecting interesting data.  However, in the meanwhile you can start visualizing or crunching existing data.  This is less defensible — anybody can go to the Census and get a few gigs of various poorly conceived slices to fill their hard drive — but you can add a whole lot of value in less time than you think with some SQL, your graph library of choice, and a well-written executive summary.

One of the few bright points for the New York Times is that they’re capable of doing things like this, for example.  You could have done that.  If you were in the job board industry, you could do something like that every Friday afternoon, by using open source, agile development, and all that jazz.  Pretty soon you’ll be cited as an authority on the subject — because, ahem, someone who publishes repeated analyses of raw data is an authority on the subject (or at least appears to be, which is 90% of what matters on the Internet, for better or worse).

Focus on evergreen content: A lot of people like blogs as content generation engines, and indeed, I think every startup should probably have a blog.  Then people blog on current events.  Bad call!  You see, today’s news is worth reading for about a day — less, in some sectors of the economy.  You’re a hamster on a wheel if you’re trying to keep up with the news — tomorrow, everything you write today is worth markedly less, and a week from now it will be almost totally forgotten.  Instead, pick the concerns of your audience that are roughly static and that will be pretty much the same next week, next month, next decade.  Alternatively, create resources that don’t go stale.

For example, for a bit of extra work that NYT visualization above could use live data, and instead of being a wonderful piece of technology becoming quickly irrelevant to a story from years ago, it could be a hub for the enduring issue of Racial Difference In America.  The NYT is interested in that issue and still will be in 2012.  They don’t have the strategic vision to make that graph with live data, though.  Luckily, your business is not a maladapted dinosaur reacting too little and too late to the changing business landscape.

I like to call this “evergreen content.”  For example, if you have a website selling a service teaching people Japanese, a page on how to make requests in Japanese will be  good for generations.  It is evergreen.  Or エバーグリーン, I suppose.

Agile — Not Just For The Product: Because you have excellent internal analytics (you do, right?) and you track what is working and what isn’t (you do, right?) and you can quickly bring resources to “market” because you’re using highly productive programming environments (you are, right?), you can try ten things, watch eight fail, and then try ten variations on the best two.

For example, suppose you have a mailing list of customers or fans (you do, right?).  Pitch (comparatively) low cost explorations of ideas to them, like blog posts about topics A/B/C/D/E.  Observe which one gets the most play with your existing customers.  Build (more expensive) resources about that topic, like something which requires custom programming.  (Bonus points: credit your customers with the inspiration for building the new thing!  You want a 95% certain way to get a link from Bob Smith’s blog to your new article?  Cite his contribution to it.  Help them help you get the ball rolling with their blogs, Twitter accounts, blah blah.)

Obviously, if one idea works out well for you, going in more depth or breadth on the same theme allows you to possibly re-use code, link sources (“Hey Cindy, this is Patrick from Random Job Startup.  A few months ago you had some great comments  about our unemployment visualization.  We’re putting together something similar and I wanted to ask if you had any more insights…”), marketing tacks that worked, etc.  (A great micro-idea I heard the other day: watch what people tweet about your stuff, use that as the title next time.  This may be the first time I’ve ever heard of an idea to get actual value out of Twitter.)

Pillar Content vs. Bill ‘er Content

As mentioned, you’re going to have to strike a balance between creating content designed to spread and gather links, attention, etc. and content designed to sell your stuff.  They’re not totally disjoint sets, but in practice non-commercial content will form the vast majority of your links.

If you don’t have any great ideas for non-commercial content (“How do we get people talking about our new squeegee brush?  It is a boring subject”), here’s a couple:

Open Source Software: You’re a programmer and you probably use vast amounts of OSS.  It is highly likely that in the process of creating your startup you will write some plumbing which is not your source of competitive advantage, but would solve problems for other people.  Since you already wrote it, why not OSS it?  Spend a few hundred on a nice logo (this is rounding error next to the engineering time you have invested and will greatly increase spread, trust me), write up a decent page on your website with examples and documentation, and send it to folks you think could use it.

I did this for my Rails A/B testing software, which at the time was a sorely underserved niche.  That is probably my best links-to-unit-effort idea ever, and it got links from authoritative sources like the Ruby on Rails official site who may not have been interested to hear about my new and improved Jane Austen bingo cards.  (Some people have no appreciation for the finer things in life — at least according to the rabid Jane Austen fans on the Internet.)

I have one comment on OSS for SEO which may cost me geek cred: does Github pay your salary?  I love them.  They’re wonderful people.  They contribute a lot to OSS.  They are also quite good at marketing their business and do not require your help to do it.  If you’re going to do OSS to get links, get links to your own site.

Blog Your Email: Do you get pre-sales inquiries or support requests?  Take careful note of how your customers ask questions, because they speak a different language than you do.  I describe bingo cards as “unique”, my customers frequently describe them in email as “not the same” or “not alike”, as it “How do I make bingo cards that are not the same?”  Using the same language that your customers use, answer their questions in public.  This can be bill ‘er content, since somebody asking this question likely has a need they’re interested in paying money in to solve (after all, a person just like them has sent you an email about it, knowing that your answer is going to involve “Oh, you do this on our product”).  Thus, while you are answering the question, you can probably work in a plug for your product.

Good SEO Can Make Your Startup

Your startup can succeed at SEO via the sweat of your brow and a bit of focused creativity, without having to spend hundreds of thousands to do so.  In terms of cost efficiency, organic SEO is probably the most efficient distribution method ever created.  Even with very modest amounts of resources, you can have get hundreds of thousands of visits and add thousands of users to your product.  (I do, and I’m certainly not a towering giant conquering the Internet from my local rice field.  You can do better.)

If you take one thing from this article, please, take this: you cannot afford to not have an SEO strategy.  If the idea of being an SEO gets your dander up, get over it drop me a comment and I’ll suggest something you can do that you won’t dislike but will still improve your SEO.

The usual disclaimers: I don’t get compensated for using people as examples.  I do try to write most people who ask for advice (odds are better if you ask good focused questions, let me get a blog post out of it, etc) but I know a few have slipped through the cracks as of late.  I’m by no means the world expert at this — take everything I say with a grain of salt.

83 Comments