I get asked reasonably often to help companies and individuals come up with a cost model for their cloud computing. People get really exercised about the cost of hundreds of compute nodes and terabytes. I know what these models should look like because I built and insanely complex model for PutPlace in 2006 when we founded the company and decided to deploy it on Amazon. I had the same concerns that most people had, was I building a business that was going to explode in my face because I had made some fundamentally flawed economic assumptions?
Once we launched PutPlace we rapidly discovered a number of interesting facts about our cost structures. The first one was that in a small online business such as PutPlace the compute costs dominate to the point that storage, bandwidth and transaction costs are essentially rounding errors. The second thing we discovered is that attracting enough users to move the needle from a compute perspective is “ahem” challenging for most companies. With consistent upload rates of over 10k files per day (with occasional peaks exceeding 100k files daily) our grid wasn’t even breaking a sweat. We had absolutely no red-line events on compute and of course AWS happily absorbed everything we threw at it without blinking.
Even at the end of 2008 when the service had been up and running for 6 months over 75% of our costs were compute nodes.
So if you want to understand your cost basis a very simple model is to work out the number of compute nodes you want to run, price those nodes in AWS (or Slicehost or Rackspace) and use that as a monthly cost model for your whole environment. Once you have a few months of price data, you can subtract your compute costs to find out your variable costs in storage, transactions and bandwidth (which I’m betting will be marginal). Now you have the marginal costs you can compute your variable cost per active user and now you know what your cost-plus price model basis is for each user.
You should still analyse your bill once a month to prevent surprises (like when we discovered we had 12 months of database snapshots taken at 5 minute intervals that no one was cleaning up) and to understand how the dynamics of your system are changing, but your key focus should be on your overall business model and your customer acquisition strategy.
Think of cloud computing like any other variable cost in your business, when you are small they are marginal (have you every priced up electricity usage in your startup financials?) and if you get big it just becomes a cost of doing business, so don’t sweat it!
(Most of the above applies equally well to building a “scalable” system, NoSQL boosters should take note!)