Gaming the Cloud: Balancing IaaS versus PaaS

This is the third post in the Gaming the Cloud series. In the first two posts I wrote about having the right use case and the need for cost-aware applications in order to “win in the cloud.” These are important initial steps, at a conceptual level, to adopting a cloud computing model. This post dives deeper into the tools available and how to balance cost versus control and cloud vendor “lock-in.”

A robust discussion about cloud computing should cover the two ways of consuming the cloud: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). Let’s define these now and then compare and contrast the optimal way to utilize IaaS and PaaS in our pursuit to game the cloud.

Cloud computing in 2011 is mostly used to support large-scale web applications. Over time the cloud will start to power traditional IT, but for now big SaaS apps are where the innovation is occurring. In terms of software and infrastructure required to power software as as service (SaaS) systems, the state of the art thinking has been an evolutionary process as we have seen SaaS delivered by non-cloud co-location morph into cloud in IaaS livery and then finally cloud with PaaS. Cloud PaaS is at odds with human nature’s desire for more control. But having more control also comes at a cost, and the industry is collective reconciling the most efficient way to balance cost versus control. The pros and cons of Platform as a Service are at the center of this debate.

Certainly the cloud is now seen as a credible provider of IT services and the sniping between the no-cloud “Co-Lo” naysayers and cloud supporters is subsiding. But now that we’re firmly in the pure cloud world, another skirmish is brewing. IaaS versus PaaS. IaaS today is what we used to think about Co-Lo just 5 years ago. Accelerating technology evolution will have PaaS becoming the new acceptable standard in the next few years, and the industry will look back on 2011 with the attitude “what was all the fretting over IaaS versus PaaS.” The debate about IaaS versus PaaS can be summarized as:

If all things are equal in terms of raw material costs, and there is no fear of vendor “lock-in,” then what’s the best choice to maximize time and effort?

Comparing IaaS versus PaaS with respect to cost, control and lock-in, along the same continuum, looks like this:

One advantage IaaS has over PaaS is more predictable infrastructure cost estimating. Compute and storage are easier to model when the building blocks are CPU hours and gigabytes consumed per month. Extrapolating PaaS costs is a more challenging exercise because cost models are multi-dimensional. Over time multi-dimensional pricing will be a benefit, since the software written specifically for PaaS systems can operate more efficiently. With PaaS there is a one-time learning curve to master the APIs and operational characteristics. The pay back for this one time investment will yield dividends forever. IaaS also has a learning curve, but it’s less steep than PaaS, and IaaS also has long-term operational costs that do not go away.

PaaS allows teams to focus on core domain expertise and not get bogged down “fighting yesterday’s fires.”

Within a specific cloud eco-system, Amazon Web Services for example, we can game each of the building block components (S3, EC2, EBS, RDS, SDB) to achieve the best cost / performance advantage.

IaaS – Infrastructure as a Service

IaaS is the basic building blocks for a cloud application. “Basic” as in the raw materials such as compute, storage, and networking. A “Do it yourself” (DIY) mindset is at play when one is using IaaS.  (As an aside, migrating from co-location hosting to cloud-powered IaaS is an admission that DIY isn’t always the best route, but giving up total control versus the cost to let another entity manage a service is a hurdle some teams can’t get past.)

But even in the cloud there are gradations between DIY and outsourcing the same equivlanet functionality with a PaaS function. The decision of IaaS or PaaS all comes down to costs, domain expertise, and the best use of people talent and where to focus innovation efforts.

Infrastructure as a Service is the next logical step from co-location hosting in the evolution of adopting a cloud computing mindset. Moving to the cloud, even if only to utilize IaaS and not PaaS, means an evolved thinking that the DIY approach is not always the best way to purchase compute infrastructure.

PaaS – Platform as a Service

PaaS is the concept that common application building blocks such as database, queueing and file sharing (among others) should be consumed as commodity services and not a place to waste time managing as a stand-alone. The idea is the cloud provider can operate a database system more reliably and for less cost than doing it yourself. The same uber argument about cloud versus non-cloud is now heard at the basic building blocks level, with the same pros and cons.

Most architectures use a mix and match scheme blending some IaaS and some PaaS because each functional area has different strategic importance to the project. For example a web application that has light database needs but intense queueing could easily use a PaaS database solution and an IaaS queueing system. Choosing PaaS or IaaS is a separate analysis for each major sub-system in the application’s software stack.

Within many technical teams the usage of PaaS fights against the urge of the tinkerer mentality to want total control. Face it, geeks like to tinker and PaaS has less obvious tinkering ability. But really this is a falsehood. While IaaS offers more knobs and dials (and the subsequent long-term ownership responsibility) PaaS offers new greenfield ways to focus on the most important value-add efforts (research) for your project. When PaaS becomes the base level from which we operate, then a whole bunch of new innovation will occur “gaming PaaS” the same way we think about gaming the cloud now. PaaS has a lower base cost (for most use cases) so any effort spent figuring out how to process more transaction for less money with PaaS will be time well spent. Spending time supporting IaaS is doing the same old thing over and over, at increasing opportunity costs.

Below is a IaaS versus PaaS cheat sheet:

Let’s compare three popular services that can be delivered with either IaaS or PaaS approaches. In these examples the IaaS and PaaS platforms are both powered by Amazon Web Services.

Comparison: Relational Databases

MySQL is the default SQL database for nearly every web framework (also giving a shout out to PostgeSQL too, but for this example let’s focus on MySQL.) Every project needs to figure out how to get reliable, economical SQL instances running in the cloud. You can run your own MySQL instances or use Amazon’s RDS (Relational Database Service) for SQL functionality. Running your own MySQL is the do-it-yourself IaaS method and using RDS is the PaaS approach. Each has pros and cons, but it’s good to have a choice, right? The cloud is all about choice and having multiple options to solve a problem.

The first part of a IaaS MySQL versus PaaS RDS analysis should be cost. Below is a cost comparison for two common implemetation scenarios: a small instance managing a 100Gb database and a large instance for a 500Gb database. In both scenarios there is a resiliency requirement, so the calculations are for a multi-zone implemetation in both IaaS and PaaS modes.

The cost estimates speak for themselves. For many use-cases the RDS option is less expensive then do it yourself. For the small multi-zone database RDS is 15% less expensive. The savings get better for the large database example, with RDS 19% less than IaaS MySQL. Not factored in the calculation is the quantifiable dollar amount representing the distraction factor of running MySQL yourself. If we could get an honest number, that would show RDS to be even more cost effective.

In 2007, running a database in the cloud required significant technical expertise because it was hard. In 2011, RDS is the answer to database in the cloud. AS an industry our collective technical expertise should be focused on more value-add problems, instead of jumping over the hurdles of the past.
Comparison: NoSQL Databases
Key value stores are becoming a common component in today’s web applications. NoSQL is the popular classification name for these systems. Typically a NoSQL installation is a cluster of compute nodes that shard data assets across multiple instances to achieve performance and reliability. Operational costs aren’t seen as a prime NoSQL design objective, but the practical thinking is that cost management has to be considered in any NoSQL IaaS versus PaaS comparison. There are many NoSQL options, and it’s not an apples to apples comparison among all the NoSQL options. Cassandra, MongoDB, Riak and ChouchDB are some of the more well known projects to choose from. For this comparison we’ll use Riak as the sample NoSQL database IaaS option and Amazon’s SimpleDB (SDB) and S3 for the PaaS equivalent. Riak versus SDB/S3 isn’t a direct feature to feature comparison like MySQL is RDS, but there is enough use-case overlap to make for a credible operational cost comparison.

For example, to use Riak instead of SDB/S3 requires different programming APIs and an application designed around how Riak behaves. The same for SDB/S3. By the way, the reason it’s both SDB and S3 as a comparison to Riak is because Riak is both a key value store and distributed file system. For the Amazon equivalent, SDB is the key value store and S3 is the distributed file system.

A fair cost comparison between IaaS Riak and PaaS SDB/S3 requires a multi-dimensional analysis of compute, storage, and number of transactions. Riak is easier to model because it uses compute and storage, but SDB/S3 charges by compute, storage AND transactions. Knowing how many transactions occur in a fast evolving software stack is difficult quantify. But it is safe to say that SDB/S3 will be less expensive up to a certain tipping point of transaction volume, in which case Riak could be the less expensive option.

The number one barrier preventing greater PaaS adoption is the fear of the unknown: Not being able to accurately predict transaction volume and their associated costs.

In the cost comparison below, the Riak configuration is a 6 node cluster with 500 Gb of attached storage per node. This would be considered a medium case architecture to handle an average load for a growing enterprise focused web application. Riak can be as small as a single node, or much larger with fifteen to twenty nodes per cluster. The more nodes, the more aggregate storage, faster processing time, and higher costs.

The Amazon SDB/S3 configuration doesn’t require pre-sizing or specific a number of compute nodes, so costing is “pay as you go.” Instead Amazon charges for amount of data stored in SDB and the compute time for inserting data and running queries against the stored data. To make a fair comparison the SDB scenario has 6 client instances making the equivalent of 24×7 query execution to match Riak’s 6 node 24×7 cluster. And herein is the difference between PaaS and IaaS for the NoDQL use case. The Riak cluster needs to run all 6 nodes 24×7, even if there is no activity. With the SDB option you are only charged for each query execution. But let’s assume there is some query running for 24 hours a day to make this a fair comparison.
The PaaS NoSQL SDB option is still two-thirds less expensive than the dedicated IaaS NoSQL Riak cluster. Where SDB could get more expensive is if the query volume exceeded an amount where the charges per hour exceeded the combined Riak 6 node cluster costs. This is where software engineers get stuck trying to extrapolate steady-state usage at scale to understand whether IaaS or PaaS is the right path.
(N.b. This analysis doesn’t suggest Riak is a direct replacement for SDB. SDB has it’s own set of issues that make it in-appropriate for many needs. But if you happen to have a need for the positive attributes, SDB can lower your costs over a dedicated cluster approach like Riak.)
Summary
The cloud offers many tooling choices to build your web application. Each choice, whether a DIY IaaS solution or a “pay as you go” PaaS solution have their pros and cons, and different cost structures. IaaS will always be the choice for when vendor lock-in cannot be tolerated. PaaS will be the choice when you can reliably measure application API queries to the compute hour pricing model and predict total expense at scale.
  • http://twitter.com/cobiacomm Chris Haddad

    The goal you state has a good intent, “PaaS will be the choice when you can reliably measure application API queries to the compute hour pricing model and predict total expense at scale.”    What do you think about removing ‘compute hour pricing’ out of the equation and predicting total expense based solely on number of business transactions?   Do you believe an application platform should expose API, message queues, and business process engines as opposed to compute/storage/network facets?   I propose a PaaS should ideally charge based on business activities instead of server instance uptime.