RE-Thinking SLA’s in a “Cloudy World”

First came Software as a Service (SaaS) powered by dedicated data centers. All of a sudden enterprises had another way to solve their IT needs. Let’s call this SaaS 1.0. SaaS took off in CRM (safesforce.com), collaboration (hosted Exchange, Google Apps, etc.), file sharing (Box.net, SendUIt) and a myriad of more niche offerings that were typically less expensive and easier to use than on-premises software + hardware installations. This first wave of SaaS to hit IT shores also started the industry conversation about how to keep SaaS vendors accountable to their customers. Thus service level agreements (SLA’s) were introduced into the common enterprise IT lexicon. With SLA’s, SaaS vendors could publish their expected “uptime” and be measured on performance. SLA’s could also be used to compare and rank one SaaS vendor over another within the same product space (for example does salesforce.com have better uptime than Netsuite.com?). The IT industry became comfortable with SLA’s as a quality measurement to help IT buyers make good decisions on which SaaS vendors can be trusted to manage a service that might normally been provided by in-house IT department.
The common SLA measurement for SaaS 1.0 is “levels of availability.” 99.9% available is called “three nines”, 99.99% is “four nines,” etc. It’s measured as “unplanned downtime” against 100% up-time, and means that if anything less than a 100% that a SaaS customer was not able to use the system to it’s full capabilities (and thus entitled to compensation.) See the chart below for correlation of SLA availability percentage equated in downtime per week, month and year.
SLA’s also helped SaaS 1.0 vendors compare their service to the on-premises equivalent. For example, SaaS vendors could show an IT decision maker that a hosted version of Exchange could deliver “more nines” in up-time compared to an equivalent in house Exchange server. 

With SaaS 1.0 and SLA 1.0 one vendor typically was on the hook for delivering the service and meeting the SLA. This all changes in cloud computing world.

Starting in 2008, cloud computing infrastructures such as Amazon Web Services started to be used to power the next generation of SaaS offerings. Let’s call this SaaS 2.0. With cloud computing, software developers now had tremendous “on-demand” power and a whole new class of SaaS offerings started to emerge. With cloud computing SaaS could now be used to economically and reliably solve “big data” problems like information archiving and analytics, business intelligence, and backup/recovery. But with “cloud-powered SaaS 2.0” the old SLA model didn’t quite fit. Instead of one single vendor (SaaS 1.0) on the hook to deliver a service with a pre-defined up-time guarantee, cloud computing and SaaS 2.0 is the combination of two separate companies working together to deliver a service.

Cloud and SaaS 2.0 means inherited responsibilities for an up-time SLA. Two or more vendors wotk in concert This is a positive not widely discussed. Each cloud vendor, and let’s use Amazon Web Services as an example, publishes an SLA for the major services offered. In our example we’ll use core services like storage and compute. Amazon’s S3 storage offering has an incredible 99.999999999% (eleven nines!) guarantee. Amazon’s EC2 compute offering has a very respectable 99.99% SLA. These are great “building” blocks to host a SaaS 2.0 service. So great in fact that Amazon has emerged as the leader in cloud computing and attracted many leading edge companies to host their SaaS 2.0 applications.

The advent of cloud computing powering SaaS 2.0 applications has expanded the thinking about SLA’s. Vendors see software SLA’s and infrastructure SLA’s de-coupled, but customers don’t care, and just want to work with one vendor. The SaaS vendor selling the service is the responsible party, and the SaaS vendor can use the positive attributes of cloud computing to deliver a service that builds upon the infrastructure SLAs. The result should be a more reliable, more available, service powered by the cloud than a legacy dedicated data center.

In summary, cloud computing has changed the way SaaS vendors think about measuring SLAs, with the concept of inherited responsibilities. The cloud vendor provides a SLA on core infrastructure, and the software vendor builds upon that SLA. The end result to the customer is a more reliable service and SLA that will far exceed on-premises capabilities.