A Brief History: Cloud CPU Costs Over the Past 5 Years

When I started Sonian in 2007, one of the driving forces for beginning what would become my third start-up journey was the allure of all-you-can-consume “ten-cent-per-hour” cloud computing. Amazon Web Services was the new IT game changer in town, and the on-demand compute platform they launched in August 2006 literally brought cloud computing to the masses over night. These past five years I have been studying “cloud costs” in different ways, and this weekend I looked back at the compute pricing history and uncovered some interesting trends.

Before I continue with this post, here’s a brief history of my experience with previous “clouds,” which illustrates why in 2006 I was ready to take a big leap into the AWS cloud as an early adopter.

In 2004 I was involved with another SaaS information archiving project, and I worked with a team at SUN Microsystems to create a reference architecture for our archive software stack to live on the “SUN Utility Compute Grid.” At the time we were hosting the archiving software on dedicated co-located hardware racks, and planning a large capital expenditure to increase capacity. In the guise of “there has to be a better way!” we entertained the idea of moving our software to SUN’s “cloud.” (In 2004 the term cloud computing was pretty alien…. the common term for this type of shared virtual computing was “utility computing”). SUN offered the promise of true utility computing, but at the end of six months of effort, we could not make the underlying cost structures work. SUN was charging one dollar per CPU hour and one dollar per gigabyte per month for storage. We ended up adding more capacity to our existing co-located hardware plant because our “all-in” internal unit costs were less than what SUN was willing to sell their compute grid for.

Now back to the purpose of this post… a historical analysis of the cost of cloud CPU from 2006 through 2011.

Beginning in August 2006, Amazon Web Service’s new Elastic Compute Cloud (EC2) service introduced the concept of the EC2 Compute Unit (ECU) …. a standardized way to define a unit of cloud computing, the associated characteristics of that unit (processor speed and memory), and a revolutionary hourly cost model requiring no up-front expense. Amazon achieved what SUN, IBM and others had been talking about for years, but could never bring to market. In 2006, for ten cents per hour 1 EC2 Compute Unit could be rented with no up-front costs. In 2006, a single ECU was defined as equivalent to a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor with 1.7  Gb of RAM. This 1 ECU reference is still in effect today.
Read more…

State of Cloud Computing in Europe

I have just returned from a week in the United Kingdom meeting Sonian customers and business partners. The purpose of the trip was to expand Sonian relationships, but an added benefit was the opportunity to glean perspectives and adoption attitudes toward cloud computing in the greater EU market. Sonian is in a unique position to observe cloud adoption trends since our SaaS service is powered by true cloud computing infrastructures, and the conversations with EU business and technology leaders revealed their true thoughts about the state of cloud computing and indicators on adoption curves in 2012, and beyond.

The Buying Market

The EU market is not one single cohesive market, but rather smaller subsets that share some ideas, and diverge on others. UK and Ireland are (as you would expect) similarly aligned, and appear to share more in common with the Scandinavian countries, than with Germany and France, which have their own country-centric view of the cloud. The French language institute can’t even come to agreement on what to call “cloud computing” in France, settling on ”informatique en nuage” as a placeholder, but still searching for a unique French term that doesn’t break their rules on language purity and consistency (other examples: Software development is called “software addition” and the people that create software are called “software editors” since they literally “edit” source code files.) From my observations, the UK, Ireland and Scandinavian countries share more in common with the US thinking about the cloud, compared to Germany, France, Italy and Spain, which diverge on a number of key issues around data locality.

The total addressable information technology market in EU is roughly equal to that of the US. Except instead of a single national set of business rules, the EU market is fractured into separate countries, languages, tax systems and local business customs. This separation dramatically reduces the business efficiencies of technology providers attempting to service the EU community. The cloud could be seen as an antidote to inefficiency. Imagine an “EU Cloud” operating in a locality that pleases all consumers, and is the trusted provider. But it feels like a stretch goal to expect a single EU cloud to be accepted with the current barriers to a cohesive EU business strategy.

The Role of Government

The role of government in EU countries is more pronounced than we see in the United States, but there is no evidence yet that EU governments are pushing cloud computing as generic trend onto the private sector. Just recently the UK government established their “G.Cloud” initiative which looks similar to the US government’s “Cloud.gov” and Data.gov initiatives. This trend could be described as a “lead by example” scenario, with central government adopting cloud computing as proof it’s safe, cost effective and viable for the private sector. A myriad of data handling regulations seek to enforce “privacy” and “resiliency” to ensure citizens are protected from un-authorized access to personal information.

Read more…

The Secret Life of a Cloud Cost Control Czar

It’s a little after 7 in the morning and I tap the space bar key to wake my Macbook Air from it’s slumber. I click a tab in Chrome, hit refresh, and with a slight pang of “what will I see,” look at the balance for our October cloud infrastructure bill. You may know this feeling… think about a recent time opening the credit card bill and dreading an unwanted surprise.

“I don’t think I overspent this month, but… there was that steak dinner in New York City …”

Monitoring the rate of spend to make sure we’re not going to break the budget is one task in the routine as the “cloud cost czar.” It’s a daily task to track the trend lines and sound the alarm if expenses start to creep off plan.

The “czar” is the human gas pedal – modulating the enormous pulsing “cloud software engine” as we process half a terabyte of data a day.

But I am not a solo act in this cloud pageant. Making the cloud work from an economic perspective is a total team effort. It all starts in the engineering group with cloud-appropriate core architecture designs. And continues with quality testing, and and then to the team that manages daily operations. Everyone plays a role and has responsibility for our prime directive: process the most amount of data at least cost, without sacrificing customer satisfaction.

“Obsessively” managing cost is one of the three design requirements, alongside reliability and performance. “Gaming the cloud,” our internal slang for all we do to maximize efficiency, is a multi-disciplined effort the engineering and service delivery teams rally around. But there has to be a least one person who focuses on the trends, the 30,000 foot view down to sea-level: The Cost Czar.

Read more…

Gaming the Cloud: Balancing IaaS versus PaaS

This is the third post in the Gaming the Cloud series. In the first two posts I wrote about having the right use case and the need for cost-aware applications in order to “win in the cloud.” These are important initial steps, at a conceptual level, to adopting a cloud computing model. This post dives deeper into the tools available and how to balance cost versus control and cloud vendor “lock-in.”

A robust discussion about cloud computing should cover the two ways of consuming the cloud: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). Let’s define these now and then compare and contrast the optimal way to utilize IaaS and PaaS in our pursuit to game the cloud.

Cloud computing in 2011 is mostly used to support large-scale web applications. Over time the cloud will start to power traditional IT, but for now big SaaS apps are where the innovation is occurring. In terms of software and infrastructure required to power software as as service (SaaS) systems, the state of the art thinking has been an evolutionary process as we have seen SaaS delivered by non-cloud co-location morph into cloud in IaaS livery and then finally cloud with PaaS. Cloud PaaS is at odds with human nature’s desire for more control. But having more control also comes at a cost, and the industry is collective reconciling the most efficient way to balance cost versus control. The pros and cons of Platform as a Service are at the center of this debate.

Certainly the cloud is now seen as a credible provider of IT services and the sniping between the no-cloud “Co-Lo” naysayers and cloud supporters is subsiding. But now that we’re firmly in the pure cloud world, another skirmish is brewing. IaaS versus PaaS. IaaS today is what we used to think about Co-Lo just 5 years ago. Accelerating technology evolution will have PaaS becoming the new acceptable standard in the next few years, and the industry will look back on 2011 with the attitude “what was all the fretting over IaaS versus PaaS.” The debate about IaaS versus PaaS can be summarized as:

If all things are equal in terms of raw material costs, and there is no fear of vendor “lock-in,” then what’s the best choice to maximize time and effort?

Comparing IaaS versus PaaS with respect to cost, control and lock-in, along the same continuum, looks like this:

One advantage IaaS has over PaaS is more predictable infrastructure cost estimating. Compute and storage are easier to model when the building blocks are CPU hours and gigabytes consumed per month. Extrapolating PaaS costs is a more challenging exercise because cost models are multi-dimensional. Over time multi-dimensional pricing will be a benefit, since the software written specifically for PaaS systems can operate more efficiently. With PaaS there is a one-time learning curve to master the APIs and operational characteristics. The pay back for this one time investment will yield dividends forever. IaaS also has a learning curve, but it’s less steep than PaaS, and IaaS also has long-term operational costs that do not go away.

PaaS allows teams to focus on core domain expertise and not get bogged down “fighting yesterday’s fires.”

Within a specific cloud eco-system, Amazon Web Services for example, we can game each of the building block components (S3, EC2, EBS, RDS, SDB) to achieve the best cost / performance advantage. Read more…

Gaming the Cloud: You Need “Cost Aware” Applications

This is the second post in my “Gaming the Cloud” series. You can read the first post here.

There are two primary reasons to adopt cloud computing for SaaS applications: One, save money and two, be more reliable (and the great aspect of the cloud is both can be achieved with the same engineering effort.) There is no reason to use cloud computing unless you have a “cost aware” application. If you use the cloud to power traditional enterprise software you won’t save money, and will probably be less reliable too. See my previous post in the “Game the Cloud” series about having the right use case. Go read it now and come back. Do you have the right use case? If so let’s continue. After validating your use case, the next step in your cloud journey is to design your application to be elastic and at the same time “cost aware.”

So what exactly is a cost aware application and why should you care?

In the old world of SaaS, using traditional co-located data centers and co-mingled hardware, it was nearly impossible to figure out at a granular level how much each software component costs itself to run. With the cloud this all changes in a very positive way. As compute and storage are consumed in small units, and each of these units has a cost (for example compute at ten cents an hour or storage at fifteen cents a gigabyte) it’s a requirement to think about software designs that focus on operational efficiency because we can now measure costs at the atomic level. When I started Sonian in 2007, with a mandate to be purely cloud focused, we had access to 1 CPU type. Very quickly, our cloud provider Amazon offered more CPU variety and our reference architecture matured in real-time as we were able to optimize the software to match virtual compute units that had more memory, more cores, or both.

A cost aware application is software with an inherent design to “game the cloud” and be ultra efficient on every transaction. This in essence means granular workload management, the ability to right-size the CPU profile for the task, and take advantage of several long-term cost management features offered by the cloud infrastructure providers.

Read more…

Abundant Innovation – Sonian Summer 2011 CodeFest Delivers Impressive Results

The first quarterly all-engineering code fest completed Tuesday (Aug 16, 2011) evening with 3 winning teams, one dramatic performance, and many laughs.

This post is linked to the Sonian Blog. Joe Kinsella, Sonian VP Engineering, wrote about the CodeFest here.

The entire company was invited to view the presentations and vote for their favorites. The only voting rule was you can’t vote for your own team. The judging was based on three criteria: 1. Impact on solving a Sonian or customer pain point (50%), 2. “Cool-ness” factor (25%), and 3. Presentation style and effectiveness to convey the idea (25%).

Thirteen teams competed, representing the four functional units in the Sonian Engineering organization; SAFE (back-end), Website (front-end), DevOps (systems management) and QA. There were several teams from each group. The themes each team chose ranged from automation, performance measurement, to UI beautification and speed. Each team gravitated toward their “natural” inclinations. The DevOps teams focused on automating manual tasks and removing friction from deployments. The SAFE team (back-end) showcased applying “math” to measuring performance and data classification. The website team looked at speed and a better user experience, and the QA team showed us new ways to think about cost-testing alongside bug testing.

Six teams had a metrics or analytics theme. Two teams focused on user interface improvements, and 4 teams came up with solutions for automation and deployment problems.

Instead of Ernst and Young tallying the votes, our Harvard MBA trained ROI analyst Chris H. stepped in to ensure a fair and accurate accounting.

And thanks to all the non-technical folks who sat patiently through presentations where terms like “latency,” “lazy loading,” “grepping logs” and “foreground queues” were discussed.

Teams chose their presentation order, and the QA team volunteered to go first. Below is an accounting of each presentation with some context on how the idea fits into Sonian’s needs and long-term vision.

Congratulations to all the teams who competed! The next CodeFest is sure to be another interesting event.

Team 1: “You paid what for that …. Export job, Object list request, or ES cluster?”

Andrea, Gopal, Bryan and Jitesh from the quality assurance team got together around an idea to extend testing methodologies into infrastructure cost analysis. In order to maximize the cloud’s economic advantage, the engineering team is always thinking about the cost of software operating a “big data scale” levels of activity. From architecture to implementation, the goal is to infuse “cost conscious” at every level. The QA team came up with a novel idea on this theme.

The proposed idea is to extend the testing framework to set a baseline of feature infrastructure costs, and then measure successive releases against the baseline. A significant cost deviation from the baseline could be considered a design flaw, implementation error or a SEV1 bug. Some sample features with measurable costs would be an import job, export request, or a re-index. Over time the entire app suite could have an expense profile established.

Having QA be an additional “cost analysis layer” in the full development cycle will only help make the Sonian software as efficient as possible.

Bonus points to the team for the most elaborate props and “dramatic performance” used in their presentation.

Read on for details on the twelve other teams

Read more…

Gaming the Cloud: Start with the Right “Cloud” Use Case

The other day I met with a group of Boston start-up CTOs to share ideas on technology and team building. Typically these discussions veer into helping each other with technical challenges scaling SaaS software.  Of all the companies present, we’re all pushing the envelope with big data, large subscriber bases, and managing lots of cloud infrastructure. Many great ideas were exchanged, but one major theme resonated: The cloud may not be the best place for every kind of software stack. The group represents many different use cases; big data archiving, analytics, social networking for niche audiences, video encoding, application performance analysis, Advertising sales networks, and others.

To understand how we got to this place today, let’s step into the time machine elevator and press the button for “Year 2008.” Down, down we go and the doors open to “the cloud” (and specifically Amazon Web Services) just coming onto the tech scene. All of a sudden software architects and developers could control their own infrastructure (and not have to hassle with hardware sales reps or CFO’s and their purchase orders.)  We technologists, myself included, with big ideas “projected” our hopes and dreams onto the cloud as the panacea solution for all our infrastructure needs. We understood for the most part utilizing the cloud would require new ways of thinking about software architectures. We also had “real world” time-to-market pressures to get our service running quickly and start to prove out business viability. The cloud has tremendous potential to help accomplish great things, but used incorrectly, the cloud could cause a lot of (financial, technical stability) harm.

In the “gold rush” to the cloud we (the nascent start-ups pioneering in the cloud) were all learning at relatively the same time the same “lessons.” How and when to scale, how to analyze costs and efficiencies, the need for different kinds of monitoring and alerting, how to deploy software updates to a cloud-based environment, the list goes on. Many lessons learned, and the theme of mastering the cloud tilted more toward “let’s figure out how to “game the cloud.” Because there’s no reason to use the cloud unless you can make the economics work, and to make the economics work requires a mind set to “game the cloud;” process more data & transactions at least cost. Easy to write, but very technically challenging and rewarding to pull off.

Read more…

Greg+ Circles versus Google+ Circles

I can appreciate Google+ Circles feature to allow segmentation of content and sharing between the different “natural” audiences we all serve and follow. I’ve been dutifully cataloging my new Google+ connections into what I hope are the right circles. At first I was amused and concerned some connections may take offense if I put them in “Following” as opposed to “Friend” or “Acquaintance.” But as the old adage goes, life’s to short to sweat the small stuff.

A couple years ago, about the same time I started this blog, I created my own version of “Greg+ circles.” Family and close friends see one version of me. Business acquaintances, casual friends and the Internet in general see another view. It’s just natural the way any of us want to control “who knows what” about ourselves. [n.b. I'm a bit horrified as I watch 20-something year-old family members "max share" every minute of their life on Facebook. I don't get that, but I'm nearly twice their age.]

I created “Greg+” circles with several different web services, and they all aggregate here at this blog.

Facebook is strictly family and close friends. Sharing photos and links and updates. Feels very close and intimate, just like in real life.

Twitter, LinkedIN and Flickr are strictly “business casual.” Updates, photos and conversations with my professional life  occur over these networks.

I’m going to give Google+ Circles a real chance. I want Google to be successful with the whole suite of “Plus” services, but last night over dinner conversation with a friend discussing our mutual Google Plus ramp-up, I realized that I had already created my own version of circles and hadn’t thought to call attention to this until now.

 

 

 

Three Secrets to Cloud Computing Harmony

A Haiku for the big data cloud:

cloud promises quick

beware hidden expenses

elastic apps fix

In 2006 Amazon Web Services flashed brilliance with a “light bulb moment” that sparked the imaginations of leading edge technologists and entrepreneurs. Literally overnight “The Cloud” had arrived. The cloud offered the ability to create, launch and operate SaaS applications in a way never before possible. Using simple and secure API’s a software engineer could harness vast quantities of compute and storage services, on-demand and with no up-front costs, without touching a single physical atom. The cloud allowed small, efficient teams to build an application that could serve a very a large audience.

Within a couple years of launch, Amazon Web Services, nee “the cloud”, brought cloud computing to the technical masses. Thus was coined the phrase “infrastructure as code” and a new reckoning by the old guard enterprise software, hardware and hosting companies that times were changing (and in 2011 hindsight, the times would be changing rather quickly.)

For a technologist, cloud concepts seemed deceptively simple: Just design a software architecture that is tuned for cloud computing operating characteristics. The cloud offered the capability to automatically scale up and scale down. The cloud offered cost efficiencies. The cloud offered incredible reliability. And all these fine attributes were possible without sacrificing one for the other. For software architects used to the traditional co-located hosting design patterns, the cloud was something entirely new to comprehend. The differences are many, but the reward for adopting the cloud and succeeding was greater than the hardship to change our collective thinking.

The core requirements for every SaaS application are scale-up, reliability and efficient infrastructure utilization. Scaling in the cloud means harnessing the on-demand capabilities. Reliability in the cloud means designing for failure by making software mirror what “physical” hardware used to supply in the co-located world. Operating cost-efficiently means “gaming” the cloud to find every place where you can process more work with less compute resources.

But in reality, taming the cloud is not a trivial pursuit. If in this new “cloudy” world infrastructure is code, and invoking more API calls can launch more infrastructure, then we thought we were dining at the “all you can eat buffet.” Heartburn ensues. The best guidance is before you start to build a cloud-enabled application you need the scaffolding in place to “raise the app” with all the necessary supporting infrastructure required to operate a dynamic cloud-based SaaS system. Retro-fitting the augmenting management framework after the fact is the wrong approach for the cloud. This approach worked in the old world with software designed for dedicated hosting, but the cloud is such a different environment the old world thinking does not transfer well to the new cloud world.

Below are three prime areas to focus on when planning to build a cloud-based software stack.

1. Effective Budgeting with a Cost Control System

Compared to a traditional dedicated data center environment, it’s way too easy to spend money in the cloud. “Purchasing” in the cloud is psychologically different when the duality of two mindsets (using purchase orders to buy everything up front versus consume small bits at a time) have to reconcile with the vastly different operating styles of dedicated compared to cloud. In the dedicated environment, big capital expenditures get multiple approvals and are on many people’s radar. But in the cloud, most teams start their cloud relationship with a credit card and pay monthly for the previous 30 days of small micro-charges for gigabytes of storage and hours of cpu time consumed. Read more…

RE-Thinking SLA’s in a “Cloudy World”

First came Software as a Service (SaaS) powered by dedicated data centers. All of a sudden enterprises had another way to solve their IT needs. Let’s call this SaaS 1.0. SaaS took off in CRM (safesforce.com), collaboration (hosted Exchange, Google Apps, etc.), file sharing (Box.net, SendUIt) and a myriad of more niche offerings that were typically less expensive and easier to use than on-premises software + hardware installations. This first wave of SaaS to hit IT shores also started the industry conversation about how to keep SaaS vendors accountable to their customers. Thus service level agreements (SLA’s) were introduced into the common enterprise IT lexicon. With SLA’s, SaaS vendors could publish their expected “uptime” and be measured on performance. SLA’s could also be used to compare and rank one SaaS vendor over another within the same product space (for example does salesforce.com have better uptime than Netsuite.com?). The IT industry became comfortable with SLA’s as a quality measurement to help IT buyers make good decisions on which SaaS vendors can be trusted to manage a service that might normally been provided by in-house IT department.
The common SLA measurement for SaaS 1.0 is “levels of availability.” 99.9% available is called “three nines”, 99.99% is “four nines,” etc. It’s measured as “unplanned downtime” against 100% up-time, and means that if anything less than a 100% that a SaaS customer was not able to use the system to it’s full capabilities (and thus entitled to compensation.) See the chart below for correlation of SLA availability percentage equated in downtime per week, month and year.
SLA’s also helped SaaS 1.0 vendors compare their service to the on-premises equivalent. For example, SaaS vendors could show an IT decision maker that a hosted version of Exchange could deliver “more nines” in up-time compared to an equivalent in house Exchange server. 

With SaaS 1.0 and SLA 1.0 one vendor typically was on the hook for delivering the service and meeting the SLA. This all changes in cloud computing world.

Read more…