Here’s an all too common scenario from the “cloud chronicles.” A virtual machine that has been operating just fine for days, and has 50 other identical twins with the same configuration, starts to exhibit problems. Slow virtual disk performance. Network brown-outs. Disconnecting and reconnecting within it’s functional cluster. Monitoring systems alert on degrading performance, and the knee-jerk response is to jump on the box (nee VM) and start to troubleshoot the issue. The problem is, spending any time troubleshooting an anomaly in the “cloud” is the wrong reaction. In the cloud, the first response, when a node starts to exhibit erratic behavior, should be to replace, not fix.
Replacing, instead of fixing, goes against the ingrained habits of over two decades of entrenched IT best practices. In the pre-cloud world, when real hardware was the base, we had to “fix IT” because replacing was too expensive and not practical. There was not an endless pile of spares lying about for a “replace IT” mindset.
But in the cloud, with, in theory, nearly infinite CPU, the remediation to an errant node should be to immediately replace, and move on.
Why Is This?
Because there are too many causes beyond our control at the OS level in a cloud environment. Think of the cloud like living in a high-rise building. Each unit in the building, just like each cloud customer, can have whatever interior they want, but there are also massive shared resources in the building. So while our interior may be a candidate for the next architectural digest cover, our neighbor could “kill our chill” with a too-loud stereo boom box. The cloud suffers from the noisy neighbor problem just like our theoretical high-rise. But in the cloud, we can choose to move and jump back into the random lottery for a new unit. We can’t change the building, but we can change the location within the building.
Of coure, you need the right cloud-centric architecture to be able to simply “replace IT” instead of “fix IT.” Having cloud-dexterity is critical to operating a successful cloud deployment.
The cloud requires us to “un-learn” the best practices of the past and embrace a new way of thinking about “break fix.” While replacing instead of fixing may seem wasteful, it’s really not. The time spent troubleshooting the random problem will not yield significant insights, and could be better spent focusing on more value-add projects. Usually after extensive diagnosis, the only recourse is to replace the node, since the original problem was an outlier.
You have just finished reading “New Cloud Rules: Replace Instead of Fix.” Please consider sharing a link to this post.









What do GitHUB and GrabCAD Have in Common?
But a different thought took hold; Hmmm… GrabCAD is to the CAD/CAM professional the same way GitHUB is to the software professional. We’re witnessing the rise of start-ups that cater to “niche” audiences who create a certain kind of content as their prime means of professional affiliation. Don’t take offense to the term niche audience applied to software or CAD professionals. It’s just a way to say “not a mass audience” that is served by a general purpose content creation site like Tumblr, WordPress.com, etc.
GrabCAD targets the CAD/CAM professional with a CAD-specific sharing space augmented with a thin “social network layer.” Create a drawing, upload to GrabCAD, post a link “hey look what I created” and share, trade, and sell your work product. It’s not a place to generate generalized content (like Google Docs, ZoHo, or Office 365), but rather a sharing space for affiliated professionals that want to showcase “their wares.”
GitHUB is a content sharing system that targets the software professional. In this case, “content equals source code.” It’s really “social source code management” with a bunch of other goodies like wikis and pasties mixed in. Source code management has been around forever, but GitHUB makes it really easy to share and integrate code from various projects. Developers don’t actually write their code in GitHUB, they do that in their own developer environments, just like CAD professionals don’t use GrabCAD to create drawings.
In the software world, it is now common for developers to tout their GitHUB account URL as a living resume. You can imagine the CAD/CAM professional one day sharing links to their GrabCAD creativity just like software developers share their GitHUB awesomeness.
Catering to a large niche audience with a custom experience is a successful end-run around mass appeal social networks like Facebook. The core required features, such as file upload, link sharing, and comment curation exist in many platforms, from WordPress to Drupal, to Facebook. But a generic user experience will not suffice. GrabCAD speaks the language of the CAD industry. GitHUB does the same for the software industry.
There are other examples of this trend, although none as focused as GrabCAD or GitHUB:
I can imagine other industries ripe for this niche audience approach: legal (specialized documents), chemical (formulas), teachers (lesson plans), music (lyrics). Easy and clear content owner attribution will need to be resolved for some of these ideas to be successful.
I’m excited to see the next GrabCAD come to life. If you know any vertically aligned professions where content creation is the core work-product, scratch your entrepreneurial itch and create a niche audience user experience now.
Posted on February 22nd, 2012 in Commentary FWIW | No Comments »