Filed in archive
The Cloud
by Scott Wilson on August 12, 2008
Every silver lining's got a touch of grey
- The Grateful Dead
Yesterday's Gmail outage, affecting both individual and Apps corporate customers, has the blogosphere all abuzz yet again over the unreliability of cloud computing. This coming only weeks after Amazon's latest Simple Storage Service (S3) outage, many are linking the two and drawning some grand conclusions about cloud computing as a whole. You can't trust it yet! You need off-line synchronization! Better make a back-up plan, this isn't ready for prime-time yet! It's fortunate that S3 didn't go out at the same time; people wouldn't have had Twitter available to complain to one another about Gmail.
All these histrionics are typical reactions to any failure, but they all themselves fail to look at the real issue and ask the most pertinent question: What solution can provide the service in question at the highest availability for the lowest cost?
First, as Harry McCracken points out, this issue is hardly new and certainly not restricted to cloud service providers. Netflix, E-bay, MSN... they have all had their outages, far eclipsing those at Google and Amazon, yet people seem to be able to manage their movie rentals, auctioneering, and messaging just fine on the whole today. Failures are a fact of life in the computing world, and as they go, these recent ones have been relatively minor.
But how about the most dramatic failures you never hear about: the private ones? Although there aren't the spectacular news stories and blogger melt-downs to point to, internal failures within the enterprise are more command and more dramatic than anything that has happened yet at Amazon or Google. I have never worked with any business, of any size, which has not had internal systems melt-downs far exceeding those exhibited by Gmail and S3, putting users out of action for far longer periods and often more completely than a single service failure such as storage or e-mail can do. These things don't get press, for the obvious reasons, but they are every bit as significant to the users affected... and more importantly, to the profits of the business involved.
The cost of cloud-based services is insanely cheap compared to the internal equivalents. Most Gmail users pay nothing; corporate Apps accounts might cough up $50 per user per year, if they want to for some additional features. Amazon S3 costs $.15 per gigabyte for storage and around $.10 per gigabit for transfer. Compare these to the alternatives; in a quick survey of hard drive costs at my favorite online store I come up with an average of around $.20 per gigabyte for hard drive storage, which doesn't include the server to put it in, the electrical costs of keeping it running, or the staffing for maintenance and installation. Network bandwidth isn't commonly metered these days, but if you have a DS3 at $2000/month, you'd have to pump almost 20 Terabits out over the line to make it worth your while.
While we're talking about costs, let's talk about the cost of reliability. Gmail was down for a couple of hours; it's dropped offline before, but not globally and not for such an extended period in recent history. That's 99.7% available for the month, for free. S3 was off-line for 6 hours in its most recent failure. Unlike Google, Amazon provides a Service Level Agreement which specifies 99.9% up-time and allows for a sliding scale of compensation should outages exceed that. There is some discussion that their compensation is too low, but let's look at the number. Three nines sounds pretty good to the layman, but in the industry most people know that equates to about 40 minutes of allowable down-time a month. In critical systems, that's a lot. What Amazon actually achieved as 99.1% availability, which is even worse. This all sounds bad, but tell me: what do you spend to get even 99.1% up-time out of your internal systems, at equivalent functionality? Let's see, that's redundant servers, on-call staff, a lot of planning time, and oh, you are going to have to test your outage responses from time to time, too, that's time-consuming.
Obviously this number is different for every business, but if you sit down, right now, with a calculator and figure out what it would (or does) take in your enterprise to achieve the reliability that Amazon and Google have, I imagine you're going to discover it costs quite a bit more to do it in-house. Maybe you can mitigate those numbers with other services you'd have to provide anyway, maybe you can cost-justify them against the cost of productivity when systems are down, maybe you don't need even that much reliability and could spend less on it internally... there are a lot of possible answers. But in the main, I think you'll find you're getting quite good reliability for the cost with most cloud-based providers.
None of this obviates the recommendations that other pundits have been making for lining up back-up services, making contingency plans, and seriously considering your various options before you leap into the cloud for corporate services. But my advice to you is, run the numbers. Don't buy into the hysteria; think about your requirements, and calculate what approach requires the lowest cost to fill them. An outage, any outage, is annoying, and provokes reaction. At the end of the day, though, stay calm and look to your bottom line to guide you.
- The Grateful Dead
Yesterday's Gmail outage, affecting both individual and Apps corporate customers, has the blogosphere all abuzz yet again over the unreliability of cloud computing. This coming only weeks after Amazon's latest Simple Storage Service (S3) outage, many are linking the two and drawning some grand conclusions about cloud computing as a whole. You can't trust it yet! You need off-line synchronization! Better make a back-up plan, this isn't ready for prime-time yet! It's fortunate that S3 didn't go out at the same time; people wouldn't have had Twitter available to complain to one another about Gmail.
All these histrionics are typical reactions to any failure, but they all themselves fail to look at the real issue and ask the most pertinent question: What solution can provide the service in question at the highest availability for the lowest cost?
First, as Harry McCracken points out, this issue is hardly new and certainly not restricted to cloud service providers. Netflix, E-bay, MSN... they have all had their outages, far eclipsing those at Google and Amazon, yet people seem to be able to manage their movie rentals, auctioneering, and messaging just fine on the whole today. Failures are a fact of life in the computing world, and as they go, these recent ones have been relatively minor.
But how about the most dramatic failures you never hear about: the private ones? Although there aren't the spectacular news stories and blogger melt-downs to point to, internal failures within the enterprise are more command and more dramatic than anything that has happened yet at Amazon or Google. I have never worked with any business, of any size, which has not had internal systems melt-downs far exceeding those exhibited by Gmail and S3, putting users out of action for far longer periods and often more completely than a single service failure such as storage or e-mail can do. These things don't get press, for the obvious reasons, but they are every bit as significant to the users affected... and more importantly, to the profits of the business involved.
The cost of cloud-based services is insanely cheap compared to the internal equivalents. Most Gmail users pay nothing; corporate Apps accounts might cough up $50 per user per year, if they want to for some additional features. Amazon S3 costs $.15 per gigabyte for storage and around $.10 per gigabit for transfer. Compare these to the alternatives; in a quick survey of hard drive costs at my favorite online store I come up with an average of around $.20 per gigabyte for hard drive storage, which doesn't include the server to put it in, the electrical costs of keeping it running, or the staffing for maintenance and installation. Network bandwidth isn't commonly metered these days, but if you have a DS3 at $2000/month, you'd have to pump almost 20 Terabits out over the line to make it worth your while.
While we're talking about costs, let's talk about the cost of reliability. Gmail was down for a couple of hours; it's dropped offline before, but not globally and not for such an extended period in recent history. That's 99.7% available for the month, for free. S3 was off-line for 6 hours in its most recent failure. Unlike Google, Amazon provides a Service Level Agreement which specifies 99.9% up-time and allows for a sliding scale of compensation should outages exceed that. There is some discussion that their compensation is too low, but let's look at the number. Three nines sounds pretty good to the layman, but in the industry most people know that equates to about 40 minutes of allowable down-time a month. In critical systems, that's a lot. What Amazon actually achieved as 99.1% availability, which is even worse. This all sounds bad, but tell me: what do you spend to get even 99.1% up-time out of your internal systems, at equivalent functionality? Let's see, that's redundant servers, on-call staff, a lot of planning time, and oh, you are going to have to test your outage responses from time to time, too, that's time-consuming.
Obviously this number is different for every business, but if you sit down, right now, with a calculator and figure out what it would (or does) take in your enterprise to achieve the reliability that Amazon and Google have, I imagine you're going to discover it costs quite a bit more to do it in-house. Maybe you can mitigate those numbers with other services you'd have to provide anyway, maybe you can cost-justify them against the cost of productivity when systems are down, maybe you don't need even that much reliability and could spend less on it internally... there are a lot of possible answers. But in the main, I think you'll find you're getting quite good reliability for the cost with most cloud-based providers.
None of this obviates the recommendations that other pundits have been making for lining up back-up services, making contingency plans, and seriously considering your various options before you leap into the cloud for corporate services. But my advice to you is, run the numbers. Don't buy into the hysteria; think about your requirements, and calculate what approach requires the lowest cost to fill them. An outage, any outage, is annoying, and provokes reaction. At the end of the day, though, stay calm and look to your bottom line to guide you.
Permalink: A touch of gray
Trackback: http://publish.creative-weblogging.com/publish/mt-tb.pl/131231
Mr Wong
Vote for A touch of gray:
|
Rating: 7.67 out of 3 vote(s) cast.
|
Subscribe
Marketplace
-
Online MBA Degrees - earn your mba degree online with one of hundreds of programs available at elearners.com
Use the search to look for other interesting posts
| RSS | See all blog subscribe options |
|
What is RSS? | |
| Yahoo! |
|
| Addthis |
|
| Bloglines |
|
| Newsletter | |
| Follow us on Twitter! |











