In addition to taking down power lines that left millions of people on the East Coast without power for the last few days, the unusual fast-moving storm system known as a "derecho" also took down Amazon data centers, including ones that had been running such well-known application services such as Netflix, Pinterest, Heroku and Instagram.
The outage was the second one to hit Amazon’s Ashburn, Va., data center in the last 30 days. While the fact that a data center might go offline because of weather-related issues is understandable, the fact that application services are going offline as a result is not.
It’s pretty clear that the IT organizations running those services have all their eggs running in one proverbial Amazon data center basket. Amazon, of course, urges customers to take advantage of what it describes as "availability zones" across multiple data centers to maintain high availability. But for whatever reason, the people managing those application services have either not implemented zones or set them up correctly.
For old-time infrastructure specialists, the dependency on a single vendor for anything is a little baffling. In a traditional data center, it’s not uncommon to have, for example, IBM systems on standby to back up primary systems that might be from EMC. The point is there is no single point of failure, which Steve Zivanic, vice president of marketing for the cloud storage service provider Nirvanix, notes is much easier to accomplish across multiple cloud service providers today simply by invoking a few scripts and active replicas of applications than it ever has been in traditional data center enviroments. And just to put a finer point on the argument for having two or more cloud service providers to rely on, Zivanic asks the question: When was the last time you ever heard, for example, of an Amazon cloud and an IBM cloud going down at the same time?
Zivanic says not only are too few people giving enough thought to redundancy in an era where replication in the cloud is relatively simple; too many of them are afraid to admit that they have outgrown the cloud computing platforms they initially launched the service on. The simple fact of the matter, says Zivanic, is that not all clouds are created equal. As your business becomes more dependent on the availability of a cloud application, the more important it becomes to work with a cloud service provider that can provide the appropriate level of service required, says Zivanic.
Of course, the bitter irony in all this, notes Zivanic, is that for some reason none of the retail services that Amazon provides ever seem to be affected by outages in the Amazon cloud, which are supposed to be an extension of the company’s excess IT capacity. Perhaps that was true at one point in time, but increasingly it’s starting to look like Amazon has one set of IT rules and processes in place for its own applications and quite another for possibly everybody else’s.