A year ago, we wrote a blog post on the dangers of cloud storage, primarily focusing on privacy issues. This post was motivated by the publicity associated with leaked celebrity photos, and pointed out that cloud storage of confidential corporate data can be very risky. The risks are heightened if your data is stored in a geographical location that may have an entirely different legal framework for privacy of data.
The recent Amazon AWS outage offers a different perspective on the cloud, and it illustrates the reliance much of the Internet now has on Amazon cloud services. On Sunday, Amazon's US-EAST-1 region located in North Virginia experienced major problems, affecting sites such as Airbnb, Reddit, Netflix and IMDb. It took six hours for Amazon to resolve the issue, and even longer for all Amazon services to return to normal. Update: Amazon has posted an explanation of what occurred here.
This isn't the first time AWS has experienced major problems. In April 2011, a major outage took almost two days to be rectified. There were two severe outages in 2012, described here and here, both in US-EAST-1, which is the largest and oldest of Amazon's data centres.
Clearly, if your business relies on the cloud, contingency plans are necessary. Netflix has shown how it can avoid significant impact from such outages by aggressively planning for them. Its "Simian Army" is a group of software tools that deliberately and continuously try to simulate or induce failures in Netflix's infrastructure, including its Amazon instances - so that when a real outage occurs, Netflix is prepared and can rapidly mitigate the issue.
An important strategy is the ability to migrate to a different Amazon region (and data centre) as soon as problems are detected in the current region. To do this effectively, an active-active replication system is required, where data is instantly replicated between Amazon regions as it is generated. This adds to data centre costs, but should be seen as an insurance policy against failures.
CompleteFTP customers will be pleased to know that CompleteFTP is very suitable for deployment in the cloud - we use it ourselves on Amazon AWS. Using its clustering feature, it could also be deployed in an active-active configuration between Amazon regions, in conjunction with some form of Amazon data replication service.