Failure in Amazon cloud, famous sites such as Imgur and HootSuite temporarily down


Cloud service provided by Amazon.comAmazon EC 2Is an option ofAmazon Elastic Block Store (Amazon EBS)In trouble, social media management toolHootSuiteAnd image sharing serviceImgurIt became a situation where a famous site such as such as temporarily goes down. This obstacle was restored in about 7 hours.

Amazon EBS failure brings down Reddit, Imgur, others

Amazon Cloud Goes Down Again, Breaks Foursquare and Others | Wired Enterprise |

The cloud service has the merit that users do not have to worry about server management and maintenance from the viewpoint of users, but also to the service provider side, cost reduction by standardization and improvement of resource utilization rate and service fee at low cost There is a scale merit that it can be set, and it is utilized also in the provision of web service.

This time, has multiple failures in the data center in North Virginia,HootSuiteYaImgurIn addition, social news siteReddit, SNSFoursquare, Of the cloud platformHeroku,Minecraft, Social coding siteGitHubAnd so on were down or affected.

The main flow of the problem is as follows.

· 2:38 (local 22: 10: 38)
There is a report of performance deterioration in some EBS volumes corresponding to the Amazon EC 2 "US-EAST-1" region at the North Virginia data center, and the investigation is started

· 3:11 (local 22th 11: 11 local time)
Confirm the performance deterioration, the instance using the applicable EBS volume will be affected.

· 3: 26 (local 22: 11: 26)
Notice that "Launching a new instance will fail if using the corresponding EBS volume". The operation lasts about 3 hours for recovery and there is no change in the situation. Also during this time errors such as Amazon Relational Database Service, CloudSearch, CloudWatch, ElastiCache etc. also occurred. Work was done in parallel with EBS recovery.

· 6:20 (local 22th at 14:20)
About half of the failed EBS is recovered.

· 7:48 (local 22th at 15:48)
Notification that the recovery rate will improve and that it will be restored after a while. At this point, other errors are almost recovered.

· 8:42 (local 22th at 16:42)
Even on the affected EBS volume, it recovers until you can start up a new instance.

· 9: 44 (local 22: 17: 44)
Restoration of almost all EBS volume failures.

AWS Service Health DashboardWhen checking the status with the status, there is still a yellow icon (some obstacles remain) at 11 o'clock, and it is not a complete restoration.

In HerokuReport on the whole storyDoing it,Heroku StatusBy comparing with the previous server status, you can see how long the time affected by this failure was long.

Similarly affected is an official Twitter accountDown report tweets. As of 10:30 when writing the article, the site is not yet restored.
Twitter / turntablefm: Alright, we are down. :( We're ...

Amazon's North Virginia Data Center has also failed in July and online rentalNetflix, Of photo sharing serviceInstagramWhenPinterestAnd others were affected.

Amazon Blames Generators for Blackout That Crushed Netflix | Wired Enterprise |

in Note,   Web Service, Posted by logc_nt