What was the size of Amazon's CloudFront's disability?


Amazon's "CloudFront"Is to provide stable and low-cost distribution of broadband content such as large-capacity applications, music, movies, online gamesContent distribution networkIt is what is called CDN.

A terrible thing happened that such a failure occurred in CloudFront's DNS and contents such as various images delivered via CloudFront were not displayed.

It also affects major web services such as Instagram, so that nothing is displayed on the site.

It seems that Amazon also had an impact.

We will tell you how big this obstacle was, "Multi-CDN service offering"TurboBytes"Is shown in our company blog.

Global outage of AWS CloudFront CDN on Nov 20 2014 - TurboBytes

CloudFront's failure occurred in about 90 minutes from 9:15 on November 27, 2014 in Japan time. For 45 minutes after the trouble occurred, the status page of AWS notified the user of the occurrence of CloudFront failure as follows and notified the user as "just information" rather than "big problem".

However, in fact, content and icons that are components of the site are no longer displayed on many sites, and even banner advertisements displayed on the site and counters of the number of visitors are no longer displayed. Due to this obstacle, it became clear that a considerable number of contents are delivered via CloudFront.

"Cotton On Asia" of the online shop changed to a very easy and clear site display when CloudFront failed.

The site display of Cotton On Asia at normal time is as follows.

It seems that TurboBytes constantly monitors the performance of multiple CDNs in real time in order to utilize other CDN's data for its own service, so it seems that CloudFront's failure situation and performance were also well monitored. The system of TurboBytes seems to judge that "loading failed" if loading takes more than 5 seconds when trying to load 15 KB of content from CDN.

The graph showing the performance of CDN output according to this rule is as follows. The vertical axis shows the rate at which the monitoring system of TurboBytes judged that data loading failed, comparing with other CDNs, how the CloudFront (blue line) failed to deliver the content at the time of failure I understand well.

The TurboBytes real-time monitoring system can not collect perfectly detailed data. However, all the data is collected from online, it is obvious that CloudFront's DNS was not responding at most timing.

The graph below shows the average value of DNS reaction time (ms) on the vertical axis. CloudFront normally has a longer response time than other CDNs, and it can be seen that the maximum was extended to more than 10 times the normal time in case of failure.

The chart below shows the blue line when loading data (loading 15 KB of data within 5 seconds) via CloudFront and the red line when failing, and it shows 75 It is clear that loading of data fails for a minute, and it is easy to understand.

in Web Service, Posted by logu_ii