Amazon's Chief Technology Officer tells what it found out after operating AWS for 10 years

ByRobert Scoble

Amazon's cloud computing service "Amazon Web Services(AWS)It has been 10 years since its appearance. Lessons learned during this operation are published in Amazon's and AWS's CTOs by Warner Vogel in his own blog.

10 Lessons from 10 Years of Amazon Web Services - All Things Distributed

◆ Build a developable system
The software we build must be software that continues after one year, Mr. Vogel. Mr. Vogel seems to have felt "necessity of reconsideration and correction of architecture" from the beginning of service at AWS, but it is possible to upgrade the system with an old approach method such as "stop service for maintenance" It was revealed that it was not. Why we could not stop the AWS service is extremely simple, as many services around the world have adopted AWS which can be used 24 hours a day, 7 days a week.

For that reason, engineers involved in the operation and development of VWGEL et al. AWS decided to incorporate software components that would eliminate service down into a new architecture. Mr. Marvin, one of the engineers working at Amazon, is one of the services in AWSAmazon S3Evolution of "I feel like a single engine Cessna aircraft". This is probably because the appearance of the aircraft refueling in the air when the flight is over the range that can be reached is similar to the service of AWS which repeats the upgrade without stopping the service.

◆ Predict unexpected things
Devices used in the development and operation of AWS, such as routers and hard disks, memory units in OS, and so on are various, but whether it is the finest hardware or the low cost component, in any case at some timing It is obvious that they will break down, Mr. Vogel said. And this obvious is an important lesson.

For example, in the huge amount of data processing process performed in S3, several errors occur only with very little probability. However, Vogel can predict the possibility of such failure in advance. However, it is said that more unknowns occurred during design and construction than failures (errors) that can be known in advance. In addition, Mr. Vogel said, "Even if we do not know what failure refers to, we need to build a system that can contain naturally occurring failures," plus " It is necessary to keep on keeping it ", and it is said that a system that does not cause the entire system to go down even if a major obstacle occurs individually is important.

In addition, Mr. Vogel says it has devised a method to predict the extent of the failure in order to maintain the whole system healthy.

ByKathyturner 1

◆ Non-primitive framework
Due to the fact that many customers were still relying on old hardware and data centers, we were working on developing a system that would allow us to use new and interesting ways not seen by anyone, Vogel . To that end, it seems that we needed to be sensitive to "what services our services provide to satisfy customers' needs".

One of the most important services mechanisms offered by Vogel is to provide customers with a collection of primitives and tools. If you only provide a single framework, customers will concentrate on a single thing, but by allowing customers to choose their own means from among many means, It seems that it got satisfied. Similar methods have been used for AWS related services in subsequent generations.

From here, Mr. Vogel says, "Understanding that it is difficult to predict priorities for customers, it is important to start building actual services."

◆ Automation is important
Development of a software service that needs to be managed even after the service is started is fundamentally different from development of software which only ends with providing it to customers. The important thing is to automate as much management as possible, eliminate errors and manual operations as much as possible, Vogel said.

In order to realize this, AWS said that it was necessary to construct a management API that can handle parts that humans do. And in AWS, it seems that this API has become able to help customers customers instead of human beings. In order to build this, it seems that we applied an automation rule to resolve human tasks into essential constituent elements and to maintain reliable and predictable performance.


◆ API forever
It seems that the API is important though it was already learned while operating, but in AWS the API has become more important work. Once a customer starts to build applications and services using Amazon prepared APIs, it will be impossible to change that API. If you change the API, it will make a big blow to your business, Vogel said. In other words, the API design is an important opportunity only once for making that API superior.

ByTsahi Levent-Levi

◆ Knowing how to use resources
AWS needs to be aware of to what extent "cost required by service provider using AWS" will be. Mr. Vogel said he should work hard to provide customers with as low a price as possible to customers by raising efficiency or cutting costs if they do not know whether the customer has room or not.

◆ Building security thoroughly
Mr. Vogel said "Keeping customers' security is the most important thing," and that investment in this field is the part that AWS spends the most.

One of the things that Mr. Vogel learned in the early stages of AWS development and operation seems to be that "security design needs to be incorporated in the initial stage of service design" in order to construct a secure service.

ByAaron Patterson

◆ Encryption is one of the most important issues
Encryption is a necessary mechanism to ensure that customers have complete control over access to data. Ten years ago, it was very difficult to use encryption tools and services, and Vogel needed a couple of years to incorporate this encryption mechanism into AWS. However, as a tool related to cryptography nowAWS CloudHSMYaAWS Key Management ServiceThere is an encryption key that can be managed by the customer side.

Presentation of AWS Key Management Service (Japanese subtitles) - YouTube

◆ No need for gatekeeper
AWS does not have a gatekeeper that tells customers what they can do and what they can not do. It is said that this has led to AWS's innovative processes and unexpected inventions. In fact, PhilipsHealthSuite Digital PlatformYaOhpenThe retail banking platform is built on AWS.

in Note, Posted by logu_ii