Dropbox is moving from NGINX to Envoy traffic control infrastructure that handles millions of accesses per second

Dropbox, which provides cloud storage services, has moved its traffic control platform from

NGINX to Envoy Proxy . Dropbox has posted on their blog about why they moved from NGINX to Envoy, why and why.

How we migrated Dropbox from Nginx to Envoy-Dropbox

NGINX has been operating as a traffic control platform for Dropbox without any problem for the past 10 years, but as modern system introduction progressed, such as migration from

REST API of service communication to gRPC , build management by Bazel, etc. It no longer fits the best practices for deployment. In addition, the traffic control platform based on NGINX is constructed from multiple elements such as YAML , Jinja 2, and Python, and the high operating cost was also a problem. I was considering Bandaid developed by Go as the migration destination of NGINX , but Go has decided that the migration destination is Envoy because it has a larger overhead than C and C++.

NGINX controls traffic by multiple processes called 'worker processes', while Envoy controls it by multiple threads. As a result of migrating from NGINX to Envoy, it was possible to release up to 60% of the server resources occupied by NGINX in the Dropbox environment. In addition, in the traffic infrastructure by NGINX, Dropbox used the log collection system originally implemented using Lua, but Envoy has a function to output various metrics in Prometheus Exporter format, and gRPC There is also a function to stream the access log in, so it is explained that monitoring became easier.

NGINX uses widely popular protocols such as static configuration files and syslog, and is simple and highly compatible. These properties have advantages in small environments, but as systems grow larger, testability and standardization become more important, Dropbox explains. Envoy provides a stable API called xDS and recommends the use of gRPC, etc., and it was very compatible with the Dropbox system that communicates between services with gRPC. Envoy has excellent scalability and security, and the development status is open and highly transparent.

The biggest problem in the transition from NGINX to Envoy was that the API service operation was inconsistent. Since NGINX has a proven track record as an industry standard, most libraries assume NGINX operation, and there are many inconsistencies between API operation and NGINX. Also, there was a bug in Envoy itself, but it has already been resolved in cooperation with the Envoy community.

At the time of article creation, both NGINX and Envoy are being used, and it is said that they are gradually switching by operating DNS. It is said that they are planning to support HTTP/3 and migrate Bandaid used internally in the company to Envoy.

in Software,   Web Service, Posted by darkhorse_log