Envoy Is the Real Deal
Proxies (or load balancers) have long been fixtures in network implementations of any significant scale. But as distributed systems, cloud-native architectures place new demands on familiar concepts, and the proxy is no exception.
Microservices can simplify building applications, for example, allowing developers to focus on discrete units of functionality. But when many applications consist of many services, service-to-service communications grow exponentially. Without a single source of information on the full network of microservices--often called "the mesh"--diagnosing performance, security, and other problems can be difficult, if not impossible.
In the microservices architecture, the “data plane” provides many of the same functions as traditional proxies, including load balancing for network and application protocols. But the data plane must also provide visibility into the mesh, enabling a “control plane” that allows managers to understand what's happening, diagnose problems, and quickly change routing and other policies for the entire mesh of microservices. Many traditional network and application proxies weren't built for the microservices environment, and can't provide the capabilities that cloud-native architectures require.
For these reasons, Envoy has garnered a great deal of attention, and deservedly so. Originally developed at Lyft and released as open-source in 2016, Envoy is a distributed proxy explicitly designed for microservice environments. Envoy operates at L3/L4 and L7, providing a robust set of features. It can work for a single service or application, or provide a communication bus and "universal data plane" for large service mesh architectures. Envoy recently joined Kubernetes and Prometheus as a Cloud Native Computing Foundation graduated project, with an active community of 273 contributors, including Google and IBM. Lyft, Google, eBay, Bloomberg, and Reddit are just a few examples of companies that have successfully deployed Envoy in production environments.
In the eyes of many, Envoy has joined Docker, Kubernetes, gRPC, Kafka, and others as a critical component of the cloud-native technology stack. As load balancers race toward commoditization, it presents an interesting opportunity for standardization of crucial network functions and should be on the shortlist for any organization evaluating how best to implement cloud-native systems.
Envoy’s rapid rise is due to several factors:
Flexibility: Envoy works as either an edge or a sidecar proxy, for a single application or as a distributed mesh in complex distributed systems.
Transparency: As a sidecar proxy, Envoy runs alongside applications, abstracting the network by providing common features in a platform- and language-agnostic manner, easing deployments and allowing applications to inherit capabilities with no code changes.
Performance: Written in C++11, Envoy uses native code and has a small footprint. It has proven its ability to scale in performance-critical environments.
Observability: Envoy provides a rich set of statistics and logs, as well as distributed tracing, making it much easier to troubleshoot and manage complex microservice deployments.
Extensibility: Envoy provides a pluggable filter architecture, allowing it to support a wide variety of use cases and environments. Developers can easily add customized filters to Envoy’s stack of both L3/L4 and L7 filters, for example, implementing functionality not present in the core software.
Programmability: Envoy is API-driven, making it much more dynamic. It’s configuration APIs allow separate management systems to change routing and other policies on the fly without disrupting either the proxy or the application. All instances of the Envoy proxy share these policies, and changes affect the entire mesh. These APIs have led to the emergence of separate control planes that maximize control and operational flexibility.
All of these features are important, of course, and other proxies provide some of them. But Envoy combines performance with transparency, extensibility, and dynamic reconfigurability, offering a compelling solution for deploying and managing cloud-native systems.
And Now a Word from the OSI Model
While it may seem archaic, most people still describe proxy and load balancing functionality in terms of the OSI model, with the primary distinction being between Layers 3 and 4 (network and transport) on the one hand and Layer 7 (application) on the other.
L3/L4 proxies have been around a long time. But the trend toward microservices has driven a significant requirement for proxy and load balancing functions at the application layer. Microservices communicate using L7 protocols such as HTTP, HTTP/2, gRPC, Kafka, and MongoDB. Consequently, L7 proxies have become a core concern in modern application architectures.
But there are tradeoffs, and real-world applications don’t quite fit the binary distinctions of the OSI layers. Due to the analysis, transformation, and routing services they provide, L7 load balancers can't handle the traffic volumes that an L3/L4 load balancer can. Parsing HTTP requests as they come in creates higher latency. While they are much faster, however, L3/L4 proxies often can't provide the features that L7 proxies can, such as proxying different URLs to different back ends. Both L3/L4 and L7 proxies have their limitations. The trick in microservice environments is balancing the need for functionality that spans both realms.
Built for Microservices
Envoy's designers (primarily Matt Klien of Lyft) set out to create a proxy architecture specifically for microservices, making the network transparent to applications and problems easier to troubleshoot, all while providing the high performance and flexibility needed at scale. It achieves these goals by providing the following features and benefits:
Edge proxy support: Envoy is usable as an edge proxy, providing features such as TLS termination, HTTP/1.1, and HTTP/2 support, as well as HTTP L7 routing. Using the same proxy software both at the edge and for internal service-to-service communications can ease management burdens and create a more consistent architecture, a must given the distributed nature of most microservice deployments.
Sidecar topology: For intra-network load balancing, Envoy runs in a distributed sidecar topology, operating alongside the service instances as a separate process, eliminating a single point of failure and making the network transparent. Integration with existing services, written in any language, is automatic. Services inherit Envoy's functionality with no changes, and managers can reconfigure the proxy without making changes to the application. The flexibility and transparent this topology provides are ideally suited for distributed microservices environments. Developers can focus on microservice functionality, knowing Envoy handle the infrastructure.
Dynamic configuration: In simple deployments supporting one or two applications, Envoy can use YAML-based static configuration files. In large implementations, however, Envoy's xDS APIs support dynamic configuration for centralized management via separate control planes. This dynamic configurability is crucial to microservices and DevOps environments, where change and the need to adapt are constants. DevOps personnel can reconfigure the proxy without disrupting service instances. And all proxy instances share configuration, routing policies, and load balancing information, giving the system scale. Several control planes that leverage the xDS APIs have emerged, part of the growing community of products and services that leverage Envoy.
L3/L4 and L7 processing: Envoy operates on L3, L4, and L7 simultaneously, bypassing many of the limitations of proxies that work either at L3/L4 or L7. Consequently, Envoy can accommodate sophisticated routing policies based on L7 intelligence but with a performance level more akin to an L4 proxy. Since Microservices live an “L7 world,” Envoy can provide high performance and horizontal scale while providing the L7 load balancing microservices need.
Advanced load balancing: Currently, Envoy supports a variety of load balancing algorithms, including weighted round-robin, weighted least request, ring hash, maglev, and random. It includes support for automatic retries, circuit breaking, global rate limiting via an external rate-limiting service, request shadowing, and outlier detection. Because Envoy is a sidecar, its advanced load balancing techniques are accessible to any application, again providing a great deal of flexibility for microservices deployments. Developers can test and tune using different algorithms, finding the best fit for their environments without making any changes to service instances.
Observability: Envoy includes robust statistics support for all its subsystems. It uses statsd for statistics aggregation, and statistics are also viewable via the administration port. Envoy also supports distributed tracing via third-party tools, such as Zipkin. This information is invaluable when trying to troubleshoot problems in microservice implementations. It gives both security and DevOps personnel visibility into what's happening--in real-time if need be--allowing them to diagnose problems more quickly and effectively.
Robust protocol support: Envoy provides a stack of existing filters, working with a long list of protocols. L3/L4 filters support tasks such as raw TCP proxy, HTTP proxy, and TLS client certificate authentication, for example. At L7, Envoy supports HTTP/HTTPS, HTTP/2, gRPC, MongoDB, and DynamoDB. And managers can filter requests based on a variety of parameters. Its pluggable filter architecture also allows developers to write filters as need be, but most developers find that Envoy supports the protocols they need for microservice implementations.
Health checking: Envoy’s health checking subsystem can perform active health checking of upstream service clusters. Envoy uses the union of service discovery and health checking information to determine healthy load balancing targets.
The Data Plane and the Mesh
This combination of features makes Envoy a robust part of the microservices architecture, and it's for this reason that Enoy's creators call it a "universal data plane." By abstracting the network from the application, Envoy works universally, with virtually any application or service. Organizations can start small, using Envoy for one or two applications, managing it via YAML config files, and scale up to larger deployments.
At scale, Envoy provides core functions for the microservice network, or mesh. As the data plane, running alongside services in a distributed architecture, Envoy touches every packet/request in the system, providing service discovery, health checking, routing, load balancing, authentication/authorization, and observability. Leveraging Envoys APIs, separate control planes provide the mechanisms for dynamically managing and updating Envoy's routing, service discovery, and other configurations for all of the data planes in the system. This combination creates the mesh, allowing all Envoy instances to share routing and health information and enforce sophisticated routing policies.
Conclusion
Explicitly built for microservices and cloud-native environments, Envoy is the real deal. It has gained significant traction by combining performance with transparency, extensibility, and dynamic reconfigurability. How it combines these capabilities makes it a strong candidate for inclusion in the standard cloud-native stack.