This is part 4 of out cloud native and Kubernetes application checklist. Part 1 reviewed cloud native development tools, part two outlined CICD tools and part 3 analysed cloud native network tools. The criteria that we used to evaluate tools in each category is based on ease of deployment in cloud native environments as well as the feature-sets they provide.
In this installment we will evaluate service mesh tools for cloud native environments. We will start off of with a brief primer about service meshes, why they are needed and how they work. We will then move on to a comparison of a handful of service mesh tools from the CNCF cloud native landscape.
The cloud native approach to application development advocates architecting those applications as loosely coupled containerised microservices deployed using an orchestration engine like Kubernetes with the cloud serving as an underlying infrastructure layer.
Breaking down applications into distributed functional microservices, while improving speed and flexibility and making it easier to scale also introduce some challenges. One obvious one is an increase in the volume of inter-service communication. Managing these inter-service communications at the scale required by today’s enterprise applications, quickly becomes infeasible with existing networking tools.
The need to secure, monitor and orchestrate these communications and implement observability paradigms like distributed tracing and logging add additional complexity. This is where service meshes like Istio, Linkerd and Consul come in.
Service meshes refer to both the microservices that make up an application as well as the communications between them. Service mesh tools create an abstraction layer on top of microservices that help manage, orchestrate, monitor and secure the communications between those services.
Service meshes de-couple both application code and the systems they are running on from networking. This way DevOps spend less time configuring, securing and monitoring inter-service communication.
Most service mesh tools bundle together network proxies with a central control plane. Proxies constitute the data plane piece of service mesh architectures and are usually deployed alongside individual microservices. In Kubernetes environments, proxies are deployed as sidecar containers in the same pod as the microservice. All traffic is channeled via the proxy to other microservices or network endpoints based on policies configured in the central control plane.
In this section, we will review service mesh tools from the CNCF cloud native landscape and compare their traffic management, security, monitoring and observability feature-sets.
Istio is the highest rated service mesh tool in the CNCF cloud native landscape and is backed by teams from Google, IBM and Lyft. It has a broad integration footprint with Kubernetes and can be installed and configured using Helm as well. A number of other installation and configuration options are also available. Most cloud providers also integrate Istio into their managed Kubernetes services and provide seamless installation and setup.
Istio uses the Kubernetes sidecar pattern where a side-car container is placed alongside application containers within pods. All pods in the mesh must be running an Istio sidecar proxy.
Istio’s traffic management features allow DevOps to configure granular rules to control the flow of traffic and API calls between services, without having to make changes to the services themselves. On top of this they also make it easier to create and configure policies for circuit breakers, timeouts, retries, AB/testing, canary rollouts, and staged rollouts.
On Kubernetes Istio creates a service registry by automatically adding all services and endpoints in the cluster using Pilot adapters. Once added envoy proxies kick in to load balance traffic across multiple instances of the same micro-service.
DevOps can also add their own traffic configurations using Istio’s traffic management API. Configurations can be added for using different load balancing policies for specific sections of services or to apply custom rules to ingress and egress traffic.
The native security feature-set allows DevOps to mitigate both insider and external threats. Built-in security features include granular authentication and authorization policies, certificate management and mutual TLS (mTLS) based encryption.
Istio’s observability feature-set allows DevOps to gain granular insights into how services interact among each other as well as with Istio components. DevOps can monitor service metrics for latency, traffic, errors and saturation along with gaining access to distributed traces and logs.
Consul is backed by HashiCorp and has the second highest rating on the CNCF cloud native landscape. It can be easily installed in cloud native Kubernetes environments as well as in most managed Kubernetes services using Helm charts.
Consul automatically injects a Connect sidecar container running Envoy in Kubernetes clusters. Each pod in the cluster receives a connect sidecar which in turn enables the pod to accept and establish connections with other pods or services over secure, encrypted and authorized connections.
As with Istio, DevOps can also implement granular traffic management policies using Consul’s L7 traffic management feature. It can be configured to support A/B testing, Blue/Green deployments, circuit breaking and fault injections. DevOps can also configure bespoke policies to control and manage ingress and egress traffic. Consul also has a dedicated registry that keeps track of all running nodes and services as well as their health. Services can be registered with local Consul agents either manually, or automatically by container orchestrators.
The security feature-set includes support for mTLS, certificate management and authentication and authorization. Authentication and authorization can be handled using dedicated ACLs which along with securing inter-service communication can also be used to secure UI, API, CLI and agent communications. DevOps have multiple options to configure certificate management using any of the built-in CA system, Vault or AWS Certificate Manager.
The observability feature-set supports metric collection for all Envoy proxies in Consul Connect. Envoy proxy metrics can be captured in a Prometheus time series and graphed using Grafana. The observability feature-set also includes distributed tracing and logs.
Kuma is an open source service mesh from Kong. Similar to Istio and Consul, it uses envoy as a sidecar proxy. Kuma is platform agnostic in that it can operate equally well across a number of platforms including Kubernetes, VMs and bare metal. This enables enterprises to secure, manage and monitor inter-service communication as a first step and then move on to containerizing applications and porting them over to Kubernetes.
On Kubernetes, Kuma does not require any external data storage since it stores all of its state and configuration in the Kubernetes API server. During installation a single instance of the kuma-injector executable is spun up per cluster. Kuma-injector in turn spins up an instance of the kuma-dp sidecar container (which invokes envoy) alongside each service pod. Kuma-dp instances connect to the Kuma control plane (kuma-cp) as soon as they start up. The Kuma control plane can then be used to spin up a service mesh and configure it’s behaviour using policies.
Kuma packs functionality to secure, observe and manage connectivity between microservices for L4/L7 traffic. Kuma’s traffic permissions and routing features allow DevOps to configure security rules specifying which destination services can be consumed by source services, configure routing rules to enable Blue/Green deployments and canary releases.
Kuma’ security feature-set includes mTLS encryption for all inter-service traffic in the mesh. DevOps can use both the built-in certificate authority (CA) as well as a third party one. mTLS is also used for AuthN/Z, since each data plane is assigned a SPIFEE compatible workload identity certificate.
Kuma is fully integrated with Prometheus and supports metric collection across all data planes. It also provides pre-built Grafana dashboards that DevOps can use to monitor metrics for data planes, meshes or service to service. Besides these, DevOps can also configure policies for automated health checks, traffic logging and tracing.
Linkerd is a Kubernetes-specific service mesh tool and has the fourth highest rating in the CNCF cloud native landscape service mesh category. Being Kubernetes centric it has wide ranging integration and can be easily installed using Helm or kubectl. As opposed to most other service mesh tools, Linkerd uses a native proxy rather than envoy. The proxy is tailored to Linkerd and supports a large feature-set including proxying for HTTP, HTTP/2, and TCP protocols, automated Prometheus metrics export for HTTP and TCP traffic, layer 4/layer 7 load balancing and automated TLS among others.
Once installed Linkerd spins up a proxy next to each service instance. All traffic to and from services is routed through the proxies, which share telemetry data with the control plane and receive signals from it. The Linkerd control plane runs in a dedicated namespace in the Kubernetes cluster and is responsible for aggregating telemetry data, providing a user-facing API and communicating with the data plane.
Linkerd has a wide ranging feature set providing debugging, observability, reliability and security for inter-service communication in the service mesh. The traffic split functionality allows DevOps to configure rollout strategies including Blue/Green deployments, canary deployments as well as fault injection. Linkerd automatically load balances all HTTP, HTTP/2, and gRPC using EWMA (exponentially weighted moving average).
mTLS is automatically enabled for most HTTP traffic using secure, authenticated TLS connections between proxies. Communications between control plane components are also automatically encrypted using mTLS.
Linkerd’s also provides a wide ranging observability feature-set allowing DevOps to monitor request volume, success rate, and latency distribution metrics for HTTP, HTTP/2 and gRPC traffic as well as TCP level metrics for TCP traffic. Metrics can be recorded across a number of different groupings including per service, per caller/callee pair, or per route/path. Once recorded metrics can be consumed using the Linkerd CLI or dashboard or Prometheus. It also provides a number of pre-built Grafana dashboards that provide both high-level metrics as well as per pod granular visibility.
Want to learn more? Download the Complete CIOs Guide to Kubernetes:
Fan of all things cloud, containers and micro-services!
A review of the best practices, processes and cultural paradigms that are recommended by the FinOps foundation. These best practices and processes are instrumental in developing and operating a successful FinOps practice that views the cloud as a driver of innovation and business value while at the same time improving transparency and accountability.
April 12, 2021
8 min read
Part 3 of the Ultimate guide to cloud FinOps blog series, which outlines core FinOps principles, and provides an in-depth review of each one.
April 6, 2021
8 min read
Part 2 of the Ultimate guide to cloud FinOps blog series, which takes a deep dive into FinOps domains and roles, reviews the main responsibilities of those domains and identifies the current organizational roles that are candidates for inclusion in FinOps teams.
March 22, 2021
8 min read