Introduction to Service Mesh: Why Is It Critical for Microservice App?

In recent times, service mesh has become an essential component of a cloud stack. Traffic intensive companies have already added it in their applications in production. Linkerd, open source service mesh for cloud applications, is now officially developed by Cloud Native Computing Foundation, which also works on Kubernetes and Prometheus. So what does it precisely do, and why is it this critical?

Service mesh is a separate infrastructure layer that manages intercommunication between services. It is responsible for safe request delivery through the complex topology of modern Cloud Native application. In practice, it is usually implemented as an array of lightweight network proxies that deploys together with the app code.

The conception of service mesh is related to a fast growth of Cloud Native applications. This model assumes that a single app can consist of hundreds of services, each service having thousands of instances, and each instance changing its state depending on dynamic scheduling performed by orchestration tool like Kubernetes. In such world, interaction between services is a very complicated process, fundamental for correct runtime behavior. Managing it is essential for high performance and reliability.

Service mesh is a network model which level of abstraction is above TCP/IP. It is assumed that underlying L3/L4 network can transfer bytes on a point-to-point basis. It also implies that this network, like all other environment aspects, is not reliable; service mesh must handle network failures.

The service mesh is in some ways similar to TCP/IP. Just as the TCP stack abstracts the mechanics of reliable byte transfer between network endpoints, so the service mesh abstracts the mechanics of sending requests between services. Like TCP, it does not give a matter to the actual load and the way it is encoded. The application has a high level task (like sending something from point A to point B), and the job of the service mesh, just like TCP, is to solve this task by processing any problems along the way. However, unlike TCP, the service mesh has a significant goal in addition to just making something work – it also has to provide a single entry point for the entire application, ensuring runtime visibility and control.

So the direct goal of a service mesh is to bring interaction between services from the invisible area, making this process a full participant of the ecosystem, where everything is monitored, managed, and controlled.