Introduction to Service Mesh: Basic Features

Previously we described what service mesh is and why it is essential for microservice application. Now, let’s take a look at what service mesh actually does.

Reliable query transfer in a cloud infrastructure is a very complex process. Service mesh is aimed to make it easier through a set of efficient techniques: preventing network problems, delay load balancing, service discovery, repeated call attempts and deadlines. All these features must work together, and their interactions and environment can be very complicated.

A simplified sequence of actions in service mesh looks like this:

  1. Dynamic routing rules are applied to define what service a query is assigned to. All these rules are configured in a dynamic way and can be used either globally or on selected traffic slice.
  2. When a target service is defined, service mesh requests a suitable instances pool from the discovery service of the corresponding endpoint (there may be several ones). If this information differs from what it sees in practice, it decides which source of information to trust.
  3. Then it chooses an instance that is likely to return a quick response based on a number of factors (including the delay for recent requests).
  4. After that service mesh tries to send a request to the instance, recording the operation result (delay and type of response).
  5. If the instance is down, not responding, or cannot process the request, service mesh tries this request on another instance (only in case it knows that the request is idempotent).
  6. If the instance returns errors constantly, service mesh removes it from the load balancing pool and periodically checks it further (as the instance may experience a short term failure).
  7. If the deadline for the request is reached, service mesh proactively returns the request error (instead of overloading the instance with repeated execution).
  8. It takes into account all aspect of the above described behavior as metrics; all of this data is sent to a centralized metrics system.

This is only a simplified version of service mesh: it can also initiate and terminate TLS, perform protocol updates, dynamically route traffic, and switch between data centers.