Multi-Service Patterns combine both reader and writers to larger and more complex systems

Microservices Reliability Playbook

  1. Introduction to Risk
  2. Introduction to Microservices Reliability
  3. Microservices Patterns
  4. Read Patterns
  5. Write Patterns
  6. Multi-Service Patterns
  7. Call Patterns

Download the full Playbook at a free PDF

Command Query Responsibility Segregation (CQRS) Pattern

The CQRS pattern combines the ideas of the Writer Pattern with a Queue and the Reader Pattern.

CQRS Pattern Diagram

The pattern mandates that writes are written as Commands into a queue. A Command is just a message with instructions to mutate data - insert, update or delete, for one or more tables or services. A processor service accepts the Command, does the processing, enrichments and any other operations needed, and writes optimized for reads data to a database used by a reader service.

There are multiple variations of the CQRS pattern, using an event log instead of a queue, deciding when and how to update the reader service (sometimes called materialized views), yet the main principles remain the same - simple writer and simple reader services.

Regarding consistency, what is the source of truth? How does a client see their own writing? How do we ensure no two conflicting writes? All of those topics have multiple solutions, from optimistic locking to client retrying reads until they get their latest writes - all with advantages and disadvantages which are a whole topic by themselves.

The main advantage of the CQRS pattern is the decoupling of the system into 3 components - the writer which is simple, the reader which is simple, and the processor which can be complex.

CQRS is frequently used combined with Event Sourcing to create a log of events or commands, the writes. Those events can be partial writes, which are then summarized by the processor into the full up to date system state available to readers.

In addition, this architecture allows creating multiple readers to each get the slice of data optimized for their use case, all based on the same written data.

The pattern optimizes latency for both reads and writes, ensures high reliability of both, at the price of added complexity to the system.

Split by SLO Pattern

The split by SLP Pattern is similar to the previous (Split Read / Write by SLO Pattern), but on a larger scale. Consider a group of micro services who are supporting multiple business flows. Those different business flows may have different SLA, and as a result we have different SLO operations running on the same set of microservices. The pattern suggests to split the group into two separate groups of microservices, such that one group supports the higher SLO operations independent of the lower SLO group, while the lower SLO group can rely on the higher SLO group.

Split by SLO Pattern diagram

For instance, consider the cart example we had before. The higher SLO operations are the operations that support the checkout flow, while the lower SLO operations are operations such as updating the product catalog, the shipping configuration or the tax configuration. In addition, analytical operations are also of lower SLO. The split by SLO pattern mandates that we split the system into checkout supporting services and all other services. As we do so, we may employ the Split Read / Write by SLO Pattern for services such as the product catalog, tax and shipment.

Lets focus on the shipping calculator service. The service has one function to calculate the shipping costs of items in a cart. However, it has more functions - to manage the shipping rules probably with CRUD APIs, including validations, maybe simulate calculations for some rules management application, and maybe more operations. We want to decouple the risk of change and risk of load of all other functions from the higher SLO function of calculate shipping for a cart. We split the service into a reader (who only performs the shipping calculation) and a writer who does everything else, including the CRUD operations of shipping rules.

Example diagram of Split by SLO for shipping calculator operation

Note: Why do we consider checkout operations to have higher SLO, while all other operations are of lower SLO? SLO definition is a business definition that considers risk and the ability to absorb delays or other problems vs investment in higher SLO system. For online commerce, the most critical operation is the checkout operation, as it and it alone generates money. All other operations are supporting, and even if they fail, as long as the checkout continues to operate the business is in a good situation.