In the third part of the Microservices Reliability Playbook, we explore how to build reliable microservices systems using our understanding of how to predict reliability.
Microservices Reliability Playbook
- Introduction to Risk
- Introduction to Microservices Reliability
- Microservices Patterns
- Read Patterns
- Write Patterns
- Multi-Service Patterns
- Call Patterns
Download the full Playbook at a free PDF
The ideal system, and in fact the only system how meets the five-nines target, is the following
Obviously, the ideal system looks like a monolithic system!
So, are microservices the wrong choice? Actually, no. Microservices mitigate other types of risk, such as risk of change and reducing blast radius in case of a problem. Microservices solve an organizational problem of multiple teams working concurrently and independently.
So the next question is how do we regain reliability with a microservices architecture? Here comes microservices patterns!
Single Read / Write / Process (RWP) Service Pattern
The baseline - classic micro service
This type of system couples risk of change and risk of load and latency all into one process. A write operation may prevent a read operation due to load, or a change in write or processing logic can cause read operations to fail, or vice versa.
Considering such a system that, for example, during read or write operations, the processing is calling another 10 microservices (transitively), the system reliability drops as per the above formula for predicting reliability.
For instance, with a product catalog service, the additional calls can be to validate a write, enrich the catalog with inventory or other information, categorization, etc.
This is the simplest and most straightforward way of building a micro-service and that is a good starting point for talks on increasing reliability.
Microservice Patterns
When considering microservice patterns, the most common concept is that of CRUD services, which support reading (R) and writing data (CUD). However, we have to keep in mind that services commonly also process data, such that whatever is read from their database or other services is processed before a read operation. The processing can be sync or asynchronous.
For this discussion we define Read / Write / Process (RWP) as a service that supports data written into it, the data can be processed by the same service and by fetching data from services (enrichments) and supports reading the processed and enriched data.
We categorize the patterns into 4 groups - Read Patterns, Write Patterns, Multi-Service Patterns and Call Patterns.
Read Patterns focus on how to read data in a reliable and fast way, given constraints such as multi-service reads.
Write Patterns focus on how to write data in a reliable way, including writing to multiple microservices.
Multi-Service Patterns combine both reader and writers to larger and more complex systems
Call Patterns focus on improving reliability of a single network call, such as retry and cache.
Type | Name | Optimize for |
---|---|---|
Single RWP Service | Simplicity | |
Read | Reader | High reliability, any database schema |
Read | Reader with Preprocessing and Enrichments | High reliability, simple database schema, preventing other service calls during reads |
Read | Isolated Reader | High reliability, non-simple database schema, enrichments. This is the most reliable and fast reader pattern. |
Write | Writer | High reliability and fast consistent writes |
Write | Write with a Queue | Decouple data processing from writes |
Write | Multi-Writer | Multi-service writes and processing |
Multi-Service | Command Query Responsibility Segregation (CQRS) | High reliability and fast consistent writes & High reliability and fast reads |
Multi-Service | Split by SLO | Focused high reliability for sub-systems |
Call | Pass Through Cache | Improve latency and reduce load from downstream system |
Call | Retry | Overcome random network failures |
Call | Fallback | High reliability by expecting failure and having contingency plan |
Call | Circuit Breaker | Detecting failure and preventing load on failed system |
Next: Read the details of the patterns in