Lars-Erik Kindblad
Writing about software development, architecture and security

Scaling Microservices Part 6 - From Sync to Async Write Operations Using Messaging

In this part, we will see how messaging can be used to improve scalability in our services by changing write operations from executing synchronous to asynchronous.

Typical Solution - Synchronous Execution

Service operations responsible for writing data usually consist of one or more steps that create, update, and/or delete data against one or more databases and/or services. In many systems, these steps are executed synchronously.

Diagram showing synchronous execution

Advantages of synchronous execution:

  • Data are immediately consistent since the changes are committed at once.
  • The code is usually (not always) easy to debug and follow since it's sequential and synchronous.

Disadvantages:

  • Slow response time for the consumer if any of the steps are slow. Slowness can be caused by various issues such as complex computations that must be done, unstable service calls, high traffic load, high network latency, etc.
  • Slow write operations consume resources and might negatively impact read operation performance.
  • Transaction management and maintaining data integrity are often challenges. Components and processes can crash, steps can fail, and subsequent steps might not be executed.

Alternative Solution - Asynchronous Execution Using Messaging

An alternative approach is to use messaging to delegate the execution of the steps to one or more asynchronous processes. This is done by publishing one or more messages to a message queue. A worker will then connect to the queue, read, and process the messages.

Diagram showing asynchronous execution

Advantages:

  • Faster response time since the service operation only needs to publish the messages to the message queue before the response can be returned. The actual message processing is done asynchronously in the background.
  • Since the messages are persisted in the queue, they are guaranteed to be processed. If a message fails, it will be retried. This allows for ensuring transactional integrity across services.
  • Even though messaging adds complexity, sometimes (not always) it can make the processes simpler, more isolated, and easier to test. For instance, by splitting a complex synchronous process into multiple messages that handle a subset of the functionality in isolation.

Disadvantages:

  • Since the service operation will return a response before the messages are processed, the system will be eventually consistent until all the messages are processed. This means the state in the data stores will be inconsistent until all the messages are processed and committed. This leads to the following challenges:
    • Operation responses cannot include data or state created by the asynchronous steps.
    • Consumers might experience an unavailable or incorrect state in the subsequent service calls if the messages are pending processing.
  • Messaging adds additional complexity. Message queues and workers are needed, and different processes publish and consume messages. Just seeing the code is often insufficient to get an overview of how the system works and interacts. Instead, documentation is needed to know how the messages are published and consumed throughout the system.
  • Handling messages that fail permanently is a challenge. The completed steps must be rolled back, and it must be communicated to the consumer from an asynchronous process.

Dealing with Eventual Consistency

Issues with eventual consistency can be reduced by ensuring messages are quick to process and ensuring the workers have high uptime by having robust code and good monitoring.

If this is not enough, the following solutions can be implemented.

Solution 1 - Check the Asynchronous Completion Status Using Polling

One solution is to provide a service operation that can be pulled every few seconds to check if the process has been completed:

  1. The initial service operation needs to accept or return an ID. This ID can be generated in various ways:
    • The ID can be generated by the consumer and added to the request.
    • The consumer can call an ID generation service and add it to the request.
    • The service operation can generate the ID and return it on the response.
  2. The initial service operation and its message handlers need to update the progress by storing the latest state in a data store.
  3. The check status operation can accept the ID on the request, look up the status by the ID in the data store, and return it on the response.
  4. The consumer can then call the check status operation at given intervals, like every few seconds.

Diagram showing polling

Polling is simple to implement and usually works fine for this scenario. But if better scalability or real-time status update is needed, push can be used instead. Push is more complex to implement but can be implemented using Server-Sent Events, WebSockets, or Webhooks. The service operation can then notify the consumer immediately when the process is completed without waiting for the subsequent interval polling.

Solution 2- Mix Synchronous & Asynchronous

For this solution, we will mix synchronous and asynchronous. Steps that are important to complete before the response is returned to the consumer will be executed synchronously, such as updating the database. The other steps are executed asynchronously. This gives a nice balance between consistency and responsiveness.

Diagram showing mixing sync and async

Other Considerations

In upcoming articles, we will discuss additional considerations related to messaging, including managing transactions across services and infrastructure components, ensuring execution occurs only once, achieving complete rollback, etc.