Big Tech Coach

What is a Message Broker?

titleImagePath

components/Message-Broker-Article.png

date

May 23, 2024

slug

what-is-a-message-broker

status

Published

tags

summary

Explore the critical role of message brokers in system design, facilitating scalable and reliable communication between distributed services to enhance system performance.

type

SystemComponent

systemType

probability

In the realm of system design, effectively managing communication between distributed services is a cornerstone for building scalable and resilient applications. Message brokers play a pivotal role in this context, providing a robust mechanism for decoupling services and facilitating asynchronous interactions. This introduction explores the integration of message brokers within system architectures, emphasizing their importance in achieving efficient and reliable service communication—critical for complex systems like file-sharing platforms and beyond.

Service-Service Communication

notion image

Up until now, we've operated on the presumption that our services somehow communicate with each other. Now, it's pivotal for us to delve deeper and understand how this communication is realistically implemented.

At its simplest, services could adopt the pull architecture, merely exposing APIs for mutual interaction.

While this serves as a foundational approach, as the multitude of services increases, leaning solely on REST APIs for system expansion and maintenance becomes arduous.

Impact of Sync Communication

notion image

One underlying challenge is the inherently synchronous nature of HTTP. In this protocol, the initiating party remains in limbo, awaiting the recipient's response. Such a waiting game brings its own set of predicaments.

Consider a scenario where the responding service crashes pre-response and remains inaccessible. Would it be acceptable for the initiating service to attempt reconnection thrice, only to subsequently flag an error?

This possibility paints a grim picture for our file-sharing system. Envision a situation where the notification service experiences downtime. A fresh data block gets recorded in the database, prompting the watch service to spring into action, relaying all checksums to the notification service. If the latter merely responds with an HTTP 500 server error message, that crucial information dissipates into thin air. This results in a disrupted synchronization flow, leaving clients oblivious to their outdated data.

notion image

Inserting a message broker into our system architecture emerges as a potential cure.

This addition aids in distilling our services, rendering them less reliant on one another and fostering asynchronous communication.

So, what does asynchronous communication entail?

In such a communication pattern, the caller kicks off a process, not lingering for its culmination or even bothering about its eventual outcome.

Does this ring a bell? Absolutely! The push methodologies, namely websocket and server-send-events (SSE) that we dissected in our preceding section, encapsulate this asynchronous essence.

Async Communication

Sync Communication Pattern

This style of message-centric communication not only boasts asynchronicity but also flips the data flow paradigm. Instead of a service instigating requests with specific intent, it proclaims a specific event occurrence, anticipating other services to instinctively respond.

Async Communication Pattern

Harnessing this concept internally paves the way for our services to function independently. As a result, the malfunction of isolated services doesn't wreak havoc on the entire system. Additionally, since services aren't idly waiting for responses, it mitigates latency issues. This is invaluable, especially when dealing with time-intensive tasks, such as synchronizing expansive files. With the foundational concept now evident, our next step involves delving deeper into the intricacies of the message broker, which orchestrates this asynchronous messaging dance.

Message Broker explained

A message broker stands out as a pivotal system component renowned for facilitating asynchronous modes of communication between services.

At its essence, a message broker operates as an autonomous server that bridges the communication gap between data producers and consumers. Here's the straightforward workflow: Producers channel their messages into the broker, and in turn, consumers tap into the broker to retrieve these messages.

notion image

By centralizing communication between services through a distinct system component, the system can more readily accommodate clients that frequently connect, disconnect, or crash.

Reliability is then shifted to the broker, which can be distributed across multiple physical nodes, just like non-relational database do.

The core premise of a message broker is to facilitate data transfer between various services, ensuring these services don't need to be aware of each other. This arrangement empowers the independent refactoring, deployment, and scaling of individual services without impacting the overarching system.

Most message brokers accommodate two distinct communication models: point-to-point messaging and pub/sub. We'll delve into each of these in the subsequent sections.

Point-to-point Messaging

In Point-to-point messaging, there is a one-to-one relationship between the sender and the receiver of the message. Each message is both sent and consumed once. This pattern is suitable when a certain action needs to be executed just a single time.

You might be pondering, “How does this differ from a REST API?”. The distinction is clear. The sending service remains unaware of the receiver; it merely dispatches its data to the message broker asynchronously, without anticipating a response.

The message broker ensures the message isn't lost even if the consumer fails. Should such a scenario occur, the message is resent to another subscriber for another attempt.

This message is securely housed within the message broker's queue. Owing to their foundational data structure, message brokers are also commonly termed "message queues".

notion image

Publish/subscribe

The publish-subscribe pattern, often referred to as "pub-sub," is the second communication method supported by message brokers.

This pattern varies slightly from the previous one. In pub-sub, there's a one-to-many relationship. The producer publishes a message to a specific topic. Subsequently, the message broker distributes it to all services subscribed to that topic. This method is especially useful for functions such as implementing notification systems or distributing independent tasks.

notion image

Message brokers vs. Databases

At first glance, the persistence of messages by message brokers until consumers process them might not be evident. This essentially positions a message broker as a specialized kind of database optimized for handling message streams.

While certain message brokers retain messages only in memory, others record them to disk, ensuring their retention even in the event of a broker crash. Some brokers, striving for state consistency across nodes, employ two-phase commit protocols akin to ACID databases. These characteristics make them quite analogous to databases.

However, when conceptualizing them, it's helpful to consider message brokers as highly specialized noSQL databases, endowed with distinct attributes:

Traditional databases generally retain data until deliberately deleted. In contrast, many message brokers automatically expunge a message once it's successfully relayed to its consumers. As a result, such brokers aren't ideal for long-term data storage.

Predicated on the rapid deletion of messages, most message brokers operate under the assumption of a limited working set—short queues, in essence. An overflow of buffered messages due to sluggish consumers can extend the processing time for each message, potentially diminishing the overall throughput.

In conventional databases, query results usually reflect a specific temporal snapshot of the data. If a subsequent client alteration modifies the query outcome, the initial client remains uninformed of its outdated result (barring a repeated query or active polling). Contrarily, message brokers, while not supporting such queries, actively alert clients to data modifications—namely, the availability of new messages.

Popular Implementations

The landscape of open-source message brokers is led by several notable entities, namely RabbitMQ, Kafka, and Redis. Although each one is potent and efficient in its own right, the subtleties in their operational nuances become particularly discernible when applied to extensive scales and intricate use cases. Let’s briefly delve into their distinctive features and capabilities.

RabbitMQ

RabbitMQ supports advanced and complex routing options. Messages are not send to queues directly but published to exchanges which then distribute message copies to queues based on custom rule sets.

Kafka

Kafka was created at first to track website activities. This required to handle massive loads for a long period of time. And that’s what Kafka is good at. It can be even used for streaming data to storage systems.

Redis

Redis is not a classic message broker either, it's rather an in-memory data store. But it's often used as a message broker. If you have a system that requires an extremely fast broker, and the lack of data durability is not a shop stopper – redis can be a good choice.

Azure Service Bus & AmazonMQ

Venturing beyond open-source implementations, several cloud providers proffer managed services, alleviating the burden of setup and maintenance from your operational responsibilities.

Azure Service Bus is Microsoft's enterprise message broker with message queues and publish-subscribe topics.

AmazonMQ is a AWS's managed version of RabbitMQ.

Cloud Tasks, Cloud Pub/Sub are google's solutions which may be used to implement message passing and asynchronous integration.

All three reduce your operational responsibilities by handling the setup and maintenance of a message broker for you.

Let's conclude by summarizing the advantages and disadvantages of Message Brokers.

Summary

Advantages

Producers can send messages even if the consumer isn't available. As long as the message broker is operational, no data is lost.

Message Brokers facilitate asynchronous processing, enhancing system performance. They also enable services to be decoupled, simplifying development, maintenance, and scaling.

Disadvantages

It's also important to be aware of the disadvantages of message brokers:

Limited Use Case - They're not suited for scenarios like login and purchase where the calling service expects a direct response.

Increased System Complexity - Integrating a message broker introduces additional complexity, meaning more components require oversight to ensure the system functions properly.

If the broker gets overloaded with buffered messages due to slow consumers, the time taken to process each individual message might increase, leading to a reduction in overall throughput.

Sluggish consumers can further impede throughput.

/