Choosing the Right Messaging Solution for Your Architecture
About this video
This session was presented at NDC Melbourne 2025.
Asynchronous messaging is becoming increasingly popular in the architecture of distributed systems. And why not, when you have the benefits of increased scalability, fault-tolerance, performance, and decoupling? From ordering your latest gadget to dealing with mission-critical, high-performance systems—messaging is gaining traction.
A message queue is the most critical infrastructure in distributed systems that use messaging. There are plenty of choices, ranging from databases to cloud solutions. But how do you decide what is the best fit for your organization and your system?
Join me as I discuss some of the most popular queuing systems - benefits, trade-offs and gotchas, with code samples. By the end of this session, attendees will have a clear understanding of each system’s strengths and weaknesses and will be equipped with the knowledge to make informed decisions on the best messaging solution for their specific application architecture.
đź”—Transcription
- 00:00:04 Poornima Nayar
- Good to go? Okay. So thank you for being here, everyone. I'm going to dive straight into the subject. So imagine you are building or trying to bring in messaging-based architecture into one of your projects, a high-throughput, high-performance system. You have been researching quite heavily into what messaging involves, what messaging architecture is, and you've understood the ins and outs and you are convinced that this is the path you want to take. And when you embark on that journey, one of the first things you want to do is arrive upon the message broker or message queue. That's the first thing that you need technically to start implementing your system. So in that journey, what are the tools that you might come across? What are the things that you will learn? What could be the conclusions be? And what are the questions that you might ask yourself if you want to consider a message queue in your system?
- 00:00:59 Poornima Nayar
- So these are the questions that I'll try to answer with my session today. I am Poornima Nayar, I'm a software engineer and solutions architect at Particular Software. You might have seen our booth by room number one, so feel free to come to us and talk all things messaging because we build NServiceBus and the entire service platform around NServiceBus. I have traveled from the northern hemisphere to be here with all of you, so thank you once again for being in my audience, and thank you to NDC Melbourne for having me as a speaker. I am on LinkedIn, X, and Blue Sky as Poornima Nayar, so feel free to connect with me out of course. If you have any questions, save them to the end of the session because I'll be at the booth trying to answer your questions and we have some cool books on the fallacies of distributed computing to give away with some cool stickers.
- 00:01:47 Poornima Nayar
- One of the questions that you might get on your messaging architecture journey from your team, maybe from other teams across the organization, is why are you bringing in messaging? Why not HTTP? Because you have been using gRPC or GraphQL or a RESTful API. Why? The short answer is temporal coupling. Think of the old school telephone communication where both services have to be up in the same time without which the communication that cannot happen, that is temporal coupling for you. Two services should talk to each other and can talk to each other only because they should be up and running at the same time. That is how HTTP works. It's synchronous processing. HTTP cannot do async processing. And with HTTP, we are reliant on the HTTP toolkit, the first one being request-response, where the server sends a request or the client sends a request to the server and the server responds back with an answer.
- 00:02:45 Poornima Nayar
- And what if the server goes down? The request has not got into the server. What if the response dies on the way back? No answer. So imagine you're trying to build your high-performance, high-throughput system connecting these services in place in a chain, one service failure means there's cascading failure. It blows on the face of the user, which might not be something that you can tolerate in your system. So temporal coupling and request-response pattern is the two things that we try and avoid in messaging queues and messaging architecture because your request goes and stores durably in a message queue or a message broker. So I am talking to Mike, who's my friend here, so I do not say, "Hi Mike." Instead of that, I go and put my, "Hi Mike" in a message queue, and Mike take that "Hi Mike" from the message queue and processes it. Yeah? So good to go.
- 00:03:41 Poornima Nayar
- And when it comes to the tools that are out there, you are spoiled for choices. I am pretty proud of this picture that I took yesterday. These are lamington cakes. I tried lamington cakes for the first time. I love them and I love the colors as well, so I'm putting it up there and you will have to tolerate it. But I don't have lamington cakes for you. Instead, I have Azure Service Bus, SQS, and RabbitMQ, the three most popular choices out there. But just go and have a look for queuing systems out there, there's ActiveMQ, there's IBM MQ, ZeroMQ, even Kafka, which is kind of, what is it, pushed into that slot as a queuing system. But more of that at the booth. Come find us. But these are all message brokers and being message queues and message brokers, everything has some similar patterns. Send-receive, publish-subscribe, dead letter queues, message ordering, and delivery guarantees. These are features and patterns that you see over and over.
- 00:04:45 Poornima Nayar
- So think of it like the two variants of an iPhone. Some has some features, maybe the more expensive model has more features. It's the same pattern that follows with message queues. So wherever possible, I will highlight the similar patterns and then give you the standout features or gotchas that you might need to watch out for. But there are some standout features as well. I cannot go back to UK without discussing those, so there will be some standout features that I'll discuss with you. And we'll start with SQS, which is a simple and reliable fully managed queue, which requires minimal maintenance. Why? Because all you're doing it is provisioning it on the cloud in AWS and you are just managing the queues, not the infrastructure itself.
- 00:05:33 Poornima Nayar
- One of the key things with SQS is it can auto scale on demand, so there's a huge number of messages that you can store within SQS. Oh my god, it's hot in here and the light is directly on me. I'll try not to be a pool of butter here by the end of the session. And being a cloud offering, what you would expect is integrations to other cloud offerings, and that's exactly what AWS gives you, SQS gives you. The first one or the only one I want to highlight here is AWS Lambda. You can use all of this, say CloudWatch for monitoring, step functions, but you start using these patterns or these offerings with SQS and what you have is cloud-native application on SQS. With AWS Lambda, you can go serverless as well. So all good.
- 00:06:24 Poornima Nayar
- Talking about the message size, you are looking at smaller payloads usually with messaging, and with SQS you can have a message size of up to 256 kilobyte, but you have unlimited amount of messages that can be stored in the queue. Remember auto-scaling? That's where this kicks in. And if you really, really want to have large messages as a part of your architecture, you can go for S3 buckets. There's native integration there. So in this case, your message itself would go and store in the SS3 bucket and the queue would hold a pointer to that message in the SS3 bucket. You can have batching as well. You can batch messages up to 10 messages in a single batch. That's the upper limit. And you can do that with sending, receiving, or even deleting. But the total batch size cannot surpass 256 kilobytes either. So that's a gotcha there, if you are into batching.
- 00:07:22 Poornima Nayar
- Of course being on the cloud, there is a cost involved. There's 1 million requests a month which is free, and you need to think that every SQS action is a request. Receive, send, delete all are requests, and even if you use AWS console to poll for messages, that also is a receive action that costs money. And if you have 256 kilobytes of message, that is four different requests, because each 64 kilobyte chunk of payload is a single request. One of the features of SQS that I want to start talking about is standard queues. Very useful, because it gives you very high, nearly unlimited number of API calls. Have you heard of Hermione talking to the headless ghost, the nearly headless ghost, Nick? That's what this makes me think about. Nearly unlimited throughput, I don't know what that means, but I'm just going to go by the documentation. Nearly unlimited throughput.
- 00:08:21 Poornima Nayar
- But you have the concept of distributed queues. So what you see as a queue is like a virtual concept. Behind the scenes, your message, a copy of it goes and stores in multiple AWS servers so that you get durability and redundancy. With the message ordering itself for the message delivery, it's best effort at ordered message delivery. Standard queues are not for order delivery, but most of the times it works okay, unless you want order delivery, and what you have is a delivery guarantee at least once, which means that your messages can get delivered more than once, which means that at the receiving side, your code needs to be idempotent. Come talk to us again, we can explain more about that. A talk for a different day, trust me. We can have two-hour workshops. Actually my colleague Shimon and Tomek, they have a two-day workshop on idempotency and delivery guarantees. That's how vast the subject is.
- 00:09:23 Poornima Nayar
- So the first pattern we'll see is send and receive. This is me talking to Mike using a message queue. Simple communication, message getting stored durably. But there can be many, many versions of Poornima's and there can be many, many versions of Mike, all of us trying to send the message into the message queue, and many versions of Mike trying to get the message out of message queue. Horizontal scaling of services. But SQS makes sure that only one version of Mike can actually consume one message at a time. That is taken care of you. So this is point-to-point communication and the underlying message itself is kind of an instruction saying, "Hey Mike." It's not just, "Hey Mike. Do something for me." So it's a command, it's an instruction, usually a very high-value message.
- 00:10:12 Poornima Nayar
- And to see some code in action, we start off with SQS client, Amazon SQS client, and creating a object of type Amazon SQS client using the access key and a secret which can be set up using IAM user. Again, a vast topic, but just understand that this is needed. And to send a message, I use the SQS client and using the send message async API that you see there, I can send the message, the messages of type send message request with a queue URL, which is obtained from the console and the message body. In my case, the message body is a serialized JSON object, but you can have plain text or XML if needed. Now there is a lot of slides with a lot of code like this. All of it can be found in a GitHub repo, which I will share at the end. So create your own resources, connect to the GitHub repo, and you can get working and playing with the code. And at the end of the slides, I'll share the QR code. It also contains a comparison table of the three systems that I'm going to talk today.
- 00:11:22 Poornima Nayar
- And on the receiving... So that's send message async API. And on the receiving side, again, we started with the SQS client and we start polling from messages using the receive message async request. So that is a receive message request there. In this case, I'm only going with a very simple defaults in place with the queue URL. And what I get back after polling is a response with maybe zero messages and maximum one message, unless and until I specify batching. So once I have got my response, I can look through the messages and then process it, as I've shown in line 12 and 13, but the most important part is to make sure that you issue a delete command back into the queue saying that I have processed the message, now you can safely remove it, and that requires a receipt handle. The receipt handle is associated with the receive operation itself. Remember, you can receive the same message multiple times. Every time you have the receipt done for a message, the receipt handle varies, so you can't cache it. You need to use the correct receipt handle to go and delete it.
- 00:12:30 Poornima Nayar
- This is a very, very important step because if you don't do this and forget this, the message goes back into the queue, gets re-delivered again, and at one point it will dead letter. But more than that later. But you can also have attributes send as a part of your message, which are called message attributes, which is a dictionary of key value pairs, type of string is the key, and message attribute value as the dictionary value type. And you can send that alongside the message body with the message attributes property. Now there is a gotcha here. You can only have up to 10 attributes on a single message. On the receiving side is a biggest gotcha, so I wasted one hour of my time. Don't be Poornima, because you need to ask explicitly for the message attributes names. So here I have said I need all of them, but you can list what you want as well. But then onwards it's you... Once you have put that in place, you can receive that alongside your message and then you use that in processing.
- 00:13:31 Poornima Nayar
- One way to use message attributes would be to route to different handlers based on what is in a particular message attribute value, but then you would be building a mediator yourself. Don't do that. Don't reinvent the wheel. One of the standout features with SQS which can help you cut the course down is long polling and short polling. Short polling is the default. Talk about the distributed queues that we have with SQS. You get an immediate response when you use short polling because it samples a subset of the servers, which means that you have a responsive application, but you might have empty responses because there is nothing in the distributed queues, there's no messages.
- 00:14:14 Poornima Nayar
- You can also have false empty responses. Why? Because you are sampling a subset of servers. SQS is a sampling, a subset of, say, five servers. We don't have control over that. The five servers did not have any message but the sixth server had, and that was not returned to me. So that eventually get picked up, but then you have to keep trying. Imagine each receive operation is a request. So you're burning out of that 1 million requests pretty quickly. Long polling is what you might want to use if you can have a less responsive application, if you can afford it, because it queries all the servers and waits for a certain period of time before it gets back with a response to you, it can go up to 20 seconds without responding to you. It waits till it has at least one message to respond to you back with, which means that you have less empty responses, less false empty responses, and it can reduce costs.
- 00:15:10 Poornima Nayar
- And all of this can be set at the queue level in the console at the time of creation. You can do that with management APIs as well. But if you want to do it via code, the first option shows you what is the default. The default uses short polling with the wait time seconds set to zero or the wait time just not specified. If you have a wait time second set to anything more than zero and under 20, that is long polling for you. It waits for that amount of time before it tries to return a response back to you. So it can be useful. And this is a standout feature as well. You have some control of how you poll for messages.
- 00:15:51 Poornima Nayar
- Next is visibility timeout. This is a pattern you will see with different names under Azure Service Bus and RabbitMQ. Basically the idea is to protect your messages against unhealthy receivers that just cannot process the message, or holds onto the message without getting it back to the queue. So the idea is for this timeout period, the message remains hidden to other consumers. Once the message has been delivered to the consumer, it remains hidden for a set amount of time, during which the message cannot be visible to any other consumers for processing. But the idea is that within that visibility timeout, you need to process the message and also delete the message in SQS. Otherwise, again, the message goes back into the queue, becomes visible, and gets re-delivered.
- 00:16:41 Poornima Nayar
- So this again can be set at the queue level, but you can also kind of adjust it according to your receive message request. That would be using the visibility timeout option here in the receive message request. The default is 30 seconds, you can go up to 12 hours. And if you have legit customer, consumer, sorry, who wants to keep a track of how far the message has been processed and you want to renew that period constantly or periodically, that would be the appropriate word, you can use the change message visibility async method to do that. So you would implement a heartbeat mechanism and keep renewing the timeout. But even then you can go up to only 12 hours, not beyond that.
- 00:17:26 Poornima Nayar
- So as a best practice, you have to make sure that you adjust your visibility timeout period based on the actual time it takes to process a message. And remember, long visibility timeout can cost delay and retries. Say you have 15 seconds that you actually need to process a message, the consumer dies midway, then it takes... And your visibility timeout that you have set is like one minute. It won't become visible in the queue immediately. It'll take that entire one minute to become visible. So you can have delays if you don't judge the visibility timeout properly. And what happens when things goes a bit sideways? There can be messages that just cannot be processed, and that is where dead letter queues come in. It's a very similar or repeating pattern that you see across the board. And in SQS, the only way I think I understand can a message hit the dead letter queue is exceeding the maximum receive count.
- 00:18:25 Poornima Nayar
- So whenever a message is received by a consumer, not processed, received, there is a header called the receive count that gets incremented. So imagine you are exceeding the visibility timeout period or say you are forgetting to delete the message, the message goes back into the queue, gets re-delivered. So it can go a few times with increasing that received count header. At one point it exceeds the maximum received count and then it automatically gets into the DL queue. But happy days, DL queues are not created by default. You have to create them using the console or the management API and then configure it. So it can be sometimes forgotten if you're not careful. And one of the other things is that you cannot dead letter a message explicitly. The receiver cannot say, "I am lazy, I cannot process." That's not going to happen. It makes the receivers work. So that's the mum in me making my child do her work properly.
- 00:19:27 Poornima Nayar
- But if I have to tell you how to configure a DL queue... I'll try explaining it. So I am creating a queue called DL queue my standard queue, and I'm saying here that this source queue can use this queue as the DL queue. So a DL queue is just a standard queue. And then you go to the standard queue and say that this is the DL queue for this queue, and then specify the maximum receive count, which is five. So as the sixth delivery, it'll dead letter. And then once you have configured this all in place, you can go to the dead letter queue and start re-driving the message. That is replaying your messages back to the source queue. Which means that in the source queue, the messages are in jumbled up order. There could be senders sending the messages and the replayed message, which gets into the DL queue.
- 00:20:22 Poornima Nayar
- And one of the biggest gotchas is that if you have expired messages, as far as I understand, they cannot be DL queued. They just sit in the SQS queue and then evaporate into the thin air. So that is all about SQS standard queues. You might want to look into FIFO queue, FIFO Queue which is all about first-in first-out message delivery. Everything that we spoke about SQS standard queues matter, like long polling, short polling, visibility, timeout, dead lettering, all of that matters here, but it focuses on exactly once delivery. So messages always delivered exactly once. Retries do not disrupt order. And there's message deduplication baked in using a deduplication ID that you can provide or you can enable content-based deduplication.
- 00:21:14 Poornima Nayar
- And if you have a delivery order and a processing order in FIFO queue that is strictly maintained. So this is going from zero to a hundred in two seconds. This is a two-hour topic in itself because it gets so complex with a lot of edge case scenarios. Next is publish-subscribe, your event-driven architecture or the basics of it. Me talking to you, this broadcast is me publishing the event, that is my talk, and you all are subscribers because you're listening to me. So this is all about something that has happened in the past, a fact that cannot be changed. Something has happened and that is an event. So that is the publish-subscribe. I am publishing and you are all subscribing to my talk. Happy days again because SQS is just a queue, it does not have pub-sub. If you want to bring in pub-sub using AWS, you need to look into Simple Notification Service, SNS. And with this in place, you have a service that publishes to the SNS topic, and you can bring in SQS queues, standard queues or FIFO queues, and then plug that into SNS topic as a subscription.
- 00:22:23 Poornima Nayar
- And from there on the topic pushes out messages to the standard or the FIFO queues, and then everything that we spoke about, the queues, kicks in. The visibility timeout, the long polling, short polling, all of that. So again, some code in action. We start with the simple notification client, and in the API, which is the PublishAsync API, so you're publishing a message into the topic. I am using a PublishRequest object which has a topic ARN, and a message. Here I'm using a simple plain text message, but it can't be JSON if needed. And on the receiving side, there's two steps. First I have to bring in a queue and say that I want to subscribe to the topic. That is done using the SubscribeAsync API. So in here I start off with the SNS client, then create a new subscribe request specifying the topic ARN and the endpoint.
- 00:23:19 Poornima Nayar
- So SNS can have many, many endpoints like emails, SMSs, and SQS is treated as one of the endpoints. So you can bring in that endpoint and connect to it, and using the SubscribeAsync API method here I can subscribe the queue to the topic. Gotcha here is that that endpoint that you see there, that is the ARN of a queue. That queue needs to be created beforehand before you start subscribing to the topic. And from there on the receive method is all what we have discussed, nothing new there. Receive from the queue, read from the message, and process and delete it. So this is like a broadcast, like you all are listening to me at this moment. What if you don't want to listen about Azure Service Bus? You close your ears or you go out. Don't go out, by the way.
- 00:24:11 Poornima Nayar
- So in that case, if you want to select a subset of messages into the subscription, you can bring in what is known as filter policies. So in here I have defined a variable called filter policy, which has some JSON inside it, which says my attribute int and value file. So how it works is as a part of the subscribe request, I can pass in attributes and say I have a filter policy called that filter policy and I have a filter policy scope called message attributes. So what then happens is whenever there's a message arriving at the topic, it looks at the filter policy here and it understands that, "Oh, there's a filter policy and the scope is set to message attributes," so it goes and looks in the attributes of the message, checks whether there's an attribute called my attribute int, checks whether its value is five and only then select that message into the subscription. So you're being selective about what you get.
- 00:25:07 Poornima Nayar
- With SQS a standard feature is that you can not only have message attributes based filter policy, you can also apply it on the message body, provided your message body is JSON. So that is filters. This is a repeating pattern which you'll see elsewhere as well. Another gotcha is about raw message delivery, because SNS, when it pushes out messages into SQS, it has its own payload. So ensure that raw message delivery is set to true. You might want it. Because without that... That is the attribute there. Without this, this is the payload that arrives at SQS and your message is here. So it'll be a deep kind of message body that you'll be looking at in the SQS.
- 00:25:55 Poornima Nayar
- The biggest gotcha that wasted my time was to make sure that SNS can have the permissions to publish to SQS queues. If you don't do that, no message will arrive. And if you have DL queues as a part of your SQS, of course SNS can put a message into the DL queue directly, say if there's no permissions here, but then again, it needs the permission set up on the DL queue itself so that it can publish the message out. So there's a lot of management going on. And SNS can talk to FIFO queues as well, so think about it, I can stand here, talk for a day, and I don't think I'll be done with SQS.
- 00:26:33 Poornima Nayar
- But I'm going to move on because I need to talk to you about Azure Service Bus as well. And this is my favorite because it does what it says on the tin so far. And this is all about being a fully managed enterprise message broker. So that's a new term. All we have been talking about is queues. This is a broker because it can do clever things, things like routing... Yeah?
- 00:26:55 Speaker 2
- Just a quick question, If you have a local queue, what would you use for development?
- 00:27:09 Poornima Nayar
- Okay. The question is if you have a local queue, what would you use? At this point at least what we see in our customers, what they do is they use RabbitMQ for local development. So RabbitMQ for local development and then they move on to other things. That is because we have a kind of same language API layer, but you might want to rethink how you approach it if you're using native clients. But more on that later. Again, come to the booth, we can discuss this further. This is a message broker because it can do clever things like message transformation, for example, routing, there's transactionality. So this is a message broker, so you have much more bigger functionalities. It is a PaaS offering in Azure, and just like AWS, you would find integration with Azure services. I think my favorite one is the Azure Functions one because that's something I've used in the past and I like it. So hence here.
- 00:28:09 Poornima Nayar
- You have some local development experience support these days using the Service Bus emulator, which is a newer tool, but I don't think it is entirely free. You still have to pay a little bit for it. And there's always the Service Bus Explorer to administer messaging entities. And of course there's cost with Azure Service Bus. With the standard tier that is meant for development usage you have $10 which you might pay for the month and that makes use of shared capacity behind the scenes. And premium tier, which is recommended for production workloads, you enjoy dedicated capacity, but there's bigger cost involved, but please do not use a standard tier in production. And if I'm to go by Clemens Vasters what he has said in his talks, standard tier uses native SQL Azure for persisting all the messages. With premium tier it is something else. Don't ask me what that something else is, I have no clue.
- 00:29:04 Poornima Nayar
- Send and receive, we saw the pattern, but there are two things I want to speak about with send and receive in Azure Service Bus, which is the modes in which you can do the send and receive. The first one is the destructive read. So use this if you can tolerate message loss. This is called the receive and delete mode because here your messages are marked as consumed and then sent to the consumer. So if your consumer restarts or is restarting or it crashes while processing, the message is lost from the queue. So here we have at most once delivery. So that is a third delivery guarantee. Normally message brokers and message queues goes for at least once because it kind of gets rid of the message lows as much as you can. The second mode, which is the recommended mode, which is the peak lock mode or non-destructive read, here the message is locked and then sent to the consumer, for a set period of time again, like visibility timeout period.
- 00:30:06 Poornima Nayar
- And the expectation, again, is that the consumer processes the message and then marks it as complete, not deleted and sends that request to Azure Service Bus, and that needs to happen within what is known as the peak lock timeout. So this is called peak lock timeout here, visibility timeout period in SQS. Here we have at least once delivery and this is the recommended mode usually because your message is safer, which is what we want. Looking at the code, you have Service Bus Sender, which is the type of object I want to start with, and then in line 12 and 13, I can use the SendMessageAsync method to send my message, which is of Service Bus message type, which has my message body. So it can be a JSON serialized object as well. No problem. I'm going for something simpler here. But note that it is SendMessageAsync here because we will come back to it later.
- 00:31:02 Poornima Nayar
- And on the receiving side we start with the Service Bus processor and the processor class has event handlers which you can implement and it can be implemented for processing a message and there's event handler to process an error as well, which is kind of mandatory to have some kind of implementation. And looking at the event handler itself, the implementation of it, you get your message as a part of the argument here, processMessageEventArgs, process it, and then issue the complete message command here at which point it gets deleted from the queue. If this again is forgotten, if it doesn't happen within the peak lockout timeout period, the message goes back into the queue, gets retried.
- 00:31:48 Poornima Nayar
- You can also have message attributes and they're called application properties here. It's not as extensive as in SQS, but it's very simple key value pair, and I think the upper limit of the size of the headers put together or the properties put together is 64 kilobyte. There's no upper limit on the number, it is the upper limit on the size of the properties itself. And you can receive it as a part of the receipt operation. You don't need to do anything. You can get it and process as normal. So that's easy peasy. Talking about message sizes, standard tier is 256 kilobytes. Premium tier, you can go up to a hundred megabyte per message, but there is an upper limit on the queue size itself. It's a 5 gigabyte, that is upper limit on the queue size itself in Azure Service Bus. And if you are into batching, you can go up to 256 kilobytes as the size of the batch. With premium tier, it is 1 MB. There's no upper limit on the number of messages in the batch. The upper limit is on the size of the batch itself.
- 00:32:53 Poornima Nayar
- So talking about the queue size, again, you can increase it up to 80 gigabyte if you use partitioning, which is like an advanced topic in Azure Service Bus. But it comes with a cost because you are looking at a of 10,000 entities per namespace if you're using partitioning. But it improves the message throughput, reliability, the resilience, the availability, all of that, because you are using something like distributed queue architecture beneath the scenes here. So advanced topic, again something for a later day, partitioning. Happy days because we have publish-subscribe in Azure Service Bus, you have a sender publishing to a topic, and then you bring in your code that is your receivers and subscribe and that is it. You are into subscription. You don't need to set up a queue and then subscribe because subscriptions are like virtual queues in Azure Service Bus. Azure Service Bus manages it for you as long as you give a subscription name to your subscription.
- 00:33:56 Poornima Nayar
- There's an upper limit on the number of subscriptions you can have per topic, which is 2,000, but you can get cheeky and go further if you have auto-forwarding in place. So you can have one topic sending to another then that forwarding it to another and so on. But you can go only up to four kind of topics as far as auto-forwarding goes. That's called chaining and auto-forwarding. So you can get cheeky and creative, but you cannot go really far with it because messages get dead lettered. You can bring in subscription rules like we had with SQS topic filters if you want to receive a subset of messages, and subscription rule is a lot more, what is it? It has a lot more functionality than just selecting message because you can have the filters to select the messages in and you can have actions to annotate or transform the message as well, or change something in the message saying that here I have passed the subscription now. And the best thing is that with each subscription you can have multiple rules specified as well.
- 00:34:59 Poornima Nayar
- And when it comes to the filters, they are a little bit more expansive than what we saw in SQS because you can have SQL filters where it uses a SQL-like expression against user-defined and system-defined properties. You can have Boolean filters. You have the true filter which selects in everything, which is the default. You can have false filter as well. I think I have a use case for it. For example, if you want to extend your system but still not ready to receive messages, you can use the false filter. That's the only thing I could think of, so I'm just putting it out there. Correlation filters, which is a set of conditions, again, matched against user-defined properties as well as system properties, but it is a case-sensitive string comparison. But filters cannot work against the message body. It always is against the message application properties. So that's a thing that stands out.
- 00:35:54 Poornima Nayar
- SQS says I win here, but if I'm to show you filters and actions so the publishers can publish to a topic, subscriptions comes in, and on the first subscription say I have a Boolean filter called true, so no action gets taken place, there's no filtering, every message from the topic hits that subscription. In subscription two, I have a SQL filter in place which says color equals red, so only messages which has application property which is called color set to red gets into that subscription. In subscription three, I am going to put a correlation filter in place with the subject being blue and I'm then adding a new property to the message itself which says action set equals yup, and these are the subscriptions in place.
- 00:36:42 Poornima Nayar
- So if I go and now show you the code in action, on the send side I am setting a subject called chosen color, which is a random color chosen of the three colors in the colors array here, red, blue, and green. I'm setting that as a subject. Subject meaning it is a purpose of the message as far as the application goes. And I'm also setting that chosen color into an application property called color here and then sending the message. On the receiving... So here comes the first difference here. Remember in SNS the API was all about publishing. Here, publishing an event is also sending a message. So the API has differences there. And on the receiving side the code is... I don't know how visible it is, but it is there in my GitHub repo, so you can download it and then have a look.
- 00:37:34 Poornima Nayar
- So in the first example I'm saying create a rule, bring in a subscription, and the rule I want is the true filter. Remember, there is absolutely no queues in place here, they are created beneath the hood for you. In my second example where I wanted to select all messages with color red application property in, that is the way I would have a SQL filter in place. Looks like SQL filter because you can bring in SQL expressions and like and things like that. And in the third one I'm using a correlation rule, correlation filter. I'm selecting all messages with the subject blue and then adding an action to set a new property called action set equals yup. And then it is things as usual, you receive the message and you process them. The main difference here is that you don't need to create the queue yourself to subscribe to the topic.
- 00:38:30 Poornima Nayar
- Use correlation filters over SQL filters because they are much more performant and less impact on the throughput as well. If you want to see a demo of what I just showed you in action, come to the booth, I'll show you the demo myself. You have dead letter queues again in Azure Service Bus, but thank God they don't need extra management because they are created automatically for you, and the DL queue can be accessed at the queue URL slash /$DeadLetterQueue. But there are many, many different ways in which a message can get into the dead letter queue, so exceeding the max delivery count, which is like exceeding the receive count in SQS. You can configure the expired messages to go into Azure Service Bus dead letter queues, which means that you have further lesser message loss here.
- 00:39:19 Poornima Nayar
- If you have errors in processing subscription rules, then things can dead letter. If a message does more than four hops in an auto forwarding scenario, so remember I spoke about chaining of topics, so it can go up to the fourth topic and then it can dead letter. So that is where Azure Service Bus tells us, "Yep, this is where I have to limit you." And most importantly, you can have application level dead lettering. Like me as a consumer, I can say that, "You know what? I'm just so tired, I'm lazy, I just want to restart, so I'm going to dead letter you." Yeah, go ahead. Who is stopping you? Then you have message sessions, which is... Okay, let me look at the time. Yes, what is it?
- 00:40:02 Speaker 3
- Just a quick question [inaudible]
- 00:40:09 Poornima Nayar
- As far as I know they don't, but I can refer and give you a more exact precise answer. But as far as I know with Azure Service Bus it doesn't. It is with SQS that happens, yeah. So message sessions have an overlap with the FIFO queues in SQS, but it is not as big a functionality. This is more of I am going to focus on an ordered processing of a message, and that is what message sessions is all about. But you have some kind of an overlap. So with message sessions, it creates kind of virtual queues beneath the scenes because you can group messages and put them into session IDs or pockets of information. And within that pocket, which is your session, messages are processed in a ordered manner. So that is message sessions for you.
- 00:41:00 Poornima Nayar
- So if I have three different sessions, the one which is purple, which is green, and peach in color. So the purple ones have a session ID of C, the green ones have a session ID of B, and the peach one has a session ID of A. So within each of those sessions, A, B, and C, message will be processed in the order in which they get into the session, which means that on the receiving side, the minute a receiver gets the first message of the session, so it has not seen that session until that point, I have received the first message of the session, so then I am locking up that entire session to myself. So every message from, say, that message session A will come to me. No one else can process it.
- 00:41:47 Poornima Nayar
- But if I want, I can take up messages from session A and B and process them, but no one else can take the messages from A and B. Those sessions are locked to me. And if I have to release a lock, I can do that as well when the lock expires or the methods close up. But you have guaranteed message order within a session, and retries do not disrupt the order here. And on the sending side, the main difference is when I create my Service Bus message here, I add a session ID, which is usually an application generated unique ID, and Azure Service Bus doesn't know when you have the last message of the session. So in this case, I'm adding an application property here called isLast to denote whether this particular message is the last message of the session, which is a simple Boolean condition that I have. And I send the message.
- 00:42:42 Poornima Nayar
- On the receiving side, everything is same as normal, Azure Service Bus will do the work for us. What we need to make sure as developers is that of course you have to process and complete the message, but if it is the last message that is shown here, we need to process the message and set the session state async to null and release the session. That is very, very important to do. So you can use sessions for related messages, you can have a unique session ID which is application generated, and the maximum number of sessions that a receiver can process concurrently is eight. Azure Service Bus says, you can kind of wedge it to use as a workflow, but don't use that. Bring in something like a saga pattern using message middleware to achieve that.
- 00:43:35 Poornima Nayar
- Transactionality is also a part of Azure Service Bus. Here you can group two or more operations and execute that in a single transaction scope to achieve atomicity. So if I'm receiving a message, I've done some processing, and now I want to send two different messages to a single queue, I can make sure that both those messages gets received and persisted into Azure Service Bus atomically, and you won't have a scenario where one message completes and one message fails. That is possible edge case scenario. One message goes in and then at that very moment, Azure Service Bus fails and the next one goes in, it doesn't get persisted. Things happen. So you are increasing the reliability further of your system by bringing in transactionality, and the operations that can be put into a transaction scope is sent, complete, abandoned, dead letter, defer, which is message deferrals, which I'm not covering today, but something that you might want to look into, and renew lock, which is you're increasing the peak lock timeout.
- 00:44:38 Poornima Nayar
- So what about receiving? Receiving is not a part of that operations at all because receiving is always assumed to be under peak lock, so there is some transactionality built into Azure Service Bus there. So once you receive the message, what happens is what goes into transactionality. So to kind of explain what goes on behind the picture, I'm sending two messages here, one after another in line 5 and line 11. Now between line 5 where I send the first message and the second message, if I try to get the number of messages in the queue, I would get the queue count as one. Now if I send a second message and then try to get the number again, the queue count would be two. So with every send it is persisted into Azure Service Bus. But with transactional scope enabled as in line one, which is highlighted, if I then try to send messages and then try to get the number, it would constantly show as zero because it is at the transaction complete on line 15, and when that is completed is when the two messages goes and persists into the queue.
- 00:45:45 Poornima Nayar
- And at that point, sorry, if I try to get the number of queue messages in the queue, I would get the number of messages as two in the queue. So that is transactionality for you, and you can have cross entity transactions as well, so complete a message and then send messages to two different queues. That is also possible. So this is the scenario here. And the code for that would be this. So you receive a message, do some processing, and then you put your complete message, send messages into one queue and a second message to another queue, and then mark the transaction as complete. So underneath the hood there is a lot going on, there's a great explanation in the docs for it. Again, come find me if you want the resources, I'll share that with you. But you have seen the processing is not a part of this. Why? Because as Azure Service Bus, transactionality cannot offer transactionality of your business data. For that, you need to bring in the outbox pattern using a message middleware.
- 00:46:49 Poornima Nayar
- So this is where you can have exactly once, accurately exactly once processing going on. And that takes me to RabbitMQ, which is like the Swiss army knife with a jackhammer and a, what is it, hacksaw on top of it. Again, tall order for the next 15 minutes, but let's see how it goes. So this is completely open source message broker with commercial offerings available and it supports multiple protocols, AMQP, MQTT, STOMP, and it offers at least once delivery guarantee. You have active community support with RabbitMQ, and if you want to support high availability and distribute the load, we recommend clustering. Again, that is all about RabbitMQ management. And if you have to host RabbitMQ, you can do it on-prem or even in the cloud, and RabbitMQ is compatible with Docker and Kubernetes. So if I'm developing locally using RabbitMQ, I usually do that with my Docker container.
- 00:47:53 Poornima Nayar
- So all of this shouts out loud that when you are looking into RabbitMQ, you need to bring in RabbitMQ management as a skill set into your team. You need to learn extensively about RabbitMQ management before you go down this path, because this is a vast, vast, vast topic. With RabbitMQ the queue size is limited by the server disk capacity. Usually we see customers putting it at 80 gig and then saying, "Hey, my queue is not getting any messages. Why?" That is because your queue has built up and you need to get rid of the messages. But you can have arbitrary message size that goes up to 512 mebibyte, but please don't do that. Don't have huge message sizes. Keep it low and rely on something like the claim check pattern using the data bus to have large message sizes. The claims pattern is on par with the large message support that I spoke about in SS3 Buckets and SQS.
- 00:48:51 Poornima Nayar
- Okay, send and receive gets really interesting with RabbitMQ because they really think about competing consumers pattern. Remember the scale out scenario of receivers? There are two modes in which you can have competing consumers, which is round robin, which make sure that messages are delivered the next consumer in sequence, regardless of whether the consumer is ready to process it or not. And there's fair dispatch, which makes sure that consumers get the messages only when they're ready to process it. So if you have a very busy consumer, it won't receive the next message, it would go to the one which is less busy so that it can be worked on. So looking at the code here, the first thing I start with is declaring a queue. Note that there's a little thing here called a durable parameter set to two. With RabbitMQ you can have transient queues which does not persist to the disk, but here I'm setting the queue to be persistent, which always is stored on the disk, which means that if your RabbitMQ broker restarts because you are managing it, the queue is still persistent.
- 00:49:56 Poornima Nayar
- But that doesn't mean your messages are persistent because you need to have this persistent property set to true on the message for the message itself to be persistent on the queue. And then I can use the basic publishAsync method to publish to the queue. So note that it is all about publish here. So you are always publishing. I think I know the reason why, but more on that later. And on the receiving side, I start off with creating a consumer, and there is a callback called the receiveAsync method where I process the message and then I want to highlight the basic async. So there I'm telling the RabbitMQ that, "Hey, I'm acknowledging that I've received and processed the message. Now you are going to go and delete it." So that is where the delivery tag comes in from. It is similar to the ReceivedHandles scenario.
- 00:50:48 Poornima Nayar
- Now the way I am simulating a delay in the worker, showcasing a busy worker, is the way I look at the number in the message. So in my message I have something called a number, which is like a message body thing, which I have put in. So I try to get that number out and add that many milliseconds of delay. So as the number goes up, the more delay it adds. But I also wanted to bring in something here which is autoAck is equal to false. It is only when you have autoAck false is when you need to use the delivery tag. You can also have autoAck true, which is on par with the destructive read mode, receive and delete mode in Azure Service Bus.
- 00:51:36 Poornima Nayar
- So quick view of the round robin dispatch. So I have two receivers and one sender which keeps sending messages. If you look at the numbers that are getting received there, it is sequential 1, 2, 3, 4, and so on. It doesn't really understand whether the consumer is busy or not. It just gives the message. So there's a message probably coming and waiting for a worker which has a 6 millisecond delay, already. 6,000 milliseconds or six second delay already. Now moving on, if I want to have the fair dispatch, what I need to do is have this API call here, which is basicQosAsync and set the prefetch count to one, and again, this is basicAck and the task delay which I explained before, which is here. But if I'm to show this in action, there are three receivers, I think, or maybe two, I don't know, but you can see that the messages that are received is not sequential, it just waits and watch who is less busy, and the message gets in there. So this is to C3 and 7, but the others have already processed four messages by that time.
- 00:52:45 Poornima Nayar
- So that is fair dispatch for you. You also have publish-subscribe, and that is where we start talking about exchanges and bindings. With RabbitMQ, you never ever publish a message into the queue. It is always published into the exchange and that is the reason why I think I know why it is a publish, not a sent message. And if you are asking me why everything worked so far, the reason is that when RabbitMQ starts up, it has a default kind of exchange. So if you do not bring in an exchange, you're always sending messages into that exchange, and that exchange knows where to send queue to. Okay, I'm getting hit by the jet lag now, so let me go quick, because I'm still, what? Yeah, I'm starting to sleep in my house at this point of time.
- 00:53:38 Poornima Nayar
- So exchange is basically your routing table. So when you have a exchange defined, that is your empty routing table, it doesn't know what to do, and that is where bindings come in, which is your routing rules and tells you that, "Hey, exchange, this is the routing rule that I have in place. I am bringing in a queue and binding to you using this routing rule. So work according to this rule, don't go away from the rule, work according to the rule, and make sure that messages are received." So binding is kind of a relation between the exchange and a queue. And if you are thinking of the routing logic, it is implemented using the type of the exchange and the routing keys. And there are four different types of exchange, direct topic, header, and fanout.
- 00:54:22 Poornima Nayar
- Fanout is what is happening now, which is a broadcast, a copy of the message is received by all the consumers. There is no subscriptions in place here. It is all about bindings. And if you have a routing key, which is like a message property, that is completely ignored. So the key thing here is to ensure that there is an exchange tape type called fan out, and I can then publish to the exchange. So here the exchange name is logs. And if you look here, there's a string.empty, that is seeing there's no routing key, this is a fan out. And on the receiving side... Sorry, so that is a publish. On the receiving side I am binding my queue to the exchange called logs and there's no routing key. So once that is in place, I can receive the messages and then process and then ack them.
- 00:55:18 Poornima Nayar
- There's also direct exchange where you are binding by a specific routing key. So the routing key attached, the message will be, say, error, and if on the binding side, that is the consumer side, if I am binding using that routing key, messages get to me. So that is the scenario at play here. You have producers publishing to the exchange and then you have queues coming in binding using the routing key error. So if I am publishing using the routing key error, the binding which uses the routing key error gets the message, here the routing key with the info routing key gets the routing key messages, sorry, the messages with the routing key info and so on. So exact match of the routing key is the pattern here.
- 00:56:05 Poornima Nayar
- So queues must be bound using that exact routing key. On the sending side, I am declaring a direct exchange type and publishing a message into the exchange using a routing key. And the routing key that I've set up here is one taken randomly from this array. It can be info, warning, or error. So that is a routing key. And on the receiving side, I am binding my queue with the direct logs exchange using a routing key. So here there is a binding using the info routing key and there's a binding using the warning routing key, which means that I'll get all the messages which has either info or warning as the message key. So that is possible. So that is the routing keys highlighted here. So it's exact match to the routing key that we are laughter.
- 00:56:58 Poornima Nayar
- And if I have to go further with this pattern, we can bring in topic exchange where you bring in patterns of routing key. So you have a routing key specified as a part of message sending. And that routing key must be delimited by period or full stops, so it could be like a log.error, or log.error.level1, for example. And on the receiving side I would suggest a pattern which contains either a hash or a star, and that gets replaced by whatever the pattern is coming in, and then I get the messages in. You can also have the headers exchange with then compares the value in the message header, but for that, the type of the header would be headers, and the routing key is completely ignored. So there's a lot of things about just publish and subscribe pattern when it comes to RabbitMQ.
- 00:57:51 Poornima Nayar
- You have dead letter exchanges, not dead letter queues in RabbitMQ, because remember it is all about exchanges. And the way a message can dead letter is when it exceeds the delivery limit, again the max count, and this pattern, the max delivery count in Azure Service Bus. When a message expires, when a consumer rejects the message that is abandoned, messages can get dropped when queue length is exceeded as well. So that's something that needs to be mindful of. You can have transactionality in RabbitMQ, but transactionality actually costs because it's less performant, although very, very highly consistent. So use it for critical operations because it supports atomic transactions, again within the queue itself.
- 00:58:41 Poornima Nayar
- But you can also rely on what is known as publisher confirms when you get an acknowledgement from the broker when a message is safely stored. This is more performant, but a weaker ack-based consistency. So if you have high throughput needs, then rely on publisher confirms. They are safe to use. One of the standard features I want to talk about RabbitMQ is priority queues where you can prioritize certain messages over the other. So you can define a queue as a priority queue when you set it up by having this header called header of value, whatever, called x-max-priority. Once that is set up, you can send messages with a priority value in its header, and such prioritized messages will get delivered and processed quickly over other methods.
- 00:59:27 Poornima Nayar
- But this comes at a cost, because it's very resource intensive, so use it only when it's needed, because beneath the scenes what RabbitMQ is doing is managing virtual queues for you. So use it very carefully. And one of the things that you need to be aware of is the queue types itself in RabbitMQ, which can be the classic queues, which gives you optional replication, optional persistence, but high throughput because it can be transient and not stored or persisted into the disk. But what we recommend usually is the quorum queues, which has built-in replication using the raft consensus algorithm. There was recently a blog post just about quorum queues in the RabbitMQ doc site. So have a read. It's really, really advanced level, but it is always persistent, has better order guarantees, durability, but slightly lesser throughput when compared to classic queues because it doesn't need to be persisted.
- 01:00:25 Poornima Nayar
- RabbitMQ uses a plugin architecture, so you can bring in plugins to extend your architecture itself, and if you have to highlight one plugin to you, that would be the web-based management where you can go and see all your exchanges and queues, but you have other options like service pulse and service insight if you come to us, which does the same thing for you. So you have picked up all this information. Now where do you go? How do you start off with deciding what is best for you? The first thing would be considering are you targeting the cloud? If you're targeting the cloud, what is your preferred cloud provider? Azure Service Bus, if you are in Azure, and SQS if you are in AWS. But if you want to go on-prem, RabbitMQ is what you want to watch out for. So this is very simple. You might have a much more elaborate diagram when you're considering other queues.
- 01:01:14 Poornima Nayar
- But that doesn't stop there. You need to think about a lot of other things, like what patterns you need support for. What is the average message size? Do you need complex routing, your message delivery order? What is the operations teams like? What development support do you want? All of these matters. And quickly you'll figure out that this is all about being at the tip of the iceberg and there's more to it than meets the eye. That is where all these integrations patterns comes in. Message routing, workflow, transactions, claim check pattern, serialization, retries, to the tests that can verify all the edge cases that I spoke to you about. All in all, you would be looking at implementing all these 34 patterns which I have taken out of the Enterprise Integrations book by Gregor Hohpe. So what is the output here? Should you be attempting something like this? That would be like you building your own car rather than going to say a showroom and buying a car for yourself.
- 01:02:14 Poornima Nayar
- So you don't want to do that. Use a messaging middleware. There are plenty of options out there starting with NServiceBus, Mass Transit, Brighter, Rebus, Wolverine. Use one of this. Please don't build your own because whatever you would be building would be reinventing your wheel and then building one of the patterns that I just spoke to you about. It'll always be a cheaper copy of it. Save yourself the money, focus on the business data, use a messaging middleware. That's a very important takeaway that I want to give away as a part of my talk. So that is it from me, I'm two minutes over, but that's good, I've got all my points across. That is the resources for the day. There's the GitHub repo and the comparison chart of the three tools that I spoke to you about, and that is my email if you want to reach out to me, and come find me at the booth if you have questions. Thank you.