Azure Messaging Crossroads

00:00:01 Adam Ralph

Hello again everyone, and thanks for joining us for another particular live webinar. This is Adam Ralph. Today I'm joined by my colleague and solution architect Sean Feldman. He's going to show us how to navigate the vast array of Azure offerings in the messaging space. Just a quick not before we begin. Please use the Q&A feature to ask any questions you may have during today's live webinar. I will be sure to address them at the end of the presentation.

00:00:32 Adam Ralph

We'll follow up offline to answer all the questions we won't be able to answer during this live webinar. We're also recording the webinar and everyone will receive a link to the recording via email. Okay, let's find out which way to turn at the Azure messaging crossroads. Sean, welcome.

00:00:56 Sean Feldman

Thank you, Adam. Hello everyone. Welcome to the webinar, Azure messaging crossroads. I'm Sean, Azure NVP working with particular software. This webinar is all about Azure messaging services, so let's begin. So we'll start with listing first available Azure messaging services that anyone can start using today if you have access to an Azure subscription. There's quite a few of them focusing on specifically four.

The first one is Storage Queues, one of the oldest messaging services. Even though it is not coming specifically from the messaging product group, it's coming from the storage group, it's still considered to be a messaging service. The next one is Service Bus, also veteran of the messaging services in Azure, as well as Event House, followed by the new kid on the block, or relatively new kid on the block, Even Green. So this session is going to focus on these four services, and I hope you will find information you're looking for.

So let's begin. Now we're looking at messaging services, the idea is that usually we want, for example, do something such as sending a message. To send a message, we can use either Storage Queues or a Service Bus for example. Both provide mechanism called queue that can be utilized to send messages and receive those messages later. So the very first question that anyone encounters is why would I choose one over another. Why would I use storage Queues if there's Service Bus?

Well, maybe perhaps storage Queues is a dead service and I should never touch it. Rather than using Storage Queues, I should use Service Bus. One of the questions. Well, for example, if we're doing pub/sub, or publish subscribe, and want to publish event, broadcast event to multiple subscribers, well when we talk about events, Service Bus is capable of handling events. So is Event Hubs. It can ingest events. So is the new kid on the block, Event Grid. It also handles events.

So which one of these is the right service? For example, when Event Grid became generally available, the rumor on the street was, hey, Service Bus is a dead service. You should not be using Azure Service Bus. Instead, use Event Grid. That's not necessarily true. Another scenario is when we have services such as Azure Signal R that feel like and seem like messaging services, but they are not. The reason why this specific service is not a messaging service is because it's not durable, meaning that messages are not stored anywhere if the delivery is failing.

So this kind of service I do not consider to be a true messaging service. So with that many options and so many choices to select from, which is the messaging service I should be using in my projects, greenfield or brownfield? How do they be together in bigger picture, and do they at all? So the idea of this presentation is to help navigate these services, understand their pros and cons. But I think probably the biggest value is to be able to avoid big mistakes in whatever greenfield or brownfield projects that you're going to have, specifically because you can't force the services to achieve what you need. But the question is, how much of a pain it's going to cause you and at what cost you will build what you need to build.

So with that, we'll start with the first one, Storage Queues. I also call this service simple queues. The idea behind it is that it's not intended for sophisticated scenarios. Let's look what I mean by sophisticated scenario. So Storage Queues are task queues. What it means, it means it's designed to coordinate work across compute and Azure. For example, if you have functions or app service, or virtual machines that execute work and they want to communicate tasks between selves or send instructions, usually messages with intent of something to take place, something to happen. Storage Queues are intended for that kind of messages.

These messages usually contain data and there is an expectation something to have and something specific, also known as commands. The sender has an expectation for the receiver to perform something with the data that it's sending, the task to be executed. The service is extremely low cost, meaning the Storage Queues, number of messages and operations that you're going to consume, comes at a very low cost which makes it extremely appealing.

It's also a very easy to use service. We'll look into the reasons why it's so easy, but there's really not that many features associated with it, so it's both positive and negative in a way. The number of queues is unlimited, which is very important if you're trying to build a system that requires an unlimited number of queues or very large number of queues, as well as unlimited number of concurrent clients. The reason for that is the service is implemented on top of HTTP, so whatever number of HTTP connections that you are capable to establish to the data center, that number of concurrent clients you might have.

Now, while the queues are unlimited, the number of messages that can be found within a single queue is limited to a size, not number itself, which is 500 terabytes, and that's huge. Normally you would strive in your applications, in your systems, not to store that many messages, especially because these are tasks. They're supposed to be executed, but Storage Queues, having the word storage in it, is definitely allowing excessively large number of messages to be stored.

Now, individual message size is constrained to 64 kilobytes only. That makes perfect sense considering these are tasks. We're not distributing data, we're not sending anything large, large objects, serialized inbox messages, so it's only 64 kilobytes. Limited features, and when I say limited features, those are compute features. What the service is capable to do for us, on our behalf, on the broker side, on the service side. As we move to the next service, you'll see the difference between Storage Queues and the other services.

So what is the sacrifice with Storage Queues? For example, no headers, no metadata. The PLO that is sent in Storage Queues is the only thing that can be transmitted. There is no headers traditionally associated with messages with other messaging technology, such as RabbitMQ for example, or MSMQ. It just doesn't exist, meaning that you have to encode it into the payload. There's no features such as publish subscribe or pub/sub. It is only sending messages to a single destination, nothing else.

No scheduling features. It is impossible to send a message in the future or to set an approaching option for a message to discard the message once a certain date/time has occurred. There's also no dead concept. The dead lettering concept is basically automated way of discarding messages that are continuously failing processing. With Storage Queues, if a message is failing processing, it is the responsibility of the system that is attempting to process the message to remove that message off the queue, which means that you have to build those features yourself.

No transactional guarantees, but that's not something new to Azure's services in general or any cloud services, because the transactions were specifically DTC is not something that is possible. Though, when we're going to talk about the next messaging services, we'll see something interesting there. Now, for production, one of the limitations of this fairly simple service is that we cannot keep more than 32 messages on a queue, meaning that if you have more than 32 messages in productions, we won't be able to see anything below top 32.

Again, the intent was for Storage Queues is to be tasks that are processed fast, therefor you shouldn't be really picking too much into the queues. Now, one of the biggest limitations was there for Storage Queues is the retention period limited to only a week or seven days. Now that limitation, build a year or two years ago, was removed and messages can be stored infinitely. But please be careful with that. You never ever want to store messages infinitely. Usually you want something that is meaningful. Otherwise, messaging service turns into a database.

Some other computer related features that I'm not necessarily going to list here, but when contrasting with other services, we'll mention those. How does it work? In a simple queue, it works in a fairly simple way. There's a sender and a receiver. The sender sends a message to the service, to the Storage Queues, and the service is acknowledging that the message is received. The receiver is pulling for the messages. It's receiving the messages and confirms wherever the message is completed or not.

If the message is not acknowledged as completed, the message remains on the broker, the message remains on the server site and available for reprocessing. So this is pretty much Storage Queues. Let's sum it up real quick. Things to remember, these are task queues who work with items as messages. Very small size messages. I mentioned 64K. If you do some sort of custom envelope to indicate headers or whatnot, even smaller than 64K, it is HTTP based, meaning that no matter what SDK you're using, you can talk to the service from any language, various SDKs that are available, and it's an extremely low cost service, which makes it quite appealing.

The next messaging service is Azure Service Bus, also known as Enterprise Service Bus. Now let's look what is the traditional environment where Enterprise Service Bus fits in. For example, eCommerce. We can talk about eCommerce where we have front end where customers can, for example, post their orders. Those orders are not processed on the front end. Usually they're processed on the back end, which is scaled differently. The communication between the front and the back end is done by queuing or messaging system.

Now, with Azure Service Bus, it's not just queues. There are three types of queue like entities and Service Bus calls them entities. So these are queues, topics and subscriptions. I'll elaborate on topics and subscriptions in a little bit. So the idea behind this is that we want to scale the front independently from the back end for processing. For example, on Black Friday, we wanted to scale the front end substantially, scale it up. Sorry, scale it out. But we don't necessarily want to scale out the back end just because we have influx of the orders within 24 hours.

The other important thing is that we don't want to lose any of those orders. This is where we could definitely use Storage Queues, but there is something more important here is that we're dealing with money, moving money for example, or purchase orders, and we want to ensure and guarantee certain transactionally on the messaging level. Yes, I said there's no transactions, but hold on. We'll see something interesting with this specific service.

The Service Bus provides advanced features that are not found with Storage Queues. Some of those features are message headers. You guessed it right. So message headers are metadata about the message. Why is it important? Well, we want to have the headers allow more complex processing. It's not just a basic task. We want, for example, into indicate what type of serialization what used on the message. We might want to add additional data that describes the message or how to process the message, sort of internal instructions for ourselves.

Also, headers could be used to manipulate the routing of the message and not just necessarily send it from one point to another. Duplicates detection. If we're going back to the example of the online eCommerce system, there's potentially a chance that someone has submitted the same purchase order more than once. So the same message will be arriving identical and we want to be able to drop the duplicates. So Service Bus, for example, allows message duplication. The caveat though is that the duplication is performed based on the message ID, not on the PLO. So the possible workaround is to create a hash from the payload and assign it as a message ID to ensure that identical messages are deduplicated. So that's another accounts page.

Auto forwarding, that I mentioned a bit earlier, being able to route messages, not just to a single queue, but amongst different queues. Or also known as topology. So Azure Service Bus allows that message routing by using auto forwarding. Dead lettering that is built into the Service Bus, and the beauty of this feature is, rather than building your own mechanism to put aside the message that is completely sort of failing processing, Azure Service Bus does it on our behalf automatically whenever a message is reprocessed multiple times.

The reason why a message could be reprocessed multiple times is either because the message is a poisonous message, a mal foreign message, or because we have some problem in our code that message is continuously failing the processing. Therefor, it potentially could be a poisonous message.

Schedule delivery is extremely powerful. When messages can be scheduled in the future, they will not be delivered immediately, but on the date/time. Not before. On the date/time or slightly later depending on the queue length when the scheduled delivery was set. Now, why this feature is so powerful is, for example, whenever you create systems that require some sort of scheduling ... for example, doctors office with appointments. If an appointment is set in three weeks, rather than sending a message the receiver postponing the message for three weeks, the receiver will not get anything until the scheduled delivery date is coming up.

Message expiration, also powerful feature provided by Azure Service Bus out of the box. The idea behind this is messages can be perched if they have not been processed by a certain day/time. So while the example that I will provide might not necessarily be the best one, think about stock updates. If a stock update is not received within, let's say, 30 seconds or one minute, then that message is no longer relevant. Rather than wasting processing power on receiving the message and processing it, it's better to be discarded because there's likely to be next message with the next update, which should be processed.

Message deferral is an interesting feature which helps the receivers that have received the message but not capable of processing it yet, postpone the message, put it back on the queue without running into the risk that competing consumers are going to snatch that message and try to reprocess. So it's almost like being able to set a message aside and say, I will take care of this message when I'm ready on my conditions. Until then, this message is not going to be attempted to be sent to any other receiver.

Auto deletion on idle, another fantastic feature that Service Bus is providing is being able to drop both the messages and the queue that contains the messages if there was no action on the queue for a defined period of time. So for example, if your system is dynamic in its nation, provisions use dynamically that are used for a short period of time and then no longer required, auto deletion is your friend because it can chop the queue without manually necessarily going to the Azure Service Bus and removing those queues.

Batching transactions, these are the additional features. But transactions, I want to focus on that one because it is not a transaction that will span your business data or operations to the database for example, and the messaging operations. We're talking about messaging transactions only. Usually it's coming in the context of the incoming message and the outgoing messages. So for example, let's say we have received a message that requires us to perform certain work and, as a result of that work, we want to announce to the rest of the system that work has been executed or scheduled some task or whatnot. We want to have the incoming message and the outgoing messages processes automatically.

So either we succeed to complete the incoming message and send out the outgoing, or we revert the outgoing messages and the incoming message becomes, again, available for reprocessing. So, that is a messaging transaction only available with Azure Service Bus.

Then we're also talking about message sessions if you need ordered or guaranteed order delivery for the messages that you are sending. We're talking about pub/sub, which I'll mention a little bit later, being able to decouple between the receivers and the publishers or senders. As a publisher, you don't care about who is subscribing to the messages. They will receive it when the message is broadcasted.

Filtering and actions, allowing to tweak how the subscriptions are taking place or what the receivers are subscribing to when messages are published. Claim check, while it's not big into the Azure Service Bus, Azure Service Bus has two tiers, the standard tier and premium tier. Now standard tier and premium tier have differences, and I'll mention this in a bit. But one of the biggest pain points for example is the message size difference that are not matching. So there are options to use something like claim check pattern to overcome these limitations, and some of the SDK, Java SDK I believe NodeJS is about to add support for claim check.

All those features, I've spent five minutes talking about those and those are yet not all of them. Those are great. How do we received messages? Well, there's two options. There's the traditional approach that messages need to be pumped all the time. Keep pulling for the messages, no different from Storage Queues. Exactly the same idea, but there is also a serverless approach that we are not receiving the messages. Messages are pushed to us when we need those.

We'll talk about it later, about how we can mix and match. If you happen to use .NET client, you're in luck, or .NET SDK, because to receive messages you don't necessarily have to do your own infinite group or what we call message clump. You can use helper methods. One of those is called restore message handler where all you need to do is provide a callback to your custom code. Whenever a message is retrieved or received by the SDK, it will be given to your code and you'll be able to execute the logic.

Now, this built in message pump allows you to fully control the concurrency so you can specify up to how many messages concurrently you would like to be able to process. The message composition can be either done automatically on our behalf, which is the default, or we can control in the call back whether it's a complete message or not, and also lock renewal. That one is an interesting one because Service Bus, unlike Storage Queues, does not allow you unlimited message locking.

The message is usually list up to more than five minutes. Now sometimes there's need to go through processing that is longer than five minutes. In that case, the lock can be renewed and the message pump or the message handler provided by the STK can do it automatically. One word of caution that I will mention here is you need to be careful with lock renewal because it's not a guaranteed operation. It's not initiated by the broker. It's a client side initiated operation, meaning that if there is a connectivity issue or some problem, the renewal can fail, meaning that the message will lose its lock and eventually will be available for other competing consumers.

Consumers, they're looking at the same queue to retrieve messages and process those concurrently. All right, our next one is an interesting one. If you're not using .NET and if you are in Java or using Java as the gate, one of the nice features that has been published recently is the first round support for GMS. The idea behind this is extremely simple. Without changes to the existing applications, by simply replacing connections stream and providing one factory in the connection stream, you can connect existing GMS based systems to work with Azure Service Bus.

Obviously, there are some limitations and you should consult with the documentation, but full GMS support is coming. I've mentioned before the premium tier versus standard tier, and I would like to highlight some of the premium features that are kind of important to understand to differentiate from the standards here. First of all, the ability to control scaling up and down, also known as messaging units. When you provision a premium namespace, instance of Service Bus, you can specify number of messaging units. It starts with one and goes all the way to eight.

What you're basically locking yourself into ... or locking maybe not necessarily a good word. What you're promising yourself is specific throughput and latency. Usually based on the number of messages on the throughput load on the system, you will be able to scale up and down. The next one is geo disaster recover, geographic disaster recovery. This is basically what happens if Godzilla decides to visit the data center where your Azure Service Bus namespace is provisioned, where your service is running. There is no way the data center is coming back, which means that there will be a fail over.

You can have a fail over namespace, but there is a caviate. Right now, it only supports meta plane, meaning that only the queues, topics and subscriptions will be found on the fail over namespace. Not the messages themselves. Hopefully the data plane, the messages DR is going to come this year. It really depends on whatever messaging group can pull this off or not. It's not a simple feature to ask implement.

Availability zones on the other hand is by default available for premium tier. Availability zones serves the purpose of high availability. So there are three replicas. All the messages and all the entities, and given data center, and you are promised to have access to your messages without failing to retrieve those. So don't mix the Geo-DR which is disaster recovery. The data center is gone, nuked, versus availability zones within the data center being able to retrieve the data at any point in time.

Another feature is virtual network service end points, basically making sure that, if you have virtual networks in Azure, that none of your messages are traveling on the public network. Metered throttling, again throttling is also happening on the standard tier more than actually on premium. But since premium tier is dedicated hardware, you can see the utilization and you can decide for yourself at what point in time, at what utilization point you will be scaling up, or if you need to scale down. But throttling will occur if there is excessive utilization of CP and memory, which Azure monitor should help with.

Now what happens if you're using standard tier and thinking of migrating, for example, your productions to premium tier. Well, good news is that there's no down time. You literally can migrate on the fly from the standard to premium tier. You will be using same connection stream for the senders and receivers as before. You don't have to stop those end points. You don't have to stop your services. They will continue functioning.

You will have to have an additional obviously premium namespace provision before you start the migration process. Some messages might need to be drained, and that's because at the cut over point in time. There will be certain messages if the senders continuously send messages, there will be some messages that don't make it over. You will basically need to drain them manually and move to the other namespace. It only supports up to 1000 entities. Now the question is what to do if your namespace contains more than 1000 entities.

My personal recommendation is to open support case with Microsoft and work with their engineering team, because they will be able to assist you. There is nothing that you will be able to do on your own unless you reduce the number of entities and stay within the limit of 1000. Now the migration can be done on the portal, fairly straightforward. As you can see, three simple steps. now the UI might have changed since it was announced.

Let's sum it up. What is Azure Service Bus? So the things to remember are queues and topics. So it's not just simple messages that we send to a simple destination. We can also broadcast, reach computer based features, many of those. I encourage to read the documentation and review those features to understand fully Azure Service Bus and how to take advantage of what it provides. It is the only service that had transactional support on the messaging level. Storage Queues doesn't do that. Neither do other services that we'll cover in a bit.

It's extremely reliable, allows to implement complex workflows thanks to auto forwarding, metadata and other features that exist. With that, we'll switch to the next messaging service which is Event Hubs.

Now Event Hubs is substantially different from Storage Queues and Service Bus. Why is that? Well, I'll use the analogy of a tape. First of all, Service Bus, while it receives messages, it doesn't handle messages. It handles those as streams of messages. The recording that happens of those messages is always moving forward, just like the tape. You can play a stream over and over again so you can basically retrieve the same messages again and again. Similar to cassette, it has channels. Cassette has two channels, left and right. Event Hubs has a little bit more. We'll talk about this.

The data on each channel is different, so it's not the same messages. These channels, rather than calling them channels, we'll call them partitions. So let's have a look at somewhat internals of the Azure Event Hub. As a service, it has partitions. We agreed upon that, and those partitions record different messages. Those are streams of the messages. Usually provided by event producers that can communicate Event Hub using one of the following protocols, HTTP, NQP or Kafka, yes Kafka, even though it's not an Azure service. We'll talk about that a little bit in a few minutes.

Then on the other side, their event receivers that are not receiving single events, individual events. They read those through consumer groups and they read the streams. I'd like to point out the inverse direction of the consumption. It is not a message going to the receiver. It's receivers picking up those messages, and I'll elaborate on this emphasis a little bit later when we talk about additional aspects of Event Hub. We can have obviously multiple receivers with their consumer groups that are reading messages from the partitions.

So does this sound familiar? Yes, and I've already mentioned it can consume events from Kafka. Are these two similar services? Yes, in a way. Both are ingesting technologies. They're not queues. It's an important differentiator. If we were talking about Storage Queues, that was a queuing service. When we talk about Azure Service Bus, that was a messaging but still a queuing service in a way. Event Hubs is an ingester. It's not a queue.

Now, both Event Hubs and Kafka use partition consumers, and those partitions are independent. The important concept that I wanted to point out is that the client is managing the cursor and not the other way around. In other words, it's the consumers, the clients that need to remember the location of the last message in the stream that they have processed or read, as opposed to Storage Queues or Azure Service Bus. With those services, remember with the message that needs to be given to the client when the client is asking for the next message.

Back to Event Hubs, just like Kafka, can scale to very high work load. Also conceptually nearly the same. Both are ingestors, which means that every Kafka user is close to being an Event Hubs user if they decide so. As a matter of fact, Kafka 1.0 There's full compatibility between Event Hubs and Kafka. If your system is on Kafka today, you could switch connection stream to Event Hubs and start using Event Hubs. Existing Kafka applications and tools will be working with Event Hubs. Just the connection streams are different.

Now, let's talk a little bit about scaling because, while it's an ingestor and it's capable of processing large amounts of events coming in, there is a certain throughput unit that you can utilize. Again, there is standard and there is premium tier. Premium tier, we're talking about throughput units. Very similar concept to the messaging units in Azure Service Bus. Now, the capacity that you can reserve for throughput units is only on the premium tier or Event Hubs dedicated tier.

A single throughput unit is about one megabyte per second of egress ingesting or about 1000 events per second. Egress is twice of that and, if it's not enough, you can go up with throughput units and you can scale to substantially a very large amount of messages, both ingress and egress. Now overages are throttled, and inception will be thrown. But to handle that, there is a feature, specifically designed feature, called auto inflation. Now this is not auto scaling and it's called auto inflation on purpose because, with insufficient throughput units, senders will be throttled.

With auto inflation turned on, the throughput units will be increased up to a defined cap which you have to define yourself. But those throughput units will never be scaled down if the ingress is falling down for example. That's why it's auto inflation and not auto scheme. The other thing to remember about Event Hubs is retention fusion by default, which is seven days. Again, an additional feature is specifically designed to handle this issue called Event Hubs Capture, that allows you to tweak this retention period, but not exactly the way that you would think, oh I can keep it for a longer period of time. Rather, automatically send those events or Event Hub data into either storage account that you will prevision or Azure will link.

What's nice about this is that it allows you to minimize the overhead associated with the downstream that is required to process the data. So for example, if you're auto inflating and receiving the boat load of events, rather than scaling up or scaling out your processing at the same time, you can redirect the data, those events, to a storage account, and process on your own terms and conditions, which is very handy. So you're focusing on processing and not so much on the capturing.

So how the capture works with Event Hub and the partitions that have the data, we define a capture rule for storage account. We usually need a container to be defined, reprovision, and two perimeters, size and the time. So for example, in my case, I'm going to specify that size is no more than 100 megabytes for an output file or blob. Time should not exceed more than 10 minutes. So whatever comes first will generate a block and those blocks are automatically generated as the data streams into Azure Event Hubs.

The formatting is storage account and the container that we provide. After that, it's namespace with the Event Hub name partitioned in day time stamp, and the format is well known Kafka format. Now what happens if you use the .NET helpers again, if you're using .NET SDK, there is a safe batching support, for example, which means that you can send events without worrying that people exceed massive number of events that could be batched together, which is very convenient.

You build up the batch, knowing up front that whatever you sent is going to be successful, unless obviously there was a communication failure. The consumption of events and managing the cursor on the client side can be simplified by Event Hub client and event processor host. So those two are basically an obstruction on top of storage account with the block, and it can manage the cursor for you. You don't have to write your own module.

So let's sum it up. The important things about Event Hubs. Remember, throughput over features. We haven't seen that many features. I've highlighted two of them, auto inflation and storing data rather trying to process it real time. Again, this is because the purpose of Event Hubs is not the same as queuing or general messaging service. It is ingesting. Not a queue replacement. Important to realize that. And you have to manage cursor yourself or use a help library or SDK ways that provide management for you. It is not something the service is going to do for you.

With that, the next and the last service with Event Grid, the new kid on the block. Let's understand why do we need all of this. We've seen Storage Queue, Service Bus, Event Grid, a combination of those three could potentially give us probably everything we needed. So why Event Grid? Well, a traditional events receiving is usually based on the fact that there's something happening, some sort of event occurred, and that event needs to be communicated to a system for processing. The reliable way of doing so would be a queue, and we would have a process that would access 24/7 that would retrieve that message, that event, and process.

The only challenge is that the processing part, the right side of the screen, would have to be running 24/7, pulling continuously for the queues, receiving those messages. If we're talking about Storage Queues, we're going to have to pull. If we're talking about Service Bus, we're going to have to run our process 24/7, pulling the events. Even at the situations where there are no events happening at all. An event is a lightweight notification of a condition or a state change.

I'm going to highlight the lightweight notification. So something very simple that happens might come up to ask or might not, depending on if the event happened or not. There is a theme that we have seen in the recent years that more and more systems prefer to do the pub/sub model when we want to be subscribing to the events, to be notified when those are happening rather than retrieving those. We want to build reactive systems and reactive applications when we only have to respond to an event that happened as opposed to continuously running.

The architecture as a whole to be event driven. There's also conflict of push versus pull. With all the services that we've seen so far, messaging services, it was based on the pulling model, not the push model. But the push model is much easier to comprehend and sometimes makes much more sense in a reactive world. So for example, a few scenarios that we can think about is serverless applications. Let's say I've got a storage account and that storage account I'm using cognitive services to process images.

If there are no new images uploaded to the storage account, I don't want to run 24/7 and check are there any blocks, are they any new blocks, are there any new blocks. No, I'm only interested to do processing, to do work when new blocks are detected. Another example is operations automation. When we provision new resources, we want to capitalize those or we want to kick off some process, for example. Again, it's very reactive in its nature. Something has to happen. The resource has to be created. Only then we're willing to keep the work.

Not to mention the fact that we might want to do some third party integration. When we want just to announce that we have an event, but we don't want to control who's subscribing at what point or how many subscribers are there. We don't want to think about that aspect. All we want is to broadcast to the world and the world to handle whatever we have. So scenarios such as these ones have triggered thought about, okay, the existing messaging services are not quite the services to fulfill this role. This is where Event Grid is coming in.

Event Grid is the man in the middle, if you wish. It connects between the systems or the services that have events to publicize on the left side, and on the right side event handlers that will be processing those. Now, this is just an example. We'll go into details, but you will see a lot of Azure services on the left side, except the last two items, cloud events and custom events. Cloud events are cross cloud events, so those do not have to be coming from Azure services or Azures at all. Custom events are events that you can admit yourself, come up with whatever events that make sense, when you want to perform an integration or notifications.

Now, on the handling side, on the right side, we have several categories of who handlers can be. So for example, it can be serverless code with functions. An example, a canonical example, I'm putting something ... I'm creating a storage block and a function is automatically treated. Or same thing could be done with the serverless workflows integrations with Logic Apps. Or we can leverage private messaging services such as Service Bus, Event Hubs or Storage Queues to do the later processing, leveraging Event Grid.

So Event Grid can actually submit or send those messages to the other messaging services. Other applications and services, including the simplest thing, web hooks. So anything that can execute web hook can become a subscriber to Event Grid. Now Event Grid is fairly in service. There are continuous additions that are coming, for example key bolt was added recently, machine learning. It's not advanced as well. Azure signal R. In the preview, as of recently, Azure app service web apps and the Redis, the Azure Redis Cache.

Now why don't we see all of the Azure services emitting those events? It's very simple. Microsoft is taking very conservative approach that customers should let know what events they're interested in. Rather than building and hoping that everybody will come to use those, customers should be letting know Microsoft what are the events that they're interested in, what are the use cases and if there's enough merit, they are building those events. So if you're interested in Azure Event Grid events, let Microsoft know.

So let me briefly go over some of the concepts of the Event Grid. Some of them we've seen. The event is what is happening, what are we notifying about. Event source is where it's taking place. It is important because, if it's Azure service, there is less work to do. But if, for example, it's a custom topic, then there's a little bit more work that needs to be done because we need to provide a topic, create a custom topic where those messages will be published.

Event subscription, subscription is the intent to receive messages with a criteria. So for example, I'm interested in all the blocks, or I'm only interested in the blocks that have jpeg extension and whatnot. Then event handlers, these are specific application of the services that will be reacting to those events that subscriptions are attaching. Now let's have a look at an example of Event Grid message or Event Grid event. Well it's Jason and Jason is scheme LS, there is a certain structure.

For example, there is topic, subject and event type. Event type is extremely important because it uniquely identifies the type of event that is coming into the level processing. The other mandatory block is data block which is all the data required to be able to process that event. But be aware, in this case, we have an event called Block Created. So in storage block under a convener. The block was created. It is not the block data that is coming in here, it's the metadata because, remember, if that is a lightweight notification, it should be enough for the subscribers, for the receivers to identify whether they want to process or not.

So in this case, content length of the block is 500 something bytes. We have also the URL, which indicates where the block actual data can be found for processing if needs to be. A message size is 64 kilobytes, used to be. Now it can go up to one megabyte, but keep in mind that the charge is happening based on number of 64K chunks. So don't go too crazy with it. You're not going to save money if you pack it into a single message.

Now about filtering, I'm not going to go into the filtering in details because documentation is there that can explain it all, but you can pretty much filter on any property within the schema that we've seen, including values in the data and there are presentation of different functions and possibilities. Now Event Domains is an interesting feature within the Grid that simplifies publishing. Probably the question is why do I need to simplify publishing of the events. It's fairly straightforward.

What happens if we have multiple subscribers, but not all of those subscribers should be able to receive one or the other type of message? Well, in that case, rather than creating multiple topics or topic per customer, there could be a single topic with the event domain endpoint which can control, using Azure active directory, which customer can retrieve what message types. So from the publisher perspective, the work is simplified because the publisher just emits the messages to a single topic, and Azure active directory is insuring that only those that can access specific message types will be seeing those, and they will not receive the message types if they are not supposed to.

Now when we talked about topics, subscriptions, filters, we've seen it all. We've seen it at least with Azure Service Bus, so then why Event Grid? Why is it so different? Well, it's cloud native by design. Unlike other services, for example Service Bus, which was designed from eight, nine years ago went into production, Event Grid was designed recently with cloud in mind, with a scale that is required. It is serverless friendly. It's very easy to start using Event Grid. If you have played with Azure functions, for example, you have seen that wiring of block storage with an Azure function, using Event Grid is extremely simple.

It is engineered for reliability and scale in mind, so the messages are not bailing if the receiver is not available. I'll talk about it in a little bit. And it's embrace support for CNCF or cloud native foundation. Sorry, forgot the name. Basically standardized for events. We'll look into that in a second.

So cloud native by design, availability for 99 SLA, near real time event deliver, so it's less than one second end to end. At least once delivery. Dynamic scale, you don't have to worry about provision anything. It's all handled by the server. You don't have throughput units, you don't have messaging units, none of that. Internally, service can handle tens of millions events per second per region, or hundreds of millions of subscriptions per region. We don't do anything about that.

It is platform agnostic because it's technically based on the web hook. Anything that can receive a web hook can handle Event Grids and events. It is language agnostic because the transport is HTTP protocol based. Now what I'm talking about scaling reliability, by default there are three delivery attempts within the timeframe of 24 hours. If the messages are failing, there are retries. This is the default scheme for those retries. They can be reconfigured and redefined if you want less number of delivery attempts or lower the time.

There is also the possibility for dead lettering. It doesn't require a storage account and a container to be provisioned. Every message that is failing delivery after retries is stored as a block, as an individual block. The downside of this feature currently is that, if you want to retry any messages, unlike Storage Queues and unlike Azure Service Bus, you cannot just resend the message back to the destination. In this case, that would be republishing the block to the topic. Actually, I have to convert the contents of the block and publish it again as a message. Hopefully Microsoft will patch this hole and address this issue, but messages are not lost. That's the important thing.

Now when we're talking about Event Grid and CNCF cloud events, what it does, it opens up the possibilities, not just leveraging Azure, but also other clouds. Because cloud event is an open format and agreed upon all the vendors, major vendors at least, you could be publishing from GCP your events, or from Azure and consuming those either in Azure Functions or AWS Lambda. So the interop between various cloud providers is becoming a much better story.

What that cloud event schema looks like, very similar to what Event Grid event schema looks like. Ignore the little typo for the event version. It's 1.0, not 0.1. You will see the same pattern. There is an event type which is specifically identifies the type of the event so that we could handle it. And there is a data block which provides the metadata that is required. So this is exactly the same event that we've seen with storage block just expressed as cloud event. Azure services can handle it or you can handle it on any other cloud.

So it is ubiquitous. Today we've seen 10 plus Azure publishers. Hopefully at some point in time, most of the Azure services will have the publishers. But again, you have to remember, if you want something, you have to ask for it and explain why. It is serverless friendly, as I mentioned before. So for example, if you go to any service that supports Event Grid already, you will find in the manual the events section which allows you to wire that service instance with whatever the subscribers who are publishing the events the service will be publishing.

We can also achieve that using Power Shell or Azure CLI, whatever is more convenient. So events and streams, that's an interesting one. I've mentioned before that the Event Grid topics can actually have subscribers that are other messaging services, such as Storage Queues, Service Bus or Event Hubs. The reasons for that is that these messaging services can be publishers or subscribers. The nice thing about it is that we can level the load associated with Event Grid, because Event Grid can be pretty much flood gates that are opened, and our system would not necessarily be able to cope with that amount of events that are received.

This is where the combination of the services is very helpful. Now when it comes to security, there's a validation handshake for the subscribers to insure that no one has subscribed accidentally or they did not intend to receive those events. What happens is the event of a very specific type, subscription validation event, and its payload that contains the validation code that has to be sent back to a very specific URL. I basically prove the intent to subscribe. You need to echo back that validation code.

All right, so let's sum it up. Event Grid, what's important? Event driven systems. Whenever we want to deal with the events on massive scale, that's Event Grid. The pub/sub, yes we have pub/sub with Service Bus, but when we want to make it more scalable across cloud inter services, Event Grid is a better candidate. It is not enterprise messaging replacement and it's important to remember that. High scale, high throughput, retries baked in. So there is reliability. Messages are never lost if they cannot be delivered. They can be dead lettered into storage account.

Support for cloud events with the open format, agreed upon multiple cloud members, which makes it a good candidate for cross cloud. So let's summarize everything we've seen so far for these services. It's great. I've highlighted a lot of cons, some of the pros of these services, but how do we really choose the right Azure messaging services? Well, it would be extremely simple if I would give you a diagram that you would just follow the arrows with simple yes and no and get to the services that you need to use on your project or multiple projects.

Unfortunately, the reality is way more complicated than sophisticated, but still I hope that from this webinar, you have learned a few things such as, for example, when you're working with serverless, most likely you will evaluate Storage Queues and Event Grid because of the nature or the requirements of serverless implementations. If you're dealing with big data, no brainer that Event Hubs should probably be the first one to evaluate. Also, if you're using another service, let's say Storage Queues, and you have to deal with big data, don't try to overload Storage Queues with the responsibility of ingestor because that's what they're going to have these designed for.

Now, if you're building enterprise systems, workflows dealing with money, every single message counts and important. It's not telemetry. Then mostly likely you're going to be evaluating Azure Service Bus, especially if the system is an organization specific system. It doesn't have to direct anything external. But, at the end of the day, you have to remember that it's the right tools for the right job. It's not a single tool. It's a combination.

An example could be the traditional system that we looked like with the back end that is continuously pulling messages from the queues to process those. It might not necessarily be the same solution in the future. For example, let's say we have messages that are coming in, but not a continuous flow of those messages. They are sporadic. Sometimes we can have tens of messages or thousands of messages per second, and sometimes we can sit idle for a few hours. Rather than having, on the right side, back end processing 24/7, we could have messages sent to a queue, Storage Queues, Azure Service Bus queues specifically in my example, and combine it with an Event Grid.

An Event Grid can work with Azure Service Bus and it has an event called active messages available with no listeners. So, that event we could subscribe in our system to that event. Upon receival of this event, we would kick off the processing of the messages until we are done, no more messages to process, and could shut down our processing. This is the idea, what serverless in a way implements for us, because for example our AWS Lambda or Azure functions are not running 24/7. They only run for the period of time that is necessary. But it doesn't have to be Azure functions or AWS Lambda to take advantage of this pattern.

We can combine in this case Event Grid and Azure Service Bus to benefit. So with that, I would like to thank you and to switch to the questions.

01:04:22 Adam Ralph

Thanks very much, Sean. So yes, as you mentioned, we've got a bit of time for some Q&A now. So I'm going to pick out a couple of questions, and apologies in advance if I pronounce anyone's name incorrectly. Mark Tarling has asked an interesting question. Mark says, "Does scheduled delivery of the messages in Azure Service Bus always deliver the message on the end of the queue? If so, isn't it better to use another system or queue to handle this?"

That's a good question. Indeed what will happen I the message will show up at the end of the queue. It really depends whatever the queue is busy or not. If the queue has hundreds of thousands of messages sitting in front of that scheduled message, it will have to wait for the processing of the other messages. The idea is to be able to scale out for message processing, not to be in that situation. Now, whether it would be more meaningful to use another system, another queue, potentially yes, but it really depends on the scenario.

Sometimes customers prefer to store something that is scheduled in the database, but then you have to have periodic creating of the data and seeing what's up for grabs or what needs to be executed. Sometimes it's much easier to just schedule a message. If we're talking about the real time systems, for example when a message has to be scheduled at a specific point in time and has to be processed exactly at that time, probably this feature is not relevant. But for general use, when you want to schedule something as a reminder, for example, and insure that it takes place, but just not before the daytime, then this feature is fine.

01:06:24 Adam Ralph

Okay, thanks Sean. Another good question has come in from Adelina Rodriguez. This was actually sent in ahead of time by email. Adelina asks, "We're intensively using messaging in my organization, but in an on premises fashion using JMS and applications deployed on an Oracle Web Logic server. I'd be interested in finding out whether Azure offers up options for message handling in a transactional way. For example, in the case of an error when saving to a database, rolling back the transaction would also roll back the message to its starting point."

Yeah, it's not an easy question. So first of all, JMS support, I mentioned that it's already in place, not 100% completed but they're getting there. In terms of transactional processing, that would cover both messages and other operations, it is not possible. DTC does not execute well in any environment that is a large environment, let alone cloud environment. What would need to happen is probably ... Sorry, I will step back.

With Azure Service Bus, transactional messaging processing is possible, is doable. So incoming message in the transactional context with the outgoing messages. Now what happens is that the operations, such as writing to a database, would need to be performed before the messaging transaction is complete. So in essence, we can't create code that would guarantee that incoming messages will not be completed and outgoing messages would not be dispatched if the database was not successful. So, that is possible.

It's possible with Azure Service Bus because of its native support for a feature called Zambia, which in essence supports messaging transaction.

01:08:35 Adam Ralph

Okay, thanks Sean. So there are some more questions outstanding, but as I promised, we will follow up with those offline. Before I wrap up, I'd just like to mention that our colleagues will be speaking later this month at the Segfault University in Warsaw, Poland, and Sean will be speaking at Prairie Dev Con in Calgary. So if you can get yourself over there, then you'll get to meet Sean in person. You'll also see us at NDC Minnesota in September, and shortly after at Explore DED in Keystone, Colorado.

01:09:15 Adam Ralph

You can also go to particular.net/events and find us a conference near you. That's all we have time for today. On behalf of Sean Feldman, this is Adam Ralph saying goodbye for now and see you at the next Particular live webinar. Thank you.

Azure Messaging Crossroads

🔗How do I choose?

🔗In this webinar you’ll learn about:

🔗Transcription

About Sean Feldman