What they don't tell you about migrating a message-based system to the cloud
Migrating a message-based system from on-premises to the cloud is a colossal undertaking. If you search for “how to migrate to the cloud”, there are reams of articles that encourage you to understand your system, evaluate cloud providers, choose the right messaging service, and manage security and compliance.
Curiously, what’s often missing from the discussion is details, like how to handle all the other on-prem systems that integrate with your system, both upstream and downstream, that can’t be upgraded at the same time. This gets even more tricky when those integrations are over on-prem-only technologies, like MSMQ, that don’t integrate out-of-the-box with cloud alternatives like Azure Service Bus or Amazon SQS. It’s as if they’re saying, “Have you documented your system? Great! Have you chosen a cloud provider? Awesome! Do you have all the services in place? Wonderful! Now go rewrite all your code… we’ll wait…are you done yet?..What are you looking at me for? I’ve already told you to plan carefully, I can’t do EVERYTHING for you”
In short, there’s a big gap between “everything works on-prem” and “everything works entirely on the cloud” that often gets glossed over. So we’re going to explore this scenario with a small, fictitious airline, called (and really, what else would we call it) ParticulAir.
🔗I want to move one of my on-prem systems to the cloud
ParticulAir has a legacy system that’s been running successfully for many years with a number of features, including flight upgrades. These upgrades are handled asynchronously, as the airline wants to prioritize upgrades for its most valuable frequent flyers over others. Technically, this is all done over MSMQ where requests are processed and eventually granted or rejected, notifying other services of the outcome. Here’s a simplified diagram of how that works:
Now, the business wants a new mobile app that will enable users to do all of the things currently available over the web, including requesting flight upgrades. They’re also thinking of migrating the legacy system to the cloud to save on costs, get dynamic scaling, and all the other benefits of the cloud.
While they would like to eventually migrate/refactor/rewrite the system to be cloud-native, that could take potentially years for a big system. However, if they could get that new mobile app up and running by integrating it with the existing systems, that shorter time-to-market would definitely be appreciated.
Luckily, there’s a way to do just that.
🔗The Messaging Bridge Pattern
The Messaging Bridge is an intermediary component that receives messages from one queuing system and transfers them to a compatible queuing system elsewhere.
In ParticulAir’s case, that would mean the cloud-hosted back-end for the mobile app would put a message in the relevant cloud queuing service (Azure Service Bus or Amazon SQS) and use the “bridge” to route it to the legacy MSMQ on-prem system. Here’s what that would look like:
The immediate benefit of a bridge in this scenario is that new functionality (e.g. the mobile app) can be built using modern, cloud-based technology while still leveraging the tried-and-true code in the various legacy systems. This provides some breathing room for the cloud migration. New features can be added without having to re-write the legacy system at the same time. Even better, depending on the implementation of the bridge, the legacy systems may not even need to be touched at all. As long as they receive the MSMQ messages with the required data, they shouldn’t care where it originated.
Now eventually, ParticulAir does want to migrate their systems away from the on-prem, MSMQ technology. This is another instance where the Messaging Bridge Pattern can help. With a bridge in place, the entire system doesn’t need to be migrated all at once. Instead, a more gradual process can be used, moving one endpoint at a time from MSMQ to the cloud, with the bridge transparently taking care of the routing. This can remove a lot of the complexity and risk inherent in a large-scale migration. Let’s see how with an example.
🔗Don’t everyone migrate all at once now
Remember from the diagram above that the Upgrade component also publishes UpgradeFulfilled events that the Marketing component listens to (all using MSMQ). When that Upgrade component is migrated to Azure, then when it publishes those very same events, they will go to an Azure Service Bus topic called “UpgradeFulfilled”. With a bridge in place configured to route messages from the “UpgradeFulfilled” Azure Service Bus topic to the MSMQ “UpgradeFulfilled” queue, the Marketing component can continue running unchanged in the on-prem environment.
Without using some kind of bridge, both components would need to be migrated or at the very least “duplicated” (after modifying the on-prem component to talk to a cloud-accessible database). The thing is, that Marketing component probably talks to other components itself, which then would have to go through the same migration or duplication exercise (together with the components they talk to).
This is far riskier than if just one component could be migrated, because it means all those components would need to be tested and deployed in tandem. Imagine if any issues arose during testing or (shudder) in production, and you had to pinpoint where the problem lay. This is much easier if you deployed only a single component compared to a series of interdependent ones. Not to mention that it would be far easier to roll back a single component to a previous version. It gets even more complicated if the different components are managed by different teams.
All of this would also slow down the timeline for the mobile application to release its’ flight upgrade feature.
These problems go away if you have the ability to migrate a single endpoint at a time. Once a messaging bridge is in place and configured, teams can migrate their endpoints however they see fit without worrying about how their outgoing messages get routed to other endpoints.
So far, we’ve been almost as hand-wavy as most of the traditional cloud migration literature has been. It’s all well and good to say, “just use a bridge” but how do you implement one?
Here’s the good part: you don’t have to.
🔗The NServiceBus Messaging Bridge
The NServiceBus Messaging Bridge was designed specifically for these scenarios. It’s an implementation of the Messaging Bridge Pattern that takes care of routing messages between different queuing systems.
In our initial mobile app Flight Upgrade scenario, the bridge sits between the Azure-hosted mobile back-end, routing messages from Azure Service Bus to the MSMQ-based on-prem system:
The code in the mobile back-end can send messages to an Azure Service Bus queue named the same as the MSMQ queue of the on-premises upgrade component, say, “ParticulAir.UpgradeService” ignoring that a bridge is being used at all, as if the upgrade component was also hosted in Azure. By configuring the NServiceBus Messaging Bridge appropriately, messages from that Azure Service Bus queue will be forwarded on to where they need to go transparently.
This means when we eventually migrate our Upgrade component to Azure, listening to the “ParticulAir.UpgradeService” Azure Service Bus queue, we won’t need to touch our mobile back-end. Instead we would re-configure the Bridge to stop listening to the “ParticulAir.UpgradeService” queue and have it listen to the “ParticulAir.UpgradeFulfilled” Azure Service Bus topic, forwarding those events over MSMQ to the downstream Marketing component, which wouldn’t need to be modified either.
Through this process, we could migrate all the relevant components in this scenario, one at a time, to run on the cloud. When the last component is migrated, we’d remove the Bridge from the solution completely.
Until this goal is met, however, the messaging bridge can make sure your migration can happen safely and in smaller, more manageable chunks.
🔗Summary
We’ll reiterate the standard advice that migrating a complex distributed system to the cloud requires a well-planned, incremental approach that maintains system integrity and minimizes risks.
The Messaging Bridge Pattern can be a crucial component to your migration and, if your system uses NServiceBus, you can even wash your hands of most of the implementation details.
To see it in action, check out our sample on bridging messages between endpoints using MSMQ and Azure Service Bus.