Lost messages are just as bad as lost luggage
You’re standing in the airport, waiting to pick up your bag. Did you ever stop to think about all the software systems involved in tracking your luggage on your journey? From the moment you drop it off at your departure airport to the moment you breathe that sigh of relief as it shows up on the baggage carousel is a complex story of messaging and system integrations.
I recently had the opportunity to chat with the lead developer for the luggage arrival system at a major Asian international hub. 1 He told me how using NServiceBus made it possible to get all the systems in the airport to work together reliably.
🔗Reading from the firehose
ARINC is a global service used by many airlines, but it’s not the easiest service to work with. Once you connect and subscribe, you process the information coming in through the TCP connection. This is not JSON or XML information—it’s a low-level byte stream with fields that are fixed numbers of bytes. Only an arcane set of rules determines what constitutes a message, let alone how they’re formed. Translating the stream of incoming bytes into a series of messages can be tricky, given that there’s little to distinguish even where one message ends and the next begins.
The biggest challenge here is how to reliably process this data because it’s not straightforward to reread the information from ARINC. If you miss it the first time, you have a problem. Usually, messages only arrive at a rate of about 20 messages per second, but the system needs to be reliable and scalable enough to handle between 1000 to 2000 messages per second.
The only safe way to deal with information like this is to immediately append it to a file on disk, and then pass it to a message queue using NServiceBus.
Once the bag source message information is contained in an NServiceBus message, we don’t have to worry about losing it. The recovery capability in NServiceBus makes sure that any failure will go through a series of message retries in case the error is transient or is safely written to an error queue where developers can inspect the problem, retry the messages, or even fix malformed messages before retrying them.
Without processing the bag source messages through a queue, it would be impossible to complete all the steps required for the processing of each bag message—including parsing the messages, saving them to a database, and then matching up bag information with flight information—at the speed required by the incoming data stream coming from ARINC. Additionally, any faulty data or flaw in application logic could result in a loss of baggage information from ARINC that can’t be easily recovered.
🔗Flight tracking and matching
Incoming luggage has to be matched up with flights as well. Flight information comes from a separate flight information system, which is the same system that drives the large Arrivals and Departures screens inside the airport terminal.
Then, baggage and flight information must be joined together into a bag list containing a bag number (found in the barcode attached to the bag), a flight number, and a departure date (typically with no year) for each bag.
The problem is that it’s actually really complex to uniquely identify a specific flight.
Assuming a made-up airline code
XX, the canonical form of a flight number is
XX0460, but some systems might represent that as a shorter
XX460, but that’s just the start of it.
Depending on the flight, the arrival time could differ significantly from the departure date, especially for long-haul flights crossing the Pacific Ocean and the International Date Line. But there are other factors, such as if the flight gets delayed, or canceled and rescheduled. Even a canceled and rescheduled flight would carry the original departure date—not necessarily the date the bag got loaded onto the plane.
Flights can also be cross-listed on other airlines, such as when a Delta flight is “operated by” KLM, one of its airline partners.
A myriad of logic like this goes into matching bag numbers with the flights. Using NServiceBus allowed the lead developer to divide this logic up using different message handlers and publish/subscribe techniques so that they could design the overall flow of messages through the system, and pass off the implementation of individual message handlers (representing much more contained and well-defined problems) to other developers on their team.
When a flight arrives, some bags will be routed to connecting flights, while others will make their way to the arrivals hall to be picked up at a baggage carousel.
For the bags headed to connecting flights, NServiceBus message handlers translate queue messages back to the byte-level protocol to be transmitted back to the ARINC service. But for the bags headed to the arrival hall, the lead developer wanted to automate the system that displays which flights are being served by which carousels and the status for each carousel.
As the bags are unloaded, a baggage crew member armed with a barcode scanner scans the barcode on each bag before it is placed on the conveyor belt. The scanner connects to an API that generates an NServiceBus message, and then one by one, each bag on the bag list is accounted for.
When each bag is checked off the list, the bag arrival system is automatically updated to display that all bags have been unloaded. It’s at this moment that, unfortunately, some weary traveler might realize their bag isn’t going to appear on the carousel after all, and they will need to go report a lost bag.
Many of these systems can be overridden manually, for instance, to say that all bags have been unloaded. Still, for the most part, the automation that occurs by tracking each bag allows the whole system to operate completely autonomously, allowing operators to focus on other tasks.
An airport is a prime example of a very public place where a whole lot of software and a litany of business rules come together in ways that most people never think about, let alone fully comprehend. By comparison, other business domains might appear simple at first, but every domain has this tendency to hide its own complexity until you really start to dive into the scenarios.
In all these business domains, NServiceBus can provide the ability to break down this complexity. Each problem gets broken down into processes, each process into a series of steps, and each step is represented as a message handler processing a message.
The safety of business data encoded into messages means you can safely read from a firehose of external data, knowing you can’t lose data. The discrete nature of messages means it’s easier to reason about what’s happening within one well-defined interaction or related to one specific business rule. And the ability to build orchestrations around the results of multiple messages makes it easier to design and build processes around individual events over longer periods of time.
Air travel is only one example. So how could NServiceBus make your business domain better? Give us a shout, and let’s talk about it.
Unfortunately, it's impossible to name names without getting a bunch of lawyers involved. The names have been changed (well, just omitted) to protect the innocent.
Well, hopefully. The bag source messages really represent the bags checked in from the outbound port, not the aircraft. So there's no way to know whether a bag made it onto the aircraft! 😬