Skip to main content

Posts related to architecture

  • Let's talk about Kafka

    Statue of Franz Kafka in Prague, Czech Replubic
    Head of Franz Kafka, a kinetic sculpture in Prague, Czech Republic

    We get a lot of questions about Kafka. Is it good? Does it live up to the hype? And most frequently, when are we going to support Kafka in NServiceBus.

    But to fully answer these questions, it’s essential to understand what Kafka is, and more importantly what it isn’t, and then think about the kinds of problems that Kafka solves. So, let’s dive into the (heavily footnoted) details…

    Read more

    1. An overloaded term if ever there was one, but that's a subject for another post. Suffice it to say, this does not mean the same thing as an NServiceBus event.

    2. This is a bit of an oversimplification. Ordering in Kafka is only guaranteed per partition, and the number of partitions heavily influences how your topic can scale with the load.

    3. While this may sound like a small detail, it's one of the most important aspects that make technologies like Kafka so popular for data distribution. Guaranteed ordering is one of the fundamental requirements to ensure correct data replication.

    4. That is, assuming that the retention period or maximum size has not been reached, at which point Kafka will start overwriting data. On the other hand, queues will not apply retention settings by default, and if your queue gets full, it will reject further writes until you consume messages from it to free up space. That may sound bad, but it ensures you don't overwrite critical business data.

    5. A NACK or negative acknowledgment

    6. See how NServiceBus retries message processing to determine if a message is a poison message before forwarding it to an error queue in I caught an exception. Now what?

    7. See Error Handling Patterns for Apache Kafka Applications and Kafka Connect Deep Dive – Error Handling and Dead Letter Queues for more information.

    8. …and the folks at Kafka will back us up on this one.

  • RPC vs. Messaging – which is faster?

    Sometimes developers only care about speed. Ignoring all the other advantages messaging has, they’ll ask us the following question:

    “But isn’t RPC faster than messaging?”

    In place of RPC, 1 they may substitute a different term or technology like REST, microservices, gRPC, WCF, Java RMI, etc. However, no matter the specific word used, the meaning is the same: remote method calls over HTTP. So we’ll just use “RPC” for short.

    Some will claim that any type of RPC communication ends up being faster (meaning it has lower latency) than any equivalent invocation using asynchronous messaging. But the answer isn’t that simple. It’s less of an apples-to-oranges comparison and more like apples-to-orange-sherbet.

    Let’s take a look at the bigger picture.

    Read more

    1. short for remote procedure call

    2. Did you notice we build a product that helps you build message-driven systems? So yeah, maybe we're selling you something too, but at least we're honest about it. 😉

    3. Note: This is not and should not be considered the only definition of microservices.

    4. In NServiceBus, we default to 2 * LogicalProcessorCount, but this is fully configurable.

  • Fallacy #8: The network is homogeneous

    Interoperability is painful.

    Around 2005 or 2006, it wasn’t so bad. Most of the code running on the planet, at least the code that mattered, was written in .NET or Java, and interoperability via web services was at least serviceable. Since then, things have gotten gradually worse.

    First came Ruby and Ruby on Rails. In the early days, it did not support web services like other platforms did. Next came the dawn of the NoSQL movement, driven at least partially by large companies with no incentive to interoperate. Google built BigTable, Amazon built Dynamo, Facebook built Cassandra, LinkedIn came up with Voldemort. None of these things can talk to each other.

    Then came REST, or in other words, “something I invented myself over HTTP.” Each RESTful endpoint is that developer’s definition of what REST means, which is different from what every other developer thinks REST means.

    Competitive pressures, together with companies’ desire to create vendor lock-in, suggest that we can expect more divergence to occur in the future.

    Read more
  • Fallacy #7: Transport cost is zero

    Of course, there are upfront and ongoing costs associated with any computer network. The servers themselves, cabling, network switches, racks, load balancers, firewalls, power equipment, air handling, security, rent/mortgage, not to mention experienced staff to keep it all running smoothly, all come with a cost.

    Companies today have, for the most part, accepted this as just another cost of doing business in the modern world.

    With cloud-based server resources, this equation changes only slightly. Instead of paying for a lot of these things upfront, we instead lease them from the cloud providers. It may change how a company can represent these costs on a balance sheet, but overall, it’s the same concept.

    But, we also have to pay for bandwidth. In order to connect our data center to the rest of the world, we must exchange currency for the transport of our bits and bytes. In the cloud, we must pay this also, whether directly or as part of the cost of whichever service we’re using.

    Read more
  • Autosave for your business

    If you’re a long-time video game player like me, your muscle-memory vividly remembers the F5 key’s location on your keyboard. For everyone else: F5 is a common key-binding for “quicksave” in computer-games. And like many others, I learned how to use it the hard way. After spending hours sneaking through dungeons, battling orcs, and looting valuable treasures, some nasty troll made an unexpectedly quick end to my character. That’s when I would realize that I hadn’t saved my game for a very long time and had to start over. From that moment on, I’d save my game as often as I could, and F5 became my closest ally.

    Modern games now provide a built-in feature called autosave. These games save your progress automatically now and then so that you won’t lose all your progress—only a few minutes at worst. This might sound trivial and obvious, but it is a game-changer for player experience. The player can now focus on solving their mission rather than the mechanics of the game itself.

    Why are we talking about video games when we have important business (let’s say, selling video games) to do? Let me ask a different question: what if that nasty end boss doesn’t come in the form of a troll? What if the end boss is a network error, power outage, concurrency conflict, or even squirrels?

    Read more
  • Fallacy #6: There is one administrator

    In small networks, it is sometimes possible to have one administrator. This is usually the developer who creates and deploys a small project. As a result, this developer has all of the information about this project readily available in their head and, if anything goes wrong, will know precisely what to do.

    I know quite a few developers and managers who talk about “bus theory” as a way to promote communication of critical knowledge. The central point is this: having only one person holding critical knowledge is dangerous because of what would happen if that person got run over by a bus. The term bus factor was coined to represent the number of people on your team who have to be hit by a bus before the project is in serious trouble.

    Read more
  • Fallacy #5: Topology doesn't change

    It’s easy for something that started out simply to become much more complicated as time wears on. I once had a client who started out with a very noncomplex server infrastructure. The hosting provider had given them ownership of an internal IP subnet, and so they started out with two load-balanced public web servers: X.X.X.100 and X.X.X.101 (Public100 and Public101 for short). They also had a third public web server, Public102, to host an FTP server and a couple random utility applications.

    And then, despite the best laid plans, the slow creep of chaos eventually took over.

    Read more
  • Infrastructure soup

    Five-Layer Brisket Chili
    Five-Layer Brisket Chili

    When it starts to get colder outside I start to think about soup. I love soup, especially chili. But I don’t want any of that watery gunk that’s just tomato soup with a few lonely beans floating in there somewhere. No sir! I want it thick and chunky. Load it up with ground meat, beans, onions, tomatoes, cheese, green peppers, jalepeños, pineapple–it’s all good!

    Just like with chili, we sometimes see code that feels kind of “thick and chunky.” It’s got validation, logging, exception handling, database communication, business logic, and so much more. But unlike chili, the result does not taste good.

    We see this kind of bloated, muddled code all over the place, regardless of what language or framework is being used, and NServiceBus is no exception. Here’s an example where someone has stuffed an NServiceBus message handler full to the breaking point:

    Read more
  • Multi-tenancy support in SQL Persistence

    Multi-tenant systems are a popular way to use the same codebase to provide services to different customers while minimizing the effect they have on each other. In a distributed message-based system you need to partition customer information and segregate messages from different customers as well. Additionally, you have to make sure different system components are tenant-aware.

    In NServiceBus SQL Persistence 4.6, we have added new features that make it a bit easier to create multi-tenant systems. Let’s see how it all works.

    Read more
  • Fallacy #4: The network is secure

    There are a myriad of security-obsessed organizations scattered throughout the world that take security concerns to the verge of paranoia.

    In one such organization I’ve heard of, there existed two separate networks. Everyone had two computers without external disk drives of any kind. Inserting a USB drive would not work, and trying to use one would instantly alert the sysadmins that a workstation was compromised. To get data from a different network, you needed to browse in a separate room, as workstations did not have access to the Internet.

    Once you found the data you needed, you could download it to a floppy disk and then hand the floppy over to a sysop. The sysop would copy the contents to a mirror folder, which would analyze the contents with every virus scanner imaginable before mirroring them to the development network. But that sync only occurred once per hour.

    Paranoid? Maybe. If you’re just selling widgets on a website, then probably. But if your organization is working on defense contracts or controls critical infrastructure like electrical grids, perhaps the paranoia is justified.

    The only truly secure computer is one that is disconnected from any and all networks, turned off, buried in the ground, and encased in concrete. But that computer isn’t terribly useful.

    Read more
  • What does idempotent mean in software systems?

    You’ll often come across the term idempotent in software, especially when you design distributed, message-based systems for the cloud. It seems like a concept that is easy to grasp at first but it’s important to know the intricacies of idempotence if you want your systems to be scalable and reliable.

    Learn more about the challenges of implementing idempotent solutions: See Udi Dahan's presentation on Advanced API and Integration Problems and Patterns.
    Read more
  • Fallacy #3: Bandwidth is infinite

    Everyone who is old enough to remember the sound of connecting to the Internet with a dial-up modem or of AOL announcing that “You’ve got mail” is acutely aware that there is an upper limit to how fast something can be downloaded, and it never seems to be as fast as we would like it.

    The availability of bandwidth increases at a staggering rate, but we’re never happy. We now live in an age when it’s possible to stream high definition TV, and yet we are not satisfied. We become annoyed when we run a speed test on our broadband provider only to find that, on a good day, we are getting maybe half of the rated download speed we are paying for, and the upload speed is likely much worse. We amaze ourselves by our ability to have a real-time video conversation with someone on the other side of the world, but then react with extreme frustration when the connection quality starts to dip and we must ask “are you there?” to a face that has frozen.

    Today, we have DSL and cable modems; tomorrow, fiber may be widespread. But although bandwidth keeps growing, the amount of data and our need for it grows faster. We’ll never be satisfied.

    Read more
  • You don't need ordered delivery

    In our family it’s a tradition that you get to decide what we’ll have for dinner when it’s your birthday. On my daughter’s last birthday, she picked pizza. I took her to the nearby pizza shop to decide what pizza to get.

    A large screen dominates one wall of the pizza place, showing each order as it progresses through each stage of preparation. As I was looking at the screen, I noticed some names suddenly switched. Some pizzas with fewer toppings could be placed in the oven faster, and some would take longer to bake than others. In various steps towards putting the pizza in its box, the process could take longer depending on the pizza. My daughter’s pizza required additional preparation time, so other customers were able to leave before we were. In short, pizzas were not being delivered in the same sequence as they were ordered.

    Read more
  • Break that big ball of mud!

    This post is part of the NServiceBus Learning Path.

    Have you ever had to deal with a function that had hundreds and hundreds of lines? Code that had duplication all over the place? Chances are you were dealing with legacy code that was written years ago. If you're a Star Wars fan like I am, it's like dealing with the Force. As Yoda would say, “Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering.” In my 15+ years of coding, every single time I've dealt with legacy code, fear, anger, hate, and suffering were pretty common.

    Read more
  • Putting your events on a diet

    This post is part of the NServiceBus Learning Path.

    Time for a diet

    Anybody can write code that will work for a few weeks or months, but what happens when that code is no longer your daily focus and the cobwebs of time start to sneak in? What if it's someone else's code? How do you add new features when you need to relearn the entire codebase each time? How can you be sure that making a small change in one corner won't break something elsewhere? Complexity and coupling in your code can suck you into a slow death spiral toward the eventual Major Rewrite. You can attempt to avoid this bitter fate by using architectural patterns like event-driven architecture. When you build a system of discrete services that communicate via events, you limit the complexity of each service by reducing coupling. Each service can be maintained without having to touch all the other services for every change in business requirements.

    Read more
  • Fallacy #2: Latency is zero

    The speed of light is actually quite slow. Light emitted from the sun this very instant will not reach us here on Earth for 8.3 minutes. It takes a full 5.5 hours for sunlight to reach Pluto and 4.24 years to reach our closest neighboring star, Proxima Centauri. And we cannot communicate at the speed of light; we must bounce data around between Ethernet switches, slowing things down considerably.

    Meanwhile, human expectations of speed are pretty demanding. A 1993 Nielsen usability study found that, for web browsing, a 100 millisecond delay was perceived as reacting instantly, and a one second delay corresponded to uninterrupted flow. Anything more than that is considered a distraction.

    Read more
  • What does your particular system look like?

    Have you ever been pulled in to a software project and had to figure out how everything works? Often, your options are limited to either sifting through piles of documentation or diving into thousands of lines of code. Unfortunately, the documentation probably became obsolete as the software grew and evolved and, while the code is accurate by definition, it requires a lot of concentration to trace through and figure out how everything fits together.

    Ideally, what we really want is a method that reflects the accuracy of the code but is more accessible and easier to understand. The best strategy for large software systems to achieve this is to create a kind of living documentation. The idea with living documentation is to generate it from the existing codebase. This way, the documentation is accurate and can be easily kept up-to-date – just regenerate it as needed (for example, as part of an automated build).

    Read more
  • Fallacy #1: The network is reliable

    Anyone with a cable or DSL modem knows how temperamental network connections can be. The Internet just stops working, and the only way to get it going again is to unplug it for 15 seconds. (Or, put another way, “Have you tried turning it off and on again?”)

    Thankfully, better solutions exist for professional data centers than consumer-grade modems, but problems can persist.

    As a company that does reliable messaging, we’ve really heard it all. Don’t worry, the names will be changed to protect the innocent.

    Read more
  • What Starbucks can teach us about software scalability

    Starbucks Drive Thru

    In 2004, Gregor Hohpe published his brilliant post “Starbucks Does Not Use Two-Phase Commit.” When I read it, my time working at Starbucks during my college years suddenly became relevant. Over the years, I gradually realized there’s even more that programmers can learn from the popular coffee chain.

    Although many people may want to build scalable software, it can be much harder than it first appears. As we work on individual tasks, we can fall into a trap, believing all things are equally important, need the same resources, and happen synchronously in a predefined order.

    It turns out they don’t—at least not in scalable systems, and certainly not at Starbucks.

    Read more
  • I caught an exception. Now what?

    This post is part of the NServiceBus Learning Path.

    I can’t draw to save my life, but I love comics, especially ones that capture the essence of what it’s like to be a software developer. They capture the shared pain we all go through and temper it with humor. Luckily, I no longer work for large corporations, so it’s easier now to read Dilbert and laugh without also wincing.

    Read more
  • Death to the batch job

    This post is part of the NServiceBus Learning Path.

    The Batch Job Ogre
    The Batch Job Ogre
    There’s something dangerous lurking in your software. Not just the general lurking, murky, ickiness you might expect. Oh no, it’s much worse than that. It’s something specific. Something big; something ugly. There might even be more than one. It can’t decide if it’s angry or hungry or both. All it knows is it’s having a very bad day. And it’s going to eat you. Maybe it won’t literally eat you, but it will come after your family time, your sleep,… Read more
  • Empires fall: Decentralize your code to avoid total collapse

    Alexander the Great
    Alexander the Great by Wikimedia Commons — Public Domain
    Ruling the world is hard. Alexander the Great1 may arguably have been the person to come closest to being “Emperor of the World”. In 334 BC, his armies left his home in Macedon (modern-day Greece) and conquered a swath of territory to Egypt and halfway across Asia to northwest India - most of the known world at that point, but ultimately failed in ruling all of humanity when he died at the age of 32 under somewhat suspicious circumstances2. Of… Read more
  • UI composition - the blind spot of distributed systems

    When architecting a distributed system, it’s easy to get caught up in the excitement of all the cloud backend services available to us. Azure Service Bus! AWS Lambda! CosmosDB! We spend a lot of time decoupling all our backend with microservices, each with their own responsibility and stack and so forth, that we often forget: the front-end has to talk to all of these services. But it can be decomposed into components just as easily as the back-end services can. Here’s a story of how we did it.

    Read more