Webinar recording
Fireside chat: Orchestration and choreography with Laila Bougria & Udi Dahan
Watch the Virtual DDD session and learn more about managing complexity with orchestration and choreography.
🔗Why watch?
When building event-driven architectures, one of the challenges we face is coordinating work across many services. How do we implement complex data flows or complex business transactions that consist of multiple asynchronously executed steps?
Luckily, there are patterns that can help us manage this complexity: orchestration and choreography.
In this fireside chat, Udi Dahan and Laila Bougria discuss how each pattern works, the pros and cons of each, and the trade-offs involved when choosing one over the other in specific contexts.
🔗See Udi and Laila respond to questions such as:
- Are there specific technologies that are well suited for orchestration and choreography and why?
- What key components should impact the preference for one pattern over the other and why?
- What are the fundamental differences between event-driven orchestration and choreography in managing complex data flows or business transactions?
- Are there real-world scenarios where event-driven orchestration proves advantageous over choreography, and vice versa?
- How does event-driven orchestration or choreography influence the maintainability and extensibility of a distributed system over its lifecycle?
🔗Transcription
- 00:00:00 Kenny Baas
- Welcome everyone to this new virtual DDD meetup. My name is Kenny Baas. With me is Marco and Krisztina. We're from the organizers and with us today we have two special panelists and they are Udi Dahan, which is the founder of and CEO of Particular Software, creator of NServiceBus, and one of the world's foremost experts on service-oriented architecture and domain-driven design, and we have Laila Bougria, who is a software engineer at Particular Software, a Microsoft Azure MVP and a frequent speaker at conferences across the world. Thank you for joining us. For the people on YouTube or in Zoom, we shared a Miro board which you can use to ask your questions. You can also chat away in all the chats available. We try to scratch all these questions. We already have a few questions from people beforehand, which we're going to discuss, but shall we first dive into, well, some quick overview of orchestration and choreography? Let me share my screen and then let's go. Who wants to pick it up? Laila or Udi? Just a quick overview of what we're going to talk about today before we go to the questions.
- 00:01:27 Laila Bougria
- Happy to take it away. The first picture that we have here is a visual representation of orchestration. Now, I actually created these slides based on a bunch of blog posts and resources that I've read out there and I differentiated on purpose between orchestration that's based on synchronous communication and orchestration that's based on asynchronous communication, because there seems to be, depending on what you're reading, sometimes you can read certain opinions that are based in a belief that whenever you use the orchestration pattern that you are therefore using synchronous communication based on some form of API communication GRPC or whatnot, which then exposes you to certain downsides as well. Now, that doesn't necessarily have to be the case. You can also use the orchestration pattern and base that off of asynchronous communication still using commands and events as sort of shown in this example.
- 00:02:31 Laila Bougria
- As shown in the legend, the blue arrows represent a command, a sort of imperative sort of message that is telling another service what to do, whereas the events are represented by the yellow errors and are just publishing that an event has occurred within a certain service. Now, in the orchestration pattern, there's basically a central component that is taking responsibility for the entire workflow and making sure that the flow is basically handled accordingly, that the services are called based on the right prerequisites that are required for those steps, and so forth. Maybe, Udi, you can take over the choreography alternative pattern.
- 00:03:20 Udi Dahan
- Okay. Well, I think that the main thing that I'd like to call out is that what you see in the orchestration slide that you don't see in the choreography slide is that there is that additional component, the orchestrator. If could scroll back up. Exactly. The idea in orchestration, if you can think of it kind of like there is an orchestra and there is a conductor in an orchestra and the conductor is live. They are involved in what is actually happening in real time telling the various other actors, players, and musicians what to do. Whereas in a choreography style, there isn't an equivalent conductor that is live on the stage telling people what to do. The source of these terms is indeed orchestration from the concept of an orchestra and having a conductor there that is live. Choreography, taken from the concept of the way that dance choreography works, is that you'll have a choreographer who works with the various dancers, actors as it were to say, to essentially set up the choreography, meaning that when this player does this, then that one does that, and essentially we decide on how the flow is going to work.
- 00:05:05 Udi Dahan
- But at runtime, you don't see the choreographer up on the stage involved in what's happening, so that's sort of the primary distinction. In orchestration, you have an orchestrator, you have the conductor that is a core part of the runtime flow that is instructing the various actors as to what to do when to do it, et cetera, and in choreography, you don't have a choreographer, orchestrator or a conductor that is live at runtime instructing the players what to do. Instead, the players themselves essentially play off of and react to each other, so the concept of events when this other actor does X, then I'm going to go and do Y is a much more significant part of the choreography style. In orchestration, the thought standard orchestration styles, you might not even have events at all. You'll often have commands and responses. Again, those can be synchronous HTTP style request response immediate type things or asynchronous full duplex request responses. But as sort of a general theme, we tend to see a whole lot fewer events in an orchestration style and we tend to see quite a bit more events in a choreography style.
- 00:06:42 Kenny Baas
- Oh, thanks. I hope that sets the stage for this conversation. We have already some questions and let's start off from the start from Nick who's currently moving their monolith to services and the biggest problem they have is designing which data elements to send as part of the event message and which parts to expose through API calls where they have cross-domain information. What are some methodologies people use to determine this? Who wants to give it a go to answer this one?
- 00:07:20 Udi Dahan
- I can start off. I'd emphasize that the context that Nick raised, starting with a monolith and migrating that to a different architectural style is probably the most significant context here, more so than the orchestration versus choreography question. A lot of how one goes from a monolith to something better is very much influenced by, well, what is the specific state of your monolith? Which bits are the tightest coupled to each other? Which bits are less coupled to each other? I think the first part is recognizing that along the way as you're transitioning from monolith to something better, you're probably going to be transitioning through a bunch of suboptimal transitionary states.\.
- 00:08:31 Udi Dahan
- In my course, I talk about this context a fair bit, but it's kind of like trying to refactor a bus into an airplane, and for a while it kind of looks weird that you've got this bus with wings that is driving down the road and those wings are bumping up against telephone poles and whatever. It's not very elegant. It's not the kind of thing that you'd say, "This is exactly what my architecture wants to look like." But you realize that in order to refactor a running bus into eventually a flying plane, you're going to have a transitionary state that nobody would look at that and say, "Yes, this is exactly a good state for a given system." It's acknowledging the fact that if you are making use of events you might be having what are sometimes called fat events, meaning events with lots and lots of data in them, as a useful transitionary mechanism where it could be that two or three transitions later that you've been able to pull apart the data and clean it up and then you move more to a thin event type style.
- 00:09:53 Udi Dahan
- That's not to say that thin events are good and fat events are bad because when you're in the context of transitioning something, you use the tools that you need as you need them and it's all about trying to keep the system functional at each sort of transitionary step. We don't want to do the big bang, nothing works for a year and a half, two years type of situation. I think that context should be sort of the driving factor more so than fat events versus thin events, orchestration versus choreography. We can talk about the strengths and weaknesses of those patterns in sort of a theoretical way, but the context of migrating a monolith, it's really... Another analogy that I use is it's like climbing a mountain. You don't kind of say, "Oh, I'm at the base of the mountain. I want to get to the top of the mountain. I'm just going to go in a straight line up the mountain." If you try to do that, you'll probably fall and hurt yourself. It's not an effective way of climbing mountains. Instead, you kind of need to work your way gradually around the mountain going up little bits at a time. But a lot of the energy that you're doing is transitionary, but that's really the only way to safely get up a mountain, and that's the same thing with scaling or transitioning a monolith.
- 00:11:29 Kenny Baas
- Yeah. I think if I also look at the question itself, underneath, it's this thing, right? Once they try to dissect it, there's this API calls which leans more maybe towards orchestration, right? Sending synchronous commands or events, choreography, but it's a whole... Yeah, so what methodologies would you use to decide that? What's your get-go? Maybe Laila, you want to answer that one, but I think that underneath these are two questions, right?
- 00:12:05 Laila Bougria
- I think one of the things I caught when you first presented the question was the question of how do I make that decision between what should basically be almost asynchronous type of communication, and what should be more synchronous type of communication, especially since you're coming from that monolithic context. Yeah. This is something that I've discussed before in one of the sessions I've presented and I think one of the sort of easiest ways to look at that is to sort of look at your business processes and to start to think of what things happen naturally at a different point in time. Like our always go-to example of our sale and Amazon order fulfillment type of process where you basically submit an order, but then we also need to get that shipped and all of that. From a just natural point of view of looking at the timeline of such a process, shipping that is going to happen at another time. Packaging that is going to happen later in time, so those are the parts that you could already easily sort of cut through and make those asynchronous because they are asynchronous by nature.
- 00:13:19 Laila Bougria
- I think another way to sort of look at it is to ask yourself, if you look at individual steps that make up a workflow, whether you're thinking of doing that in an orchestrated form or in a choreographed form to sort of think of, if I execute this piece of the workflow, this individual step and that fails, does that mean that I need to compensate or roll back, if you will, previous steps that have already been executed and sort of try to understand how much coupling there is inside of that to see where you can start cutting things apart, and I found that a very good way to understand where you can start to divide things up and then it starts to become easier and easier as well to make that transition.
- 00:14:09 Marco Heimeshoff
- I have a follow-up question on that from Alex. In choreography, what's a good pattern for namespacing events on the bus as they bridge different services and components especially that you talked about, right, coming from one context in a monolith, getting into more context over time? How does this evolve and what's the strategy there?
- 00:14:29 Udi Dahan
- Do you want to continue on that Laila?
- 00:14:34 Laila Bougria
- Sure. Well, I think at that point we are almost starting to talk about what type of topologies you're using as well, which is a whole other sort of topic in itself. But assuming you want to create a sort of topic per event, I'm going to assume that for a moment, then I believe it's a good match to match the event name as well and that you could reuse the service in which that is running to namespace or prefix that event, and that's usually sort of the model that I would use if I would use that type of topology,
- 00:15:20 Udi Dahan
- If I might add a layer to that. I think the first thing that potentially some of us take for granted when talking about events is that there is a single logical owner to events and not everybody accepts or approaches events from that context. Part of the issue also is that the term event can be sometimes overloaded or not everybody uses it the same way, so in some cases people might use the concept of notifications as a kind of event, so I'll distinguish. When we're talking about events, what we're talking about is events at the level of business contexts, bounded contexts if you will, domains rather than notifications that we might be pushing to users or to other systems.
- 00:16:29 Udi Dahan
- While at a technical level both of these things as payloads may behave in sort of a one-to-many broadcast fashion, they are used for different things, so when talking about notifications and or another angle of this is the concept of data distribution where you've got let's say data in system A and you need to get that data into systems B, C and D. That also tends to correlate with events containing lots of data. But essentially the context there is it's distributing the data rather than the event being significant in the sense that something at a business domain level happened, so beyond kind of saying this record changed and everybody who has a copy of this record, I'd like them to receive that information, and this is a lot of what folks that are using Kafka as a system end up doing. They're essentially using it as a data distribution service to keep lots of different systems in sync. There's that kind of event, if you will, which I'd say is a different category of thing. That's a data distribution type of thing.
- 00:18:02 Udi Dahan
- Then you've got notifications that are often going to users, so things like emails and SMS and mobile push type of notification type things. They may correlate with business domain type of events, but a bunch of times, like you'll see this in mobile banking or something, you'll get a push notification when money arrives in your account or a transfer that you've initiated has been fulfilled or all those types of things where you've got the business event itself at sort of one level and then a user push notification that happens on top of that, so you want to distinguish user push notifications, which are optional additional technological things from the underlying business event. When people are talking about events or asking those questions, the first thing we need to essentially clarify is when you say event, what are you referring to? Are you referring to this or this or this or that? Because that will help answer the question of, all right, so how do we go about doing namespacing and or mapping these things to various other technological concerns?
- 00:19:25 Udi Dahan
- One of the challenges I see when people do data distribution or user notification types of things is that you'll often have many sources that can initiate a push of a notification like that, so over there you don't tend to immediately find or see a single clear logical owner to that. Whereas when you put those to one side and you go into business event type territory and say there's a very clear, logical bounded context that is the single source of truth, the logical owner for that thing, then we'd likely use the name of that bounded context, that business domain as a part of the namespace name, which then as I was saying would likely map to a topic in the infrastructure that we're setting up.
- 00:20:25 Udi Dahan
- In the picture that we showed in the slides at the beginning where we had sales and shipping and billing and those kinds of things, those things are, if we interpret them as logical domains, then they will likely be involved in the name of the topic and slash the namespace that we're setting up. If we're doing a data distribution type thing where there could be any number of producers of the same type of thing and saying, well, who's the logical owner? You might not get a clear answer to that. To say what name would you then incorporate in the namespace of the event that you're setting up, it isn't as immediately clear what you're going to be doing there. There's a lot of sort of sifting around to do before being able to answer a question of how do you set up namespace.
- 00:21:26 Krisztina Hirth
- Can I follow up on this? I have my own question. It's not on the board, sorry. If say data distribution doesn't have a real owner, how can this be if we have data ownership, we have bound the context, we have domains and domain caring about payments can only distribute the data about payments, or you say this domain could aggregate and get other data and add other sources in this, which is not their domain.
- 00:21:59 Udi Dahan
- The tricky part is that many environments don't necessarily... What's the best way to say it? They may be applying domain-driven design and bounded context and all those things, but that tends to be limited in scope to a given system or set of subsystems or a sub-org in the organization. But saying, "Oh, we've got a payment still. It's like, well, that's great, but there's this other part of the company that's doing wire transfers that is not being handled by your system, so who owns payments? The answer is, well, kind of sort of both of us do. We handle this part of payments, they handle those parts of payments. Then it's like, oh, okay, great. What about refunds? It's like, oh, right, okay, so we've got actually 13 different systems that can process refunds in different ways.
- 00:23:02 Udi Dahan
- Once you start going beyond the scope of a single sort of clean environment where you kind of have full control and look at the surrounding things that are often covered. Oh, no, no, no. That's integration. It's like, well, right, but that integration has a business context to it, so to say who's going to publish the events about payments received or payments returned or all of that, it's like, well, if we look around broadly enough, we end up seeing a very large number of systems that engage in the very same business processes and we say, well, which one of those is the source of truth? There isn't always, and I'd say more often than not, there's a whole lot of overlap between them. That's why it's hard. Yeah.
- 00:24:01 Kenny Baas
- Yeah. It can also be different business lines, right? I saw the first question, which was also focused on information and you two both started talking about bounded context, so maybe that's a nice bridge to the next question which, well, I saw this triggered it a bit, right, on Twitter, use orchestration within the bounded context, but use choreography between bounded context. Why orchestration within the bounded context? Would you always use orchestration or favor orchestration within the bounded context? How do they relate? Maybe we can, it is a virtual DDD meetup as well, so how do these two patterns relate with these bounded contexts? Who wants to? Maybe Laila, you want to hit that off?
- 00:24:46 Laila Bougria
- Sure. It's going to be an "it depends" type of question, at least in my opinion. It makes sense to sort of say yes, use orchestration within a specific bounded context, but I feel that giving that generic answer is dangerous and I would rather any of us, any developer, architect, basically building a system, look at a whole workflow and rather just look at the complexity of that workflow, how many steps are involved, how many failure scenarios can occur, how much of that does trigger sort of compensating transactions or actions or even alternative flows and those types of questions to basically decide on whether orchestration or choreography is a better pattern to use. Because if we would base it solely on the fact of whether it's within or overlapping certain bounded contexts, I think there's a danger of then making it a sort of black-and-white type of decision leading to eventually making the wrong decision in some cases.
- 00:26:03 Laila Bougria
- I'd rather sort of focus on those components instead and also sort of include whether it crossed a bound context or not, and if it does, maybe it could even sort of lead us to reevaluate did we get one of our service boundaries wrong? Because that might also be something that is hidden in there, which is why I sort of struggle with saying, yeah, sure, follow the service boundary. It's like, okay, but did you make sure that the service boundary is correctly defined as well? Which is, as we know, I think one of the most challenging exercises that we do when applying domain-driven design practices. That's usually my recommendation is to rather look at the entire workflow, how complex is it? The way I see it is the more steps you have, or to say it simply, the more complex the entire workflow is and therefore the more compensating types of things that you need to do, the more you're probably going to benefit from an orchestrated approach to handle those things. Udi, I'm super interested to hear your angle as well.
- 00:27:16 Kenny Baas
- Just to add one thing before we go there, because it also says, would you ever use orchestration cross-bounded context? My follow-up question would be in that, would you then model that also as well as a new bounded context?
- 00:27:32 Laila Bougria
- Maybe. Yes, I would definitely consider it, and if I see that there's great complexity, then I would feel more inclined to do so.
- 00:27:45 Kenny Baas
- What do you think, Udi?
- 00:27:47 Udi Dahan
- I definitely agree with and extend Laila's statement that there's a good chance that people get their boundaries wrong, definitely early on. I'd like to present one cause as to why that happens. It is essentially the way that organizations tend to charter projects and teams to work on things. Often there'll be various discussions at business-type management levels. One of them might be that the old system is causing us a lot of grief, let's have a team go in and fix it, where essentially the bounded context which is defined is that old system. But part of the reason why there are issues in that old system is it's not very well designed. It could be a monolith, but part of the statement of monolith is talking about its internals. The part that isn't discussed enough is probably it has all sorts of overlapping responsibilities, both in terms of data and functionality with a whole bunch of other systems in the organization, meaning that it is not a clean logical boundary to begin with.
- 00:29:30 Udi Dahan
- When a team gets started and they're given a certain scope of work, and that usually correlates with physical systems, those physical systems don't tend to map to clean logical boundaries. Why? Because just sort of the history of organizations that different systems got built by different sub-parts of the organization and they got integrated together and then they acquired another company and then they integrated those things and everything sort of patched together and request-response calls here and database batch jobs over there. The initial boundaries were never really thought out that well. Then subsequent to that, the projects which are kicked off tend to align with those archaic historical boundaries. Right out of the gate, things are not clean. But for a lot of people, they don't spend a minute thinking about that or dealing with that, so they're looking sort of at their scope trying to ignore everything else and saying, all right, how do we come up with the correct sub bounded context of this space?
- 00:30:57 Udi Dahan
- I don't want to say it's a fool's errand, but it's really hard to do when you're starting from an unclean type of initial place, so that element of saying, oh, under these conditions, these are my bounded context and therefore I'm going to be using orchestration within them and I'm using choreography between them, it's realizing things have not been set up cleanly to make such sorts of guidelines to necessarily be helpful for that project. That's part of the reason why I think that Laila's recommendation say, well, hold on a second. Take a more "it depends" type of approach. Really look at the details of what's going on to decide what's appropriate.
- 00:31:46 Udi Dahan
- Now, all of that being said, yes, sort of from a clean perspective, if such a thing could exist, or let's call it the academic type of answer to say if there were no other constraints and things were set up cleanly and you've got these bounded contexts and they are indeed well decoupled from each other, and that means that they're not sharing raw business data with each other, meaning that there's really clean ownership and separation between them and that whatever larger workflows that, for example, Laila was talking about, have already been well subdivided between these various bounded contexts, then yes, under those circumstances choreography is likely to be the better choice of interaction between those loosely coupled highly autonomous bounded contexts, and that within them, the reason why orchestration makes more sense is that you have things that are already logically coupled to each other. They need to share more data.
- 00:33:03 Udi Dahan
- There is a certain element of, I can't know that I can take the next step until they get the response from the previous step. A bunch of, let's call them sub workflows, have that nature to them, whereas between one sub-workflow and another sub-workflow, there can be a looser type of relationship. But essentially that's the work of identifying the bounded context to begin with. It's looking at all of your workflows and all of your data and all of your systems and trying to make sense and get things into the right pile and decouple them cleanly from each other. But again, the reality of most organizations is that that sort of clean academic exercise is not going to map well to what you're actually dealing with. Having these guidelines, it's important to say, yes, under conditions of academic cleanliness and purity, these are the guidelines. However, in your context, be aware that it's kind of like migrating a monolith. You need to be a lot more practical, tactical in the weeds. I'm doing this now to transition to the next step and then from there I'll do some other change.
- 00:34:24 Krisztina Hirth
- Thinking about maturity here, isn't it? Maturity of the product, maturity of the people, maturity of the timeline, how long this product is not changed? I have some follow-up questions.
- 00:34:41 Udi Dahan
- We could use a maturity model to talk about organizations and their systems and the knowledge and the skill of the people involved, and all of those elements. Say that if you're earlier in your journey and you have a much messier initial state to start from, then it's more sort of the tactical, no holds barred, whatever works, works and live to fight another day type of thing. Versus a, if you've been transitioning gradually climbing up that mountain for the past, whatever it is, 10, 15 years and you've got things much more cleanly separated, you're going to have more degrees of freedom and then the guidelines, you'll be able to follow them effectively in more cases. In that sense, I'd say, I'm not sure I'd use the word maturity by itself. I'd say there's a maturity model that for different organizations, different tactics are appropriate.
- 00:35:51 Laila Bougria
- Actually, I'd argue that given that we know that even if we at some point reach a sort of stage in our systems or applications where we feel like, oh, we've reached a sort of good level of maturity and we're pretty sure that we've got our boundaries right, still that system is evolving, and especially in today's day and age requirements keep coming in. We keep making changes and that design and that architecture needs to sort of evolve with it, and that's why I'm way more an advocate of continuously looking at all of these components and basically making that decision over and over again of which are the components that would lead me to this pattern versus this pattern as even an additional way to validate whether your boundaries are still valid over time and with the changing requirements and all of that. I think it's also just a good practice, yeah, to continuously make that exercise. Also, new people join the team. They might have different opinions, see it differently or see some things changing that people who have been working on the project for a longer time are just unable to sort of see our spot.
- 00:37:07 Kenny Baas
- Yeah. I think from a domain driven design perspective, one model is no model, right? At least three. You can play around with that as well.
- 00:37:18 Krisztina Hirth
- I have a question from John regarding exactly this. It is not about the pattern and orchestration or choreography, but about the evolution of events. Why developing applications, new concepts and features emerge over time. These new concepts' features may affect existing payloads within events. What is the best way? Can you move the sticky? What is the best way to mitigate these types or evolving messages? So event evolution, event data evolution, information evolution.
- 00:37:59 Udi Dahan
- Right. I think we need to include sort of another example of events that we didn't mention here that is often raised as a challenge or that serves as a challenge for the kind of context of this question, and that's when people do event sourcing slash, which is event source domain models, which essentially correlates with event sourced data models, and essentially the payload of the events is there's an internal element even for a given bounded context and then saying even within that bounded context I might be using events and then I have to evolve those based on the new requirements that are coming in. You've got that sort of internal form of events as well as the, I'm doing events between bounded contexts, and again, those bounded contexts are loosely coupled. I think we can say at sort of a general level that the less data that your events contain, likely the easier it is going to be to version them, but for the most part in the sense that you're not going to have to version them, so if the business is changing something and that is not affecting the payload of an event, then that makes life easy.
- 00:39:39 Udi Dahan
- Now how do we get there is part of sort of the question. People say, all of that sounds good, but how do I get there? This is where I think we call it view model composition, UI composition, micro front end type of architectural styles come into play. By having, whether it's a workflow like the Amazon checkout process where you might build up a very large order object that has the items, the quantities, the prices, the shipping address, the billing address, the shipping options, all of that as sort of a large event, and then you're passing that between systems, when the business comes along and says, "Oh, we want to change something. Now we want to deal with people buying MP3s and all sorts of other digital items," then what is the shipping address of a digital item? That new feature, that new set of requirements changes the payload of the event because we put a lot of data into the event to begin with.
- 00:41:01 Udi Dahan
- If on the other hand we set up that workflow to say that it's actually a series of bounded contexts where each of them has a micro front end that is capturing the set of data that is relevant to it, and the only information that ends up being passed between those bounded contexts via an event is essentially an order ID or an order ID and a customer ID where all of the other data is collected and managed top to bottom vertical slice within a given bounded context. Then when the business says, "Oh, I want to change. I want to allow digital goods," we might need to make some change within a bounded context, let's say the shipping bounded context, but then we wouldn't need to change the other bounded context. We wouldn't need to change the events between them.
- 00:41:59 Udi Dahan
- The question of what techniques are there for evolving events, the first technique is use thin events. Don't put lots of data in the events. That will mean you don't have to change them nearly as much. How do you do that? Micro front end, UI composition, view model composition, vertical slice architecture, all of that. But all of that is a non-event based style. But layering that architectural style in solves a bunch of their problems that people would end up having to deal with again at the level of events between bounded context and evolving them. That's sort of the big one. There are all sorts of other technical tactical techniques for saying if I did include a bunch of data in an event, what ways could I tactically evolve that to minimize the impact on subscribers or consumers of those events, et cetera. But I think the person asking the question is familiar with a lot of those tactics. What they're kind of looking for is the, I'm not happy with any of the tactics, and part of the reason is there was too much data in the events to begin with.
- 00:43:22 Krisztina Hirth
- Maybe I don't have more information, but all these options have trade-offs because... John answers yes, he's happy.
- 00:43:36 Udi Dahan
- I think I'd love to hear Laila's comments on sort of the technical tactical patterns for the event evolutions.
- 00:43:44 Laila Bougria
- Right. Well, so let's say that you're not in that ideal scenario, right, and you are basically dealing with a bunch of in-flight messages while you basically need to evolve those contracts. This is always tricky and basically you are going to have to live with having to support those types sort of side-by-side for at least a while. Let's say that, I don't think it matters that much, whether it's an event or a command, what type of message it is, but let's say that there's something added to the payload that is then needed by one of those services. You would still have those two contracts that you basically both have to support, and at that point you'd have to make the decision that when one of the messages using the old contract comes in, you have to at that point either decide to say, I'm going to assign some kind of default value for the value that I'm missing here in the contract, or you change sort of that implementation of handling that specific message and go fetch that additional information while you wait for all of those i- flight messages to be handled and transition basically to that new contract, which includes all of the information that you need.
- 00:44:59 Laila Bougria
- That's basically it. You'd have to have that transition phase, make sure that all of the in-flight messages have been handled and allow the rest of your system to basically evolve to that new contract, but you'll have to handle them both. That's one of the biggest challenges that we also with our work at Particular that we face all the time, is if we change anything, what happens to in-flight messages? It's definitely a massive channel challenge.
- 00:45:26 Kenny Baas
- A question to follow up there, Laila. Have you ever tried consumer-driven contract testing between in an event driven way? I know Pact sort of like, I'm not sure which one in one of the things, but have you ever?
- 00:45:41 Laila Bougria
- I've heard and read about it, but I haven't actually tried it. No, I haven't.
- 00:45:46 Kenny Baas
- Very curious how that would relate, right, with what you just mentioned. You still have the same problem that you just mentioned, by the way, but at least you have more grip on the contracts between it, right?
- 00:45:57 Laila Bougria
- That's a good question. I'll definitely check it out.
- 00:46:01 Kenny Baas
- To follow up, do you want something to say, Udi?
- 00:46:05 Udi Dahan
- Yeah. I mean, that idea of saying you're going to have potentially multiple consumers of a given event and each of them has slightly different needs and in terms of sort of a validation mechanism using a consumer-driven contract, so if we translate that to programming concepts, we could say that each consumer has a kind of interface. They're saying I would like the payloads that are coming in to conform to a thing that has these attributes, right, so it's kind of like an interface. I don't really care about the class that's coming in so long as it implements this interface, right? You could essentially have multiple consumers, essentially each of them with their own interface, and essentially what you want is that whatever is the thing that you're publishing, it's implementing all of those interfaces so that essentially each one of those consumers saying, "Oh, yeah. The payload that you're producing conforms to the interface that I expect."
- 00:47:19 Udi Dahan
- Now, things get interesting when they themselves start to version the things that they want to receive. Right? You had a V1 interface saying, this is what I want to consume, and then sometime later that consumer now has a V2 interface that might have yet another field to it, right? Then it's kind of a question to say in terms of the logical relationship between a producer and any one of the many consumers that it has kind of say, well, who's the one that gets to dictate that another thing should be in the payload? Is it the consumer or is it the producer? Even when talking about consumer-driven contracts and consumer-driven contract testing as a result of that, you still need to come back to the logical question of the relationship between them. Where if you're saying, well, if the consumer can dictate what the producer needs to produce, then are they really that decoupled from each other?
- 00:48:30 Udi Dahan
- It's kind of like I send you an event saying, "Give me some data and I'm going to pretend like I don't know you," so I'm just going to sort of shout into the ether and say, "Hey, if somebody named Kenny would happen to have some data that conforms to 1, 2, 3, 4, 5, 6, I'd love to hear that in the next five minutes." That's not really a decoupled relationship, is it? Right? Sometimes people, they get involved in the tactics and they put all sorts of things into place without realizing that, look, essentially you have two components here that are talking to the same database table. You're trying to pretend like they're decoupled. They're not. Just bite the bullet. Accept the fact that they're logically coupled and use simpler, more straightforward communication mechanisms to represent the reality rather than sort of twisting things around to try to make them behave as if they should be decoupled.
- 00:49:34 Kenny Baas
- I guess that also counts for more of a social technical one like the upstream downstream in a context map, right? Sometimes we try to decouple it while the business not at all is decoupled in that way. I once was it a situation where they tried to do it, but they're saying, yeah, well, they're not solving the issues out of our events. Well, then it's not. Right? They're trying to go around the situation that the business is not there.
- 00:50:04 Udi Dahan
- Yes. Allow me to add maybe just 30 seconds on that because this is a common failure scenario when people do try to design their bounded context using the upstream downstream type of thing. They're looking at that sort of from a time-based separation. These are often data flow type situations. The data starts here, then it goes here, then it goes there, it goes there. That type of data flow or time-based separation often does not correlate with logical separation, so trying to represent things in a context map based on time and then introducing events as a way of doing that, you end up with fat events, which is indicating there is logical coupling there, and that usually means that you need to look for logical boundaries to say that a given logical boundary will have some time zero functionality, some time one functionality, some time two. In a given data flow, a given logical boundary will be involved at multiple points and only through that you can sort of piece together multiple logical elements without having to share a bunch of things between them beyond IDs. But it's a very common failure point. The data flow time-based separation model leads to poor logical boundaries.
- 00:51:30 Kenny Baas
- Yeah. Yeah. That's why I started, well, I think many in the DDD community started to call it a sociotechnical context map, right? Sociotechnical, not only a time-based one. Marco, I think you have a...
- 00:51:44 Marco Heimeshoff
- Yeah. There's kind of two questions that are counter related. One person asks, is orchestration a single point of failure because it is less robust? On the flip side, how to avoid a situation in choreography when the services become so decoupled over time that no one understands which systems will react to certain events and becomes difficult to understand the whole process at large. How do you deal with this trade-off, and is orchestration, let's start with that. Is orchestration a single point of failure? How can you ensure robustness there?
- 00:52:21 Udi Dahan
- Laila, do you want to start?
- 00:52:22 Laila Bougria
- Sure. I'll start with that first one about orchestration. Again, sort of depending on the flavor of orchestration that you're using, I think that's a very important component in how robust or how failure prone it's going to be. I think if you can make use of the message-based flavor of orchestration, you are setting yourself up for success a whole sort of way down the road. Why? Because you're capturing those incoming requests to the orchestrator at least on a message queue, so if that orchestration component fails, even if it's unavailable for a while, when it's back up then those requests are safely stored in the queue. If there's any events it's subscribed to, if there's any incoming commands, it can basically just resume the work.
- 00:53:10 Laila Bougria
- Now, it is definitely true that there's a lot of contention there, right, because basically given that you have this orchestrating component that is deciding which step is going to happen next, basically anything that is related to that workflow needs to pass through that orchestration because otherwise that state is not consistent anymore. It doesn't have the full overview anymore. It's definitely a point of contention, if you will, in that regard, which is an additional reason to think about robustness, and which is why I would be way more inclined to take the message-based approach, especially because of that. Yeah. That's definitely one thing that I would consider for that first part of the question.
- 00:54:01 Marco Heimeshoff
- This is my own, just thought about that. Would you then implement the orchestrator as some kind of saga pattern when you use messages in state-based orchestration?
- 00:54:11 Laila Bougria
- Okay, so for the people that know NServiceBus, NServiceBus sagas, I think we should differentiate because there's the saga distributed pattern, which is basically a mechanism to sort of create or mimic a sort of distributed transaction where you're accessing multiple types of infrastructure that still make part of a sort of logical transaction. For example, I don't know, I want to, for example, process an invoice and that means that I need to store a PDF format of that invoice on Blob, but I also need to make some changes to SQL, to my SQL Server database, and you can't really do that in a distributed transaction, so a way to solve that is using the saga distributed transactions pattern, right? But the way we look at saga and NServiceBus is a little bit different and it's rather a sort of a way to model a longer running process or workflow, and I think that would be more applicable to the question that you're asking, Marco, in this context, so I just wanted to differentiate those to make sure that everyone is on the same line.
- 00:55:20 Laila Bougria
- NServiceBus sagas basically allow you the flexibility to use commands or events where they make sense, but still give you that sort of framework, if you will, to build a sort of orchestrated solution because you have that component that says, "Hey, I am started by this specific type of message," and it's basically going to take the ownership of the flow. It can also store some state and basically keep that around to make decisions. One of the things that comes up a lot is ordering issues, right, because that usually is a problem, and one of the things that I find really useful when we start talking about ordering of messages is to think in terms of prerequisites rather than order. Specifically in a workflow, I think that makes a lot of sense because usually if you look at a workflow in sort of logical terms and you look at each individual step, it might be that for a specific step that's part of that workflow, you have multiple prerequisites.
- 00:56:22 Laila Bougria
- For example, I only want to ship this product once it has been paid for and once it has been stored in the database, and those are my two prerequisites. Basically, you could still from the orchestrator's position sort of execute those two steps of storing that data and then making sure it's paid at the same time, but you're going to wait until those two prerequisites have been fulfilled to then say," Okay, and now we can move ahead and ship." That's one of the strengths that we have in the service sagas.
- 00:57:01 Marco Heimeshoff
- Yeah, thank you. Thanks a lot. Then there's the other part of the question, how to avoid a situation in which choreography, when the services become so decoupled after a while that no one understands which system will react to the certain event and it becomes difficult to understand the whole process at all.
- 00:57:20 Udi Dahan
- Right. I think that that's where the expectations, or I'll start with the popularity of the microservices architectural style sort of gets intermingled with that. Let's say you had let's say a dozen or two dozen logical bounded contexts and they were collaborating in sort of a choreography style, then probably for most people they'd say, "Yeah, that's complex, but I can draw a picture of that and continue to reason about it at sort of that level of scope." But if I have 200 or 500 bounded contexts, that's when I feel like this is too much and I can no longer reason about things when they are that decoupled from each other. I'd say that the premise of the question is related to the number of parts that are interacting in a choreography style.
- 00:58:30 Udi Dahan
- Now, by virtue of those things that we said before to say we'd be using a choreography style for things that are logically decoupled from each other to a high extent, which means they're highly autonomous, they have a high degree of data ownership, you don't tend to see those criteria being able to be fulfilled when doing sort of the really small microservices, right? Because those microservices oftentimes are doing request response. They don't fully own their data. Multiple other microservices are touching the same data. That's where, again, when people are asking these questions that say, if you're thinking about microservices, then your first step is to consider the fact that if you have microservices, there's probably going to be a bunch of those within a given bounded context, so the bounded context would be larger potentially than a single microservice. That's really the only way you're going to be able to have the high degree of autonomy and data ownership in all of those elements.
- 00:59:41 Udi Dahan
- By doing that, the 12 to 20 bounded contexts that you've got, each of them might have 10, 15, 20 microservices inside it resulting in a total number of 200, 500 microservices across everything. Yeah, if you didn't have the bounded context concept on top of those and everything was sort of eventing and loosely coupling with everything else, it would be a mess. When you put in that sort of organizational concept of the bounded context on top of those microservices, then in my experience, and I've done this across a large number of domains, you don't often see more than a dozen of those bounded contexts, two dozen when you're starting to talk about a whole business unit, so at those levels, it's usually within the scope of reason.
- 01:00:48 Udi Dahan
- That's not to say that there aren't technical tactical challenges to kind of say, wait a minute. Who is publishing this event? Who are all the subscribers of this event? There can still be a lot of moving parts and it can be difficult to figure that out. This is one of the reasons why around NServiceBus, we built the rest of the particular service platform that essentially extracts a lot of this metadata. Who is publishing which events, who is subscribing to which events? Essentially we have a database of all of this information, and quite some time ago we wrote a blog post that shows how you can do something like that. I'm pasting the link to that in the chat for those of you that want to look at that. Oh, wait, we need to pass that on to... This is the wrong chat. All right. I'm going to have one of the organizers pass that along.
- 01:01:46 Udi Dahan
- But essentially we're able to have a picture that looks like this, if I can share my screen briefly, where essentially we've extracted what those logical endpoints, as we call them in Particular lingo or called, and we have the ability via the audit log of what the production system is actually doing to identify the flow of commands and the flow of events between all of those things. Here you're seeing a picture that has five endpoints where hopefully you can sort of extrapolate from something like this to say, yeah, with 12 to 20 endpoints, and yeah, you'll probably end up with potentially 50 to 60-ish events at those bounded context level. That would be a big complex picture. But for the most part, you don't need to look at all of it all of the time, and the difference here is that this is automatically generated from your running system rather than being an architectural diagram that somebody drew to say, "I think the system works like this, or we're going to try to design the system to work like this." This actually shows you, well, this is what the system's actually doing, including all of its elements.
- 01:03:26 Udi Dahan
- Again, for 10 to 20-ish endpoints, a picture like this is sufficiently understandable, tractable for doing a choreography type style. It would also be useful in an orchestration type style as well. Right? If you had an orchestrator, so essentially the diagram that we showed at the beginning that kind of says, okay, sending commands and getting responses from these five or 15 other endpoints. It could also be visualized in this kind of way. But I'd say generally speaking, you'd want some kind of tooling to be able to extract an architectural type diagram from a running system to enable and support the teams that are working on it to make good decisions about how to continue to evolve it. Again, regardless of using orchestration or choreography. But the whole concept of is choreography going to break down? Is it going to be too difficult to manage? Usually the only reason that occurs is there are too many microservices-y type things without larger contexts around.
- 01:04:44 Laila Bougria
- If I can add to that, actually, ServiceInsight is a really good example of an observability tool. Right? We can also look at that as sort as in a broader context. Because I think one of the strengths of ServiceInsight is that it can, without you having to do anything, extract all of that information and visualize how the messages are flowing through the system. But the thing is that a system is usually also composed of other components, and some of it is still rest APIs while you're transitioning from your monolith into a more sort of event-driven system and all of that, and one of the ways to get there and to increase the observability is also to adopt something like open telemetry, which is by the way also something that we do support within NServiceBus as well. But that sort of allows you to sort of see how all of the system components are interacting from basically your message broker to your database to across rest APIs and so forth, so that gives you an even wider picture, which can also be overwhelming. I think if you're looking in the context of how are the messages and the events flowing through my system, then ServiceInsight Insight gives you that sort of more condensed view of what is flowing and what is subscribed to what and so forth. But if you need that sort of wider context, then open telemetry is also incredibly powerful.
- 01:06:07 Udi Dahan
- This is an image of ServiceInsight again. This is part of the Particular Service platform around NServiceBus. These are the types of things that we've realized are very helpful to people when working on asynchronous message driven, event driven types of systems. There's a bunch of tooling here. For those of you that are working in .NET, this is going to be great for you. For those of you that are working in other platforms, the tooling there is not developed to this degree, but hopefully this can give you the idea of what's possible.
- 01:06:47 Kenny Baas
- Cool. Thanks. Yeah, and I think, I'm not sure whoever mentioned this, but to relate to that last question, right, if you're not careful you're ending up with an umbilical cord anti-pattern, but I'm not sure whoever mentioned that, if you do choreography and it's still too entangled with each other, but I'm not sure where that came from. It reminded me a bit if you're not watching correctly how they relate to each other, you're ending up with coupling everything again in true choreography. Not sure who had mentions.
- 01:07:20 Laila Bougria
- That's interesting, Kenny. I'll give you another one to sort of think of that I came across somewhere. I wish I knew where. But it was pinball architecture where basically events are basically bouncing from here to there and it's like a pinball machine basically. I found that also to be a good fit to describe that same thing.
- 01:07:41 Kenny Baas
- To follow up on this one, when you have complex workflows that has a lot of compensating failure recovery paths, which one would you prefer? Maybe which heuristics do you have when that happens a lot? Is it orchestration or choreography?
- 01:08:04 Udi Dahan
- I think that I want to come in with sort of a preliminary part because we've been talking about workflows a bunch and business workflows, technical workflows. I think that's sort of the first part is to distinguish different layers of workflow, if you will. Like the example that Leila gave us, I need to generate a PDF and put it on Blob storage, and then I also need to put the, let's say the URL of the Blob storage thing into my database so that it can be looked up. That is a technical type of workflow situation versus a thought business type of workflow situation, which is the first step of this is getting a supervisor approval for an employee's request of absence.
- 01:09:01 Udi Dahan
- Now, if the supervisor does not respond within one day, then that can escalate to an HR request for approval, so it's a more pure business type workflow that is not related to the technical underlying systems that might take part in that workflow. When people talk about complex workflows, if you will, the first step is to say, "Well, let's try to make sense of the workflow that we're talking about and try to partition it, break it up, layer it in various ways so that it's not just one big monolithic thing." That's one element of it. Other parts involve thought validating or challenging the way that the workflow is defined and failure conditions of that, and therefore the need to compensate. In academia, one of the classical examples that is... There are two classical examples that are given around workflows and compensation that are also mentioned in the literature around the initial saga distributed transaction pattern.
- 01:10:30 Udi Dahan
- One of them is transferring money between accounts. Saying you're transferring money between account A and account B as a kind of workflow, and then it's like step one is decrement account A, and step two is increment account B, and then you'll have a compensating part of that workflow saying if that fails, then you return the money into account A. Right? Now, all of that sounds great until you go and talk to people in banking and they say, well, so there's actually a whole bunch of other intermediate steps, and yet if it fails, you don't necessarily get all of your money back because there are fees that you pay regardless of whether the thing succeeded or failed. The talk about compensation is itself not necessarily a pure complete compensation, and in some cases you have things that, so let's say I'm trying to transfer money to an account and that account has been flagged for money laundering. Then the compensation, it's not just, am I going to get my money back? I might not, because now my account becomes flagged as possibly being involved in money laundering as well.
- 01:11:52 Udi Dahan
- There are all sorts of variations to these workflows that as you dig into them, they don't end up nearly as pure and clean in kind of the sense of, okay, so things are going to compensate. Usually when I hear the term compensation, that is a technical person making assumptions about the business domain. When I talk to business people, I tend not to hear the word compensation. I hear all sorts of domain terminology as flagged for money laundering or all sorts of other business states that then themselves result in yet other workflows that are happening. I'd say that if you're hearing the term compensation, then if it's a business workflow, you're probably misunderstanding. You're not talking to the right people. Go find some other domain experts to talk to and you'll often hear different things.
- 01:12:48 Udi Dahan
- At a technical workflow, the thing that I talked about at the beginning, yes, at that level, sometimes you'll have compensation in the sense that after I do this, if that fails, then I want to clean up. Unless the data can arrive in both of the places together, it should arrive in none of the places because essentially I'm building a kind of distributed database. There's a lot of stuff that needs to essentially be teased apart whenever somebody says complex workflow before you even get to the part of what am I doing for those various parts. I'm sorry. That was a very long-winded non-answer to the question essentially saying, here's why you should be careful when asking a question like you're asking. Here's a whole bunch of contexts that may lead you to ask a different set of questions to begin with. But again, back to those questions, happy to transfer it over to Laila. This is the "Udi doesn't answer the question but talks for five minutes," and then Laila comes along and actually answers the question webinar.
- 01:14:03 Laila Bougria
- The tricky part is I lost track of the question now.
- 01:14:05 Kenny Baas
- The question was for complex workflow that have lots of compensation failure recovery part, which is better, orchestration or choreography?
- 01:14:15 Laila Bougria
- Thanks. Thanks, Kenny. Well, I covered this a little bit earlier, I believe, but I think as the complexity of the workflow grows, which means more compensating transactions or alternative workflows that need to be invoked somehow, I think the more that complexity grows, the more you will benefit from having an orchestrated approach. However, sometimes specifically actually if it's an alternative flow that is being invoked, it could also be natural to choreograph that instead and say, for example, let's say there's an order that comes back in. The customer just sent it back. They tried on the dress, it doesn't fit, they don't like it, whatever it is. Right? You send it back and at that point it's not that initial workflow that needs to be invoked, but it's rather an alternative workflow where you say, "Okay, we just need to refund this," and it may completely be separate from that initial workflow. Although in our minds they are connected to the same order, but still they're completely different things and they can be handled separately.
- 01:15:29 Laila Bougria
- I think also this sort of exercise of should we do choreography or should we do orchestration is a sort of continuous exercise. I think especially when you're adding steps to the workflow, I think that is a very good moment to sort of reflect back of, are we still using the right pattern? Because it may have been the right choice for a while until requirements start to change or new insights start to emerge which lead you to sort of reevaluate whether you should jump from one or the other. I think one common one to sort or go from a choreographed solution to an orchestrated one, if you see that adding steps is easy in a choreographed solution because it's like, okay, I can just subscribe to this event and do whatever I want. But imagine that there's a requirement change and you've always been shipping physical products and then at some point you're delivering, I don't know, digital product to just someone's inbox or whatever, and that might break up that entire flow causing you to have to change multiple services that now have to react to other events in order to make the flow work.
- 01:16:47 Laila Bougria
- I think that is a very good point to say, does it even make sense to have this in a choreographed form and how confident are we at this point in time that this won't continue changing? Also, the flip side of having an orchestrated solution and that can also sort of run the risk of becoming its own little monolith or even distributed monolith. Yikes. Because we have this component there, so it's so easy to go there and to say, okay, we can just add it there. But I think going through that exercise of asking yourself, okay, if I want to add this individual step to this workflow, does it make sense? If I would leave this out, does that make this workflow uncompleted? Does it mean that it's not valid anymore? And if that answer is yes, then it makes sense as part of that workflow, but if it doesn't, you might just have a case where you can sort of say, no, this part can just be done through a sort of choreographed solution. But it's a continuous exercise.
- 01:18:00 Udi Dahan
- I'd like to layer in one other part because I don't think it gets discussed enough when developers and architects are engaging with requirements slash workflows, especially around sort of failure points, and here I want to talk about logical failure points, not like the technological, "I tried to call an API and I got 500 back." All right? At a logical level you might say, to use our e-commerce example, I'm going to buy a product, and one of the steps in that workflow that we need to check in the system is to see that the product has not been recalled. It's not defective. All right? We kind of say, oh, okay, great. We've got a workflow and one of the steps is check if the product has been recalled and if it has then essentially fail the workflow, right? That's kind of a roll back the process type of scenario. We tend to receive those requirements and say, yes, that makes perfect sense.
- 01:19:11 Udi Dahan
- At this point in the workflow, do this kind of check. It could be calling a database, calling a microservice, calling an API which gives me that information and we implement that workflow and then we're done. We don't often consider the fact that there was another workflow that set that flag in the first place, right? That a product was fine for a while and then somebody in the back office clicked a button that said, mark this product as defective, mark this product as recalled, mark this product as no longer for sale. Now essentially you've got two users, or multiple users, those that are wanting to buy the product that are in a kind of race condition with somebody else who is marking the product as defective. While we kind of say, oh, okay, so whoever placed their order before the product was marked as defective, that's fine, they get the product, and whoever tried to place their order after it was marked as defective, then they don't get it.
- 01:20:26 Udi Dahan
- But if we zoom out just like a couple of inches from that statement and say, but wait a minute, should the millisecond of who pushed the button first mean that now that this family is feeding their baby defective baby food? It's like, no. The fact that a product is defective, it shouldn't just be a check that we do in let's say the first workflow, but as a part of the workflow of marking a product as defective, we might need to expand that workflow to say, well, let's look at orders that were received within the last certain period of time and make sure that we then get in touch with those customers, letting them know that the product is defective and that if they've received it, that they should return it and that we're issuing a refund. This whole make a product defective is not necessarily a simple click a button, set a record in a database type of workflow, but it's a much longer thing and we only discover that that is needed because we're looking at sort of the other side of the race condition.
- 01:21:53 Udi Dahan
- Whether you have business analysts that are doing this or you are interacting with business analysts or domain experts, anytime you're getting a requirement, an if statement that is checking some flag about some entity state, open up that conversation with your business stakeholders and say, "Wait a minute. What about the race conditioning happening the other way? What if this happened a second earlier? Should those things be ignored? Should we not have any workflow that handles those?" There's often a whole bunch of other requirements that if you don't handle them early, then later on you're going to get them surfacing because that's just sort of the way that the business works. These compensations are usually not technical compensations, these are examples of yet longer running workflows that might be happening in a different workflow than the one that you're currently looking at. Just as ways to broaden people's perspectives when looking at a single if state.
- 01:23:06 Kenny Baas
- Great. Thanks. We're almost out of time. One thing there, what we talked about in the last session is right, like that's the power of DDD. Technically we talk about composition, but we are aligning to the business, right? These are nice heuristic in DDD to say, well, if you talk too technical, I also generally if you use create, read, update, delete, maybe you should find out what's really happening in your business, which is great. Perhaps maybe before we close down, any last words from you, Laila or Udi? Any tips maybe you have out of the hat like start here or just some last minute things you want to share?
- 01:23:51 Laila Bougria
- Regarding this topic specifically? I'm planning a session, so come watch it if you have the opportunity. I usually do share a bunch of resources, so even if you don't catch me at any of the conferences I speak at, I do usually tend to share a lot of resources. Yeah, or if it's recorded somewhere, I also tend to collect that altogether to give everyone the ability to sort of read up on that. But apart from that, maybe there's one thing that I would like to add that is often forgotten, I think, specifically when comparing orchestration and choreography, and it's the topic of SLAs. If you have a certain business flow, business transaction that needs to happen, it may happen that there's a business requirement to say this has to be done within this amount of time.
- 01:24:55 Laila Bougria
- I think if you run into such a scenario, doing that in a choreographed solution is going to be a lot harder, especially if things are sort of scattered across multiple services because then there's no real owner, no specific point that you could sort of say, "Hey, you have to keep track of this time component, and when that exceeds you need to trigger some alternative flow," or we need to ring whatever red flag or whatever bell that is relevant. I think that's also an interesting way to sort of think about it because it's usually an afterthought. Usually, you build or you design a system and then someone walks in and, "Hey, this has to happen within 48 hours," and you're like, "What?" That's maybe a question to ask upfront that I noted down as a good tip as well.
- 01:25:49 Krisztina Hirth
- This is one of the usual failures, not thinking about things which not happen. What do we do if does not happen? Nobody thinks about this. "It's a bug." No, it's not. It wasn't designed. It wasn't even thought about this.
- 01:26:05 Kenny Baas
- Thanks. Any final tips about orchestration and choreography from your side, Udi?
- 01:26:11 Udi Dahan
- Well, I'm not sure I'd call it sort of a tip, but this is, I've kind of alluded to the fact that I talk about a lot of these things in my course. My guess is that a bunch of people that are on this session have already taken the free online version of the course that we've got. I am running the course again in person in London this September and we are doing a giveaway, so for those of you that would like to win a seat at the next Distributed Systems Design course, again, that's September 18th to 22nd, go to this URL, go.particular.net/virtualddd. Fill in your details. I know it's an expensive course. That's part of why we're doing this. We also have another offer available if any of you are, would really like to go, but you're paying out of pocket, we have special discounts that are available for those of you that don't have an employer that is paying for them.
- 01:27:28 Udi Dahan
- Please go to that link, sign up and reach out, and I'd love to see you there. We still have a couple of seats available. Yeah, that's it. There's a lot more to talk about each of these topics. There's a reason why I've got a five day course on it and why some people end up going back to the course again more than once because it is, a lot of it is sort of a relearning and an unlearning that happens as well as learning the new thing along the way. That's how I'd wrap it up.
- 01:28:04 Kenny Baas
- Yeah. I remember when I did my, well, I did a DDD course and then of the next five to 10 years I kept coming back at concepts the first time. Right? Once you start working on them, you need to come back and there's so much to grasp and later on you'll understand it better. Thanks. Thank you, well, for sharing your information, knowledge, expertise, and answering the questions. Thanks to every participant who was here answering or asking the questions. Thanks. Without you, we didn't have this really nice conversation. Next month we will have Barry O'Reilly back at our meetup. We will deep dive a bit more about residuality theory, and then in September, we'll do an open space. More on that later. Thanks, and see you next time. Bye-bye.
- 01:28:57 Marco Heimeshoff
- Bye.
- 01:28:57 Laila Bougria
- Thanks for having us.
About Udi Dahan
Founder and CEO of Particular Software, creator of NServiceBus, and one of the world’s foremost experts on Service-Oriented Architecture and Domain-Driven Design.
About Laila Bougria
A software engineer at Particular Software - makers of NServiceBus, and Microsoft Azure MVP. She's passionate about software and always looking for patterns, both in code and in yarn.