Design more decoupled services with one weird trick
About this video
This session was presented at NDC Oslo 2024.
Developers have been trying to build more decoupled systems for years, but the same challenge keeps popping up: how much data do you pass between things, whether via an API or through events?
Lots of parameters means lots of coupling. Passing over just an identifier means the receiver needs to look up the data it needs, essentially sweeping that coupling under the rug and pretending it isn’t there anymore.
Join Udi to see the “one weird trick” that rotates the design space enabling services to be so much more decoupled, each encapsulating their data that much more completely.
You won’t believe what happens next!
đź”—Transcription
- 00:04 Udi Dahan
- All right. Good afternoon. Welcome. Thanks for coming and finding your seats in such an orderly manner. I love coming to the Scandinavian countries. The folks are so polite and organized. For those of you who don't know me, my name's Udi Dahan. I'm going to be spending the next hour talking to you about how to decouple your services. I'll be covering fairly wide range of topics, and at the end, I will come to this one weird trick. So if you're wondering what that is, you're going to have to stick around a little bit longer. If you want, you can follow me on all the socials. I'm @UdiDahan, I'd be happy to connect with you afterwards. I work at a company called Particular Software. We have a booth over there. Come see me afterwards, we'll talk some more.
- 00:59 Udi Dahan
- So why am I even talking about decoupling? Well, I mean it's been the biggest problem in software since, when I say practically forever, in software terms, I looked this up on Wikipedia, the earliest reference that they have documented for the concept of spaghetti code is 1977. So if you count that in internet years, that's like several hundred years. If you count that in TikTok years, that's several thousand years at this point. So this idea of coupling, again, from the earliest days of software, it's really always been an issue. Now, we've never really had a good definition for it. It's one of those things that, as developers, we know it when we see it. That challenge of things that stick together too much. You pull on one and you end up getting a whole bunch of others. The spaghetti metaphor is actually quite apt for anybody who's ate spaghetti. It's really hard. It takes a great deal of effort to extract one single strand of spaghetti and do something with it, and code is like that as well.
- 02:16 Udi Dahan
- Now, from the code side, well, as an industry, we didn't just stop there and say, "Well, we're just going to have spaghetti code." We ended up creating spaghetti databases as well. So databases, relational databases, and subsequently to that, document databases, and graph databases, and all those things, same idea went along. It's like, well, we've got this entity, and it's got one-to-many relationships with these other entities. And those entities have many-to-many relationships with five other different entities, and just those strands of coupling enmesh all of the data in the database. And again, it's a you try to change just one thing, and you end up breaking all sorts of unrelated things.
- 03:05 Udi Dahan
- Databases are particularly pernicious as places of coupling because, unlike code, for the most part, with code, you can step your way through it. It might be a mess, but you can set a breakpoint and say, "All right, let's just follow this thread of spaghetti and see where that takes me." Databases on the other hand, well they're used by lots of code, right? Who's got a batch job in their system? Yes, batch jobs. Can I get a show of hands? Okay. The problem with batch jobs it's some other code somewhere else, and our code over here that we're working on has no awareness of that other code over there.
- 03:52 Udi Dahan
- Now, on the one hand, you might say, "Well, that's good. It's technically decoupled from the batch job," right? Operative word being technically decoupled because we find out that we change our code and we get all the unit tests to pass and we get all the integration tests to pass and get all the acceptance tests to pass and we're thinking, "Okay. This is good. I got this. I'm going to deploy it." And then I'm checking production, I'm checking the logs and miracle happened and just first try it worked. And you're thinking to yourself, "It can't be this easy." And you wait, you count the hours and you're like, "No, nothing's blowing up." And then at the end of the week, you get a call, "Something blew up." "What was it?" "Batch job."
- 04:46 Udi Dahan
- So that problem of a shared database and coupling coming in through the database is often one of the most difficult forms of coupling that we've got. Unfortunately, we keep doing more of it. Now, at some point in time, we tried to say, "All right, what if we tried to decouple there as well?" And as an industry, we went to this concept of microservices. Who's heard of microservices before? Yes. Okay, everybody. Who's doing microservices? Hands up. Okay. Still, most of you. Who's happy with the microservices that they've got? Oh, not nearly as many hands going up. It's like a, "Well, with some of them," right? Some of them more so than others.
- 05:35 Udi Dahan
- So I mean, when microservices came along, we had the Netflix story and the Twitter story and the AWS story, and everybody with all of these hundreds of microservices. And as an industry, we got a little bit of microservices envy. I want to have lots of microservices too. That's what's called resume-driven development, right? "Oh, you only have 20 microservices, we have 50 microservices." And somebody says, "50? That's nothing. We've got 78 microservices." It becomes a pissing competition. Who has the most microservices? But the interesting part again with microservices is back to that element of data. Because if all of these microservices are talking to a shared common database, then once again we have that same problem of coupling where we make a change to one microservice and then we end up breaking who knows how many other microservices.
- 06:33 Udi Dahan
- So that's where when we started doing microservices, the idea was, well, in order to actually have this work, we need to have these microservices own some of their own data. Now, that can help, but also, it can cause problems and we're going to talk about a bunch of those problems. But before we do, I want to come back to this whole microservices versus monolith thing because while modular monoliths are now back in fashion, so need to talk a little bit about that. All right, so let's start with sort of a original monolith before modular monolith came a thing and we had two pieces of code. Components, let's call them. Component one, component two. Component one calls a method Foo on component two, passing in a bunch of parameters. All right, simple.
- 07:27 Udi Dahan
- Now, we decide to microservice-ize our monolith. Microservice-ize is an actual verb now. And we've done this where we have taken component number two, and we are now deploying that into its own process. And instead of doing a regular in-process call between component one and component two, now we're doing an HTTP call. But not just any HTTP call, it's all going through some kind of service mesh, gRPC secured with all sorts of wonderful tokens, serialized with Protobuf because that's the fastest. Until somebody says, "It's not the fastest." I know there are others. And we go through all of this and say, "Great. Now we've got a microservice." But if we look at the logical coupling, the nature of the two pieces of code, what's the difference?
- 08:28 Udi Dahan
- We've added a lot of technology around the code that was there before, but logically, we're still doing an invocation. We're still passing the same number of parameters between these two pieces of code. We haven't really changed the nature of the coupling. And now, by the same token, if we're saying, "Well, we're not going to do microservices anymore. We're going to go back and do a modular monolith." So we're going to take all of that technology that we added over the past five years. Docker, don't need that. Kubernetes, don't need that. Serialization, don't need that. Put everything back in the same monolith and we're in the same coupling story that we were before. So that's what we're going to be talking about here is the how to deal with the logical coupling first. If you can't deal with the logical coupling, technology is not going to help you. If you can, then you might start getting some benefits from it.
- 09:27 Udi Dahan
- Now, other things that people have tried, they say, "Well, okay, yeah, we're doing an invocation. We're passing a lot of parameters. Every time we need to make a change to one of those parameters. Well, it ends up breaking the API, breaking our consumers. So how about instead of passing a whole lot of parameters around, we'll take those parameters, and we'll put them in a database somewhere? And then we'll get an identifier back and then we'll do the invocation with an ID. And then on the other side, they'll pull the information out based on that ID." Please, don't do this. Just in case you were wondering, no, I'm not recommending doing this. That's just what I said before about having too much coupling going through the database of saying, "Oh, I've got this logical coupling, what can I do with it?" It's not a let's sweep it under the rug.
- 10:21 Udi Dahan
- Now, of course, developers say, "Well, I'm not doing it via database. I've got a REST API. I've got an entity service and I'm going to put things into the entity service from one place and I'm going to read things from the entity service in another place." And you've still done the same thing. Wrapping up a database with some sort of CRUD entity service, you're still not actually changing the logical coupling that you got. So coupling is like that mass-energy thing in physics. You can transform it, and you can move it around, but it's really hard to destroy it. It'll just change shape and go somewhere else. So the attempts to hide it usually make things worse.
- 11:13 Udi Dahan
- So if you're doing this, please, it's just not worth it. Go back to simple calls between things. Don't sweep the coupling under the rug. But again, for as long as we've been doing software, we've essentially been playing this game and sometimes it feels like we're stuck in the spiral. This is a picture of a thing called an ant mill. An ant mill is what happens when ants lose direction and then they default into following the ant that's in front of them. Now, the thing is when enough ants lose direction, they start following the one in front of them and they just sort of walking around in a circle. I'm sorry, I'm going to not have this up. It's a little disorienting, but that's what we've been doing as an industry.
- 12:09 Udi Dahan
- It's like coupling was bad, we're going to do microservices. Microservices is bad, we're going back to monolith. And when that doesn't work, what then? We'll have another name for microservices. We'll call them actors this time. Actors have been a thing for a while. Let's do distributed actors. How will actors look? They'll probably look either something like this or something like that. So the idea is to stop repeating the same patterns over and over again and try to find how can we handle the coupling better rather than shuffling it around and always feeling that pain.
- 12:50 Udi Dahan
- So there is a better way. You've come to the right place and luckily there's almost always a better way for anything. However, it requires a number of changes to be done, all right? So it's not just a little tweak to your architecture. There's actually three different parts that we're going to be talking through. The first one is event-driven architecture. So that's number three on the list. There's talks about it at the conference. Event-driven architecture has kind of... It's also been around for a long time. It's been popular. It's gotten more popular over time as event sourcing became a thing and event storming. So now you have event anything and it's like, "Oh yes, absolutely. Give me some of that." We want to eventicize all of the pieces of software. So coincidentally that's not what we're going to be recommending. Event-driven architecture has a piece to play in this change that we're going to be doing to coupling.
- 13:58 Udi Dahan
- Number two is this thing called vertical slice architecture. If you haven't heard about it, we're going to be talking about it. Essentially, it's a different style than layered architecture. So layering is probably one of the better-known approaches where you've got layers that are request-responsing with lower layers beneath them. Vertical slice architecture says, "Well, instead of slicing things up this way into layers, we're going to slice them up top to bottom into these vertical slices." And that's going to have a piece to play together with the event-driven architecture. And then we've got this one weird trick that I promised you and that's going to help us get some true data encapsulation where we were missing it before, both in the vertical slice architecture and the event-driven architecture. So that's our plan for today. That's what we're going to do starting with event-driven architecture.
- 15:00 Udi Dahan
- So what is event-driven architecture? Well, like before we have our components: component one and component two. Now, the idea with event-driven architecture is when I've got those components, instead of doing the classical request-response between them where I've got the calling component, component one, invoking some kind of method, getting some kind of response value from that. Now, again, request-response can be synchronous HTTP request-response. It can be full duplex request-response over some sort of message broker. But the idea with event-driven architecture is that we do a publish-subscribe type of semantic where it's not in the same order exactly as request-response.
- 15:49 Udi Dahan
- So the first part of publish-subscribe is actually the subscribe. So in that way, publish-subscribe is actually incorrectly named. It should have been called subscribe-publish because if there is no subscriber there, if nobody has expressed any interest to the events that are being published, nothing is actually going to happen. So at some later point after the subscribe, the first component will then be publishing events saying, "Oh, Foo happened." So that's the important thing to realize about publish-subscribe versus request-response. A, it's subscribe-publish, not publish-subscribe. B, the roles are reversed between request-response and publish-subscribe.
- 16:39 Udi Dahan
- In request-response, usually, you have a client requesting from a server. When you're talking about publish-subscribe, it's usually the server that is publishing. So where the events are coming from is usually from the thing in the back rather than the thing in the front. Now, this becomes interesting when we put it into all sorts of workflow contexts. We're going to get to there in just a minute, so bear with me. Now, one thing I want to call out, it's important, is notice the past tense. It's Foo happened. Something already occurred. It's not a request, it's not a command, it's not a do something. So past tense for your events.
- 17:20 Udi Dahan
- Now, I know some of you are saying, "Udi, we know this stuff." I know. I got to lay the groundwork for anybody who hasn't done event-driven architecture, not familiar with these terms. We're going to get through them quickly and move on. So past tense, something that already happened, not something that we're requesting to happen. Now, unfortunately, because event-driven architecture has become really popular, people are now using it for request-response as well. So they'll publish a save-customer-requested event and they're subscribing to a customer-saved event. You are doing request-response by another name at that point. It's not really pub/sub what you're doing.
- 18:01 Udi Dahan
- So one of the indications is save customer requested. Is that truly a past-tense thing? Is it a thing that already happened? If the answer is no, it's not an event. You're not doing event-driven architecture. That's fine. You don't get brownie points for more events in your system. But there's this belief, unfortunately, I'm seeing more of it. It's a, "Oh, event-driven architecture is more decoupled than the other forms of architecture." And then we essentially just force all of the code to become event-driven even if it doesn't really want to. And then it actually makes the code a lot worse rather than better. It makes it more tightly coupled rather than less tightly coupled.
- 18:46 Udi Dahan
- So I've heard this occasionally and I'm harping on the point because I've seen this happen. Developers do this and then where does this blow up? It blows up actually in production at scale. So when you're doing an event-driven request-response style, it looks fine on your dev box because you only have a frontend process and a backend process running and you're doing the pub/sub between them. In test, usually, also you have a small version of your production environment. You don't have a scaled-out web tier or something like that. And then you go into production and now you have a scaled-out web tier, and you have one server that publishes save customer requested. But when that event comes back, another web server, because they're all listening to the customer-saved topic, steals that event from the first web server because it didn't know that it wasn't supposed to listen for that specific event. And then users don't get their feedback and developers are like, "Wow, that's weird. How did that happen?" Check it on their machine. Everything works great. Why? Single publisher, single subscriber.
- 19:59 Udi Dahan
- Once you scale things out, then there are problems. That's when people start to say, "Oh, okay. There's this great feature that many of the pub/sub systems have, you can subscribe based on a filter." So I'll subscribe not just to all of the customer-saved events, I'll subscribe to the customer-saved events by some identifier, but then what would that identifier be because the customer hasn't been saved yet? So am I going to use the name of the customer, the phone number of the customer, their social security number? It becomes a tricky thing to solve. But for the most part, people are like, "In for a penny in for a pound." They keep trying to make this square peg fit into that round hole without stopping to say, "It shouldn't be this hard. It wasn't this hard with request-response."
- 20:47 Udi Dahan
- So event-driven architecture is for past tense things, not for command imperative do a thing now. So there's the backend side. I do want to mention the frontend side has some uses for event-driven architecture as well. So React and all of those types of tools that have adopted a more event-driven style to decompose the front end, it's absolutely great, but we're not going to be talking very much about that now. It's more on the backend side of things.
- 21:21 Udi Dahan
- All right, so on the backend, what makes event-driven architecture useful, helpful, et cetera, is when we've got complex workflows. So if it's just a simple, take a command, persist something to a database, and tell the user success or failure, if it's a CRUD type of scenario, well, you don't need event-driven architecture. You don't really need any fancy patterns. But when the logic becomes more complex, that's where event-driven architecture starts to help. It's one piece of the puzzle.
- 21:57 Udi Dahan
- So let's take a familiar type of scenario. My guess is that you've all bought something online, and you have some understanding of the workflow that might go into that. Some of you might be actually working at e-commerce companies. That involves things like, well, we need to look up the items from the database, and get their up-to-date prices, and calculate the total. And then we need to check the checkout service and the customer information and what kind of discounts they need to get. And then the inventory levels of the items, and there's this whole big, long, workflowy thing that we need to go through in order to be able to say, "All right. This now purchase has been completed successfully."
- 22:38 Udi Dahan
- Where do events help? Well, we can start to invert some of these interactions and break them up into pieces. Something like this where the consumer is interacting with the website saying, "I'd like to place the order." And then instead of this sales service, for lack of a better term, request-responsing and orchestrating all of the downstream pieces, it does a subset of the workflow. It does steps one, two, and three and then publishes an event saying, "I've decided to accept the order." Now, deciding to accept the order turns into an event that it publishes that other downstream services are subscribed to. Now, in this case, we've decided to say, "Well, the billing process can be done asynchronously with respect to the first sales process."
- 23:34 Udi Dahan
- Now, this is a part technical decision, part business decision. For example, what would need to happen if the customer's card couldn't go through? Now, under the traditional type of orchestration style, something fails. Essentially, the whole workflow rolls back to the beginning. When we apply event-driven architecture, again, it involves both a technical decomposition and a business decomposition. The story goes well if their credit card was declined, don't just revert the order, ask them if they have a different credit card. What we found is a lot of times they do, and the purchases can still be successful. So essentially, we break down what would be an otherwise monolithic workflow, and we turn that into smaller modular sub-workflows where each piece of it can succeed and fail independently of the others. That's again, not just a technical decision, there are business decisions there.
- 24:44 Udi Dahan
- Now, as this happens, the same thing on the shipping side. What happens if we don't have the items in inventory, right? Now, under the traditional workflow style, we would say, "Items aren't in inventory." We'll roll back the whole thing, say it's an invalid order, and tell the user, "Oh, that item's not in inventory anymore." Quick question. Those of you who've bought something on Amazon, have you ever gotten a message saying, "Oops, we're out of stock on that product"? Anybody got that? Yes, out of stock. Yeah, some of you. What happened? Did you say, "Well, I'm not going to buy that item then." My guess is that most of you accepted the message from Amazon that said, "But we'll have it in stock in another two weeks and then we'll ship it to you." So the majority of the cases, when we don't have the items in inventory, it's not a failure scenario, it's a it just succeeds differently a little bit later.
- 25:44 Udi Dahan
- So when choosing to introduce events and to modularize your complex workflows, find those places where it maps to business-independent success and failure conditions. Don't just throw events at everything, find those places, and then it will help. Now, shipping here needs to also wait to see, well, did they actually pay for their product? So we don't just say, "Oh, the order was accepted, let's go ship the products." We actually want to wait until billing does its process, publishes the order build event, and then we get both of the events say, "Oh, okay. Great. Now, I know that I actually should be shipping the products to the customer." All right?
- 26:27 Udi Dahan
- So these are the core bits of event-driven architecture. One thing that I want to mention on the infrastructure side because most of this stuff technically not that challenging. When you're processing a given command like the place order command and persisting something to a database and publishing an event afterwards, there are all sorts of ways where that thing can go wrong. So if there was a failure, if there was a deadlock in the database, if the web server crapped out, if there was any retry that's in there, that can cause a situation where you get the right data in the database, but the wrong data in the event. Or the right data in the database and the event doesn't get published.
- 27:15 Udi Dahan
- So there are techniques to handle this stuff. I'm not going to go into them right now. There's this link particular.net/transactional-session. Read up on that. It'll tell you all of the ways in which this event-driven architecture style can fail technologically and then the ways to go about resolving that. Again, this is an infrastructure piece. It doesn't actually change the event-driven architecture piece that you're putting over the top. All right, I see everybody got a picture of that URL. Moving on.
- 27:46 Udi Dahan
- So that was number three, event-driven architecture. Number two, vertical slice architecture. So vertical slice architecture. Some people know it by a different name. The term micro frontends has started to become more popular. We've also got the backend for frontend pattern. A lot of those things are describing the same idea from different angles. Now, what is it? Vertical slice architecture. As I mentioned before, we've got traditional layered architecture which categorizes things into layers like user interface, API, business logic, data access logic. The idea with vertical slices is, as the name states, not layers, vertical slices. What that means is inside each vertical slice you will have some UI pieces, some API pieces, some business logic pieces, some data access pieces, and yes, even some database pieces. We'll talk about how those things pan out.
- 28:50 Udi Dahan
- So what's the idea over here? Well, there are a couple of elements. One of them at the database level. We talked about that before. We said if we share things, if we have too much coupling at the database level, too many pieces of code that are talking to the same tables, that ends up breaking things. We want to stop doing that. So we need to decompose at the database level and also, we see that at the UI level, things can get quite monolithic. So if we're looking at a given UI, and apologize for keeping using the same Amazon case, I'll give you another one a little bit later. So we've got product details that have images of the book, names of the book, the rating of the book, the pricing of the book, the inventory of the book. And we see this in a lot of systems, there's often some central entity to the system, and over time, the business keeps telling us, "I need another attribute on that thing. I need another attribute on that thing."
- 29:49 Udi Dahan
- And that goes all the way from the UI. It's like, "I want to see this attribute over here, that attribute over there." And then we have to go and add that attribute to the API, and add that attribute to our business entities, and add that attribute to our data access logic, and add it to the tables. So over the years, a lot of us have come to this conclusion, say having to change code in five different places every time the business wants to add an attribute, something there seems, at the very least, inefficient. We should be able to do better.
- 30:21 Udi Dahan
- So one part of this is to say, "Well, does all of this data really need to sit together?" Now, to answer that question, that's actually more of a business question than a technical question just like the event-driven architecture story that I was talking about before. We could try to partition the data into different vertical slices, but that is predicated on the assumption that there is no business relationship between them. Let me give you an example. So the connection between the name of the product and the price of the product. If we put both of those things in the same table in the database, any time I run a transaction on a record in a database, it locks the whole record, right? I can't just lock a field, just lock the name column of that product table, but not lock the price column. That's the way that databases work. So you put the stuff together in a table, that means every record is all locked together as part of one transaction.
- 31:39 Udi Dahan
- But why would we need that? We'd only need that if there were multiple transactions working on the same product at the same time where potentially, they were touching different attributes. Now, imagine we had a situation where a user's saying, "Well, I want to change the image of the book. We've got a better image for the blue book. We want to use it." And another user is saying, "I want to update the price." And both of those users issue their commands, go to the database at the same time. The way that technically it would be built underneath a layered architecture is one of those transactions would succeed. First one wins, the second one would fail because, again, the record in the database was changed and we'd have optimistic concurrency checking and that kind of stuff.
- 32:30 Udi Dahan
- But if you ask a business stakeholder, say, "Would you want that to fail?" They'd say, "No. Why? What possible relationship is there between name of a product and the price of the product?" Now, we can try to come up with some requirements, if we'll call them that, that the business might want in the future. Like, "Well, wouldn't you potentially want to have a rule that states that if the name of the product is more than 30 characters, that the price must be at least $30?" And the business looks at you and says, "You don't understand anything about business, do you?"
- 33:06 Udi Dahan
- But technically that's how we've built the system. We've built the system in such a way to support that kind of requirement. By putting the name and the price in the same table, giving consistent locking around all of that, we're saying we can support that requirement. And the business says, "Well, good for you. I'm never going to want that." But then they come to us with requirements that they do want and then what do we say? "Oh, no, that's a hard one. I have to rewrite the system from scratch for that one."
- 33:41 Udi Dahan
- So part of both doing event-driven architecture and vertical slice architecture, aligning the boundaries with the business. And for a bunch of things, when you look at them at a field-by-field basis and you talk to your domain experts, they'll say, "I don't need transactions between those things. The price and the name and the image independently volatile. Put those things in different places, I don't care." "What about the inventory?" "What about the inventory? Obviously, the inventory should not be affected by the name of the book and vice versa." "What about inventory and price?" They're like, "Ooh, that's a good question. Here's the thing for this type of B2C e-commerce where we can always get more inventory of the stuff that we're selling, the fact that inventory has dropped to a low level is not going to be a reason for us to increase the price. So good question, but no, we're not going to need transactions that span inventory and pricing.
- 34:41 Udi Dahan
- And then that leads us to the conclusion to say, "Ah, we can put all those things in different vertical slices." I can take the name of the product, the image of the product, product ID, that kind of stuff, put that in the yellow vertical slice. And then I can take the rating of the book, and I can put that in the blue vertical slice. And then I can take the pricing of the book, and I can put that in the green vertical slice. And I can take the inventory levels and put that in the brown vertical slice. So long as they're all indexed by the same product ID and I can compose together a UI that has all of this stuff together, then my decoupling works.
- 35:22 Udi Dahan
- Again, this is not just at the level of the UI as a micro frontend as it were. It's also at the backend, the microservice backend, and the micro frontend. And I'd have a micro database thing at the back as well because, again, there's no reason for me to have a single big database that has all of the data inside it. Now, this is for showing a single type of item page. If you need to show a grid of information, so lists that have data that are coming from multiple different vertical slices, it can be more technically involved. Luckily... Now, again, that's a whole other presentation. That presentation is online, and you can find that, and it goes through and shows how to actually build out all the technical bits in order to support single-item composition and multi-item grid type of composition. So designing a UI for microservices, that's the link over there. Again, those are the ideas, the technique, the technology, whole presentation, it's all available online. Has everybody taken a picture? Yes, no, come back to me later if you need the links.
- 36:41 Udi Dahan
- So we're getting close, we're getting real close where we've got services, microservices, whatever you want to call them, modular micro, macro services where they've got some micro frontendy bits that might be running in the browser. We might also do that on the web server. So there are multiple styles. We can do micro frontends using React, using Angular, Vue, et cetera, or we can do them server-side using regular ASP.NET on the backend as well. But the tricky part is at the database level. Services seem like we're so close that they could fully encapsulate their own data. The only problem is those pesky workflows that we were talking about before. So that's just that one problem that we've got left, and that's going to lead us to our one weird trick.
- 37:40 Udi Dahan
- So we've got these services, vertical slices, top to bottom, micro frontends, micro backends, micro databases. We're using an event-driven architecture for them to interact with each other. We've chopped up our big complex workflows into these subparts, each of them well encapsulated. But now the question is how much data ends up being passed around between these events? Because that's where we started with the problem when we were talking about coupling at the beginning of this presentation, that Foo method that had all of those attributes, A, B, C, D, that was the logical coupling that we were concerned about. So we need to find some way to handle that. Now, the question is, well, how much data is there? Well, let's look at this checkout workflow and get a sense of what that's like.
- 38:28 Udi Dahan
- So in Amazon, you have to select a shipping address. So shipping address data, we've got name, address, phone number. The address itself is a value object that has city, street, house number, and in different places, it's actually more complex than that. Then we've got billing information, the credit cards that the person might have on file, all sorts of other ways that they can pay for it. Then we've got things like, well, what's the estimated delivery date and what delivery options do you want for that type of thing? And then ultimately, we summarize all of that to the user and say, "Okay. Great. That's your summary. That's how much it's going to cost." Got a fair bit of data here, right?
- 39:15 Udi Dahan
- Now, where and when does this data end up going via events? Well, when the user clicks place your order, essentially, we need to collect up all of that data. So the shipping address, the billing address, the credit card information, the delivery option. It's a big payload of data, and essentially, we need to put all of that into that order accepted event that we were talking about right over here. So when we say place order, place order includes these 25 different parameters in it. And then when sales publishes the event order accepted, that includes the same 25 parameters in it primarily because well, billing needs information about how much does this thing cost, and what are the credit card details, and all of that. So it extracts those pieces of information, does the billing, publishes its event. Shipping as well needs to know what the shipping details were. What's the shipping address? What's the delivery options that were selected?
- 40:23 Udi Dahan
- So while we've gone some ways along in the fact that things are more asynchronous, slightly better encapsulated, we still have lots of data in the events, sometimes known as fat events. Now, I'm not going to go fat shaming or anything, but it's that same problem of coupling that we mentioned before. The more data you're sharing, the more logical coupling you have. The more logical coupling you have, the more difficult it's going to be to handle changes when the business says, "Hey, we want to do a new thing now. It's called e-books. You might have heard of them. So I want to sell e-books on the website." You're like, "Okay. Cool. What's the shipping address for an e-book?" The answer is, "Good question. How exactly technically do we ship e-books?" But it's probably not to a person's physical address anymore, right?
- 41:18 Udi Dahan
- So when the business says, "I want to sell and ship e-books," then now we need to make a change to the sales code that is collecting up the data. It needs to be aware there are different types of products. Some of them are shipped physically, some of them are shipped digitally. The ones that are shipped physically need a shipping address that is a physical shipping address. The ones that are digital need a digital Kindle ID, whatever thing for shipping the thing to the person's Kindle device, or it could be to their email address, or what have you. So that's our problem. As we mentioned before, the business wants to do changes and once again, we need to change things in multiple places, potentially breaking things that we didn't want to break. So that's our problem. Again, we've done event-driven architecture originally, we've added vertical slices, we've decomposed things, we've gotten them small. But we still have a bunch of data that is flowing between our services via events, and that coupling is what hurts us. So how do we solve that? Well, time for that one weird trick.
- 42:29 Udi Dahan
- It is so small and trivial, you're just going to be saying, "Is that it? I can't believe that nobody told me this thing before." It's so easy. And you guys are sitting there saying, "Just shut up and tell us," right? So one weird trick is generate the IDs upfront at the beginning of the workflow. At step minus one of the workflow. So when the user's going to check out, they click a button that says proceed to check out. Instead of waiting until the end where the user has completed their checkout process, okay? So they've gone through everything, they've completed the checkout process, we're now in sales. We persist the order into the database with all of its data, and then the database gives us an identifier say, "Ah, yes, now I have an order ID." Persisting the identifier at the end of your workflow is what's causing the problem.
- 43:32 Udi Dahan
- If you were to generate the ID upfront, you can do this by generating a GUID, universally unique ID. You can do that on the browser, you can do that on the web server, but the more important thing is that you do that right at the beginning of the workflow. Why? Here's how everything twists once you've got an ID. We mentioned before vertical slices. We've got this type of composite workflow where each service could have its own micro frontend. So shipping is responsible for the bit of the, "Well, what shipping addresses do we have on file?" When the user says, "I want to ship to this address," they click a button in there, it's not the sales service that holds onto that. The shipping service says, "Oh, that's the shipping address that you want for this order ID. The order ID that we have in session state from before. I can hold that inside my service indexed by that order ID."
- 44:42 Udi Dahan
- When we get to the next step of the workflow and we need to choose the payment information, again, instead of sales collecting that information up into one big batch, the billing service with its micro frontend says, "I'm going to take that. It's my data. I'm the one that has to use it later on. I'm the one that's going to have to validate it. I'm going to be the one that needs to charge the credit card. So because I've got the order ID already there in session, I can hold onto my data indexed by that order ID already at that point in time." And the same thing with the shipping details bit, sorry, the delivery options later on.
- 45:23 Udi Dahan
- Essentially, all of the pieces by having the micro frontend and that micro backend and the ID, each of them can collect and store up their own data. Such that when the order is placed, so the user clicks that button and say, "I'd like to place my order," place order just needs to pass along that order ID. Order accepted, as an event, only needs to pass along that ID because when billing gets the order accepted event with the ID, it can look up all of the previous information that it stored indexed by that very same order ID. And same thing with shipping. When shipping gets its event, says, "Oh, I need to ship. I need to know what the shipping address is. I need to know what the delivery options are." Again, it has previously persisted that information based on that same order ID. It can look it up, and now, the order accepted event, the order build event, just need to have IDs.
- 46:30 Udi Dahan
- So we've done that thing that I told you at the beginning that we shouldn't do, where I said instead of passing all of the data via the invocation, put it in a database. But what we did here is it wasn't a shared database. That's the thing that we changed. By having each service has its own vertical slice, each of them with their own database, by generating the ID upfront, now each of them can collect that up, store their data, look it up afterwards. And when the business says, "Hey, I want to do e-books." In order to do e-books, saying, "Oh, I need to change the shipping code. The code that says, "I need to ask for a shipping address." It's the same code that says, "I know I don't need to ask for a shipping address because this is an e-book. I know I don't need to ask you for delivery options because this is an e-book." All of that remains fully encapsulated inside the shipping service.
- 47:34 Udi Dahan
- So it changes its bits of micro frontend and logic on the backend, but most importantly, sales didn't need to change anything. Sales remains totally ignorant of the fact we're selling eBooks, we're selling regular books, I don't care. An item is an item. It has a price. We do what we need to do. So one more example, similar but different, booking a hotel room. My guess is you've all been through a process like this, whether it's on Booking.com, Marriott, whatever website type of thing of the hotel that you've got. The process usually goes something like this. A person searches for availability. They put in the dates of their stay, their trip, click search, or view rates in this case, and the system comes and shows them, "Oh, these are the types of rooms that are available for you during those dates. Here's their information. Here are their prices."
- 48:33 Udi Dahan
- Again, this is a type of grid as it were. You've got a list of items and information that technically we'd say, "Well, the price of the product and the image of the product, the name of the product, those should probably come from different services." So how to generate this grid, already gave you the reference for that, but there's an extra interesting piece that we're going to get to. But as a part of your checkout/make-a-booking process, then you need to go to the next step. You need to fill in your guest information. You need to fill in your credit card information where, again, intuitively we'd say guest information should probably be in its own service, separate from the credit card information a lot of times because we've got business travelers. Business travelers, they're staying at the hotel, but the company is paying for their stay.
- 49:22 Udi Dahan
- And also all of that information is logically decoupled from the information about the types of the room and their prices and their availability. So again, if we were to do a traditional style of making a booking, a traditional style of workflow where, as a part of this process, the user is filling in information, and next, next, nexting, and we're building up a big, gigantic booking object, which then we're going to be sending to a backend. Anytime there's going to be a change, well, that event will end up having to change as well, and we end up getting a lot, a lot of data in it. Big events also make the broker slow, so it hurts our performance.
- 50:07 Udi Dahan
- Now, once again, that trick. Generate the ID upfront. Now, when I say generate the ID upfront, the reason that I use this example is sometimes people are like, it can't be that upfront. Like the person comes to the website and the very first click where they're saying, "These are my dates. Search." As developers, we view that, say, "But that's just a search. A search shouldn't really be a beginning of a workflow. Maybe on the next screen when they're actually making the booking, but at the time that they're actually doing the search, why then?" Well, I'm going to get back to that, but I want to plant that idea in your mind, say, "How upfront could upfront be?"
- 50:56 Udi Dahan
- So the next bits of the make a booking, yeah, those are the easy parts. Say, "All right. We're going to have a guest microservice where it has a guest micro frontend that is collecting up its data, and then we'll have the payment vertical slice with the payment micro frontend and the payment backend." Note that in this case, we have a single book now button at the end. This is a different composite. This is a composite form. Essentially, it's two forms pretending to be one. So each bit of those microservices, each one of those micro frontends is a kind of micro form unto itself. When you click the book now button, essentially each one of them gets an independent callback.
- 51:45 Udi Dahan
- So the yellow vertical slice says, "Oh, you want to book now. Great, I'll go persist the guest information." The blue microservice says, "Oh, you want to book now. Okay. Great. I'll go persist the payment information." So we can do that decomposition on form submission as well and have a nice clear separation over there so each of them can collect their own data, which is good. That's what we were talking about before, but why did we need to generate the ID already at the time of the search? Well, there's this other business requirement that oftentimes you won't hear about until much later in a system, the prices.
- 52:30 Udi Dahan
- The prices that are shown when you do a search have to be respected when the person actually makes their booking and also checks in and checks out of the hotel weeks and months later. Now, clearly, the price of those rooms are going to change a thousand times from when you make the booking to when you actually show up to when you actually check out. But you'd be pretty upset if, at the time that you're making a booking, they're saying, "Oh, it's $100 a night." And you're like, "Oh, great. I'll book that." And then when it comes time to check in, they're like, "Just a friendly reminder, it now costs $400 a night." And you're like, "Wait, what?" They're like, "Well, yeah, the price changed." You're like, "I don't think you can do that." "Well, the price in our database has changed," right? Saying, "Well, no, there's actually legal requirements that state when you're showing a price, it actually needs to be respected over the long term."
- 53:38 Udi Dahan
- So essentially, what we need to do, we need to snapshot the prices as well. Once we show the price to the user, that is now what's known as advertising. Advertising means you can't do false advertising. You cannot present one price and then charge a different price later on. So when they're doing the view rates, that's the time when you're going to the database to get the prices. By having their booking ID already at the time that they clicked view rates, you can then snapshot the prices connecting them to that booking ID so that when they make their booking, when they check in, when they check out, you have the correct price that was shown to them at that time. Again, if you don't do this, your company may get sued for false advertising. That's not a nice thing to have happen to you, all right?
- 54:33 Udi Dahan
- Now, I know with all of these things, you're sitting there, you're like, "Yes. That makes sense, but I have questions. So many questions." It's like, "Yes, you have solved all of the problems that you have said that you've solved, but how do we search across all of this data?" It's like, "Okay. You've put the prices over here and the room types over there and the guest information over here, and then that thing over. It's all nicely decoupled. You're not sharing it via events. Neat, groovy, but users want to search, and they want to be able to search across this thing, and that thing, and that thing, and that thing." We see that also in Amazon. I want to search based on price. I want to search based on the color of the item. I want to search based on the brand of the item. I want to buy a new phone. Am I looking for an Apple iPhone or am I looking for a Samsung Galaxy? Am I looking for sub-$1,000 or sub-$2,000? Am I looking for gray? Am I looking for black?
- 55:30 Udi Dahan
- We need to support search across all of these services. Great question. I don't have the time to get into it right now. There's another one you're saying, "What about calculations? Again, we might need to do calculations across all of these services. How do we do that?" Again, I apologize for all of the moving graphics. There is a way of doing that, but it also is a slightly different, more advanced technique. I got you covered for that as well. There's this other talk that I've given a number of years ago at this very same NDC talking about rules engines. So the engine pattern is actually a fairly old pattern in software. You might remember back in the day, those things were called search engines, right? We didn't call them search services or search AIs. They were called search engines. The reason that the term engine was used is because it was a software pattern.
- 56:30 Udi Dahan
- So engines are a way to compose multiple microservices, vertical slices, into a given search process to give those types of results across services. Same things with things like pricing or discounts. Before we got all services crazy as an industry, we used to call those things, "We have a pricing engine," right? So engines were used in a bunch of different ways. Fraud as well. We'd have a fraud engine for solving these things. So there's this engine pattern. I've given a talk about it. That link is available online. Hopefully, you'll be able to find more information about that.
- 57:11 Udi Dahan
- Again, I know you got questions. There's lots of information, but if you can take one thing away from this course, it's that one weird trick, generate those IDs upfront. Once you generate the ID upfront, it enables you to do that vertical slice architecture and event-driven architecture that extra little bit better. Your data ends up being that extra bit encapsulated. Your events end up being that much slimmer. The performance of your message broker, everything gets so much faster as a result of that.
- 57:44 Udi Dahan
- Now, if you have more questions, come see me at the booth. I'll be heading over there in just two minutes. Or for those of you that want the full comprehensive A through Z with all of the questions and answers, I cover this stuff as a part of my five-day course. That's at the link over there, particular.net/adsd. The next one is in London in September. Hope to see some of you there, it isn't sold out yet. And thank you all for being such a wonderful audience. I hope you've enjoyed our time together, and I'll see you at the booth in just a few minutes. Thank you.