All our aggregates are wrong

00:00:01 David Boike

Hello again, everyone. And thanks for joining us for another Particular live webinar. This is David Boike. Today I'm joined by my colleague and solution architect, Mauro Servienti, who is going to talk about how to take requirements apart to build the right software.

But before we begin, please use the Q&A feature to ask any questions you may have during today's live webinar, and we'll be sure to address them at the end of the presentation. We will follow up offline to answer all the questions we won't be able to answer during this live webinar. We are recording the webinar and everyone will receive a link to the recording via email. Okay. Let's talk about why our aggregates are all wrong, Mauro. Welcome.

00:00:39 Mauro Servienti

Thanks, David. And good day, everyone. So welcome to all our aggregates are wrong. I'm a remote worker, as David is and everyone in Particular Software is. We are all remote. And every one of us works from home, or like in my case, given that the flat we're living in with my family is too small, I rented a small office that is something like a kilometer and a few meters away from where we live.

I usually bike ride to the office, and my favorite snacks for coffee breaks are bananas. And if you try to picture that, imagine yourself bike riding with a banana in the backpack, that's not going to end well. So after failing very badly for my backpack, a couple of times, I decided to try to fix the thing and headed to my favorite online website and tried to look for a solution. And here is what I found, a banana protector.

So I stopped for a while and realized that there's a banana protector with 26 customer reviews, really? And by the way, when I received it, there were instructions as well inside the box. But that's another story. So I selected the banana protector I was looking for. And then I put that into the shopping cart. And as I transitioned to the shopping cart, the architect that is constantly working in the back of my mind all of a sudden stopped everything and asked this question, doesn't it look like a very nice aggregate?

So it's interesting. Because if we look at the shopping cart, we can identify a set of data that clearly, or at least, seems to be clearly belonging to the shopping cart. For example, the item ID that on the page is not displayed, but you can imagine that every single item in the shopping cart has an identifier, the price, the quantity, inventory, name, and description.

And again, talking about aggregates, it's interesting because the shopping cart even has behaviors. So we can, for example, delete an item from the shopping cart, save it for later, whatever it means in the context of this kind of shopping cart, and change the quantity. So it seems that we are looking at the perfect aggregate from the data and behaviors definition.

But let's try to look at the few business cases and try to understand how they affect the design we just talked about. So let's say that the business comes along and say, look, we want to be sure that while an item is in the shopping cart, the item price doesn't change over time, even if we change the price of that item. So let's say that I put this banana protector to the shopping cart a week ago. And two days later, the price changed from $9.99 to $11, whatever.

And since I put that in the shopping cart before the price changed, the price in the shopping cart shouldn't change. It should remain the same as the time I put the item in the shopping cart. That's a business requirement. So it sounds like a straightforward one. So price is probably owned in some way by sales. And what we can do is that whenever we put the item in the shopping cart, we can copy the price over there. And it's a regular denormalization kind of approach.

So we're copying the price. So if we now change the price of the item you identified by the item ID, sales won't change in the shopping cart. Okay. Let's look at another requirement. So the business comes again and says, look, for psychological reasons, we want to tell to users how many items are left in stock for that specific item in their shopping cart. So that if we are low on stock, the inventory quantity is low, it should try to convince them to buy it, instead of leaving that in the shopping cart for too much time. And it's not a big deal.

So we can basically do the same thing. Probably the inventory is owned by the warehouse. And when we put the item in the shopping cart, we can do the exact same thing. So we can copy the inventory to the shopping cart. However, availability changes over time. So whenever the availability changes in warehouse, we now need to go to the shopping cart and say, hey, do you have any shopping cart in your world where there is this item ID? And if yes, can you please update the inventory of them to the new value? And it goes on and on and on.

So basically every single time inventory in warehouse changes, we need to update all the items in all the shopping carts if they match what just changed. And a third business requirement could be something like, you know what? Due to rules, laws, regulation proposes, different reasons, if the price of an item in a shopping cart changes, we are not allowed to leave it in the shopping cart. So for example, it works like this in Italy and in the US as well, I guess, because this screenshot comes from the US website.

We're not allowed to leave the item in the shopping cart because we cannot allow users to buy something at a different price when it was put in the shopping cart. So we must remove the item from the shopping cart and warn users that an item that was in the shopping cart, the price changed and we removed it, and possibly we moved it to a section called saved for later. So it's still in the same page as the shopping cart, but it's not currently in the cart.

At which point we can start smelling that something is not really behaving correctly. Because the information that we have in the shopping cart is not enough anymore to satisfy these new business requirements. So we need to change the shopping cart to track current price or last price, instead of just one single price. And now sales needs to do something very similar to what warehouse is doing.

So whenever the price changes in sales, we need to look for items in the shopping cart, if there's any item for which a price changed in the shopping cart, swap prices, inserting the new price, adding a new price to the current price and moving the current price to the last price, and finally move to save for later. And these couple of actions will go on and on and on, basically forever.

Here there will be items in the shopping cart will go on trying to change elements in the shopping cart from sales and time to change elements in the shopping cart from warehouse. And the big picture is even messier. So if we look at all the other data in the shopping cart, we probably realized that we have something that we can call marketing that takes care of the name and the description. And we copy information there.

And very similarly, we have shipping that has a fairly similar behavior to warehouse. So shipping is responsible to provide, for example, estimated delivery times, and based on the current load, the shipping might want to change information related to items in the shopping cart over time. And so they need to query the shopping cart. And if there's something in the shopping cart they care about, update delivery estimates in the shopping cart.

And if you look at this image or with all these arrows going back and forth, and probably with the red one, with the one with the red border that attracts your attention, it's becoming probably clear that something is not going as we expect. Because what we are basically losing here is autonomy. So as soon as we have all these interactions between things that should be clearly isolated and autonomous, autonomy is gone.

So basically now, whenever the sales needs to change something, they need to talk to the shopping cart. And we'll even be violating the sort of single responsibility principle. Because now we have sales telling to shopping cart, "Do this for me," instead of shopping cart being responsible for that action based on something that happens. So a question we can ask ourselves is, can we get rid of all this coupling? Because that's basically coupling.

So all these arrows going from left to right and right to left are telling us that there's a lot of coupling going on between all these, let's call them components for now. And it doesn't really matter if they are talking to each other through messages over a queue, or they're using HTTP, or they're talking to a shared database. Still coupling it is.

So the medium we are using to allow them to talk to each other doesn't change the fact that they are coupled and probably also temporally coupled. Because whenever the price in sales change, we want to update the price in shopping cart as fast as we can. So there's a sort of temporal coupling between them. The reality of thing is that if we stop for a second and go back to the drawing board, we might realize that the shopping cart doesn't exist.

So we can do something that we can call decomposition or model decomposition, and try to say, do we really need a shopping cart in this picture? Or can we decompose the shopping cart and do something like this? That is, can we remove stuff that is owned by sales and put that into sales? So can we remove current price and quantity because sales needs quantity in order to be able to calculate the total value of the shopping cart? And move that to sales, because that's where things belong.

And then can we do the same for warehouse? And say, well, in warehouse, we might need the item ID, the inventory and the quantity. Because based on the quantity, warehouse can, for example, forecast how many items would be shipped or delivered or removed from the inventory in the next few days, and probably lie to users on the website saying there are just 11 items available for this item, even if that's not true.

But based on the items currently in shopping carts, they can forecast that quantity. And the same for shipping. We can move item ID, quantity again, and the delivery estimate calculation there. And quantity is important for shipping as well. Because if you're buying one banana protector that's a thing to ship, it can be shipped probably in an envelope, using the simplest shipping method ever.

Or if we are buying thousands of them because we are a resellers, then we need something completely different. And then delivery estimates will changed drastically, probably. If we can look at marketing, we don't need to move anything there. To some extent, we can say that marketing is very, very stable. The only good reason for a name or description to change over time are typos.

It's not that we are changing the name of a banana protector into, I don't know, glasses, because we are reusing an item ID. So there's really no point in denormalizing that kind of information to satisfy the thing that these data can change, because that's not going to happen. If you look at the big picture now, do we still need the shopping cart at all?

I mean, what we have in the shopping cart now are just the cart identifier and the list of item IDs. And can we move that responsibility to sales, for example? So what about we move the cart identifier to sales, and we're removing entirely the shopping cart concept from the system? Why move into sales? This is an interesting scenario.

So whenever we look at the business scenario, where there are multiple components, or maybe we can start calling them services or microservices or aggregates, if you will. So where there are multiple components involved into defining that business scenario, from the logical perspective, there will be one of these components that is responsible for that concept, or we can say that owns the concept.

So is the fact that there is something in the shopping cart owned, logically, from the business perspective, by warehouse, shipping, marketing, or sales. Probably the answer is the sales. So it makes sense to make sure that there's kind of virtual master-details relationship. Sales is the master. If we agree with this picture, it's interesting now to realize that the act of copying data over to the shopping cart still exists.

So whenever an item is put in the shopping cart, sales will copy the price to the current price and move the old price to last price. Whenever the warehouse does that, it copies the inventory to the inventory value in the shopping cart. And shipping does the same. What's the difference between before that we just followed the coupling? So now all the interactions happen within the boundaries of a single service aggregate or microservices, if you will.

So without the boundaries of what we can start now calling an autonomous component. So these components are really responsible for one single thing. The shopping cart component living in sales is responsible just for prices and for probably calculating the total amount of the cart. The component handling the shopping cart part that lives in warehouse is responsible for inventory only and the one in shipping for delivery estimates and nothing else.

So we can transition from this kind of image, where we have interactions within components, to vertical slices. So if these components are autonomous, we can basically say every single service now lives in a vertical slice where they deal with behavior and data, and they fully own behaviors and data belonging to that slide. There is no need anymore for interactions between these components.

As we saw before, when we had the shopping cart, it was acting like a shared resource across multiple components. So what vertical slices brings to the table is basically autonomy. And that's probably the most important thing that we need to care about when designing the complex business system. So think about, for example, sales. If sales now wants to evolve, like for example, introducing Bitcoin payments. Bitcoin are everywhere or are supposed to be everywhere nowadays.

So if this business comes along and say, "Look, folks, we want to be able to allow our customers to pay in Bitcoins." There is no need to change the shopping cart anymore. This kind of change affects sales only. No one else in this very small system is affected by this kind of change. Warehouse doesn't care about the currency we're using, marketing neither, and shipping at all.

The same happens, for example, if we want to introduce digital downloads. Let's say that starting tomorrow, we're going to sell other than books and banana protectors, MP3 or eBooks. In that sense, your shipping doesn't make sense anymore. So if the business comes along and say, "Hey, starting tomorrow, we're going to sell stuff that can be digitally downloaded." We just need to change shipping to basically ignore orders or shopping cart items that are related to digital downloads, that are digitally downloadable.

And again, warehouse doesn't care at all. And marketing neither. And sales probably not at all. What's the problem now? And I already see a couple of questions that I might answer in the next few minutes. So it's obvious that as soon as we remove the shopping cart that sits in the middle and we chop it, moving information into their own components or services, the information structure doesn't fit the users anymore.

So the problem is obviously that if we have information living in different systems or different databases or even different tables that are not linked together, so that there is a foreign key linking them together. We cannot expect users to go from page to page in order to understand what's in the cart or what a product looks like.

So we should try now to find a solution to a problem we just created ourselves. So the problem we solved before was we wanted to get rid of all the coupling in the system by assigning responsibilities, basically following the single responsibility principle to services or autonomous components or aggregates. They're interchangeable at this point, this terminology.

So we achieved that by making so that data are owned by the right component. Obviously we're now in this new scenario where we have data in the right place. So we have autonomous components. But we now broke in some way the UI. So the first question is can Read Models be a solution? Because one thing that we could do is basically say, let's say that we have our five lovely services plus whatever we want. So we might have many, many services involved at this point.

One thing that we could do is basically push information from these services into a sort of let's call it a ViewModel storage or Read Models storage. Let's imagine, for example, Elasticsearch. Lots of people use Elasticsearch to achieve this kind of solution. And then have the shopping cart built into this ViewModel storage, or have the product page built into this ViewModel storage.

So whenever something changes in sales, sales take care of updating the ViewModel storage. Whenever something changes in marketing, marketing will update the ViewModel storage, and so on and so on and so on. Unfortunately, as soon as we introduce something like this that is clearly outside the boundaries of the services defined up there, we basically moved to a cache. That's the sad reality.

So if we gave it the proper name, that's not that fancy anymore. So the problem here is that... I'm not against the Read Models at all. And to be clear, I'm not against CQRS at all. What I'm against is Read Models outside the boundaries of service or outside the bounded context. So in the specific sample before, that ViewModel storage is not owned by anyone.

Now, imagine if you model the Read Model that represents my current shopping cart living in that storage. And now sales wants to invalidate the price. The only thing they can do is basically delete that monolithic model that represents the shopping, my shopping cart, and now it's basically forcing every other service to be invoked once again because they need to repopulate.

The problem is that there's no ownership. It's a kind of no man's land, that ViewModel storage, where no one really owns anything. And so it's really hard to say who can change what. And evolution is very, very complex. It's, again, a huge source of coupling. And the problem is that not everything can be cached. So I intentionally picked the screenshot from the US website because most of the items from the US website cannot be shipped to the place where I live, especially things not directly sold by Amazon in this case, but sold in the marketplace.

This information cannot be cached. So it means that there are parts of the page or of the shopping cart that cannot be really cached, unless they are caching every single possible combination of products, shipping destination, and users, customers, and they can pre calculate that they cannot ship that specific item to Garlasko. But that's impractical in the end because they have millions of customers and millions of items. So that cache should be super, super big.

It means that this specific information cannot really be cached. And at least that part of the page needs to be evaluated on the fly for every single request. So let's go back for a second to the drawing board. And let's look at what we have when we assigned every single component the responsibility for their own data. So we assigned them data. And the thing that we should notice here is that we have shared identifiers.

So basically the item ID is not going to change. The primary key that identifies a product in our system or an item, whatever item it is, won't change. Because if in a system primary keys are going to change, we have a much bigger problem than information structure for the user. So we can assume that these identifiers won't change ever, structurally won't change ever. So we can use those to build the kind of virtual foreign key across components.

If that's true, we can start talking about ViewModel composition instead of storing the Read Model into a cache. So imagine a client, in this case a browser, that is trying to retrieve a product that is identified by the identifier one. It can be whatever you want. So it can be a global unique identifier or even a surrogate key. It doesn't really matter in this case.

And then we have the four plus all the other services that are interested in providing information for that specific product identified by the key 1. If that key 1 is shared across services, what the browser can do or the client in this case can do is basically go to every single service, retrieve all the information, compose on the fly this information into the ViewModel and return that to the client. So the client in this case to the user in front of the screen, and can do that on the fly.

Let's pause here a second and try to dive deeper into what I just said. So the flow would be something like this. There's an incoming HTTP request that comes in, /product/1, that goes to something that we can call a composition gateway. If that pictures in your mind like at a reverse proxy, that's what it is. That composition gateway does two things. The first one is request matching.

What request matching means, it means that the composition gateway goes to every service and says, are you interested in handling these incoming requests identified by this URL, /product/1? And all the services interested just reply yes. In this case, the four services that we have will reply yes. But they might be just one or two. It doesn't really matter. In this case, there are four.

And then what the composition gateway does is ask the services that the replied yes to the previous question, just do the composition now. So act on it and go back to your backends or to your databases, retrieve data, and finally compose the ViewModel. And the composed ViewModel is then returned as an HTTP response to the client. So trivially thinking about it, think about two foreach loops nested within each other.

The first foreach will iterate over all the available services, saying, are you interested in this request? The second foreach will iterate over everyone that replied yes or true to the previous one. And will say, hey, compose this ViewModel, where composed this new model means here is a dictionary that is empty. Put data in it.

And again, if you think about it, a JSON object works very well in this case, because a JSON object works like addition. It's an empty thing that can be filled on the fly by services. So let's focus on the price. Let's look at this product page. I put a hypothetical product page. And let's try to understand what's the way for sales in this case to handle and compose the price on such a page.

You can imagine something like this, a class that implements an interface called IHandleRequests. And in this case, we have the class code ProductDetailsGetHandler. The first thing that the interface demands the class to do is what's called request matching. So the interface expects that we implement a method code matches that returns and receives as input route data and a string.

I'm using MVC Core as the underlying infrastructure in the demo. And route data is the routing information provided by MVC core. And the verb is the current HTTP method. The only thing that at the matching time sales does is basically check, is it a get? Yes. If it is a get, is it for products and contains an ID?

So does the template contain an ID? Yes. So I'm interested in that. So this Matches method will return true. And if that returns true, it means that the secondary invocation will come in. And what sales does is basically, again, retrieve from the route data the interesting values, in this case, the ID.

In this sample using HTTP, they will go back to their own backend, let's say a web API somewhere, retrieve the information they need to retrieve, and finally compose the price. The price is composed over that VM object. And if you looked at the signature of the method, the VM object is a dynamic object.

So I'm using C# here. And in order to keep this as simple as possible and to reduce the infrastructure required, we're using dynamic to mimic what the JSON object would look like. So basically every single handler is free to put into that dynamic object whatever they want. And that thing will appear in the page just because they are saying, hey, that DLR, Dynamic Language Runtime, please make so that the product price property exists on this dynamic VM instance.

There's an important bit here. It's important to remember that we are not constrained to use HTTP to issue requests to our own backend. So these handlers are logically owned by services, meaning that this class, even if it's run within the context and deployed within the context of the composition gateway, it's logically owned by sales.

This means that there's nothing wrong in having code here that accesses directly the sales database or the sales cache, in which case being fully owned by sales is perfectly legit. So if we now have kind of solved the composition problem, so we're now able... If you think about it, every single service we'll deploy a class like this one.

Marketing will do for name and description. Warehouse will do that for the inventory for a specific product ID. And shipping will do that for the delivery estimates and sales for prices, as we just saw. So we kind of solved that kind of problem of sending data to the UI, but the shopping cart allowed us to select quantities.

So the question should be, what about writes at this point? So we're now able to send data to the UI in order to display something that doesn't really exist in the backend from the shape of the data that we're displaying. But on a page like the Amazon one, when we choose add to cart, there is a drop down that allows us to select the quantity.

And in our design, the quantity is required by multiple services. So both sales, warehouse and shipping, they all need the quantity. And this is a very simple case. We might have a much complex form, where some of the data on the form goes to a service, some other data go to another services, and some other data go to third service.

So it can be even much more complex than a very simple data like the quantity. So in this case, we usually talk about ViewModel decomposition. So the opposite of composition. And it works like this. So still we have a browser that acts like the current client. And instead of getting something from the backend, that browser will try to post something to the backend.

So a POST request would look like this. So POST to /shopping-cart/ passing in the item ID and the quantity. And what we can do is inside that composition gateway, basically do exactly the opposite of what we just did. That is there will be handlers retrieving data from that POST body coming in and talking to data on back-ends sending data back.

So everyone is now capable of extracting what they own from the incoming HTTP request, the POST one in this case, and everyone will talk to their own backend sending back data. So we're now able to compose data on the UI. We saw to decompose data from the UI to back-ends. And probably you are wondering, "The UI you just talked about is way much simpler than the shopping cart," because the shopping cart is a list of items.

And I used the sample, the product page. It was kind of intentional to skip that part because there's much more complexity in composing a list. So if we use the approach that I just used in composing a list, that I just used to compose a product page, it's very easy to fall into the top of a kind of select N+1 over HTTP.

There will be dozens of requests going from the client to the backend services to retrieve all the items and to compose all of them. List composition is a topic on its own. So if you want to understand more, I wrote 11 or 12 blog posts about it, with samples and demos, and you can head to that link to understand more. And obviously you can raise questions and we'll answer those with no problems.

So we have now data owned by services and we're respecting the single responsibility principle in order to make so that services are autonomous. We saw how to compose data in order to satisfy the user information structure or as UX design like to call it, the user mental model. We understood that we can even decompose data in order to send those data back to their own services, where they are owned.

And if you think about it, we haven't touched at all the problematic part. Because things might go wrong, even might go very badly. So let's have a look again at this picture. So the communication when posting data from the front-end or from the client to services is a kind of a very fragile point. So what can happen, if you think about it, let's imagine those three yellow arrows are HTTP requests.

So there's an incoming POST, HTTP, that goes to the composition gateway. And then there will be three HTTP requests going out in parallel from the composition gateway to sales, shipping, and warehouse. Now, shipping fails, blows up, and we have no way to fix the problem because there are no transactions over HTTP. Luckily, there are no transactions over HTTP.

But the problem is that we might be in a very bad situation that is something like we can talk to shipping, but the response times out. So now sales and warehouse, well, they are autonomous. So they might have no idea that we failed in talking to shipping. But even if they had, they have no way to do anything. Because what we didn't get back is the reply from shipping.

So we are in a situation where we don't know what to do. And we cannot simply say, hey, please roll back. It's even worse if we think that, for example, the entire composition gateway can blow up. So the infrastructure hosting the part that represents the composition gateway can fail in the middle of the request.

So there's an incoming POST. The first HTTP request luckily goes out, but all the others fail because the infrastructure collapses. Now, sales has some information related to the cart, but no one else has. And we have no way to recover from that scenario. So the definitive solution to that is to introduce messaging at this point. And to solve the infrastructure problem, that's the exact topic we're talking about in the next few slides.

So what we have is the incoming HTTP request /products/1. In this case, the POST to shopping cart is the thing we're talking about. Sorry about the typo in this slide. And what we do is that basically we convert the payload of the incoming HTTP request into a message, and we send the message to ourself.

So now we immediately handled the incoming HTTP request. And we have a message in the input queue of the composition gateway. When the message is handled, every single handler is invoked over a message in this case, not anymore an incoming HTTP request. And handlers will send out messages to their own backend in order to tell their own backend, hey, add this to the sales shopping cart. Add this to the shipping cart. Add this to warehouse shopping cart.

Why messages? Because if at this point shipping blows up, what happens is that the incoming message, the original message that we sent to ourselves will be rolled back and retried. So if shipping failed, the shipping message didn't go anywhere, but maybe the other three already went to their own backend.

But the first message we wanted we send ourselves is retried. So everyone is invoked once again. But now it's much easier to deduplicate over messages because we can implement something called the outbox pattern. And for example, NServiceBus will do that for us. And make sure that subsequent messages identified by the same message ID will be deduplicated by the receivers and they will be handled just once.

So there is no risk of having corrupted information in the backend. Obviously what we're introducing now is eventual consistency. Because now we are using messages in order to talk to the backends. And we have no idea when everything is consistent. That's why, for example, Amazon is using that kind of intermediate page whenever you add something to the shopping cart.

So you add something to the shopping cart, and then you transition to a page that says, "Thank you. We added that to the shopping cart. People that put that thing into the shopping cart also looked at these other things." That captures your attention for enough time to make sure that the backend are now consistent and everything, if you go to the shopping cart page, is consistent and your shopping cart will look just fine.

So the thing that we just did enables us to augment our vertical slices by saying, well, now the vertical slices can own the UI part as well. So the data related to the UI are now owned by the vertical slice. So sales is responsible for how data going to the UI are composed, meaning that sales is even more autonomous now. Because if they need to change, they can influence also how things are displayed on the UI.

However, we're not done yet. If you think about it, services needs to communicate with each other. They need to communicate with each other in some way. After a while, the business comes along again and comes with two new requirements. One is notify users if their carts are stale for a week in an attempt to convince them to buy. And if after a month nothing happened, just wipe the stale cart. Just two simple requirements.

So we now need some sort of communication channel that allows services to talk to each other in some way. So whenever an item is added to the shopping cart, we said that that every service that deals with the shopping cart will receive that information through the composition mechanism. So sales will receive it, warehouse will receive it, and shipping will receive it. Marketing doesn't care from the shopping cart proposal we're talking about.

Sales, we said, from the logical perspective, is the one owning the business cart. So we can say that the add item to cart action is owned by sales, conceptually. So whenever that happens, sales will set two timeouts, a one week timeout and a one month timeout. What's a timeout? Nothing than a delayed delivery message. So basically sales will send to a queue a message saying to the infrastructure, hey, please send this message back to me in a week timeframe and this other one in a month timeframe.

And it can be in a few second, in six months, in 20 years, in five days. It doesn't really matter. It's up to the business requirement to decide how long a delay is. And whenever the first time out expires, so we have a kind of a cart got stale event that sales publishes, it goes to a bus that is the communication infrastructure. And that information is interesting for marketing.

Marketing had nothing to do with the cart when adding items, but is interested in knowing when a cart got stale. So what marketing does is they notify users through an email. And if you think about it, the email might contain details about the shopping cart. And that's UI composition again.

So the UI composition infrastructure kicks in to retrieve data about the stale shopping cart and provide a nice looking email to user saying, "Hey, you have these items in the shopping cart. They've been there for a few days. Are you interested in buying them?" And a month later another messenge arrives, that is the second timeout, and sales publishes another event, the cart expired event, that is now interesting for two services, warehouse and shipping.

They just handle the event and they wipe their review of the shopping cart. So it's obvious that sales wiped its own view of the shopping cart when the timeout expired and published these two events. Being sales the master, if sales removes the concept of a shopping cart from its own view, even if warehouse and shipping are eventually consistent and they'll handle these two events an hour later, that doesn't really matter.

There's no way for users to retrieve information in warehousing and shipping anymore because the master data aren't there anymore. So eventual consistency at this point is not a problem at all. The interesting thing of identifying the clear boundaries is that events now can be as thin as an event name and a bunch of identifiers.

So there's no need to share data at all between services in order to change the state of the system, because data are already where they need to be. If we are uploading data to the system, through the composition we can already provide each service with its own data that they own. And now they just need to publish events saying order accepted Order ID and here is the list of product IDs.

Shopping cart wiped. Here is the ID of the shopping cart, or product went out of stock, another event. And that event just needs to contain the product ID, and sales can react and remove that from the shopping cart and move it to the save for later section of the shopping cart.

There is no need anymore to share any data using messages or using any kind of infrastructure that can be HTTP request or a shared database across services. So what we basically did is that we reduced the coupling across the services as much as we can. It's not zero because zero coupling is impossible, but it's the very minimum amount of coupling that we needed.

The interesting thing is that now the final big picture looks like this. So we have services owning behavior or behavior and data. They can use a bus to talk to each other. They also own the ViewModel composition that has a foot into the UI shell. So they own the full vertical slice.

What they don't own yet, and it's unfortunately off topic for today, is the UI part. So how the UI is built, how the HTML will look like, we don't know it yet. So if you model composition, strictly relates to data. Composing data for the UI. In this example, we can start from the assumption, for example, that the UI is monolithic, owned by a service that we can call, for example, branding.

So to summarize what we talked about today. Boundaries are the key to success. One of my rule of thumb is whenever I feel the need of bringing in more technology to solve something that doesn't sound like a technical problem, it's time to go back to the drawing board because maybe we got boundaries wrong. The cache sample is the right sample.

So that specific Read Model, ViewModel storage outside of service boundaries was a technical solution to a non technical problem. We got boundaries wrong. And then we were kind of forced to add that technical solution. The shopping cart was a very similar problem. The other thing that they learned over the years is that user mental model can badly influence the design.

So users, they talk about they way they see data presented on the screens. So it's very easy for them to talk about the product. And it's very easy for us as engineers to fall into the trap of saying, "They're talking about the product." So we immediately transition to public class product with a nice-looking, string name, string description, and decimal price.

And then we have coupling. We built a monolith based on their model. But their model is most of the time based on the way data are presented. Connected to that is do not name things prematurely. So the problem is that names stick. So as soon as they talk about shopping cart, as soon as they talk about order management system and so on, it's very easy to adhere to that terminology.

And then if we already defined the concept of a product and all of a sudden someone talks about a price, that price feeds into the product. Because from the business perspective, that works nicely. But that's not the thing we are designing. So suggestion is use planet names, colors, magicians, whatever you want, Game of Thrones characters. It doesn't really matter.

Don't give to them a business meaningful name till you clearly know what they are. And the final thing is behaviors define how to aggregate data. The most important one is, here is the first one, group data that change together and that influence each other together. It's basically following the coupling.

There's no point in having the product name and description in the same place as the product price. They don't change together. There won't be any relationship or any policy or any business rule connected to the three of them all at the same time. There would be however a policy connected, for example, to the price and the shipping price.

So shipping price probably doesn't belong to shipping, but it belongs to sales because it goes into discount policies and stuff like that. So in the end it's just follow the coupling. And finally, since you've followed the coupling and you have data scattered around in the system, use composition techniques to present data.

There's no need for complex projections. It's very important to think that there's no need for complex projections outside the boundaries or the service boundaries. That's the important bit. And finally, there is no such thing as orchestration. You should be able to design everything in a system without the need of orchestrating things.

Thin events should be enough to move the system forward from a state to a next consistent state. And this is it for today. So thanks, everyone. And I guess that there are a few minutes for a Q&A session.

We will take questions for about 10 or 15 minutes, but we will be sure to get all of you a response to your questions via email if we don't have time for it live. So Mauro, the first question comes from Yavutz. He says when an aggregate is retrieved from disk and loaded into memory, should it fully include everything it aggregates or is it okay to partially include them? He says he has been a proponent of the former, which pushed his selections for the aggregate in a different manner than my peers. What do you think?

I'd say that probably the best answer is that it depends, but I'm going with all the options now. So don't worry. So it depends why you're loading the aggregate. If you're loading the aggregate for writing purpose, so there's a command coming in, whatever command means. It's not necessarily command on a queue or message on a queue.

There's an incoming HTTP request that should change something in an aggregate. Being the aggregate, the transactional boundary, when loading the aggregate in that case, you probably need all the data, because you need all the data to guarantee the aggregate consistency or the component consistency before being able to save it and validate it, and eventually reject the incoming request due to rules validation.

When loading data for visualization purposes, then it makes a lot of sense to apply some sort of CQRS here. So there's no point loading everything in order to display just a couple of things. So the aggregate might contain a few information and you just need a couple of them to display them.

If you find yourself in a scenario in which you're loading a component data or an aggregate in order to apply some business policy, so handling requests and incoming requests, and you don't need all the data, there are probably wrong boundaries there. So it's probably too big. That's not always the case obviously.

But it's probably worth spending some time investigating if it's violating the single responsibility principle. So that's the case in which we might have put together too many things that are not really required to stay together. And we might be putting them together for display purposes.

If you're in a situation where we are not using composition techniques, obviously we don't want it to go into multiple places, loading data here and there in order to display something. So we want to be able to load data with the minimum amount of requests ever, in which case, we might tend to put stuff together that should not stay together. Hopefully it was a comprehensive answer. Otherwise, just ping me over email or Twitter and we can continue the conversation.

All right. The next question comes from Ashith. And I think this is a really good question. So I'm interested to hear your response, Mauro. How is data compositional techniques different from GraphQL, or is it one of the methods to compose data from different services?

GraphQL is basically one of the methods to compose data from different services. There are a few caveats using GraphQL. So GraphQL was not designed originally with that intention. The original intention of GraphQL was a different one, that was to allow clients to select the subset of data they needed.

But GraphQL can be used to achieve the same results that I showed using sample code. And by the way, you will receive a follow-up email with the links to the webinar and the slides and the demos. There's one caveat with GraphQL. Not to being designed for such a proposal, dealing with the GraphQL resolvers, that is the very low level part of GraphQL that allows you to tell to GraphQL, "I'm going to handle this part of the query," might result in very complex queries.

Since it's very complex to understand what's going on overall in the request coming in, because the resolver is too low level, it might be very complex to understand what's happening and to optimize that kind of approach. But all in all, GraphQL is a solution to this. We can also say, for example, that NGINX is another kind of solution to this. It solves, again, a slightly different problem, but can be used to solve composition purposes. I know of systems using NGINX to achieve the exact same result.

All right. Next question is from Carsten. Why do you believe this is a better setup than each slice emitting events about their changes and a shopping cart aggregate doing all its updates itself based on those events?

It's mostly because of coupling. So the thing is, in a continuously evolving system, you want to have the least possible amount of coupling in order to allow components to freely change whenever they want. Let's say that, to some extent, there is nothing wrong in saying sales publishes an event containing a bunch of data. That event is consumed by shopping cart and then shopping cart will update itself.

Well, there's two problems. The first problem is that they are temporally decoupled. So there will be a delay between which the sales publishes the event and the shopping cart will handle the event and under heavy load that delay will stretch a bit. The other problem is that, now, if sales wants to change, for example, implementing the third business requirement, the price changing should be displayed in the shopping cart and the items should be moved up to the save for later.

Technically speaking, that change is a schema change from the shopping cart perspective because we need to alter the tables where the shopping cart is stored. So in doing that, we're breaking everyone else, basically, because the shopping cart acts like a source of contention between all the services, in this case, the design contention.

It's also probably a source of contention in the running system as well. And it's also deployment contention because now sales wants to change stuff. And when they need to deploy, they are deploying and breaking from the runtime perspective everyone else till deploy is done. If we wanted to achieve autonomous components, we might want to stay away from that design as much as we can.

So it really depends on the kind of SLAs that we have with our customer, the kind of distribution the system has. For a small system that is not the scale of Amazon, some of the things we talked about today might be overkill, to some extent.

All right, Mauro. Next question from Jacob, how would you implement a query to find, for example, all shopping carts that were shipped last month with the total value greater than $1,000, let's say?

I was waiting for the search kind of question. Thanks, Jacob. Searching in a distributed system is complex. That's the first thing to say. We need to be able to distinguish between the kind of searches and the user that is searching data. So if we look at Amazon, for example, they don't allow users to search for anything other than name and description.

Everything else is provided to facets that are the things that appear on the left side of the search results page that allows people to narrow into the results. So for example, if you search for hard disks or SSDs, then you have a facet that says we have some of them from Samsung, some of them from Crucial, some of them from Intel. And you can click on the link and narrow the search for SSDs where the producer is Intel or Samsung or Crucial.

That kind of search is kind of easy. Because if you think about it, the search is handled completely by marketing because we allow people to search only for name and descriptions that are owned by marketing. Once marketing retrieves all the ideas, then all the other services can kick in and compose additional data, the facets in this case.

A completely different scenario is the one sampled by Jacob, where it's probably a kind of backend or back-office kind of search, using the shopping cart example, and that can be eventually consistent. So the constraints there are much more relaxed, in which case we can ship some data to a sort of elastic search that will be a source of contention, no matter what.

So it's going to be a kind of a shared resource, and we're going to ship there the cart ID, the time of the purchase, the number of items in the cart, and the total price of the cart. And then we can allow items to be searched in there. So the third option that is to build the kind of search engine that is to allow services to participate into the search.

It allows the services to participate into search. It works. It does very low throughput. So the problem is that we cannot have near to zero coupling and very well performance search engines at the same time. So we need to choose which price we want to pay. And probably in the case of a front-end, Amazon selected the option for saying, we won't allow people to search for all the data.

They cannot search for prices, but they can only search for name and descriptions. From the backend perspective, we allow our back office user to search for much more information, but probably they don't need the real-time data or very fast responses from the search engine. Hopefully I answered the question. If you want more information or if you want to discuss further, just ping me.

I actually have a follow-up on that, Mauro. I think you covered the search aspect pretty well, but it seemed like there was a bit of a reporting slant to that question, too. Would you like to say anything about business activity monitoring?

What do you mean by business activity monitoring?

I mean like sagas to watch for events and stuff like that.

Oh, yes. So that's a very good option. So probably most of the times, the business tend to go to technical people and teams, and instead of coming with requirements, they come with solutions. So they say, "I want to be able to search for this," or "I need a report that displays this." And our role should be to try to dive into the thing of the standard real requirement.

And for example, about what David was saying is what we can do is basically start listening to events that flow in the system and use those events to build up a representation of data that are used to then satisfy that specific query into the system. So these events probably won't satisfy customer's requirement on the front end, but will be more than enough to satisfy back-office user requirements on the backend.

So it's really important to dive into every single scenario and try to understand, do I really need to break my coupling or is there another solution to that? That's the most important question. So for example, talking about NServiceBus, we can set up a saga that listens to multiple events and acts like a kind of a buffer, listening to data and processing these data.

And if there's a kind of a pattern in the event's coming in, then saga can react accordingly and maybe publish another event or store some data so that they will be available for other purposes later on. Is this the thing you were mentioning, David?

Yep. I think that pretty much covers it. I think this might be the last question. But again, we will make sure all of your questions are answered via email. This one is from Manvel, it says, so is it safe to assume you're saying if commands are sent across service boundaries, that's a code smell that your service boundaries are wrong?

Yes. So I used to say commands are internal to the service boundary, events are across boundaries. So imagine an incoming HTTP request that goes to a web API. There's some validation going on. And then the command is sent from that web API end-point to a back-end service.

If that happens, all the three parts, the view model of the decomposition part that issued the HTTP request, the web API that handles and validated the incoming request and transformed that into a message, and the backend service, they all belong to the same vertical slice. When that back-end service completes its work, then it can publish an event that says, "I'm done ABC."

And that event is interesting for another service. But commands should never be cross services. There shouldn't be a case where sales tells the shopping cart, "Move this item to the save for later." That's a sort of coupling. That's a kind of coupling that we're creating between boundaries, and we should try to avoid that as much as we can.

All right. Well, our colleagues will be speaking next month at DevTernity in Latvia and at CloudBrew in Belgium. You'll also see us at DDD Europe in February. So check those out if you're nearby. And go to particular.net/events to find us at a conference near you. That's all the time we have for today. On behalf of Mauro Servienti, this is David Boike saying goodbye for now, and see you on the next Particular live webinar. Thank you.

All our aggregates are wrong

🔗When things get complex

🔗In this webinar you’ll learn about:

🔗Transcription

About Mauro Servienti

Additional resources