Microservices architecture: is it the right choice to design long-living systems?

00:01 Mauro Servienti

Good day, everyone. And welcome to this webinar, this part of Codemotion. Codemotion is the multichannel platform that helps developer in their professional growth. Codemotion is the biggest tech conference for developers in EMEA, open to all languages and technologies. Codemotion also organized as many events for developers, such as tech meetups, webinars, hackathons, and training courses.

Here are the two upcoming conferences, Codemotion Rome, end of March, March 22nd and 23rd, to be precise. And Codemotion Amsterdam, beginning of April. Codemotion recently launched it's Codemotion platform. And here's a sneak preview of the Codemotion platform, where you can find articles, videos of events, interviews, and webinars. Shall we get started then? All right.

One of the challenges that we face every time we approach the design of a new system is the choice of the architecture style. We could even say that the larger and more complex the system is, the more likely it is that we need to select a set of architectural styles, each one for different parts of the system, as a single architecture doesn't really fit all our requirements. Unless we're tasked to design the disposable system, one of our main goal should be to select architectural styles that come with the guarantee to design long living and untenable systems.

Today's webinar will be focused on the microservices architectural style. We'll start by looking at what seems to be a natural choice until we try to fit in a few business requirements. We'll deeply analyze how to accommodate these requirements and absorb all the architecture evolves. As usual if you have any questions, do not hesitate to use the questions panel in the go to webinar app. I'll do my best to answer them at the end of the webinar, otherwise, I'll follow up later via email.

By the way, my name is Mauro Servienti, and as you may guess by my accent, I'm Italian. Let's get started then. I'm 100% remote worker, and unfortunately the flat we live in is very small, but I have a small office as well. I enjoy going to the office by bike. And at the same time, I enjoy bananas as next. As you can imagine, bananas and bike riding at the same time with the banana in the backpack it's not really a great idea.

After failing miserably a couple of times, and you can imagine what failing means in this case, I decided to fix the problem. I had it to my favorite online store and decided to buy a banana protector. That is the best way to protect my favorite snack while traveling to the office. I went to webpage, selected the product I wish to buy and added to the shopping cart. As soon as I added to the shopping cart, my immediate reaction was to listen to the architecture, architect, sorry. In my mind, trying to understand how this kind of shopping cart could be designed.

And it's very easy, the first guess to transition from this image or webpage, if you will, to something like a shopping cart microservice. What about if we design a shopping cart that is composed by a list of item IDs, prices, quantities, inventory, names, and descriptions. Basically the list of products we just added to the shopping cart, sounds trivial, isn't it?

But let me try to introduce at least three different business requirements that will try to fit into our requirements, and see how the shopping cart evolves along the way. The first piece of requirement is that, we cannot tell how the price of items in the shopping cart to unexpectedly change while the item is in the shopping cart itself. Basically what we want to achieve is to be able to, in some way, decouple the price of the item into the shopping cart.

It seems natural at this point, to designing something like the product microservice, where we have an item ready, the price. And every single time the price change, we copy the price, the shopping cart, or every single time we add a product to the shopping cart. We end up copying the price of the item. We just add it to the cart and to the shopping cart itself.

Let's have a look at the second business requirement then. The business wants to show the number of items is left in stock for a specific product that we have available. Something like what's displayed on the image, only 11 left in stock. Again, since someone is telling us that the quantity of available products should be displayed in the shopping cart. It's very tempting to failing to design this by adding the inventory quantity to the product microservice.

We end up with products microservices composed by an item ready, a price, and inventory, and still the shopping cart composed by the same data we identified earlier. And we end up copying the price and inventory. So every single time we add the product to the product shopping cart, we end up coping two more information. The last piece of requirement I want to investigate is a little bit more complex.

Due to legal reasons, we want to display to users if prices change while an item is in their shopping cart. And for very specific legal reasons, we cannot leave the item in the shopping cart. We want to move the item out of the shopping cart. Let's say in a kind of saved for later area in the shopping cart, so that we prevent users accidentally buying things where the price was changed and they were not aware of the price change.

Still looking at our products and shopping cart microservices, now suddenly the price definition that we added to the shopping cart at the very beginning is not enough anymore. So we need to transition from a price to a different concept. That is the current price and the last price, because that's the only way to understand from the UI perspective, if something changed along the way.

Now whenever price update comes in that affects the price of a product, what happens is that we need to query all the shopping cards to see if an item that they identified that XYZ, the item ID, is in some shopping cart. And if it is swap the current price to become the last price, and insert the new price in the current price attribute. And obviously move to save for later.

As you can imagine, we all start this meeting now, there's something wrong with all what we've been doing so far. The first smell is that there's a clear service boundaries violation, and that clear service boundaries valuation is identified by the move to save for later area. There's a service or a microservice if you will, that is attempting to perform an action on another microservice, but that's not the only problem with these new requirements introduced into our at least apparently nicely design architecture.

Price itself doesn't feel in the right place. So we might have many different reasons or many different sources that caused the price to change. Let's say, there's marketing coming in saying, "Hey, I want to run a marketing campaign." Or there's the stock inventory changed, there's a new supplier supplying the same kind of product at a different price. So the price we sell the product needs change.

There's two different sources of information, and what's this telling us is that ownership is lost. We're violating service boundaries in way so that we are not anymore owners of the information we store. We just own data, but we don't own behavior, and we stopped owning data as well, because everyone now can ask us to change something that we should own. But it's not all it is.

There's another problem that we have. As soon as we introduce, for example, suppliers in the concept, we have a supplier that has a supplier ID, that an item ID that might be the product ID and the purchase price. We buy stuff from suppliers, and we sell these things on our online store. Now, a new purchase price comes in, so the supplier changes the price. We buy items from them, and then we suddenly needs to query the products microservices, products microservice, sorry, asking, are we selling this? Because if we're selling this item, we need to update the price.

Now the suppliers microservice dictating to the product one, "Hey, you need to update the price of the products you're selling." And at the same time, what happens is that, as soon that happens, the products microservice needs to go to the shopping cart saying, "Are we actually currently selling any of these? Is there any of these in any of these shopping cards out there?" And if the answer is yes, swap and insert new prices, and move to save for later.

What is this feature telling us? That autonomy is gone. So services or microservices should be designed to be autonomous, but as soon as we have all these arrows connecting services in a kind of common and concurrent way, so issuing commands, telling to other services, do this for me, or let me read your data, we've lost autonomy.

And the other problem we might face moving to a completely different level is that since we add a few things on the web, we implemented all these cross service communication using HTTP, because it's microservices way. Using the last slide, we end up in such situation with something like the following. There's an incoming external HTTP post, that turns into an HTTP query from suppliers to products, that turns into a post from suppliers to product, and that results in HTTP query from products to shopping cart, that results in a couple of HTTP post from products to shopping cart, to complete the process.

What's the problem with this, let aside for a second, latency. If we have many of these steps, and we really have no control of how many these steps can be. If there's latency and HTTP and the network, they bring latency onto the table, we end up with probably the first incoming HTTP post that time's out, because of the latency accumulating there on the way.

But that's not only that. We can produce something called the snowball effect. Let's say that when we try to swap and insert prices in the shopping cart, somethings goes badly. Suddenly we have not the last one, but the one before the last one, HTTP call that is failing. That caused the previous call to fail, because obviously the threat runs in HTTP 500, or something that is an error that caused the product microservices to fail, that obviously caused the suppliers microservices to fail.

Now what these identifies is that we have another problem that we just introduced by allowing services to call each other and sharing data basically. We introduced temporal coupling. Basically the problem caused by a service is reflected into another service, and they need to another one, and into another one, very similar to what a snowball effect is. But it's not only that, what if we have some sort of a HTTP requests that succeed, but the HTTP response that should come back times out, or it's lost because of a network failure?

Now there's something like for example, the products microservices that is trying to call the shopping cart. In order to swap any surprises, the operations succeed, but the product microservice never knows that, because their response never come back. The only way if we use the HTTP to solve a problem is in some way to invent some sort of distributed transactions over HTTP.

And as you can imagine, we started with the intention of designing a distributed system, using the microservices architecture style. But then we ended up with distributed monolith, because now we have temporarily coupled services or microservices, not owning data, not responsible for behaviors, they can call each other, they can query data across each other. And the protocol they use to talk to each other causes temporal coupling and the infamous snowball effect.

What's the first thing we can do to alleviate the problem. The first step is to introduce messaging. Messaging is meant to solve the temporal coupling. In this case, it's meant to solve the temporal coupling and the distributed transactions problem. Let's take a look at again at the snowball effect slide, we left our snowball effect here, so we have the shopping cart microservices failing that is operation that is causing the failure to be propagated to all other services.

What if we change the approach by saying there's someone that can still be an HTTP request coming in from the outside, going to suppliers saying, "Hey, can you change the purchase price for me for please?" "Yes, sure." And once the price has changed a message is this dispatched on queue saying purchase price changed, with the new price details. At this point, products can react, introduce some business logic, and apply the new price to the product we sell, and publish in other message that there's kind of sell price change.

The price we sell something, now is changed, again, with new price details. This message can be handled by shopping cart, where business logic raise the new price detail, swaps the prices, adding the new price to the current price, moving the old price to the previous price, and at the same time in order to respect the rules, move to save for later. Now shopping cart is responsible for the entire business logic that needs to be applied whenever there's a price change.

And if something goes badly, let's say, when trying to move to save for later, or when trying to swap prices, or whatever, that's not really important. The only thing that we need to do is just retry the message. There isn't any more the propagation, the fatal propagation effect. In one of the rule of thumbs that we should apply whenever we design something you can call a distributed system, is that problems at the specific service should not affect in any way, any other services and the system itself.

But if you think about it, messages are just the palliative. So they alleviate the symptoms that we identified so far, they sold some of them. So we don't have any more distributed transactions needs or the snowball effect is gone. We could say that from the temporary coupling perspective, the distributed monolith is now gone, but is it really gone? Not really because queries are still a thing.

Whenever we need to add an item to a cart, that is a user operation, if the price and inventory data are stored in the products microservice, then we end up copying data from one service to another. And we're not really solving the problem by sharing data, using messages. So were still sharing data, it's a different way of sharing data, but still we're sharing data.

We're basically floating the system, because every single time an update of the price, or an update on the inventory causes a message that is, again, a fat message with the reach data going from one service to another over, and over, and over, and over again. We haven't really solved the problem, the real problem is that we design the system to begin with, using this trivial sample, because it's really trivial sample, following the user mental model.

So that's the main issue, is that the user mental model in this case is misleading, because users or domain experts, if you will, they talk in terms of a mental model, they talks in terms of suppliers, products, shopping cart. And whenever they say something like there is a product that there's a price as an inventory, as a name, as a description, we as good engineer immediately react to that and say, "Well, public class product, public string name, public decimal price, and so on."

We transition immediately from the user mental model described by the domain expert into code or into a class. Let's focus on the shopping cart first, and just have a look at that. We'll focus on this single problem in what can be a very large system. Do we really need the shopping cart? That could be the question we should try to answer? And probably the answer is no, we could see the shopping cart from different perspectives.

There's a sales shopping cart, sales needs to know the price, the item ID, and the item ID current price, last price, and quantity, because the quantity is required to calculate the total amount to the shopping cart. So we can remove this information from the shopping cart. At the same time, warehouse has a very similar problem, warehouse needs to show the inventory, and how many items are left in stock.

Warehouse owns the item ID, that is the same as the one shared. It's a primary key in the end, the inventory. And similarly owns the cart ID, sorry, has the cart ID, item ID, inventory, and quantity. Shipping basically does the same thing, shipping might be required to display delivery types when an item is in shopping cart. So there's item ID, and delivery type, and the shipping servers, and we have a list of cart IDs, item IDs, quantities, and delivery estimates.

Marketing, that is the one that owns name and description, or product catalog probably is a better name, doesn't really change. The only reason for marketing to change is because of typos, probably. There is no need to denormalize any kind of data into a shopping cart. At this point, do we really need the shopping cart at all? Probably no, so we can say that sales is the one that owns the shopping cart concept. Sales is the one containing the so-called master retail relationship, where we have the cart identifier that is the master. And then at list of children who are from the point of view of sales are just prices and quantities. That's it?

What we basically did is that we just followed the coupling. So we observed the way data were exchange the before, and we basically ended up putting in the same boundaries or in the same service if you will, data that changes together. Let's look at sales, for example, whenever the price change, it's a sales responsibility to look at the prices and the items in the shopping cart, and make sure we'd expect the regressions, and update prices, and move to save for later, sorry, products that were the price changed based on some business rules.

And warehouse they're the same, whenever an item is sold, it updates the inventory, looks in all the shopping carts that actually are running in the system and update the inventory. I'll doubt that it's happening really, but it's just for the sake of the sample. What is really in microservice or a service? It's the technical authority for a specific business capability, or business rules, and data reside within the service boundaries.

I put data in parentheses because I truly believe that business rules are much more important than data, where data end up being stored is just the side effect or a consequence of how business rules are run. So we should carefully look at who owns and who applies business rules. And then we end up discovering where we should design, and cut our boundaries to identify services.

In order to achieve autonomy, we want to be able to design full vertical slices, and the first step in order to have full vertical slices is to identify behavior, and data, and identify boundaries. Once we have identified boundaries we can say, "Well, a vertical slice can own the UI part as well, and we'll talk a little bit about UI later on. And at the same time, we can look at the backend of this services and say, "No, every single service can have its own technology, or framework, or dedicate this kind of storage, and so on in order to satisfy the specific need of its behavior, and the way data are stored in that service.

Once we achieve full vertical slices, we achieved a nearly the perfect autonomous group of services where each service is now able to evolve independently from each other. And we have no temporary coupling, but we don't have any cross service communication at all basically. Obviously the time that we have today is very limited, I have a demo it's a .NET and C# based demo. But demos, all we've told so far, and all we have talked about so far, feel free to reach out to me if the .NET and C# technology are not your favorite technology, and we can try to understand how to accommodate your needs using different technologies.

The demo is available on GitHub, feel free also to raise full requests if you find any errors. Before concluding, a few takeaways. Boundaries are the keys to success in order to design a distributed system, and the distributed architecture for a long living system. One of my key tenets is, do not bring in more technology to solve non-technical problems. I use this rule of thumb, whenever I feel the need to bring in new technology, I'll start looking around for mistakes I made in identifying service boundaries.

For example, whenever I feel the need to bring in distributed cash, that's the kind of problem you don't want to solve. It's probably a boundaries problem to fix before the need of a distributed cash. The second important thing is that the user mental modern can badly influence design. Users tend to think in terms of data presentation, they talk about the way they see data on screens, they talk about the ways they see it on paper. Be careful in immediately translating the user mental model into service boundaries, because that's probably the worst thing you can do.

The third thing is that, do not name things prematurely. Because premature names stick and drive data aggregation. As soon as we start listening to a user talking about the product, and we say, "Well, that's a nice looking microservice." A product microservice, that's the beginning of the end. We'll start putting data into that service just because it's named product, or because it's named shopping cart. Years of fantasy names like planets, call-logs, ABCDZ, whatever, do not give it any meaningful business name, at least at the very beginning.

Behaviors define out what aggregate data. Group data that change together, and that influence each other. Use anti-requirements techniques to validate data grouping, what does it mean? Let me use a sample for the first group data that change together and this influence each other. Let's say that you have the concept of the product, name, description, and price, and the concept of the user with name, first name, last name, and user type that can be standard user gold customer.

And it's very easy to end up designing a user's service and the product service. But at a certain point, there will be a business stakeholder coming in saying, "You know what? If it is a premium customer, we want to apply a 10% discount at the buy time." We're now trying to apply a business rule to data that are owned by two different services. That's a clear sign that service boundaries are wrong.

Price and user type change together, they affect each other. This business rules applied to both of them at the same time, that's a good sign that they should stay together. This highlights another problem, it's impossible to give it a name. We cannot really say, what would it be? Would it be some sort of user prices microservices, or price user microservice? Makes no sense. That's why it's very hard, and it should not be given names prematurely.

At the same time, use anti-requirements techniques to validate that grouping. What does it mean? Ask stupid questions such as, will it ever happen that if the last name of the user starts with an S like my last name, there will be a discount on the price. And obviously the answer is no. You know that last name and price cannot stay together, or can be owned by two different services, but someone might raise their hand and say, "Well, however, when it's a customer birthday, we want it to apply a discount." That means probably that the birthdate should stay in the same place as prices user type, because there's a business rule that is applied to these three concepts in order to generate the price.

In the end, follow the coupling, look at how data flows within the system, and group them together. Finally, use messaging to temporarily decouple the services. So that service won't be unavailable if another one is available, the infamous snowball effect. I guess that in your mind there's a question now. You have all these nice looking services now where data are divided up into different services, each one owning their own data.

But what about the poor user in front of the screen? We need to be able to show them a product, not prices, names, inventory, and shipping delivery estimates. There are View Model Composition techniques, that are designed to solve this specific problem. View Model Composition techniques are outside the off topic for this webinar. I'm describing View Model Composition techniques on my blog at milestone.topics.it. Please have a look there, and if you have any questions there's a contact form, reach out to me, and we can discuss whatever problem you might have.

Before concluding, I like to leave you with one final suggestion, the first rule of distributed systems, you don't need a distributed system. What does it mean? Think more than twice before going down the route of microservices, or a service oriented architecture. There's a lot of complexity that comes in as you start building a distributed system, and there must be good reasons to go down that route.

Think to the fact that larger successful companies that are now using distributed architectures, first of all, are very large. Secondly, they all started as monoliths, they built monolith, and then monolith v2, and then a monolith v3. And then probably they end up at the fourth, rewrite to distribute the system. This used to say that before going to implement the distributed system architecture, the first most important thing is your knowledge of the domain should be really deep, you need to deeply and intimate know the domain you're trying to design.

And at the same time, the domain must be very stable. In a constantly changing domain, for example, the startup one, it's very hard to apply microservices architecture styles, because if there's a constant change in the domain, it means that rules, policies, behaviors are kind of unreliable. They constantly change, and that means that service boundaries go on changing, and changing, and changing, making it very hard to come up with stable well-known service boundaries.

Thank you very much, and if you have any questions just use the questions panel. I can see that there's already one that asked me to show the link to the demo. Let me go back, so you can take a screenshot of this link if you will. One question is, what about the required communication that there must be across services? That's an interesting thing. We have four short scenarios in which services needs to talk to each other.

Let me use a sample that we used earlier in the webinar. There's a price change, and whenever that price change... Or let me use another sample, sorry. Let's say that a product is added to the cart, and marketing is interested in knowing whenever the product is added to the cart because of statistical reasons. Now the shopping cart can simply publish an event that is as light as item added to the cart, item ID, and marketing will subscribe to that event on a queue.

And given that marketing has the same concept of item ID connected to a name and description, can start keeping track of statistics related to products that were added, and removed, and quantity change to the cart. So there will be cross service communication for sure. The important bits are, whenever you design a cross service, or cross boundary communication, you want to design that communication to be as thin as possible. That means that messages, or events to use a better word should be just a name, what happened, products added to cart, and a bunch of IDs. That's it?

No more data, because data are already in the correct place. Each service already owns the data required by that service to perform any business rule, any policy, any logic that is required. What about UI and microservices? Well, that's an interesting problem in the demo there's the solution to that. The webinars time was obviously limited to 40 minutes, so we cannot really go that deep. Certainly the microservices and UI problem is just a matter of composing data back into the UI. Let me do a very brief overview of what happens, I go back to this slide for a second. Okay, this one.

If you take a look, the item ID attribute is everywhere. The primary key that identifies what is a product from the user mental model point of view is shared across all the services. It's basically the primary key, is a kind of a foreign key with real foreign key relationship type of thing. What the UI can now do, is issue multiple requests in parallel to each service saying, "Hey, can you give me your details of the item ID XYZ?" And the UI will take care of composing these data that we can now call a view model, that is then shipped to the frame you're using to build a UI, let's say, react, whatever angular. It's not really that important.

That receives a view model that presents the user mental model. In the demo there is all this logic applied, and you can see multiple things happening, also some sort of decomposition. So what happens when we post the data from the UI to the services? There's interesting question from Mateo. What do you think about event sourcing and the idea of having a single source of truth on a commit log? Well, event sourcing is an interesting idea.

My rule of thumb is that, event sourcing should not be a top level architecture. What do I mean by that? We should not design a system starting from the assumption, it will be an event source system. If we add for example, accounting to this picture, accounting is probably a nice place where event sourcing is a winner. Accountants by definition don't use erasers, it's a kind of append-only data model. To some extent we can say that event sourcing is an implementation detail of a service. And if it makes sense for that specific service to use event sourcing, that's fine, I'm very happy for that service if it benefits your event sourcing.

Kafka on the other hand can be used as a transport mechanism for messages, is not really designed to achieve the same goal. Kafka is designed to achieve an ordered commit log, where all the events are ordered in the way they happened, and consumers can consume them from the beginning of history problem, or from a specific point in time to an end. A messaging infrastructure is designed to achieve another goal, that is to share data across services, and once those data are delivered they are not available anymore in the transport.

Again, I'm not saying that the Kafka is not the solution, and I'm not saying that messaging is the solution. There might be part of the system where messaging makes a lot of sense, other parts of the system where Kafka makes a lot of sense. That's why the larger the system is, the more likely there is that you end up using multiple architecture styles, and multiple technologies. And I'm happy that Mateo agrees.

Unfortunately the time is over, I'd like to thank all the people that joined this Codemotion webinar. And let me go back to the demo slide, so that if you still want to copy the link, there's an option. So I'd like to thank all the people that joined this Codemotion webinar, Codemotion for having me obviously, and all the people that will have the opportunity to watch the recording as soon as it's available. Thanks again. Bye now.

Microservices architecture: is it the right choice to design long-living systems?

About this video

🔗Transcription