Keynote - Event Sourcing @ DDD Europe
About this video
“I’m not old enough to say get off my lawn, but…”
In the keynote at DDD Europe 2020, Udi casts a critical eye over the cargo cult forming around event sourcing. Is event sourcing really the greatest thing since sliced bread? Or is it at the “peak of inflated expectations” in Gartner’s Hype Cycle?
It’s easy to assume that event sourcing is a silver bullet that will solve all the problems associated with building a complex system, but silver bullets don’t really exist. In the presentation, Udi covers the history of events and event-driven architecture, and points out where attempts at event sourcing commonly go wrong, and how to approach it in a better way.
đź”—Transcription
- 00:00:11 Udi Dahan
- Good morning. Welcome. Hello. It's great to be back at DDD Europe, specifically for this event sourcing talk. Just a quick question to sort of get a sense of the audience. Who's been to DDD Europe before? Ooh, okay, a lot of you. Who's this the first time for you? I'm guessing that a lot of you are raising your hands twice. Either that, I just don't know what happened here because it looked like 80% raised their hands for both questions. This usually doesn't happen when I'm in Europe, getting that many people that are raising their hands. What about event sourcing itself? I mean, who's been doing event sourcing on their projects already? Okay. And who's this the first time that you're hearing about event sourcing? Okay. Some hands coming up. All right.
- 00:01:02 Udi Dahan
- For those of you that this is new, I'm going to give you some of the background of what this event sourcing thing even is. For those of you that have been doing it, my guess is that you've heard varying opinions on what it is, how to do it, what's included, what's not included, how that relates to other patterns. This is sort of the natural order of things, okay? So for those of you just sort of coming in and feeling a little bit confused about what's been going on with event sourcing, the fact that this is our essentially first event sourcing conference is sort of a product of that.
- 00:01:36 Udi Dahan
- As an industry, we first start doing things in new ways, then we come up with names for it, then we start talking to each other about those names and then realizing that everybody's using the same names to mean different things. Then we have conferences. Then we get together. Then we sort shit out. And then roughly 10 years down the road, we kind of get to the point we're like, okay, now we finally know what we're talking about.
- 00:02:05 Udi Dahan
- For those of you who don't know me, I'm Udi Dahan. I speak at various events like this. I spent a number of years consulting, working with various companies. I started a company called Particular Software over there. We build service bus technology that enables building event driven systems. And I'm @UdiDahan on Twitter if you want to talk to me over there.
- 00:02:30 Udi Dahan
- I know I'm not really old enough to pull off the whole Clint Eastwood grizzled get off my lawn type of thing, but there will be a little bit of that in this talk. A lot of the things that sort of get bundled underneath event sourcing, they're not really that new and that's where it's going to help as we sort of go through this to put some order in these types of things. My concern, and it's not just about event sourcing but pretty much every new thing that comes along, there's this sort of early phase where everybody's like, "Oh my God, it's the greatest thing since sliced bread. I'm using event sourcing for everything." That's usually where a lot of the damage gets done, I'm afraid.
- 00:03:18 Udi Dahan
- So if you or your company is in this phase of event sourcing all of the things where everybody's all hyper excited about it, again, this is not the first time that we as an industry got excited about some new thing that's going to make everything magically better. Before that we had microservices, which still is kind of a big thing that everybody's kind of going on and on about, domain-driven design for a while was that brand new thing, service oriented architecture before that, and it goes on and on. So really every other technology out there has gone through the same type of thing and over time, yeah, they all kind of start to look the same. And as we go through a bunch of these patterns, you'll see the similarities and I'll try to articulate the differences between them.
- 00:04:16 Udi Dahan
- This idea of things getting sort of blown out of proportion, it was documented a number of years ago from the fellows at Gartner. It's known as the Gartner Hype Cycle. The very first conference on event sourcing means you are here, the peak of inflated expectations. This is where we are event sourcing all of the things. Some of you have already been through that and realized this is a lot harder than it looks and there are a lot of challenges along the way. You might be down there in the trough of disillusionment saying, you know what, let's actually rip all that out and just go all the way back to basics. Our objective with this event sourcing conference is to hopefully bring some of you up towards that slope of enlightenment of understanding what it is, what it's good for, what are the other bits that make it up and how to use them.
- 00:05:21 Udi Dahan
- So, that's kind of our story for the day. Obviously in this talk I'm not going to be able to give you all of the detailed information that helped bring you up over there, but we've got lots of great speakers and we won't always agree with each other. All right. So for those of you that this is the first time that you're at a DDD conference or any type of industry conference, you'll be hearing from lots of speakers and not everybody will agree with everybody else. And then that might leave you confused.
- 00:05:51 Udi Dahan
- You'd much rather have somebody say, "This is what it is. This is how you do it. Here's a cookie cutter approach, ABCD." And then you just follow that and you're good. We're not going to get that. All right. We're not mature enough yet with regards to event sourcing in industry to get to that point. But we will hopefully take all of this stuff that sort of falls in and around event sourcing, where we've got SOA and domain-driven design and bounded context and event-driven architecture and microservices and pub/sub and all of this mess, and hopefully get you to a point where at least you understand what those things are, what they're good for, and how to make use of them on various projects.
- 00:06:33 Udi Dahan
- So, let's start and bring some order to that chaos that we just talked about. What is event sourcing anyway? Let's start with this. Who thinks they know what event sourcing is? Raise your hands. That's a loaded question, isn't it? Huh? I see the heads are kind of like, hmm, let me think about that actually. You wouldn't have given this big prologue if it was that clear cut. Now, I remember when I asked who's doing event sourcing at the beginning, lots of hands were like, "Yup, I am." And then at the who thinks they know what event sourcing is, it's I think so. All right.
- 00:07:22 Udi Dahan
- Generally speaking, what there seems to be broad agreement of is event sourcing refers to a collection of patterns. It's not just one thing. All right. A collection of patterns based on the idea that you persist the full history of a domain as a sequence of events rather than just persisting the current state. All right. And that's sort of the big distinction from the way the traditional domain models were built when using object-relational mapping technology of sorts where historically what we've done is we've taken a domain model in its current state and persisted that current state, and then made changes to that current state and persisted that. All right. So that's the main difference that event sourcing is bringing in on top of the core domain model pattern.
- 00:08:22 Udi Dahan
- Let's talk about domain model patterns for a second. Who's used the domain model pattern before? I think I see some hands. All right. Who has not? Who has not used domain model pattern? Okay, less hands. So just a refresher, even for those of you that have, what the domain model pattern is. And again, it's one of those things that you think you understand it until you kind of go into the details. Domain model is an object model of the domain that incorporates both behavior and data. Fairly generic. So my guess is that a lot of you have a domain model. Even those of you that might not have heard of the domain model pattern, you probably have some kind of object model of your domain.
- 00:09:10 Udi Dahan
- Now, seeing as we're part of the DDD, domain-driven design conference, it's worth including some more detailed information from the book that actually penned the domain model pattern. First thing, when do you use it? Use it when you have a complicated and ever-changing set of business rules. Now, like every where do you use it, where do you not use it? In that same book, page 119, it says if you have some simple not-null checks and a couple of sums to calculate, a transaction script is a better bet. What that means in short order is that the domain model is not, I repeat, not a best practice.
- 00:10:08 Udi Dahan
- You do not use it for everything. There's going to be a bunch of complex stuff in a given system and some not so complex stuff. This idea that all of your business logic is supposed to be in one domain model, well, that causes some problems that you might've heard about in various movies. The one domain model to rule them all and in the darkness bind them.
- 00:10:45 Udi Dahan
- Now, if you've built a system using domain-driven design and you have a big complex domain model, this is usually the reason why it's just putting too much stuff in that one place. And when people are going to do event sourcing these days, they're doing the same kind of thing. They're taking what essentially is a big monolithic domain model, and then they're saying, awesome, now we're going to make that event sourced. The pain that we're feeling with our one domain model to rule them all, when you put event sourcing in there, that doesn't solve the problem, it makes it worse. And that's where people end up in that trough of disillusionment. It's the expectation that event sourcing is going to solve the pains of the one big domain model.
- 00:11:43 Udi Dahan
- So hopefully as an industry, and I've seen speakers talk about this already saying you don't want to be doing monolithic architectures with event sourcing. All right? So having event sourcing in one big place, not a good idea. But that's true for all patterns. All right. So, now that we've put that caveat aside without saying, again, event sourcing as compared to traditional domain models, maybe we should start talking about what is an event in the first place. I mean, it's one of those words, kind of like service, whether microservice or service oriented architecture or domain service, that we tend to throw around a lot. But when it comes to an event, what do we even mean by that?
- 00:12:35 Udi Dahan
- Hopefully you'll human me a little bit for some computer history because it's important to understand where things came from in order to understand what the are and what to do with them today. In the beginning of computers, we had this thing called an interrupt. An interrupt was a way to signal to the processor that something needed immediate attention. In the early days of computer programs were essentially static void main. You start at the beginning and you go one, two, three, four, five, six, seven, eight, nine, 10, and then you're done. That's how code was written. Interrupts came in to say, wait a minute, something happened that's important, we need to stop and take action on that.
- 00:13:28 Udi Dahan
- So things like somebody typing on the keyboard or doing a mouse click or running out of space on the hard drive or any number of things like that were hardware interrupts that essentially told the processor, stop what you're doing and handle this. That was essentially the first concept of events in computers as we had them. From that point in time, graphical user interfaces came along and they created slightly larger concepts of things like a button that could be clicked. At this point, we moved from just a hardware level interrupt to a more abstract software level concern saying, hey, we had a button-clicked event. Maybe we should do something about that.
- 00:14:16 Udi Dahan
- And this is where we went from just sort of the hardware level to the software level where we created a different programming construct. It wasn't a class, it wasn't a method, it wasn't a pointer. Events were a new kind of thing that were built on top of all of those existing pieces in order to help us build in this type of reactive fashion saying something happened, we need to handle it. So, when talking about these events, we need to distinguish. There is the event itself, the button clicked, and then we have the code that handles that thing.
- 00:14:56 Udi Dahan
- This is where programming actually started going in an interesting new direction because you could have multiple callbacks for a given event. This was a fairly new thing. You got to remember in the early days of programming what we had was structured programming where essentially you had one code flow that you would go all the way through. You'd have if then else, but this idea that multiple pieces of code could be triggered and that the calling code wouldn't necessarily know what it was was a brand new thing at the time. Now, there were people at the time that said that would create an unmaintainable mess. That not knowing what code is going to run next is going to make just big messes, developers won't know how to handle that. But here we are decades later and it's fine. It really is.
- 00:15:50 Udi Dahan
- So you got callbacks, methods on the classes with arguments, and this is where things started to get a little bit complex because when we're talking about these arguments, that are giving us additional data about the event. That's where we start to get into this sort of in-between state of saying, well, what is the event and what are the arguments and what are the callbacks? So again, in GUI type territory, in a mouse down type of event, you'd have a whole bunch of arguments. You'd have the index if you're clicking on like a tool tip or tab strip type thing, you'd have which button was clicked, was the shift button pressed? What's the X, Y coordinates of the mouse down event? You'd have a lot of this stuff. And essentially these were considered arguments. This was not the event, the event was mouse down. These were all the arguments that were passed along with that.
- 00:16:49 Udi Dahan
- A number of years later, we as an industry came up with this thing called the parameter object pattern which essentially said, all of those arguments are ugly. It makes it difficult to maintain. You end up having difficulty versioning your API because you end up with all sorts of overloads between the arguments. So essentially we said, let's take all of those arguments and package them up together into a single object. You add a little bit of strong typing and an IDE, and you get something fairly modern that looks like this, where the MouseEventArgs is your parameter object and the mouse move thing that you're seeing over there, that is the event. The mouse event handler is a strongly typed delegate or callback type pointer that will tell you that your method to handle that should have that specific signature.
- 00:17:59 Udi Dahan
- Essentially, we started to create some structure around these concepts of events. So, that's on the let's call it programming language side and computer history side of how are we getting to event-based programming. But now I want to come at it from a slightly different angle because there was this parallel thing happening outside of the GUIs called component orientation that made a big difference. You see, when building more complex software, not just simple GUI type interactions, we realized that just having big structured code with methods calling other methods calling other methods doesn't work. Leads to an unmaintainable mess.
- 00:18:50 Udi Dahan
- So the idea was if we create loosely coupled and highly cohesive components that are separate from each other, that we can evolve independently. That we have data hiding, we have encapsulation, all of those ideas from object orientation, but slightly bigger, that that is going to do good things for us. Off of the back of component orientation and that level of decoupling, when we included the event-based patterns that we were just talking about a minute before of saying, you know what, if we want truly loosely coupled interactions, calling a method on an interface, that's not bad. I mean, it's better than calling a method on a concrete class. But event-based interactions are even more loosely coupled than the request response style interactions. They're more flexible.
- 00:19:47 Udi Dahan
- And that's essentially the path that we took that brought us to event-driven architecture. So you'll often hear these two things used interchangeably, event-driven architectures and event sourcing, but they're not the same thing at all. Event-driven architecture essentially is component orientation plus an event-based interaction model. Please note there is nothing about event-driven architecture that indicates that it's in any way distributed across the network. I'm going to say that again. There's nothing about event-driven architecture that indicates that it requires or is based on distributed systems having multiple processes running in multiple places.
- 00:20:44 Udi Dahan
- The idea was very much from what we nowadays call a monolith from a deployment style. How do you decouple that so that it is more maintainable over time? It wasn't about scaling. Not at all. Not from like a performance scaling perspective. It did have to do with how can we have multiple teams working in parallel on the same code base, and loosely coupled, highly cohesive components with an event-based interaction were a better way of doing that.
- 00:21:19 Udi Dahan
- Now, when we started talking about event-driven architecture, that's when we started to engage in a little bit more languaging around this thing of events where the components that created, produced those events, we called those event producers. The components that handled the events were called event consumers. And there was some kind of mechanism for getting this thing called an event, with its arguments, from a producer to a consumer that was called a channel.
- 00:21:58 Udi Dahan
- Now, when you have everything running in process, it's not really that big a deal to figure out how to have these channels that are communicating with each other. But in the early days of event-driven architecture, things like how do you create a thread safe event dispatching model, that was a tricky thing. It didn't just work. Nowadays we kind of take that for granted. Oh, absolutely, you just raise an event and lots of consumers get callbacks and it just works. But it didn't start there.
- 00:22:32 Udi Dahan
- Now, one of the things that I see happening a lot, unfortunately still, is that people get hung up on these terms and consider a component either an event producer or an event consumer, but not really both. It is common to have event producers also consume events and vice versa. It almost always happens that you'll have some code, receives an event, does something and produces other events. So in that case, the language of event producers and event consumers is not really that useful architecturally speaking. Essentially in an event-driven architecture, everything consumes events and everything produces events. And this is hopefully where you're seeing where that's going to lead us over time towards an event sourcing mentality. But it started first from that element of loosely coupled, highly cohesive components.
- 00:23:43 Udi Dahan
- Now, once you start event-driven architecture based systems, that's one thing. When you have an organization that has a whole bunch of systems lying around and you need to integrate between them, that's where you get the distributed worlds and the event-driven worlds start getting shoved together. Integration in the early days was hard. You had applications on different platforms. This was before JSON, this was before XML. This is when we were for the most part using CSVs for everything unfortunately. Who's still got CSVs in their systems? Yeah. Technology, I got to tell you, it just lives and lives and lives. And you're like die already, die. I don't need this anymore. And it just doesn't. It refuses to go away. And that's part of the challenges of integration.
- 00:24:40 Udi Dahan
- Now, messaging technology, which was originally called message-oriented middleware, was designed to make that easier. To cross all those boundaries to simplify the integration between platforms. Now, when you had an event-driven system that was built as a bunch of loosely coupled components, integration with that thing ended up being a little bit simpler because those events were really good candidates for passing to other system and saying, hey, something significant happened in system A. We know because an event was published in the event-driven architecture. Let's take that information, put that into a message by taking the name of the event and putting that in the header of the message, taking the arguments and putting that in the body of the message, and then route that to some other system that's going to be interested.
- 00:25:41 Udi Dahan
- Now, this is where things kind of shifted in terms of event-based programming to this more modern concept of what we mean by an event. Because for the most part, we distinguish between the event itself, mouse move, mouse down, whatever, and the arguments as a payload. With messaging, because we couldn't really do the sort of transparent, just different platforms publishing and subscribing in a transparent manner, we required a message to bundle those things up together more explicitly.
- 00:26:23 Udi Dahan
- Now, one other small piece of history. In event-driven architecture, as it's done in process, almost all of the routing that you see, all of the subscribing and registering to events, is done based on the type of the event, essentially the name of the event. You don't really see code that's saying, I want to subscribe to the mouse move event only when the X-Y coordinate is within this space, right? You can't really do that kind of GUI programming. But when people started to do messaging-based integration, that's when this concept of content-based routing started to creep in of saying, I'm not only interested in the type of the event, I want to actually deal with things that are in the payload itself.
- 00:27:20 Udi Dahan
- That was not originally part of event-driven architecture. Not at all. It's only later on when doing message-based integration that this concept of looking at the payload for where to route things started to become relevant. So it's important to realize how things happened, what the history were, in terms of where that leads us today. So, where are we today? When we say the term event, essentially what we're talking about is not just the type, but it's that name plus the arguments. Essentially it's a data structure that has some kind of name that tells you what happened, and a whole bunch of key values that will tell you this is the data of what that thing happened.
- 00:28:11 Udi Dahan
- Now, the important part is that it's about an action that's already occurred. In a GUI type setup, button click, mouse move, all that stuff is easy. It's when you move into loosely coupled highly cohesive components territory, when you're doing event-driven architecture at a business level, that talking about what happened becomes a little bit tricky. So, we'll talk more about this. What are these events? What are the business events? What are the distinction between business events and other types of events, because that's where most event sourcing goes wrong.
- 00:28:56 Udi Dahan
- It's essentially saying an event is an event is an event. Now, again, intuitively we understand the difference between a button clicked event and some sort of higher level business event like signed up a new customer or a customer bought an item. But in between there, there's a lot of gray that we're going to talk about. Almost in all cases you will have some kind of infrastructure that's going to move these events around. Historically that could have been a messaging-based type infrastructure for passing those things between systems, dealing with transactions and consistency and all that kind of stuff. Or in an event sourcing type way, the act of persisting a stream of events and bringing that stream of events back and doing snapshotting. There's going to be some kind of infrastructure that's helping you with that.
- 00:29:52 Udi Dahan
- But here, back to the where we are today, before using event sourcing on a complex system, stop a second. All right. Again, there's that whole hype thing of we're going to use event sourcing and it's going to solve all of our problems. Just stop right there. Remember what we said about that domain model pattern, that monolithic style of putting all of your logic in one place. If you see yourself heading in that direction, making that mistake, that's where you need to stop. Again, even before you're doing event sourcing, but also I'd say before you're doing DDD or domain models or that kind of thing.
- 00:30:36 Udi Dahan
- So, if you're building a complex system, my guess is that sort of at an abstract level you have this kind of set of layering in your mind. You have some kind of user interface invoking some kind of API, usually remote, which will invoke some kind of business logic, which will ultimately influence data that gets persisted in some database. Your thinking, and most of the thinking in the domain-driven design community, is that business logic layer right there is your domain model. I have news for you. It is not. Why? Because of that one ring fallacy. In order to comply with our principles of what actually a domain model is, event sourced or otherwise, you need to take your original idea of what a model of your domain is that has a bunch of entities like customers and orders and products and employees and those types of things, this entity relationship diagram, and you need to decompose that into something less monolithic.
- 00:31:52 Udi Dahan
- So, how do you do that? Well, it's hard but I'm going to give you some ideas. Remember what we mentioned about domain models. You use them when you have complicated and ever-changing rules, but not when you have just some simple not-null checks and sums to calculate. So let's look at let's say a given entity like customer. Customer has a first name. Is that a complicated and ever-changing rule? Answer is, well, not really. I mean, we get some logic that says the name can 40 characters, it can be 50 characters. I think it changes a little bit for a while, but then it just sort of settles down and we never change it again. So complicated and ever-changing, not really. Customer has a last name. Also same category.
- 00:32:57 Udi Dahan
- What about the name of a product? The description of a product, the image of a product. A lot of the data that users see and interact with is not really that complicated. Its definitions don't change that much. By virtue of the definition of the domain model pattern, you don't need domain models for that data. So you start separating those things out. If we start looking at things like a product price or a customer status, those are things that over the lifetime of a system, we'll see that the business changes them quite a lot.
- 00:33:51 Udi Dahan
- It says, "Okay, we're going to start off with preferred customers and non-preferred customers." Great. And then a year into the system, the business says, "You know what? We're changing that. We're doing gold, silver, bronze customers. Gold customers get this, silver that," and you're like, okay, great, we're going to change that behavior over that. Then they say, "Okay, great. But we've got this other category of strategic customers." So strategic customers are one thing, and then you've got gold, silver, bronze, which is another thing. And you'll see that type of behavior and data changing repeatedly, never really settles down.
- 00:34:28 Udi Dahan
- Also things like how much of a discount does a certain status of customer get on a product? Well, if the product is this price, then they get that price, so they get this kind of discount. You start seeing a bunch of the logic changing together between the customer and the product, but it requires time to sit down with your domain experts and kind of sift through what things really change together versus what things change separately.
- 00:34:59 Udi Dahan
- So saying, hey, we're going to take the customer first name, last name, phone number, that kind of stuff, we're going to put that in one domain. We're going to take the product names, descriptions, images, we're going to put that somewhere else. We're going to take customer statuses and product prices and the logic around discounting, we're going to put that somewhere else. This is the first act of finding your boundaries. And from there we can start asking the question and saying, which areas are complicated and ever-changing? And we'd say, "Hey, that customer first name, last name thing, it really is just taking some data from the UI, shoving it into a database and showing it again."
- 00:35:40 Udi Dahan
- As it said in the patterns of enterprise application architecture, a transaction script is a better bet. You don't need a domain model for that. Domain model pattern wasn't designed for that. Product names and descriptions. Essentially you've got a product catalog. Again, you're taking a bunch of data from the UI in the back office, storing it, and then showing it to users in the front office. Data in, data out, it's what we'd call CRUD. Again, you don't need a domain model for that, event sourced or otherwise. Only when you uncover the sub domain that is that kind of complicated and ever changing, that is where you're saying, okay, that might be a place to use the domain model pattern.
- 00:36:29 Udi Dahan
- So essentially instead of that simple layered architecture that we saw before, we start subdividing that and saying, hey, it's not just one set of layers, essentially there are these slices, if you will, that kind of start at that domain level going down in the sense of, well, I don't actually need the customer's first name and last name in the same table as their status information because really those things vary independently of each other. From a single responsibility perspective, they have different reasons to change. So when we think about it, it's no longer just one system. Essentially it's these sets of boundaries where not only going down but also going up we can subdivide them.
- 00:37:26 Udi Dahan
- So this idea of what does a product look like? I mean, on the screen it looks something like this, and it has a bunch of information like the product name, the description, the image, the author, the price, the inventory, but essentially that gets subdivided as well. That the product catalog information: the name, the image, the description belongs in one boundary. The rating of that book belongs in another boundary. Essentially all that's shared between these two things is a product ID. In both cases we've got product ID and then a bunch of data. And then in another one we've got same product ID and a bunch of other data. And then that price, product ID and how did we come up with the price? Inventory and the product ID.
- 00:38:20 Udi Dahan
- Essentially we end up with what nowadays is called microservices where each microservice has its own data. But notice that it's not like we have a product microservice that owns everything about a product, or we have a customer microservice that owns everything about a customer. When looking for boundaries, you'll usually find them going right down the middle of your entities where different parts end up in different boundaries. So essentially instead of thinking about this in terms of layers, what would happen if we took those vertical slices and said, you know what, those are our primary architectural concerns.
- 00:39:12 Udi Dahan
- Essentially we've got these boundaries; boundaries one, two, and three. Each of them can be built differently. They don't all need to have the same layers. One of them could be fairly simple based on a document database as a UI, some basic validation logic, as separate components, but that's it. Another one, because of scaling reasons, dealing with lot more data, needs to have a sharding type database. So it has some different logic for physical scaling out. Yet a third one has more an event sourcing nature to it, therefore we use an event store underneath it in order to support that.
- 00:39:59 Udi Dahan
- But here is where I want to come back to that concept of events. We have events passing between these higher level boundaries in a publish/subscribe type manner, and then we've got things happening inside the boundaries which may or may not be event oriented. Essentially it's a question of events on the outside versus events on the inside. Now, some of you might remember probably about 15, 20 years ago, paper was written called Data on the Inside Versus Data on the Outside. Really great paper. If you haven't read it, I suggest it strongly. This is the exact same idea. Boundaries matter. Things on the outside need to be treated differently from things on the inside.
- 00:40:54 Udi Dahan
- So when thinking about high level business events, what do we mean by that? Well, let's take an example of a kind of retail flow where we have these loosely coupled highly cohesive components interacting in an event-based manner, meaning we're doing event-driven architecture where one boundary publishes the order accepted event. Multiple subscribers listen to that. One of them takes action on it, publishes another event saying order billed. Shipping listens to both of those things and then decides to take some action on that. Here we're talking about stable business facts and not so much things at an internal level.
- 00:41:44 Udi Dahan
- So, if you're thinking of a domain that has a more create, read, update, delete relationship to its logic and data; so entity created, entity updated, entity updated, entity updated, entity deleted, those might be internal events that happen in one of those boundaries but those are not the higher level business events that we're talking about. When thinking about these level of events, the concept is it's something that happened that is not planned to unhappen or to be overwritten. A stable business fact.
- 00:42:31 Udi Dahan
- Now, here I want to talk a little bit about terms or patterns that you've probably heard about in the context of event sourcing; auditing and replay. When talking about auditing events, essentially what we're doing is we're creating a copy of the events that were passed around and storing them somewhere. Almost all messaging infrastructure already provides this capability out of the box. So if you have this type of flow and you're saying, hey, in my business it's important to have a tracking of all of the events that happened, that can be done without doing event sourcing. All right?
- 00:43:23 Udi Dahan
- Event auditing is an independent pattern. A lot of times people, again, lump that together because they think of event sourcing as one thing. Remember our definition, event sourcing is a collection of patterns. It's not just one thing. Event auditing is one of those elements that allows us to essentially record stuff that's going on and it's very useful in a number of systems. Once you have all of these events audited, then you can replay those events if you want to.
- 00:43:59 Udi Dahan
- Now, at the business event level, to be totally honest with you, there are so many challenges with event replay, I got to tell you, if you're integrating with another system that's not event sourced and distinguishing the now we're handling real-time events versus now we're replaying historical events and make sure that you don't pass those events along, but these ones you do. But if you're correcting an event, then pass that one, but make sure that you don't do that when you're in the replay of the correction mode. It's just too hard, right?
- 00:44:39 Udi Dahan
- People twist themselves up into knots saying if we're doing event sourcing, we must do event replay. And then, well, we're integrating with another system because we have to, how do we make that work, under the presumption that they must do event replay all the time. At the level of these sort of high level business events, really I'd say 99% of time it's just not worth it. There are some special edge cases where that might be interesting and some of you might have certain areas where that's happening. But again, keep those in a separate boundary than everything else. Events on the outside versus events on the inside.
- 00:45:29 Udi Dahan
- So, now that we've talked a bit about the events on the outside, I want to talk about the events on the inside because again, as an industry, we do a pretty bad job at naming things. We use events for everything. One of the places that events make a bunch of sense is when we want to notify a user, let's say, that their order was shipped. Now, you might be thinking, is that an event on the inside or an event on the outside? There might be relevance at a higher level business event to have order shipped, but definitely on the inside to say, hey, when something happens inside shipping, we're going to use a signal. We're going to be using Google Cloud Messaging, we're going to be using Azure Notification Services, whatever.
- 00:46:26 Udi Dahan
- In order to push a notification to our user, a lot of times developers use the word event when they mean notification. Try to be specific about that. Notifications are things that are most relevant on the inside. Within a specific bounded context of pushing data from one physical location to another, that's a notification. Now, that might be driven off of some higher level business occurrence, like the order was actually shipped, but distinguish the technical part of moving the notification between physical locations from the business event itself. So that's notifications.
- 00:47:19 Udi Dahan
- Scenario number two that often gets bundled in that related to integration between systems. Inside billing we've got our retail system, but the organization's got a data warehouse, it's got a big SAP ERP system. The guys in finance are using an Oracle finance type thing, and we need to get things that happened inside of our system into those other places. Again, this is a situation that a lot of times people say, "I'm going to use event sourcing for that." But remember what I said, one of the big challenges with event sourcing is the replay part when integrating with other systems.
- 00:48:05 Udi Dahan
- If your domain is, there's a big part of integrating with other systems, there are other tools for that. Change data capture is an infrastructure ability that many databases have that will allow you to essentially stream out the changes to a given data source to other places. This is a kind of data synchronization technology, change data capture. Add some ETL on top of that for transforming the data from one format to those other formats. And this is just integration 101. We've been doing this forever. It works, again, 95, 99% of the time. You don't need to build event source domain models that replay themselves in history and all that kind of stuff in order to solve this problem. This is mostly a solved problem already.
- 00:49:16 Udi Dahan
- One other case that often gets talked about saying, we need event sourcing to scale. Hammering the same database for both our reads and our writes, it just doesn't scale for us. We need to have event sourced projections, creating read models somewhere else so that then we can read off of those things fast. For example, building a product catalog, we need to be able to serve up lots and lots and lots of queries really fast. But as we mentioned before, the domain itself is not very complicated and ever-changing, specifically the part of showing images and names and descriptions of products on a screen. It's CRUD. And I don't say that in a way to sort of belittle it.
- 00:50:12 Udi Dahan
- This is fundamentally a data problem. And you know what, databases are pretty good at solving these types of data problems. They have this concept called replication. Pretty much every database out there has it where you write changes to a write master node and it asynchronously replicates those changes to read replicas, as many of them as you want. Essentially you have an event-driven model at the data level itself that already gives you asynchronously updated read models. It just isn't event sourcing, it's database level replication.
- 00:51:03 Udi Dahan
- Its infrastructure that happens behind the scenes. You just open up the connections via your database client and say, "Hey Postgres, I'd like to read something." It says, "Hey, I'm going to give you a connection to one of the read replicas." "Hey Postgres, I want to write something." "Here's a connection to the write master." And then you do SQL. It's really not hard. A lot of this stuff, when we say I'm going to use event sourcing for that, it's not because we have to, it's because we're just not familiar enough with the alternatives.
- 00:51:42 Udi Dahan
- Now again, event sourcing has its uses. We'll get to that. But it's important to understand that there's a whole bunch of these sub domains that there are simpler and better ways of tackling them. I'm going to use one other one that's not also really that event based either but a lot of times it gets modeled using event sourcing. For example, in the sales domain, customers who bought product A also bought products B, C, D, and E. Essentially we could describe that as a series of events where we're recording every purchase that a customer does and then we're building projections off of that to help create these types of recommendations per user to say, what should we recommend to user number one versus user number two versus user number three?
- 00:52:37 Udi Dahan
- Essentially what I've seen unfortunately a few too many times is people are using event sourcing to build what is essentially a graph database. Because that question of saying customers who did X also did Y and Z and all of these relationships between them, that's what graph databases are designed to do. So questions like figuring out connections and strengths of connections and ordering them based on each other, this is like the whole space of graph databases. So a lot of the event sourcing stuff that you might think of doing in this domain in many cases are, again, inside the graph database. You don't need to build that yourself. The database technology provides those things for you.
- 00:53:33 Udi Dahan
- I'm going to give you one more example. Not so much a business domain, but a technical domain, monitoring. My guess is that a lot of you need to monitor your distributed systems and you've got lots and lots of events that are coming through. You need to do event logging, you need to do event tracing. Just the scale is enormous. You're getting more and more events and as you add more systems, just the sheer volume ends up getting to a place where regular databases don't handle it and you're saying, "Ha, now I'm going to do event sourcing because I've got events and I'm appending them and here I don't even necessarily need to do sort of update type of events." But essentially, I'm sorry.
- 00:54:20 Udi Dahan
- There is another category of database technology, it's called time series databases. That is essentially what they do. You put data into them that is timestamped in this type of manner. And there are a number of examples of that. InfluxDB is I'd say probably the better known one. Netflix has built their own because, well, that's what Netflix does, and they call theirs Atlas. And there's all sorts of differences between them. Atlas is a purely in-memory thing, Influx is replicated and it has some durability to it.
- 00:54:58 Udi Dahan
- But essentially when I see people saying, I'm going to go build an event sourced monitoring type thing, what they're doing is building their own time series database without even realizing it and having that intermingled with the actual logic of the domain. So again, part of the challenges around people doing event sourcing is that they look at anything that looks remotely event-driven and saying, I'm going to event source that thing, without taking the time to do a deeper analysis of the nature of the domain. What part of it is complicated and ever-changing? What parts aren't? What kind of persistence technologies out there suit them?
- 00:55:48 Udi Dahan
- I got to tell you, I mean, some of you are going to be looking at this and saying, "Yes, but Influx, I tried that. It didn't scale for me. Atlas, well, it's not durable. So I have to build my own thing." Yes, at scale, everything becomes customed. When you're Netflix, yeah, you're going to write everything custom yourself. When you're Amazon, you're going to write everything custom yourself. The point is to distinguish when are you writing infrastructure versus when are you dealing with the domain itself. And hopefully through some of these examples, I've given you ideas of various types of infrastructure that are out there, the mapping to domain type of examples, and essentially that element of saying, you know what, just because something has events related to the domain doesn't mean I'm going to go do event sourcing for that.
- 00:56:51 Udi Dahan
- So here's the question. What is event sourcing actually for? We're having a whole conference on it, what's the point? There are specific cases, like I said, in the 1% to 5% of the cases where you're essentially dealing with a complicated ever-changing domain where there are sort of two main criteria. First is it's bi-temporal, meaning there is the time at which a thing happened as a transaction time. And then there is a validity time. If your domain experts are always talking in the language of this is valid as of or effective as of, that's an indication that you have some bi-temporality to your domain.
- 00:57:42 Udi Dahan
- Some domains have more points of time. There is a valid from, there's an effective from, there's an effective to, there is a corrected as of. If that's a core part of your domain, event sourcing could be a good fit for that. However, this is the important thing, code as data becomes an important part of that. Once your data is as of, effective of, and all those types of things, you also need to be able to run rules as they were at a point in time; because that's the challenge with long lived complex systems. The code that was running V1 is different from the code that's running V2, which is different from the code that's running V3, which is different than the code that's running V4.
- 00:58:32 Udi Dahan
- If you're going to be replaying history with all of this history of events, you can't replay that history with the new rules. Essentially you need to replay history most of the time as it happened, meaning being able to persist this was my logic as of a certain point of time, this was my logic at another point in time. So essentially your code is not just a whole bunch of if then else statements and loops that you're writing in a domain model. Your code is data as well. This is complex. This is not something that you should just sort of say, "Oh, absolutely. The domain expert used the word event, we're going to go do event sourcing." In order to apply event sourcing patterns to a domain, this is essentially the bar. And while you'll be doing that in maybe one sub domain, in most you won't need to.
- 00:59:39 Udi Dahan
- There is one other category where it's not really domain oriented where event sourcing is used a lot. And this is what we were talking about before. Event sourcing is for building infrastructure. The folks that are building InfluxDB, you bet they're doing event sourcing. A lot of the technical patterns for saying, hey, I want to bring up a new node in the cluster. I'm going to take the history of the transaction log and replay that in memory using snapshots to make the thing faster or absolutely relevant when building a kind of database technology.
- 01:00:21 Udi Dahan
- The folks that are building Neo4j, a graph database, you bet are using event sourcing patterns left, right and center. But it is mostly in the domain of infrastructure that we do these types of things, much less in the actual business domains itself. So, when do you do that? When you're at Netflix scale, when you're at Amazon scale, when you're at Google scale, when you're doing that type of thing that all of the existing technologies out there just don't cut it. Okay, great. When you have the time, when you have the resources, when you have the budgets, when you have people who've actually built things that are this technically complex at scale, then yes, absolutely that's where you do event sourcing.
- 01:01:14 Udi Dahan
- But for those of you that are not at Netflix, Amazon, Google scale, I want to just shine a light briefly on sort of the, what can you do? So, when you're not at that super-duper level of scale, Moore's law is your friend folks. I got to tell you, you can scale up far beyond anything that you've imagined. So, I looked this morning, the biggest Amazon EC2 instance that you can buy has 128 virtual CPUs and almost four terabytes of memory. It's a beast. You're thinking, oh my, that could probably run all of my system and then some, but it's probably too expensive. That's people's sort of general assumption about scaling up, too expensive.
- 01:02:20 Udi Dahan
- If you were to look at the price of a reserved instance for a three-year term saying, "This is an important system. I know I'm going to have to run it for at least three years. Amazon give me a price. How much does that cost?" My guess is those of you thinking, it's going to be an arm and a leg. It costs $6,250 a month. Nothing upfront. If you pay a little bit upfront, it costs a little bit less than that. $6,200 for just absolutely unbelievable amount of processing power. That is cheaper than most developers, by a long shot. It is absolutely cheaper than a team of developers doing event sourcing and domain models and CQRS.
- 01:03:22 Udi Dahan
- Understand that this is today. Three years from now, Amazon will have the yet other super-duper, quadruple, extra large data center enterprise edition on steroids that's going to give you 4X what that is for the same price or less. Monolithic deployments where everything just sort of runs together can scale up pretty darn well. Now this idea, again, I'm putting our planets here as an example saying, you're not building the sun. You're not building the supernova. If you're at sort of, let's call it, normal real world levels of scale, understand that, hey, there's a lot you can do. You don't have to go and event source all the things.
- 01:04:21 Udi Dahan
- And keep in mind that during this three year term, the business is going to change anyway, right? I mean, you built the system and then a year later they can say, we're going to want a whole bunch of changes. And then six months after that, they're going to want a whole bunch of changes. And then they're going to want some more changes. And then you're going to realize that all of the architecture that you've set up is not going to fit anyway. Who's ever had that happen to them? Two years into production, the rules change and you have to change the architecture. Yes. Yeah.
- 01:04:52 Udi Dahan
- We need to accept that. Incorporate that idea of re-architecting a system while it's live. That's just the way that things are going to be. Having the right boundaries in place can allow you to re-architect bits and pieces at a time and simplify that. But for God's sake, keep those pieces as simple as possible in the beginning, scaling them up as we mentioned before. And I know that all of us have a certain amount of professional pride which is why we come to these conferences to learn and see the patterns and see what's the best way to create maintainable clean code.
- 01:05:35 Udi Dahan
- But after you've been working on a code base for five, 10, 15 years, it doesn't look pretty, right? It's this kind of platypus. It doesn't really, it's not a DDD. It's not a layering. It's not a CQRS. It's not an event sourcing. It has a little bit of everything. The issue is we tend to view that as a failure, but it's not. This is just nature itself. This is what long lived systems end up looking like. It's not pretty, but it's functional. It allows the business to keep going and evolving.
- 01:06:17 Udi Dahan
- Now, hopefully I've scratched the surface, set the stage, given you some ideas about what's going on in and around event sourcing, event-driven programming, event-driven architecture. We have a lot more speakers coming today that will go much deeper into the details and examples of things that I've talked about at the beginning. If you are interested in more about things like how to divide up responsibilities, how to get your boundaries right, I have a bunch of videos online at this link go.particular.net/dddeu2020 where I go into a lot more depth of various domain examples, how to find those boundaries, the practices around that.
- 01:07:00 Udi Dahan
- And that'll sort of set the stage for ultimately building these types of more complex systems. Again, that URL, go.particular.net/dddeu2020. Now, for those of you that are interested in doing a venture of an architecture happen to be using .net, my company, Particular Software, we build service best technology that makes that easier. If you like, take a look at that. We've made some recent changes to it. From now on you'll be able to use it for free in development. So please give it a try, let us know what you think. And with that, I'd like to thank you very much. My name is Udi Dahan, I'm @UdiDahan on Twitter and have a great conference everybody. Thank you.