Channel 9 Live interview with Udi Dahan and Daniel Marbach

00:01 Seth Juarez

Hey there. This is Seth Juarez from Channel 9 coming to you from NDC Oslo in the beautiful country of Norway. We're The city of Oslo for this wonderful conference. I always have a good time with the folks at NDC. I've never been to the one in Oslo though. First time. I have some real special guests. Why don't you introduce yourself? Tell us who you are.

00:18 Udi Dahan

Hi, my name is Udi Dahan, I'm the CEO and founder of NServiceBus.

00:23 Daniel Marbach

Yeah. Hi, I'm Daniel Marbach. I'm from Switzerland. I work for Particular Software and Udi is my boss.

00:28 Seth Juarez

Oh, fantastic so don't screw up.

Yeah I'm really under pressure here.

00:34 Seth Juarez

Dozens of people watching right now. By the way, if you do have any questions make sure to submit them. We'll make sure to take them as soon as we can. So why don't you tell us a little bit about what it is that Particular Software does?

00:45 Udi Dahan

So we do a bunch of stuff. The the bottom line is, we build tools to make developers enable better .net Systems, complex business systems, the kind of stuff that you read about on all of the big websites and the really complex logic. We don't really get involved in a small scale, little apps that are built for just a small team. It's the big complex stuff is what we do, and we solve the reliability problems, the high availability, the scalability, making sure that the code remains maintainable over time. And we provide the infrastructure that makes tens of thousands of developers able to do that without our help. That's the main ideal.

01:22 Seth Juarez

So what is the difference between, I'm writing a cheesy app, a little cheesy app, and I've written lots of those because I make, I make demoware folks, that's what I do. When you go to something that's much larger and that needs to be much more reliable. What changes?

01:38 Udi Dahan

Oh, I guess the question is what doesn't change? Usually starts off with somebody wrote a small app 10 years ago, and then business started booming and then they got two other developers to add features to it. And now they got 10 times as many users on it. They're adding more functionality and all the time the code base grows and gets more complex. Eventually you kind of hit a point where, the single database is starting to crumble and the UI is not responsive. At that point in time that's when you need a different style of architecture to for example, do things more asynchronously.

02:13 Udi Dahan

Now, ever since Microsoft introduced Async/await, people are like, Hey, this is going to be easy. I just async this and then I await that. Problem with all of that is what happens if there's a crash in the middle of that async, all of a sudden all of your state just sort of fizzles out and disappears. So we come in there with durable queues and transaction management and threading and all of that kind of infrastructure pointing developers into how to structure their business logic, including published subscribe and event driven architecture. From there system gets more reliable, can have more developers working in parallel without stepping on each other's toes.

02:50 Seth Juarez

Now, my sense is that, from what am I hearing, do I have to rewrite everything? Or is it like you can chunk this up and solve it a step at a time?

03:01 Udi Dahan

Right, so that's exactly what we recommend most of our clients do. There is always that sense of developers want to rewrite everything. It's kind of, this time we're going to do it right. But the nice thing about our approach is we can say, you know, you can actually get into your existing database with the Sprock, and then surface events from there. So you don't actually have to touch anything. You can just start omitting events from your existing code base and then start expanding other logic around the edges. And then gradually bit by bit rewrite the parts that are emitting that data via Sprock into a better way of doing things.

03:37 Seth Juarez

Now, with the advent of the cloud, and now these clouds infrastructure, moving over is it the same kind of challenge that you're going to have? How do you do it in agnostic way, for example, that you want to have stuff on prem stuff on the cloud? Is it the same kind of process?

03:52 Udi Dahan

I'm not sure I'd say it's the same kind of process because so much of the storage ends up being different. There's also challenges between having Azure service bus and Azure storage queues and their APIs are slightly different from each other. And if you want to be on AWS, then they have SQS and SNS. So each cloud provider has something that's slightly different in terms of API. Most developers are trying to hedge their bets and create some kind of infrastructure layer on top of that, so that if they need to move from one cloud to another, or they need to take something from cloud to on-prem back again, they try to create some kind of abstraction layer and it turns out it's actually a lot harder to do that.

04:31 Seth Juarez

Well Yeah and it might not be a question of even hedging your bet, and you might have a customer that uses AWS infrastructure and you're an Azure person and you still need to communicate in a way that's agnostic, is that correct?

04:44 Udi Dahan

Right. So where we are coming in is to provide that higher level API for developers to write their business logic, to write all of their pub/sub code and their long running processes. And then we do that adaptation to SQS SNS, and Azure service bus, Azure storage queues and as well, if you wanted to, to have everything run on premise, RabbitMQ, MSMQ just regular SQL tables like we talked about before.

05:10 Seth Juarez

So is the big architecture change more of one that you're changing to sort of a messaging protocol, is that the main crux of it?

05:17 Udi Dahan

Right. So I'd say that a big part of it is indeed making things more message driven, more asynchronous. But at the same time it's not just about making things message driven from a programming perspective. You got to have the underlying transaction management.

05:34 Seth Juarez

I see.

05:34 Udi Dahan

So without that, you could end up in cases where messages get lost or database ends up in an inconsistent state. So we make sure all that is handled.

05:42 Seth Juarez

Okay. So this sounds really good, but Daniel here is going to show us what this actually looks like in practice with a little bit of a story involved, right?

05:51 Udi Dahan

Take it away Daniel.

So I'm from Switzerland, like I said, and Switzerland is pretty well known for its delicious chocolates. I know Belgian people would say differently, but I think that Swiss chocolate is the best chocolate, and I actually brought you some. Its dark chocolate. From Switzerland.

06:06 Udi Dahan

Look at this, Dark Chocolate. That's actually my nickname.

Product placement here, live on Channel 9. So in this hypothetical situation, there is a Swiss chocolate manufacturing team and they have a existing legacy application on premise. And actually they want to make it scalable because they realized, especially around Easter, they lost a lot of orders. And of course, that's horrible, right? If you want to ship the chocolate worldwide and you can't deliver, it's horrible for every company.

So they thought about how could they actually scale the chocolate or the chocolate order management system and someone from the team stumbled over this Microsoft technology called Service Fabric, awesome. I think you have talked about it as well at Channel 9, right? And they decided they want to go for Service Fabric because it allows them to do some kind of lift and shift kind of architecture, where you start on premise and gradually move their microservices to the cloud.

But in order to be hyperscale, they actually realized it's not enough to just do like the scaling by cloning things like having multiple instances combined with microservices. They also realized that they need to do data partitioning. And now we're circling back to NServiceBus because like Woody said, NServiceBus is built on top of queues, right? So you have a queue like an order queue where your orders arrive, but how does that affect when you do data partitioning? So of course the partitioning itself also need to apply it to the queues.

07:35 Seth Juarez

And this is even more important when you're looking at the perspective of a microservice architecture, because data... Literally you have to pass around data. It's not a question of, Hey, let's just all call the same database, because that's not even how it works anymore.

Yes. Especially with Service Fabric where you essentially with stateful services, you move the state closer to the compute layer. What's going to happen is you will have multiple partitions and what it means is you have to route basically the request to these partitions and the same also applies for messaging.

So the team realized this, this challenge, and they came up with this architecture here. So they have an order microservice with a... Sorry for a second, with the stateless front-end and the stateless front-end sends in chocolate order commands. Depending on the chocolate type, let's assume they partitioned by dark chocolate, Brown chocolate, white chocolate. So basically, they have to stateful back-end service, it fetches messages from the queue, right? And then the whole business process is triggered. So in this example we say, okay if someone orders a chocolate, let's say dark chocolate type, we also have a buyer's remorse because you never know, right? Sometimes customers want to cancel their orders. And that's handled with what we call a process manager in answer response terms it's called a saga. A chocolate ordering saga. And this saga then manages the whole process of doing payments, gateway calls. And then at some point when the order needs to be shipped, it then basically sends a command to the shipping and the shipping then publishes events back to that saga.

09:16 Seth Juarez

And that's interesting, right? Because even in the case, lets just you say your whole website explodes, the durable messaging, those orders will all be there when everything turns back on and be like, it'll start peeling. And I like the idea of commands as nouns, right? Because now that you're sending commands to the broker as nouns even those things that it's supposed to do remain stateful in the event of a catastrophic failure.

Correct. And here I've outlined it here. So what's happened is because it showed chocolate order front-end composite UI part, knows the destination, which is the order receiving backend. So what happens is, it will do sender side distribution. So basically it knows the partitioning function, how to translate the orders into a given queue based on the chocolate type. And here, when we publish, for example, a chocolates ordered received event, this is then published to the shipping service. And the shipping service itself has its own partitioning schema, right? The side cannot inflict any kind of partitioning to the shipping microservice, because this is a completely dedicated bound context and domain, which then sometimes if with data partitioning, what can happen is that an event arrives on the wrong partition, and it needs to be redistributed internally in order to end up on the right partition. That's what we call smart crowding, or here in that case, receiver side distribution.

10:44 Seth Juarez

So there's a lot of vocabulary here. So we heard about partitioning. We heard about a message queues, and we heard about routing, can you put this into... Are these programming structures that we're using?

10:56 Udi Dahan

Let's see how simple it can be in terms of code. It sounds really complex with all of these pieces.

Okay. So let's see briefly. So we have here this logic. So we have the front-end, it's a classical ASP.NET Core application with a controller, which allows to send in chocolates. And as we can see from answer response perspective, this is super simple, we just send in an order chocolate command, and that's it. That's everything we have to do. And on the receiving side, we have the order management system, which it just declares a process-

11:36 Seth Juarez

And is that just like an API service or something?

11:40 Udi Dahan

It's even better. It's a message driven API.

11:44 Seth Juarez

Oh, I see. So it pulls messages off queues and does...

11:47 Udi Dahan

Exactly.

So what we then see here is, this is the order process manager. So whenever this thing comes in, I just declare basically I want to handle this order chocolates command. I implement this handle method, which is async. And then I say, okay buyer's remorse. Wake me up in one second. So after one seconds, the buyer's remorse is over and then the timeout is triggered and then the process continues.

12:12 Seth Juarez

So let me go through this code, because this is super interesting to me. First of all, your interface names are super nice. I think you went way too fast. So scroll up a little bit. The names like 'IAmStartedByMessage' That's super nice. First of all, because I've seen some weird interface like 'I handled message async' it's like, okay that's not helpful. But this is saying, I am started by a message to order chocolate and I handle messages that are payment response ordership, and these are the ones that handle the timeouts.

Correct.

12:44 Seth Juarez

And so I liked the naming. That's the first thing. The second thing I like is that when you receive a message, you also have logic that says wait, right? For a certain amount of time because your business logic might happen. And so that's really nice. And then the third thing that I liked that was really nice is that all of these interfaces are going to handle these events for you without having to do any other job.

Exactly. And it's automatically correlated based on the order ID as you can see. So what it means is we have an order process manager per order ID and all events and messages they're correlated automatically together on this order process manager, per order ID.

13:20 Seth Juarez

And then the last thing I saw was that you have an annotation here that this is supposed to work on top of service fabrics.

Yes, exactly. So we are using here this Service Fabric persistence, for answer response which then basically saves the state into reliable collections behind the scenes.

13:38 Seth Juarez

And see, and I've had Mark Russinovich on, talking about reliable collections. Those things are literally fault tolerant. They will not break. They have quorum numbers and so this automatically says, just use that.

Exactly. That's all you have to do. And the rest here in this example, the old partition of fine routing is all done behind the scenes by answer response. And we can actually have a look here. I have it deployed here on my local cluster. It's using Azure Service Bus, and this is the application. Awesome UI, as you can see. So if we want to order a dark chocolate, we just hit this Button, and what we can see then here behind the scenes is when you go to the diagnostics event view, we can then see behind the scenes an order has started. So the order chocolate event is received. Should I tune in?

14:29 Seth Juarez

Yeah lets do that.

Okay. As we can see here to order chocolate is received, it's a dark chocolate type. It's processed on a process manager of that given chocolate type. And then it sends out a buyer's remorse timeout. And as you can see here, everything is internally correlated and all done behind the scenes so that it makes sure that you end up every time on the right partition where the state for this order is currently managed based on the chocolate type.

14:58 Seth Juarez

Explain to me, this partition part because this is the part that's maybe a little confusing.

Okay.

15:02 Seth Juarez

What do you mean by it ends up in the right partition? Is there routing happening with the messages to be in the right spot? Explain that to me.

Yes, exactly. So let me briefly go back here. So what we can see here is when we enter a new chocolate order of type dark chocolate, this part here knows that the chocolate type is... The destination is the dark chocolate queue and the dark chocolate partition. So with Service Fabric stateful services, we have per partition type, basically an end point or a service instance, running inside the cluster per partition type.

15:43 Udi Dahan

I just want add that All of the things that Daniel is saying, it's not code that you have to write.

15:47 Seth Juarez

I'm looking at that. I just feel like if I don't understand, I'm going to write some code that's like, where did this even go? I mean, so that's the part that now that I understand where it's going and that it understands it. That's really cool.

And for example what we can do as well is if you go back and that's where the beauty really starts to shine for example when you're losing connection to the cloud, the cloud is currently down for period of time. And this is example, the team still had a RabbitMQ cluster running. What we can do with NServiceBus is currently we're using Azure service bus transport. As you can see here. That's the only thing you have to write Azure service bus transport and the connection string, and that's it. And then the works. And now, if we want to switch over, for example, to RabbitMQ, then it's just a matter of basically going for answer response.RabbitMQ, fetch the right nugget package, and then install it where it's needed, and then pull down this nugget package, accept the license.

16:59 Seth Juarez

And so you are hot swapping out the actual message persistence mechanisms.

Yes. And now I just type here, RabbitMQ transport. This is something that is specific to Azure service bus transport. So I have to disable time, we're not going to talk about this. Here and here I have another one. I just do it in RabbitMQ transport. I also have to do here, delayed delivery. That's something that is required for Rabbit and for our transport. And then of course, I need to somehow tell the application where the broker lives, and in order to do that I need to basically override the connection string.

17:42 Seth Juarez

And so what you're doing is you're effectively swapping out Azure Service Fabric for RabbitMQ, which I know is not a trivial thing. Right? Cause it's completely different. RabbitMQ is a local like message queue thing that works on top of MSMQ is that right still?

18:05 Udi Dahan

No RabbitMQ is separate from MSMQ. Separate technology.

So now I just have to say here, and this is the specific connection string and say local host right? If I can type... Local host. I save it. I have to compile it. And now I'm doing something that I deployed with virtual studio, that should be in production it would probably, deploying not with virtual studio, but with something like PowerShell. So I published this application to the local cluster. I publish it. And this is going to take some time.

18:38 Seth Juarez

So as it's going through and doing this, I must say, you can show the boundaries of my understanding. MSMQ is the Microsoft message queue that works in the local. RabbitMQ is an open source one that does a similar thing. But you're effectively switching from something that's happening in Azure Service Fabric, has their own storage mechanisms, to something that's working in a local cluster.

19:03 Udi Dahan

Mm-hmm (affirmative).

19:03 Seth Juarez

Okay.

19:04 Udi Dahan

And did you see how fast that was?

19:05 Seth Juarez

And I literally was trying to like speak over it for a little bit and then it finished before I could finish.

Yeah. Well we have to wait, I think probably the longest part here is the deployment to my local cluster. It's going to take some time as soon as it's done...

19:20 Seth Juarez

But as it's doing that, you didn't have to really change any other code.

Just the connection string.

19:24 Udi Dahan

Did you notice no change to the order process management? No change to the client that's actually sending the messages.

19:30 Seth Juarez

Sure.

19:31 Udi Dahan

No change to the definition of the messages, right?

19:35 Udi Dahan

Yeah. I mean, its very different speaks too. So I mean, that's the thing that's really cool.

So as we can see now, we should see here connections on the RabbitMQ broker. As you see multiple connections from different partitions. We should see now here answer response created all the queues necessary and all the bindings and everything that is necessary to run it. And now we can just order dark chocolate again. And as you can see behind the scenes in the diagnostics window, it's still processing. Let's clear it so that you actually believe it.

Let's do Brown chocolate. As you can see. And let me show you that's actually also working here with RabbitMQ. We see here, we have an audit queue and let's fetch and see if there are actually messages in there. Let's get the last 10 messages from RabbitMQ broker. And as we can see here, we have the full history of the order chocolate, the buyer's remorse period, and everything that is now running inside Service Fabric, partitioned, now with RabbitMQ.

20:43 Seth Juarez

That's actually really cool. So was the RabbitMQ something that was directly in Azure already? Did you have a RabbitMQ message queue that was up there in the cloud for you?

Currently I have it on my local machine. So previously I connected from the cluster on my local machine, up to the cloud to Azure service bus. Now I'm connecting from my local cluster to my local RabbitMQ broker.

21:04 Udi Dahan

But this could also work on the cloud. There is RabbitMQ available in its own cloud environment. So again, if you're saying Azure service bus goes down, I want to hot swap to a different RabbitMQ in the cloud. What was the name of that RabbitMQ provider that's a that's available online? It's not IronMQ.

I don't remember.

21:23 Seth Juarez

There's, probably something, there's tons of queuing systems.

21:27 Udi Dahan

There's a bunch of queuing systems online and they all speak the same RabbitMQ protocol. So you can actually point to a different cloud queuing provider that supports RabbitMQ. And you're doing that hot swap live it actually when you think about it would have taken longer to have a meeting to decide which queuing environment we want to switch to rather than actually doing the switch.

21:50 Seth Juarez

Yeah. And here's the other thing I realize because there's different kinds of queuing that happens in Azure. Because there's Azure... In the storage area there's message queues there.

22:02 Udi Dahan

Right.

22:03 Seth Juarez

But then on Service FabriC there's different persistence mechanisms. So you might be thinking, I want to go to Azure Service Fabric if you're using this technology, you can swap over pretty easily. That makes sense. All right. So that was really nice. Is this code available somewhere?

Yes, it is available. So you can find it here on this link. Let me briefly show it to you. So it's a content for my talk on Friday.

22:30 Seth Juarez

This is pretty cool. So let's get into, because we have about seven minutes left, where can people go to learn about this stuff and figure out this stuff?

22:39 Udi Dahan

So I'd say the first thing to do is just take the introduction NServiceBus, the link's right there on the slide and really you just Google for NServiceBus and you'll find this stuff. Sorry, you Bing for the NServiceBus.

22:52 Seth Juarez

You Google with Bing.

22:54 Udi Dahan

Exactly. And then from there, it'll really take you by the hand and, you know, in 15 minutes or less, you can have a bunch of end points, pub/subbing, running on top of RabbitMQ, MSMQ, Azure storage queues. It really is very straightforward to get started. And at that point in time, when people have the technology working, then the really interesting discussions start about, well, how do we turn our business process in to be a more event driven one? How do we migrate our legacy code? But most of the underlying technology, all of the underlying pipes and all that kind of stuff, we got you covered.

23:33 Seth Juarez

So I remember that, because I followed this for many years, this has been around for a long time. And I went through the tutorial and played with it. It was really nice. It was open source back then, is it still open source today?

23:44 Udi Dahan

Right, so in terms of open source, yes, all of the source code is available on, on GitHub. But in terms of the actual use of it, we've found out that companies really want, when they're basing their, Easter time mission critical chocolate management processes on there, they want to know that something goes wrong, they uncover a bug that somebody is going to be on the phone and fixing it for them in real time. So there are commercial licenses for all this kind of stuff. And the reality is for most development environments, it's so much cheaper than actually having a developer struggle with this stuff themselves. So we'd be happy to talk licensing if anybody really wants to get into that.

24:26 Seth Juarez

I mean, look, between us girls kind of thing. If I don't know how a developer is eating and sleeping with the code they write. I never can be sure it's going to be around long enough to support my business.

24:39 Udi Dahan

Right.

24:40 Seth Juarez

And so I don't mind at all people charging for good code. So they can go here to find out how to use NServiceBus. Is there any other information that you'd like to get out there?

Not from my side.

24:51 Udi Dahan

Well, I guess I'd say that, that the next step from learning NServiceBus is actually moving into thinking about service oriented architecture, microservices, how you manage your data. And that's when we've got lots of videos available online, people can learn more about the design of large scale complex business systems on top of these types of things message driven patterns.

25:13 Seth Juarez

That's awesome. And now, especially in the microservices era, because I remember this has been out for many years, and I remember that it was... When I heard it first, it was this is to make things more scalable, more friendly, but in an era of microservices, there's really is no other way to do things. In my opinion. What do you say?

25:31 Udi Dahan

Well, I think that some vendors might take issue with the fact of saying NServiceBus is the only way microservices.

25:38 Seth Juarez

What I mean is that you have to use stateful messaging because you can't share a database anymore.

25:43 Udi Dahan

Right. So if you actually want to get the benefits of a microservice architecture, you really need to follow a lot of these kind of patterns. So, I mean, yeah you can kind of struggle through it yourself by saying, Oh yeah, I'll just code something myself on top of Azure service bus API. There's really a lot of knobs and dials and infrastructure to write yourself. So you will want something, some layer that abstracts that away from you, but the patterns? Absolutely, you need those.

26:10 Seth Juarez

Yeah. Because you can't... Lifting and shifting into a containerized environment, you can do that, but then you'll quickly realize there's some limitations. And if you want scalability you need to be able to partition things out. And I've always been a fan of the messaging system, because it allows you to separate things out in a much richer way I think.

26:29 Udi Dahan

Yeah.

26:30 Seth Juarez

All right. Where can people go to find out more about you folks, if they want to get ahold of you.

26:34 Udi Dahan

You can Google or Bing NServiceBus, particular.net is the website. My name's Udi Dahan. I've got a blog available online. This is Daniel Marbach, he has his own blog as well, where he blogs about a lot of Azure Service Fabric, internals, and how to get all of the magic that we've been showing you today working. So take a look at that.

26:54 Seth Juarez

Awesome. Well, this has been very helpful to me. I'm definitely going to look at this. I'm really excited about the Azure Service Fabric stuff that you've put in. It's definitely something we should look at. Thanks so much for watching. We'll be right back with another guest right after this tiny break.

Channel 9 Live interview with Udi Dahan and Daniel Marbach

About this video

🔗Transcription