Under the hood of Particular Software
About this video
Take a journey through custom tooling, testing the un-testable and massive cloud service bills in the name of reliability.
🔗Transcription
- 00:00 Andreas Ohlund
- Andreas Ohlund, I'm a Swede that's why we played ABBA. I just tell that because that's why those funny little dots are above the O in my last name, that's a Swedish character. That's what preventing me for signing up for a lot of websites. That character screws it up. Anyway, this talk is titled Under The Hood Of Particular Software. I'm heading up the engineering department. So, what sort is that? I'm the one that makes sure he doesn't get too close to a keyboard. But I also got some brilliant guys and girls out there that, I guess, they're trying to get me away from my keyboard as well. So we'll see how long I'm still able to do a lot of coding, but I am coding as much as I really can. So, I really love that.
- 00:43 Andreas Ohlund
- It's a little bit boring under the hood. This talk is about under the hood but it's really more about... You heard a lot of awesome stuff, right? Danny showed you Matrix Insight and all that cool stuff. And John and Indu showed you the new pipeline. And when I was a consultant, I went to a lot of conferences. I went there and went, "Yay, cool stuff. Very cool." All that new things and I just went home and nay, we can't use it. We are on this ancient version of something and company policy tells us we're not allowed to use it. So I can like, "Ah, yeah, that was cool but I can't use it." Right?
- 01:25 Andreas Ohlund
- This talk is about what we are trying to do to help you avoid getting into this situation, right? Yeah, the pipeline is cool, but I'm on insert ancient version of NServiceBus here. So, we won't be using it for another two years. So, this talk is about what we do internally to make sure and help you stay up to date. Obviously, some part is on you as well. James mentioned yesterday about Continuous Delivery. I assume that's something that a lot of people is trying to achieve. Obviously, we were in the same position as you guys. If Martin Fowler puts his name on a book, obviously it's gotta be something good and that's a good book. It says on a book, "Reliable software releases through build test and deployment automation."
- 02:15 Andreas Ohlund
- That's really what we've been trying to do a lot for the last few years. When I got involved in NServiceBus in 2008, we were quite manual. Now 2014, we are much more automated and are able to churn out more releases and more high quality stuff to you, guys. A lot of stuff I'm going to mention in this talk relies on you guys being on NServiceBus 3 because that's where it all starts. If you just can get yourself up to NServiceBus 3, we got you covered in terms of backwards compatibility and so on. I've heard from a lot of clients that actually version 2.6 can actually talk pretty nicely to version three as well. So, we're not actively checking for that but try to get yourself up to NServiceBus 3 and from that on, we'll try to do our best to really enable you, guys, to keep updated with as low risk as possible.
- 03:17 Andreas Ohlund
- Obviously, we want to do continuous delivery, not only the big cool feature sort of so far but we really want to be able to churn out new code, get it in your hands because we want you to benefit from the new code. But also we want to get quicker feedback so that we can adjust APIs and so on. And obviously, God forbid, if there is hotfixes to be released, we want to have them in your hands as quickly as possible, right? Because we know that you really depend on us being able to fix those issues that are hampering your production systems. So, we are releasing more now. Back in 2011, I think we made six, sorry, eight releases. 2012, we were up to 81, 84 and we're now at 41. So, it looks like we're going to hit the same target of roughly 80 releases per year.
- 04:16 Andreas Ohlund
- And arguably this one could have been even higher, but as I mentioned yesterday, we trying also to be more... We're splitting repost and doing more targeted releases. So, instead of releasing everything every time something changed, we only release the RabbitMQ transport when we actually make changes to it. We only release the artifact adapter when we make change to it and so on. Obviously, that's going to lower the amount of release we have to do. But we're still going to do roughly 80 releases this year, which means roughly two releases per week. That includes the new products like Insight Matrix and all the different add-ons to NServiceBus.
- 04:57 Andreas Ohlund
- So, we have improved at least since 2011. But that's all fine, right? I mean, yes, we can release a lot of stuff and I'll hopefully show you that we do everything we can to make sure that the qualities is really top-notch, but that's not really no good if you guys can't use it, right? Trust me being a consultant in Sweden, working for governmental agencies, I know all about falling behind and the policies that says, "We can't use anything new and it has to be certified," blah, blah, blah. It's really painful to fall too far behind. We don't want to get into a situation where we keep on releasing stuff and you guys is falling more and more behind because that's no good for you guys.
- 05:53 Andreas Ohlund
- So, that's probably some governmental guidance in Sweden sitting there. Because what happens when you fall behind that leads to... Everyone has a Lego thing, so I had to put the Lego thing in there as well. That drives you guys to a big bang release, right? We are now on version 2.6. The NServiceBus guys release version five, so, yeah, it's a big thing. We're far behind. It's going to be plenty of changes and risks it goes up. Growing up with Lego, I could live with that big bang, but it's really this BigBang you don't want to have. This Korean boy band called BigBang, depending on the music tastes, I can tell this is not the big bang you want to do. I got three girls at home, so I might see those guys just because of that.
- 06:55 Andreas Ohlund
- The issue with big bang is that it drives the risk up and as risk goes up, obviously, upper level of management will need to get involved. The higher the risk, the more chances is that you need sign up from the CTO. And when they get involved, you will have to motivate, "Why do you need this new version?" "Yeah, because we're more productive. Yeah, but we don't want to risk our..." You get into a tricky situation when you need to really motivate staying up to date. And the risk actually gets compounded by the number of years. So, we really want to avoid you guys sort of waiting too long and I have been being forced to do a big bang update. They're on Spotify if you want to listen, so.
- 07:46 Andreas Ohlund
- So, how do we solve this? Well, what do we do to do our part of this thing, right? What we do is that we make very, very sure that we're backwards compatible and we'll talk a lot more about different flavors of backwards compatibility, but the main one is backwards compatibility on the wire. Essentially, meaning that endpoints running different versions of NServiceBus should always be able to talk to each other, using the classical bus.Send, bus.Reply, bus.Return, pub/sub, and so on, right? Because as long as we can keep our compatibility that allows you guys to mix and match your NServiceBus versions.
- 08:32 Andreas Ohlund
- Because, obviously, we are a company that pushes you to build fine-grained solutions with autonomous endpoints talking to each other. If we're not keeping our part of the deal, there's no way you guys can be backward compatible. We talk a lot about how you evolve your message contracts in a backwards compatible way but obviously, if we don't do our part, that's pretty useless, right? You wouldn't be able to talk to each other anyway. So essentially we want to get you to a place where you can essentially give you freedom of choice, right? So, our sales endpoint is still at v3. No pointing up great now on, but hey, we got a new billing project starting next month. Why shouldn't be we using a NServiceBus v5? A lot of companies I come to, they're kind of like, "Yeah, company policy, we are on version 4.1.2."
- 09:21 Andreas Ohlund
- This is a long journey and I don't know if we're there, but we do our very best to ensure that we are always spectrum available, that the quality is high enough so that you guys are able to trust us to move to a more mixed environment. So, we want you to feel that every time you release something, you also upgrade NServiceBus because hey, it's backwards compatible, it will talk to those other endpoints. It's a lofty goal and I guess it's up to you guys to judge if we're able to get there. But as they say on Facebook, "Automated test or it didn't happen." So, that's a big thing we've been doing last couple of years. Essentially not relying on ourselves to remember to test stuff instead we're automating all this stuff so that we know for each check-in, if we're breaking backwards compatibility with all the versions.
- 10:14 Andreas Ohlund
- So this is one of the screenshots of some of the wire compatibility tests we're running, making sure that the latest version in this case version five is able to talk to all the other ones. And we are testing all the permutations, right? So that is version 3a but to do a bus.Reply that's received by a version three endpoint. As I said, I've heard reports of 2.6 actually working as well, but we're not actively checking version 2.6. So, as long as you can get yourself up to version three, we will guarantee you that we are a hundred percent backwards compatible on the wire.
- 10:54 Andreas Ohlund
- Of course, we are all humans. We make mistakes, you guys make mistakes and we definitely making mistakes. And if we make mistakes, we fix them. So, this is the comment, well, not five days ago. It's actually eight days ago. We are hotfixing NServiceBus version 3.3.8. So, there'll soon be a 3.3.9 out, which will enable a version three endpoints to talk JSON to version four and our endpoints because running our tests, we realized that if you're on the JSON serializer, there's a change in v4 that actually breaks compatibility with version 3.3.8. If we find issues with wire compatibility, we will fix it and we'll fix it all the way back. So in this case, it's only version three that was affected so we're fixing it.
- 11:49 Andreas Ohlund
- So far we've talked about wire compatibility and that affects obviously the entire ecosystem on endpoints, but let's zoom in a little bit because obviously backwards compatibility is also important on the endpoint level. Let's start off with the first sort of dimension and that's data. We all know how tricky it is with data, right? New version of the system and there are some new columns and I've done this before, you added customer age, new column, the column is null. And null defaults to zero if you read it up using a NHibernate as an int. Those new business rules, checking that only a customer above 18 is allowed to buy stuff will now falsely hit in and no one will be able to buy stuff because they're all age zero, right? Data backwards compatibility is really tricky.
- 12:43 Andreas Ohlund
- And by data here, I mean the data that we're storing for you. Timeouts, sagas, subscriptions, gateway application. So we also making sure that between major versions... So we leave it more relaxed, but we're saying between major versions of NServiceBus, we are backwards compatible in terms of data and that allows you to do side by side updates, so when you move from version four, up to version five, you can actually run those endpoints side by side at the same time. Reading and writing the same data which means that you should be able to upgrade NServiceBus without incurring downtime. And yes, that's tricky, you have to be careful so that you don't do breaking changes. But it's a must, otherwise you guys will be in a bad position because then the upgrading to five is going to be tricky because all this data migration, all this stuff needs to be happening. So, we know painful lists and we don't want to inflict that pain on you. That's data.
- 13:57 Andreas Ohlund
- Next one is, I guess, the one that comes to mind when you think about backwards compatibility, right? The API, the bus.Send, the configured of something, all that stuff. That's also per an endpoint level so that's the thing that you guys get exposed to most. And we're all careful about that as well and we are following SEMVER, very strictly. Batman, that's probably Simon and Robin's probably me. Get push, pull request, breaking change. Oh, I didn't think about that so we really, really careful about not breaking backwards compatibility within the same major version. So 4.0, 4.1, 4.2, 4.3, the programming API is that's what we're gunning for, it's gotta be a backwards compatible. And so far we've been relying on code reviews, but starting with v5, we actually have written some code to actually make sure that if we change public types, it's gonna break the build.
- 15:04 Andreas Ohlund
- So, that's the one thing we've learned that if it doesn't break the build, it doesn't happen. So starting with v5, actually we'll be able to guarantee you that we're not going to be touching any public API at all. Which means that I get slapped a little bit, then I can see the build server break. And instead of Simon telling me that it's a breaking change, I build service on telemeter breaking change. So this obviously leads us to keep this sign around the office. Well, we don't have an office, but I guess I have it printed out next to my screen at home. And isn't as a bit funny, because I used to be standing on the barricades and blaming Microsoft for, "Hey, make everything public because I really don't want to override it."
- 15:47 Andreas Ohlund
- That was a good idea at the time I thought, but trust me now, we really careful about public types. We really don't want to expose stuff to you that you shouldn't be using because if the type is public, SEMVER has to apply. And we actually realize that this is a benefit from you guys because by not exposing public types, we have to think through what we're exposing to you. So instead of giving you everything and the kitchen sink, we actually give you a nice way to extend. A more controlled way where we can guide you better. So like John showed, give us your pipeline extension, we'll plug it in. It's also funny because you're turning on API docs and you might think that that's because we want to create API documentation. But it's mostly because when your API docs on, if you make it public, it'll tell you to create a XML docs in code. And that's boring and I don't want to do it.
- 16:49 Andreas Ohlund
- It happened last week, make type public sheet, resharper is red, start piping some sort of, yeah, this is the constructor of the class. What is this? I don't want to write this stuff and I realize, "Shit, this type should not be public." So that's a nice side effect of on wanting SRO for the API comments. So that's the v5. We not coming back and change this. I don't I do because it's going to break the bill.
- 17:29 Andreas Ohlund
- The final piece of the puzzle if you may, API is one thing, that's the easiest thing. Making sure that we don't change the signature bus.Send that we can even have the build service check that for us. but verifying that we're not breaking behavior, that's really hard. I mean, the default isolation level for transactions actually changed from version 2.6, it was serializable, and then version three, it is read committed. So that's sort of a public API, but it's very hard to spot that we're actually changing that. So, the only way to really test that is to cover it with as much tests as you can. And I guess this also can be related back to you guys, verifying your business logic, that's what you do. You put a lot of tests on it because when you do refactoring as a song, you really want to see if something breaks. And that's something that we spending a lot of effort on lately.
- 18:44 Andreas Ohlund
- But rarely, I have a little more than problem that because essentially we're providing you with a communication platform that will give you async, all the benefits of async messaging. It's scalable, it's robust, all that stuff, but it's async, and as we all know, async stuff is really hard to test. So when we upgrade to a new version of RabbitMQ, does it still work? What about SQL Server 2014 that'll be released soon? Will all this stuff work there? it's essentially for you guys to be able to benefit from all that we have to reassure that things doesn't change as we change that permutation, right? But what about SQL Server 2014 on Windows, some out of version of the OS, right?
- 19:33 Andreas Ohlund
- As you can see the permutations sort of becomes endless almost. And the tricky thing with behaviors that it's really hard to unit test that's what we've seen so far. Instead we needed to go more to core screen tests that sort of exercised our API because it was really tricky to test that behavior sort of on the unit level. So we moved, shifted a balanced quite. Well, we used to be old unit tests and manual testing, but now we sort of switched over more to automated acceptance test or running more end-to-end. And we'll talk about that in a second.
- 20:25 Andreas Ohlund
- Well, and what we realized that we have to be the best in terms of testing this stuff because that's really what we sell to you guys. We make sure that you can focus on your business logic and we'll take care of that nasty async testing with different versions and so on. Not claiming that we're best but I'm just saying that we continuously invest in here and we really want to be the best when it comes to that. So you can feel comfortable that what we give you guys has been completely tested on that specific set of infrastructure on. MSMQ plus SQL-EX integrating with Rabbit Y and the list just goes on and on and on.
- 21:10 Andreas Ohlund
- Not gonna go super deep, but here's an example all then we call them Acceptance Test. So, obviously, we are developers. So we wrote our own framework. I even heard of some clients actually using this to test their own stuff. But essentially what we did is we wrote a sort of small framework where we can define our tests and we run them end-to-end. We find real endpoints in separate app domains. And we essentially run scenarios end-to-end using the real databases, using the real transports and so on. The DTC and all that stuff in order to make sure that actually works. So it isn't a specific sample we're testing that bus.Defer is working. So, we have an endpoint. My endpoint it's sets we coach the time you see that there's something called a context. Again, we're running a separate app to mines and we need to have some context flowing across so we can actually check stuff. So, that's the context.
- 22:15 Andreas Ohlund
- And we just doing our bus.Defer and so I think in time span, delay in three seconds and when we're done, we checked that in this case that the actual delay was at least three seconds. So exercising the old infrastructure. So we're running this, of course, against our, in memory storage, we running it against the NHibernate storage against it the Raven storage and against whatever future MongoDB, Redis, we're running it against the Azure storages and so on. So we're running this type of test against all the permutations that I mentioned. And the good thing about this framework is that we are able to, in those endpoint classes, we write real and service post code. We code against our own API. So you can see here, I have a classical message handler. I handled messages of T, of passing that little context, a little trick we do so we can pass them state on essentially just a key value par thing.
- 23:18 Andreas Ohlund
- And when the message comes in, you remember I do the bus.Defer. So when the text comes in and I record the actual time in command, and I set a little flag to tell me that the test is done. So this allows us to test all our public APIs. And obviously if we break something, this is likely to break. And that has been happening a lot where it's a big code base and we're constantly splitting up. And as I said yesterday on the panel debate that we tend to discover our own little clever hacks that wasn't so clever, right? So you move something here and then, "Okay, that broken acceptance test over there. What a gun shock." "Oh, yeah, because there was some..." So in the same way, as you guys are benefiting from your automated testing, we were doing the same.
- 24:12 Andreas Ohlund
- And that saved our bot a lot of times. Another nice side effect is that we removed more from white box unit testing, more to black box testing like this. Because that's really allowed us through factor to got some NServiceBus, but still verifying that it works. So now we can use our unit tests focused on tricky algorithms. The CLI says, if you want to assume in somewhere, we go unit testing. And if we have some stuff to see the test, but most of the pipe panel stuff is covered by, I think, we got around a hundred of those. Running stuff, five, six, seven points end-to-end, seeing that sagas are loaded and concurrency works and all that stuff.
- 24:59 Andreas Ohlund
- And that's really allowed us to refactor freely. We can move stuff around without breaking unit tests all the time. So that's the lesson learned for us as well, that if you get the balance right between your black box and white box tests really becomes much easier to factor code base. So it's, yeah, that doesn't raise. Try to get those more core screen tests going. It's really going to help you evolve your code base.
- 25:37 Andreas Ohlund
- As I said, right, SQL Server 2014, it hasn't been released by the way, it's in the works, right. It's released. Okay, cool. Greg update the bill legend. Cool. The permutations are brutal and it just keeps on growing, right? As soon as we add new infrastructure like, "Okay, let's support Mongo or let's support something else." There's a whole set of versions of Mongo that you guys could bond. Oh, there's a new version of the OS. There's new version of the queuing system and so on. And that thing just explodes, right? We have this big test suite of a hundred and fully automated end-to-end tests. If you're going to run those tests manually on setting up all that hardware, setting up virtual machines for all that ourselves, it's just to take too long. So, we need minions.
- 26:34 Andreas Ohlund
- At least, my kids love the Despicable Me movies and what our problem is that I talk to Gru and he wouldn't let us as minions. So, we to find our own. We need find a way to run this massive test suite against all the different versions of QM systems now. So we got some minions, we're using TeamCity as our build server and the good thing with team city, apart from being a kick-ass build server is that actually support running agent farms on the different cloud services out there. Essentially what we do every time you commit, we're kicking up a lot of build agents running on Amazon. Sometimes they fire up stuff on Azure because some of the installers and stuff like that, we need to test on Azure because Derek can get a client OSS running.
- 27:28 Andreas Ohlund
- So, we're leveraging the cloud infrastructure to be able to fire up a lot of machines to run all those tests because obviously we want as quick feedback as possible. If I break something that we'll call sagas persisted in SQL Server 2014 on our Windows server X, I want to know that as quickly as possible. So, but they do take time right. They're black box, the entire test suites runs in around 20 minutes to go through all the hundred-ish tests. And obviously being a small company, our cost we haven't fully distributed, no office. We're spending a significant amount of money on this. But without using the clouds, we wouldn't afford it. We couldn't have all that hardware sitting around.
- 28:15 Andreas Ohlund
- So, when there's a commit, fires up, all the agents runs all the tests, report back, and when they go offline, the commission themselves, I mean, essentially is paying for that find the lecture running. So, for us, this has been a must. We couldn't have done all this without running on the clouds. And then the nice side effect is that we are all heavily automating stuff on those clouds, meaning that we can better help you guys when you have issues because have you guys had sort of looking a little bit on a moving stuff out, some cloud provider? I see it fired me down. So that's a nice side effect that we become experts on running stuff on those clouds.
- 29:09 Andreas Ohlund
- Yup. So, some lessons learned. I already mentioned quite a few, but I tried to compile sort of a list of stuff that I've realized and that we realized during this journey back in 2009, when we was fairly sort of fresh, in a sense. We are an open source project sitting at home at night, doing stuff until the stage where now, where we are able to sort of put more resources into this. The first it's not really a lesson learned, it's actually one thing I've really learned, this is not required if you want to do continuous integration. And that's the dreaded F5 plus WinZip combo.
- 29:53 Andreas Ohlund
- So, Udi and a few others sitting at five and a few samples and me using WinZip to create a zip. And then remember Udi showing that green download button. That's how we release back then. "Go on, Scott. Have you run all the samples?" "Yeah. I hit F5 all them, there was not no red printouts in there. It's probably working." If you're doing this, you need to stop. You shouldn't be relying on... If you have manual testing, try to focus that on doing more exploratory testing, where they can test education and stuff like that. For the known business requirements, there should be automation covering it. And this is not easy always trust us. It's not easy to just all this stuff, but it's doable.
- 30:42 Andreas Ohlund
- So, the WinZip part like James talked about yesterday, create your build pipelines and automate away all that stuff. So you can focus on more fun stuff. So, this is really the big advice and that's one of our core values. Automate all the things, that's really the key here. Even if you can't do it right away, creating that list of steps. This is what we do when we release. Going to the bin folder on my machine, copy-paste, copy the DLL on Azure and then call some guy who unzip it. And hopefully it puts those assemblies in the correct spot. And so on, right?
- 31:25 Andreas Ohlund
- Yes, start with creating that list. And then that's a nice blueprint because that's the list of things that you need to automate. So, that's sort of really the driving thing here. We trying to automate everything we do, then we don't have to rely on people remembering. And we also looked sensitive for people being away. We are not such a big company, so we don't want to have, "Yeah, but that guy is on vacation. So we can't release service insight." So, not rely on people and obviously... I'm super lazy. Everyone in our team is really lazy and this is driving us because we can't really be bothered without zipping stuff and coping things. We want to push stuff and want to build server to tell us if it's good. And then when we want to release, we want to go into HipChat and type you bought release service insight.
- 32:23 Andreas Ohlund
- That's okay. Typing as command to a shot, but that's how... No, I could leave without. Creating release notes that's boring, one automate that as well. And our tip is be really picky with the tools to use. Don't settle for some half okay thing. Make sure that you have tools that is the best tool for the job and can be sort of integrated and composed with your other tools to create that deployment pipeline. We use GetTab. We use TeamCity. We use Octoplus Deploy internally. We choose in those tools because if we can stitch them together and create this full automated pipe. But we were able to sort of push stuff to you guys. Staging on my yacht, so we can do a final Sante check before we push it up to new gap, then it'll become available to you. All the applications are pushed up to Chocolatey and then our platform installer. Well, I can pull them down from shop latte.
- 33:31 Andreas Ohlund
- So, really be careful to select the tools that will help you do all this. And it most likely not a single tool that can do all this. Most likely we'll have to combine stuff. And sometimes that's not even enough. That's what we have been doing a lot, especially since we moved to a model where we are using fine grain repositories because we found that there wasn't really much tooling there. It's a tricky to manage dependencies between them. And also when you're one repository, it's very easy to assign versions of stuff. Yes. NServiceBus 3.1, 3.2. It's very easy, but now we got like 2,500 repos. I don't even remember what versions we're on anymore. So we had to create our own tool just as an example, it's called GitVersion. It's an open source product and use it if you like. Essentially it looks at your Git history and it stamps a version on your assemblies, which means that we know we no longer talk about version numbers.
- 34:35 Andreas Ohlund
- One thing we decided is, is this a hotfix? If it's hotfix, we will follow. We pulled the GetFlow branching model. Some products used to GitHub flow. You can read up on those later, but since you GitVersion supported, both those two branch models, GitHub flow and GetFlow. Well, GitHub flow is a more simple one. Essentially. We never talk about version number. If you say, is this the hotfix? Do that. If it's a minor release, do that. And if it's a major release, we do that. So, the tool itself has sort of taken all that away. And we're now focusing on discussions about important stuff, right? Hotfix that's buying a drop-in for your guys. Minor release, still back for comparison and breaking. Major release, potential breaking changes in the API and so on. So don't be afraid. Don't be afraid to write code essentially.
- 35:29 Andreas Ohlund
- We are all developers. We're really good at writing code. So, if there's some area of your process, that's friction manual stuff going, there might be some open source tools that can help you. I don't know. So let's help each other. I told you our core value was sort of automated all the things and be production ready by default but actually find the core value that I think it's even more sort of up to the point. And that's the core value that Netscape used back when they were alive, I guess. I don't know if that has to do anything with the core value, but I think it really rhymes with my thinking is that we must not ship crap, right? That's the nice thing to keep in mind, right?
- 36:24 Andreas Ohlund
- We don't want to ship crap. That should be a guiding star. Through Metasys that was sort of the core value of Netscape back then when they were sort of competing with Internet Explorer about who should be the best browser. Even though they're gone, I think that holds true today. We want to ship as quickly as possible. We know we're going to get that hotfix to you. We want to get those miners on a nice monthly cadence because that's what we're trying to do at least every month there'll be a ma minor version. So, that those features we're cooking should be in your hands, sort of in a mouth monthly written but we don't want to ship crap. So if it's not ready, we're not going to ship it.
- 37:14 Andreas Ohlund
- So, we really call it a driven there. So, that's pretty much the final verse. I was thinking I'm going to leave some time at the end for some Q and A, so, don't ship crap. And by the way, we're hiring. Let's do some Q and A. Questions? Do we have a mic or just raise your hand. I'll repeat the question. Question is how many agents that we're actually using to throw on the firsts? Greg, what is it? Is it seven or eight? Big instances, that's the max. We run them eight, one of the biggest ones on Amazon in peril, but we actually need to SQL Server 2014 has come out. So, our permutations just exploded.
- 38:21 Andreas Ohlund
- So we're probably looking at getting more. I think we need to go higher. And we now at the stage where we're looking at randomized testing. I don't know if you heard about that. There were some good talk at NDC Oslo, because we're not at the stage where we have so many permutations, so we need to start thinking about perhaps randomizing saying let's run 15 random tests for this commits and we'll pick Rabbit X running on Windows Y using storages.
- 38:53 Andreas Ohlund
- I guess, otherwise we won't have enough time to complete them. We're a very small company, right? So, it's really nice to have that value infrastructure as long as you can pay, more images. And obviously, that's only on Amazon. We're on stuff on Azure as well. More questions?
- 39:18 Speaker 1
- ...
- 39:21 Andreas Ohlund
- It's hard to test tasting stuff. So essentially you guys were looking at creating an environment where things would run sort of sequentially so that you can test your business logic, I guess, right?
- 39:32 Speaker 1
- ...
- 39:35 Andreas Ohlund
- Yada, I mean, turn the treads down to say to one, and you will do one message at a time. So I guess you can force at least on that level for it to be sequential. But as I mentioned, right, it's async messaging, that's what we do. And it's really, really hard to test. Well, I guess that's one of the downsides. If you want to get all that scalability, robustness, all those nice benefits, let's not fool ourselves. That is tricky stuff, how do you monitor all that? And obviously the async nature will make it hard to test, right? And I wish I had sort of magical wand that can wave and say, "Turn the threads down to zero, be very sort of unit tester handlers." And, obviously, we are trying to do our part to make sure that if your stuff works correctly, we make sure that when we interact with infrastructure everything is going to be working as it says on the box.
- 40:39 Andreas Ohlund
- So, I really don't have a better answer. I'm sorry. It's the asynchronous. So that's, and obviously we need to do as good a job as possible to help you put the unique attributes on your sagas. Think through your sagas, could they cancel or the command arrive before the place order command? What if a user cancels the order quite quickly and the place the order command gets stuck in their queue? If your sagas are not started by the cancel order command that might come in, no sagas available, and then someone replace the place order command from your queue and boom, you placed an order that the user has actually asked to cancel, right?
- 41:27 Andreas Ohlund
- So, there's a lot of sort of non-obvious stuff that happens when you're building systems like this. And it's very true. Don't fool yourself that it's a synchronous world, embrace the async and really try to test for it. Otherwise, there might be nasty surprises when you deploy. More questions? How much time more do we have? Two minutes, three minutes. So one or two more questions.
- 41:58 North
- How do you actually handle the version...
- 42:07 Andreas Ohlund
- Versioning? We'll use the tool called Ripple. It's essentially a tool written by the FubuMVC guys. Jeremy Miller and those guys. Essentially just wrapping new gits and remove some of those poor design decisions made by the NuGet team. Removing the version number from the path, which is a very suboptimal design. So, that's how we do it. We haven't use NuGet packages on, so you build a core and you'll get packaged and done downstream the processor will pick up that NuGet package. That's sad. We actually have a lot of friction with Ripple as well. So, that's the thing and I constantly evaluate what tools are you using?
- 42:55 Andreas Ohlund
- So, we have issues with Ripple's walls. So, we don't know if we're going to keep doing that. Some Ripples, we don't need it. So, we sort of went overboard and use script leveler where we didn't need. So we're backtracking on those Ripples. We're still using it on some but we are using NuGets to do dependency. And on that part is working pretty nicely. You have services like MyGet and so on. And I used to wish NuGet would be more sort of opinionated and sort of, if you follow SemVer there's so much more than NuGet can do for you. but instead they have some crazy defaults. If you have a dependent package, they're pulling the oldest one. So if you haven't so response RabbitMQ that depends on the core with a range between four and five, it's going to pull the oldest one.
- 43:45 Andreas Ohlund
- That's not a good default in my opinion, but I think the whole industry is getting there, right? So that's also actually a tape follow SEMVER. Then the tooling will start working for you. And it'll only be elongate pattern. So, try to follow SEMVER. I think we're out of time, right? So, I want to thank you for coming and keep building good systems out there and don't ship crap.