Latency is zero
This article is an excerpt from the book Dr. Harvey and the 8 Fallacies of Distributed Computing which chronicles this misadventures of Dr. Harvey Fallacious, a 19th-century explorer and researcher, and also my great-great-great-great-great-grandfather. In it, he details his travels and explorations in remote, then-uncharted regions of South America, unknowingly dealing with the repercussions of the 8 Fallacies long before they were written down by Peter Deutsch, let alone the birth of the computer age. You can get your own free copy of the eBook here.
On the efficacy of messages in bottles
Upon the return from my visit with the newly-discovered Ossian society, it happened that my ship capsized and I was marooned for a time on a tiny deserted island somewhere in the Caribbean Sea.
One might think that the biggest problem with being marooned would be finding food for survival, but that was not the case on this island. Coconut and pineapple trees grew all over, and as luck would have it, I stumbled across a hidden cache of rations likely left behind by Spanish explorers or pirates. Among the supplies were dried and preserved meats and a collection of rum in glass bottles.
The biggest problem when stranded, after finding food and shelter, is keeping one's own mind occupied. One day, I took a discarded rum bottle, inserted a note, and threw it into the sea.
Imagine my surprise when, approximately six months later (to my count), I observed the selfsame rum bottle washing to shore. Pulling the stopper out of the bottle, I removed the note and was astonished to find that it was not the one I had originally written!
My excitement waned when I read its contents: "Received your message. Happy to help. Where are you?"
Furious with myself, I wrote another message, describing my island's position in relation to the moon and stars, and returned the bottle once again to the surf. I began drinking rum at a furious pace, emptying bottles so I could send a new message once per week.
Luckily, ocean currents in the region proved to be consistent, for six months later I observed white sails on the horizon. Soon thereafter, my rescuers made landfall. I thanked them profusely but found myself severely irritated that, had I only included my position in my first message, my rescue could have been hastened by six months.
Dr. Harvey Fallacious
August 3, 1835
Latency is not zero
The speed of light is actually quite slow. Light emitted from the sun this very instant will not reach us here on Earth for 8.3 minutes. It takes a full 5.5 hours for sunlight to reach Pluto and 4.24 years to reach our closest neighboring star, Proxima Centauri. And we cannot communicate at the speed of light; we must bounce data around between Ethernet switches, slowing things down considerably.
Meanwhile, human expectations of speed are pretty demanding. A 1993 Nielsen usability study found that, for web browsing, a 100 millisecond delay was perceived as reacting instantly, and a one second delay corresponded to uninterrupted flow. Anything more than that is considered a distraction.
What is latency?
Latency is the inherent delay present in any communication, regardless of the size of the transmission. The total time for a communication will be
TotalTime = Latency + (Size / Bandwidth)
While bandwidth is limited by infrastructure, latency is primarily bounded by geography and the speed of light. Geography we can control. The speed of light we cannot.
This latency occurs in every communication across a network. This includes, of course, users connecting to our web server, but it also governs the communications our web server must make to respond to that request. Web service calls, or even requests to the database, are all affected by latency. Frequently, this means that making many requests for small items can be drastically slower than requesting one item of the combined size.
Ignore at your own risk
Latency is something that should always be considered when developing applications. To neglect it can have disastrous consequences.
One of our developers once worked with a client that was building a system that dealt with car insurance. In the client's specific country, automobile insurance coverage was required by law. The software needed to check if drivers had car insurance already. If they did not, they were automatically dumped into an expensive government-provided insurance pool.
The team decoupled this dependence on an external insurance check by stubbing it behind a web service. During development, this web service simply returned true or false immediately. Unbeknownst to the team, the production system used a dial-up modem to talk to the government system. Clearly, this was not going to work at scale.
This is an extreme example, but it illustrates the point. The time to cross the network in one direction can be small for a LAN, but for a WAN or the Internet, it can be very large — many times slower than in-memory access.
Careful with objects
In earlier days of object-oriented programming, there was a brief period when remote objects were fashionable. In this programming style, a local in-memory reference would be a proxy object for the "real" version on a server somewhere else. This would sometimes mean that accessing a single property on the proxy object would result in a network round trip.
Now, of course, we use data transfer objects (DTOs) that pack all of an object's data into one container, shipping it across the network all at once and thus eliminating multiple round trips associated with remote objects. This way, the latency penalty only has to be paid once, rather than on each property access.
So, it's important to be careful of how your data objects are implemented, especially those generated by O/RM tools. Because of object-oriented abstraction, it's not always easy to know if a property will just access local data already in memory or if it will require a costly network round trip to retrieve.
SELECT N+1 queries are just one specific example of the second fallacy of distributed computing rearing its ugly head.
To combat latency, we can first optimize the geography so that the distance to be crossed is as short as possible. Once we have committed to making the round trip and paying the latency penalty, we can optimize to get as much out of it as possible.
Accounting for geography
For minimum latency, communicating servers should be as close together as possible. Shorter overall network round trips with fewer hops will reduce the effect of latency within our applications.
One strategy specifically for web content involves the use of a content delivery network (CDN) to bring resources closer to the client. This is especially useful with resources that don't often change, such as images, videos, and other high-bandwidth items.
Latency typically isn't much of a problem within an on-premises data center with gigabit Ethernet connections. But when designing for the cloud, special attention should be paid to having our cloud resources deployed to the same availability zone. If we are to fail over to a secondary availability zone, all of the related resources should fail over together so that we do not encounter a situation in which an application server in one zone is forced to communicate with a database in another.
Go big or don't go
After accounting for geography, the best strategy to avoid latency involves optimizing how and when those communications take place.
When you're forced to cross the network, it's advisable to take all the data you might need with you. Clearly this is a double-edged sword because downloading 100 MB of data when you need 5 KB isn't a good solution, either. At the very least, inter-object "chit-chat" should not need to cross the network. Ultimately, though, experience is required to analyze the use for the data. Access to that data can then be optimized to balance the need for small download size and minimal amount of round trips. This is related to the 3rd fallacy, bandwidth is infinite, which will be covered next.
However, a better strategy is to remove the need to cross the network in the first place. The latency delay of a request you don't have to make is zero.
One obvious strategy is to utilize in-memory caching so that the latency cost is paid by the first request. This will be to the benefit of all those that come afterward; they can share the same saved response. But of course, caching isn't always a possibility, and it introduces its own problems. As it has been said, "There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors." Knowing when to invalidate a cached item, at least without asking the source and negating the benefit of the cache, is a hard thing to do.
Harness the power of events
With asynchronous messaging, you can publish events about information updates immediately, when they happen. Other systems that are interested in this data can then subscribe to receive these events and cache the data locally. This ensures that when the data is needed, it can be provided without requiring a network round trip.
Not every scenario requires this additional complexity, but when used appropriately, it can be very powerful. Instead of having to wait to request updated data, always-up-to-date data can be served instantly.
We can use a variety of strategies to minimize the number of times we must cross the network. We can carefully architect our system to minimize differences in geography, and we can optimize our network communications so that we return all the information that might be needed, minimizing cross-network chit-chat. This requires experience in carefully analyzing system use patterns in order to optimize how information is delivered between systems.
We can also use a variety of caching strategies, including in-memory cache and content delivery networks, to minimize repeated requests for information. CDNs in particular are useful for moving content "closer" to the end user and minimizing the latency for those specific items.
Publishing events when data changes can be instrumental in managing a distributed infrastructure. Being notified of changes can help disconnected systems remain in sync.
The speed of light may be slow, but it doesn't have to keep you down.
About the author: David Boike is a developer at Particular Software who has never been on a deserted island but is glad to hear there's rum.