Cosmos DB Persistence — Questions & Answers

Written by Adam Ralph, Dan Kent, and David Boike on August 17, 2021

We recently released NServiceBus.Persistence.CosmosDB version 1.0, which provides saga and outbox storage for NServiceBus endpoints that is transactionally consistent with the business data you store in Cosmos DB.

This component was previously offered as a preview package. Now that it has reached version 1.0, our full support policy applies, including API stability and backporting of bugfixes.

Now that our Cosmos DB persistence has reached general availability, let’s answer some common questions about Cosmos DB and what it means to use Cosmos DB with NServiceBus.

What is Cosmos DB?
Is Cosmos DB serverless?
How does Cosmos DB pricing work?
What is Cosmos DB used for?
Which Cosmos DB API should I use?
How do I model data with Cosmos DB?
Why should I use NServiceBus Cosmos DB persistence?
Why use Cosmos DB for the NServiceBus outbox feature?
How do I move from Azure Tables to Cosmos DB?

🔗What is Cosmos DB?

Cosmos DB is the primary database service provided by Azure. Microsoft guarantees response times of less than 10ms and 99.999% availability ¹ with lots of cool features.

Cosmos DB is a polyglot, NoSQL database that supports multiple APIs, but when people say “Cosmos DB,” they usually mean the Core (SQL) API. But what does it mean for a NoSQL database to have an SQL API? It isn’t SQL in the truest sense—you don’t get an IDbConnection, you don’t get INSERT, UPDATE, or DELETE, but you do get an SQL-like SELECT API for querying data.

Under the hood, each “record” in Cosmos DB is a JSON document with an ID and partition key that, together, define its globally unique ID. You can use the SQL-like query language to query over multiple documents, or if you have the ID and partition key, you can look up a single document with a point read—the cheapest query possible.

🔗Is Cosmos DB serverless?

People ascribe a couple of different meanings to the word serverless so let’s take a look at both.

Cosmos DB is a Platform-as-a-Service (PaaS) offering, which means thinking about servers isn’t something you have to do. You never need to patch your host machine or Cosmos DB instance. In that sense, it is serverless.

From a billing perspective, Cosmos DB offers two different modes. One of those is a serverless mode where you only pay for what you use, although you’ll get better performance from the provisioned throughput mode, where you pay to have a certain amount of capacity allocated to you.

🔗How does Cosmos DB pricing work?

Pricing for Cosmos DB is quite a bit different from other databases. All billing is normalized by Request Units (or RUs, for short). You don’t have to worry about managing CPU, IOPS, or memory with Cosmos DB—that’s Microsoft’s problem. Those physical concepts are abstracted away behind the RU, where a point read for a 1KB item is 1 RU. Other operations cost more RUs, as do operations that operate on more data or with more complexity. Still, the same operation over the same data will always cost the same number of RUs. Microsoft does have a Cosmos DB Capacity Calculator where you can estimate how much a workload will cost.

Cosmos DB is available in serverless mode which is ideal for spikes or unpredictable workloads that don’t have sustained traffic. It’s also great for a non-production scenario or for building a proof of concept. It only costs money when it’s used.

There is also provisioned throughput mode, which is ideal for critical workloads. You pick a maximum number of RUs per second and get billed per hour. Provisioned throughput mode also has a free tier, where one Cosmos DB account on your Azure subscription gets the first 1,000 RU/sec and 25GB of storage for free—ideal for learning something new.

🔗What is Cosmos DB used for?

“All sorts of things” is the obvious answer. But where Cosmos DB really shines is for systems that require worldwide low latency with flexible pricing. You don’t have to allocate a gigantic SQL Server instance in one data center and then figure out how to make that scale. Instead, your data is effectively everywhere.

Cosmos DB is a great fit for building a service that offers a free tier and then later transitions customers to paid and premium tiers.

Customers start out on a free tier, with all of their data arranged with different partition keys on a shared container, all sharing RUs within that container. Arranging tenants this way in a multi-tenant solution is subject to the noisy neighbor problem, ² but for a free tier, that’s OK. This makes the marginal cost of adding an additional customer on the free tier essentially zero.

Later, when the customer upgrades to a paid plan, you can use Cosmos DB’s change feed feature to do a live migration of data to a different container with provisioned throughput, paid for according to the customer’s payment plan.

For a demonstration of how this live migration feature works, check out our webinar recording Building multi-tenant systems using NServiceBus and Cosmos DB.

🔗Which Cosmos DB API should I use?

When most people talk about using Cosmos DB, they’re talking about the Core (SQL) API. This is the API used by NServiceBus.Persistence.CosmosDB. We only expose the transaction when using the Core API, so if you want to use NServiceBus with Cosmos DB, the Core (SQL) API is the one to use.

But for completeness, Cosmos DB currently has five different APIs to choose from:

Core (SQL)
API for MongoDB
Cassandra
Azure Table
Gremlin (Graph)

The Core (SQL) API is the native Cosmos DB API, and it’s designed for building new stuff. MongoDB, Cassandra, and Azure Table storage APIs are designed to migrate existing data from those databases to Cosmos DB. Gremlin is intended for unique workloads focused on relationships between data. You might use this if you were designing a new competitor to Facebook.

🔗How do I model data with Cosmos DB?

It’s essential to understand how Cosmos DB’s data structure affects data modeling for a system.

Within a Cosmos DB account, you can have one or more databases. Within each database, you can have one or more containers, up to a limit of 25. In addition, you can have many logical partitions within each container defined by a partition key, which is a string.

At the account level, you can configure geo-replication settings. At the database level, you can configure your throughput using a manual number of RUs per second, or you can set an entire database to autoscale. Finally, at the container level, you can also set throughput, as well as define a partition key. One other important limitation is that each logical partition is limited to 20GB of data.

These settings and limitations introduce a unique set of tradeoffs. To get a better idea of how to approach data modeling in a Cosmos DB system, check out our webinar recording Building multi-tenant systems using NServiceBus and Cosmos DB.

Watch Building multi-tenant systems using NServiceBus and Cosmos DB now

🔗Why should I use NServiceBus Cosmos DB persistence?

The main reason to use NServiceBus Cosmos DB persistence is when your business data is going to be stored in Cosmos DB. In fact, that’s the main reason to choose any of our persistence packages.

To use the NServiceBus outbox feature, you need to store the outbox data in the same place as your business data using the same local database transaction. Our Cosmos DB persistence uses Cosmos DB transactional batch operations to store any changes your message handlers make to business data and outbox data in the same atomic transaction.

🔗Why use Cosmos DB for the NServiceBus outbox feature?

One challenge of developing for the cloud is that it’s hard to keep things consistent unless you make each and every operation idempotent, and idempotence is hard.

When you build a system with NServiceBus, each message represents a discrete operation. If it fails, it’s no big deal—it can be retried until it’s successful. All that this requires is that the messaging operations and data operations are kept consistent.

Even that can be a challenge, but not for NServiceBus with Cosmos DB persistence.

The NServiceBus outbox feature keeps whatever you’re doing with your database consistent with messages being sent and received, so that you don’t get zombie records ³ or ghost messages. ⁴

NServiceBus uses Cosmos DB to store information about what messages are being sent from a message handler right along with your business data in the same transactional batch so that either everything succeeds or everything fails and can be safely retried.

🔗How do I move from Azure Tables to Cosmos DB?

It may be time to start thinking about migrating any data you have in Table storage to Cosmos DB. We have a tool to make that really easy for your NServiceBus data.

Cosmos DB is clearly positioned as the successor to Azure Table storage. For example, at the time this article was published, Microsoft’s Table storage page showed the following banner:

Use Azure Cosmos DB’s support for Tables API to take advantage of global distribution, automatic indexing and rich query, dedicated throughput, and single digit millisecond latencies.

This seems to indicate the direction that Microsoft is taking. It’s also worth noting that Microsoft has included the Table API as one of the Cosmos DB API options to make migrations from Table storage easier.

For our customers who are using either of our Azure Table storage packages ⁵ we’ve created an AzureTable export tool which allows you to export your saga data from Azure Table storage as JSON documents which can then be imported to Cosmos DB using the Data Migration tool provided by Microsoft.

🔗Summary

Cosmos DB is a great data platform for building new systems designed for the cloud.

To learn more about Cosmos DB with NServiceBus, check out:

Share on Twitter

About the authors

Adam Ralph is a developer on a personal voyage to rid the world of whitespace errors.

Dan Kent is a hoopy frood who misplaced his towel and had to become a software developer instead.

David Boike is a developer who dreamed of exploring the Cosmos, but is a bit too tall for a space suit and mostly writes code in his basement.

For reference, in one year, that's 5 minutes 16 seconds of downtime.
A noisy neighbor problem is where one tenant in a multi-tenant hosting scenario monopolizes too many of the resources such that performance is impacted for the other tenants sharing the same resources.
Data committed to the database but without an accompanying message sent out due to a failure between the database commit and message send operations. See What does idempotence mean? for more details.
A message sent before a failure occurs that can't be recalled, so the message refers to data that never got saved to the database. See What does idempotence mean? for more details.
The two package names are NServiceBus.Persistence.AzureStorage and NServiceBus.Persistence.AzureTable. They're the same thing, we just renamed the package between versions 2 and 3 to make it more clear what it does.