Things I wish I knew before I got started with CosmosDB

Lately I've been working on a project using Cosmos DB on Azure. I really like the simplicity of a document DB. However there were some things I learned while on the job which I wished somebody told me before starting with it.

Cosmos DB?

Cosmos DB is Microsoft's flavoured document DB, you could call it Microsoft's Mongo DB that runs on Azure. On contrary to a normal SQL database, a document DB stores documents in the database. There is no schema, so you don't need to define columns and data types. Instead you pass a document to the document database and it will store it in JSON for you. This also means that you can add and remove properties on a document without having to change other documents which store similar information. An example: if there are 2 documents in the database which represent a User, the first User can have an address and it's perfectly possible for the second User to not have an address.

First user:

{
	id: "41bb417e-c942-4df2-96f0-cd2c3e1b2f91",
	name: "User 1",
	street: "My street",
	number: 55,
	postalCode: 1234,
	city: "Antwerp"
}

Second user:

{
	id: "cc4f3a5b-5e1c-4cdf-8167-dbd363538e26",
	name: "User 2"
}

Now we got the basics, let's look into some operations.

Parameterized query

Let's start with the very basic. If you'd like to run a query using parameters with C#, the SqlQuerySpec class can be used and passed to the DocumentClient. Both classes can be found in the Microsoft.Azure.DocumentDB package.

var querySpec = new SqlQuerySpec {
	QueryText = "select * from c where c.id = @id",
	Parameters = new SqlParameterCollection {
		new SqlParameter { 
			Name = "@id",
			Value = userId
		}
	}
}

// These properties can be found in the Azure portal.
var documentDbUri = "cosmosdb-uri";
var databaseId = "cosmosdb-database";
var collectionId = "cosmosdb-collection";

var documentClient = new DocumentClient(new Uri(documentDbUri), documentDbKey);
var database = documentClient.CreateDatabaseQuery().FirstOrDefault(d => d.Id == databaseId);
var collection = documentClient.CreateDocumentCollectionQuery().FirstOrDefault(c => c.Id == collectionId);
var queryResult = documentClient.CreateQuery(collection.DocumentsLink, query);

ARRAY_CONTAINS

This function's name pretty much speaks for itself. If a document has a property which is an array, ARRAY_CONTAINS can be used in a query to check if the array contains a certain value.

An example, this is our document:

{
	id: "9476d2bb-6f19-483f-9a59-446ddc693050",
	name: "Jane Doe",
	roles: [
		"manager",
		"reporting"
	]
}

To query all the users that have the manager role:

select * from c where ARRAY_CONTAINS(c.roles, "manager")

If the property is an array of objects, you can also pass an object to the ARRAY_CONTAINS function. More info can be found in this Stackoverflow question

SELECT  *
FROM    food as f
WHERE   ARRAY_CONTAINS(f.servings, {"description":"bar"}, true)

Another great use of this function is using it as an alternative for the traditional SQL in keyword. If you would want to check if an ID is in an array that is passed as a parameter you would write something like this:

SELECT *
FROM Orders
WHERE CustomerId in (@customerIds)

@customerIds would then be an array of ids which is passed from C#.

To do a similar query on a CosmosDB instance, ARRAY_CONTAINS can be used. The array is passed as a parameter and ARRAY_CONTAINS can be used to check if the id is in the given parameter.

var customerIds = new List<int> { 5, 6, 7, 8 };

var querySpec = new SqlQuerySpec {
	QueryText = "select * from c where ARRAY_CONTAINS(@customerIds, c.customerId)",
	Parameters = new SqlParameterCollection {
		new SqlParameter { 
			Name = "@customerIds",
			Value = customerIds
		}
	}
}

Count

If you want to know how many records are in the result of a given query, there are two ways of getting the result. Using the LINQ Count method.

// These properties can be found in the Azure portal.
var documentDbUri = "cosmosdb-uri";
var databaseId = "cosmosdb-database";
var collectionId = "cosmosdb-collection";

var documentClient = new DocumentClient(new Uri(documentDbUri), documentDbKey);
var database = documentClient.CreateDatabaseQuery().FirstOrDefault(d => d.Id == databaseId);
var collection = documentClient.CreateDocumentCollectionQuery().FirstOrDefault(c => c.Id == collectionId);

var count = documentClient.CreateDocumentQuery<User>(collection.DocumentsLink)
						  .Where(u => u.city == "Antwerp")
						  .Count();

Or you can use the COUNT keyword in a SQL query, this query can be ran using a parameterized query.

SELECT VALUE COUNT(d) FROM d where city = 'Antwerp'

Read more about count in Cosmos DB

Client

Although the Azure portal allows to do any kind of operation on the document database, it's a very clunky way to work. I'm a big fan of using a dedicated client. Although the Microsoft Azure Storage Explorer still has some rough edges, the Cosmos DB support is still in preview, it works well. And besides managing documents in a Cosmos DB on Azure you can also manage your storage account.

Another option is the open source Cosmos DB explorer but it has less options than the Storage Explorer.

Like keyword

In traditional SQL the like keyword can be used to filter a string field on a certain value. CosmosDB foresees in equivalents for the like keyword. Let's go over the different options.

Show all rows where a specific column contains a certain value

SQL:

select * from Table where Column like '%value%'

CosmosDB:

select * from d where CONTAINS(d.Column, 'value')

Show all rows where a specific column starts with a certain value

SQL:

select * from Table where Column like 'value%'

CosmosDB:

select * from d where STARTSWITH(d.Column, 'value')

Show all rows where a specific column ends with a certain value

SQL:

select * from Table where Column like '%value'

CosmosDB:

select * from d where ENDSWITH(d.Column, 'value')

Conclusion

As you can see if you compare CosmosDB with a relational SQL database there are some things that work quite different. I hope this gives you a head start with Cosmos DB. Feel free to ping me if you have any other tips or remarks.