Prev Next

Database / Azure Cosmos DB interview questions

How do you model one-to-many relationships in Cosmos DB?

Cosmos DB is a document database with no foreign key enforcement and no JOIN between containers, so one-to-many relationship modeling requires deliberate design decisions based on your access patterns. There are three main strategies, each with different tradeoffs:

1. Embedding (denormalization) — Store the "many" items directly inside the "one" parent document as an array. A blog post with its comments embedded:

{
  "id": "post-101",
  "title": "Cosmos DB tips",
  "comments": [
    { "id": "c1", "text": "Great post!", "author": "alice" },
    { "id": "c2", "text": "Helpful!", "author": "bob" }
  ]
}

This is ideal when the "many" side is small, always accessed with the parent, and not queried independently. One point read fetches everything. Avoid this when the array can grow unboundedly — it could approach the 2 MB per-item limit and degrade write performance as every comment addition rewrites the entire document.

2. Referencing with cross-container lookup — Store comments in a separate container with postId as the partition key. The post document contains only an ID reference. Query comments by WHERE c.postId = 'post-101'. Suitable when comments are numerous, independently paginated, or subject to independent TTL. The tradeoff: two operations to fetch a post with its comments.

3. Denormalized buckets (time-bounded embedding) — A hybrid that bounds the array size by grouping related items into "bucket" documents (e.g., one document per month of comments per post). Each bucket holds up to N comments, new buckets are created as limits are hit. More complex but avoids both the unbounded growth problem and the separate container overhead for read-heavy data.

The right model depends almost entirely on: how often the relationship is read together vs. independently, how large the "many" side can grow, and which side drives the query access pattern.

What is the primary risk of embedding an unbounded array of child items inside a parent document in Cosmos DB?
If comments are stored in a separate container partitioned by postId, what query pattern retrieves all comments for a given post efficiently?

Invest now in Acorns!!! 🚀 Join Acorns and get your $5 bonus!

Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!

Earn passively and while sleeping

Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.

Invest now!!! Get Free equity stock (US, UK only)!

Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.

The Robinhood app makes it easy to trade stocks, crypto and more.


Webull! Receive free stock by signing up using the link: Webull signup.

More Related questions...

What is Azure Cosmos DB and what problems does it solve? What are the different APIs available in Azure Cosmos DB? What is a partition key in Azure Cosmos DB and why is choosing it correctly so important? What are Request Units (RU/s) in Azure Cosmos DB? What are the five consistency levels in Azure Cosmos DB? How does global distribution work in Azure Cosmos DB? What is the Cosmos DB Change Feed and what are its main use cases? What is provisioned throughput vs autoscale vs serverless in Cosmos DB? How does indexing work in Azure Cosmos DB? What is Time to Live (TTL) in Cosmos DB and how do you configure it? What is a stored procedure in Cosmos DB and what are its limitations? What is the difference between a point read and a query in Cosmos DB? What is the Cosmos DB NoSQL query language and how does it differ from standard SQL? What is the Cosmos DB transactional batch API? What is Cosmos DB Integrated Cache and how does it reduce RU consumption? How does optimistic concurrency work in Azure Cosmos DB? What is hierarchical partition keys in Cosmos DB and when do you use it? What is the Cosmos DB Bulk Executor and how do you use bulk operations in the SDK? What are Cosmos DB triggers and user-defined functions (UDFs)? How does Cosmos DB handle conflicts in multi-region write (multi-master) setups? What is the Cosmos DB Emulator and how is it used in development? What is Cosmos DB for MongoDB API and what version compatibility does it provide? What is the Cosmos DB analytical store and Azure Synapse Link? What are Cosmos DB materialized views and how do they differ from containers? How does Cosmos DB pricing work and what are the key cost drivers? What is the Cosmos DB Gremlin API and what is it optimized for? How does Cosmos DB backup and restore work? What is the Cosmos DB Patch API and how does it differ from Replace? What is the Cosmos DB Cassandra API and how does CQL map to Cosmos DB concepts? How do you model one-to-many relationships in Cosmos DB? What is the Cosmos DB free tier and what does it include? What is the Cosmos DB SDK and what are the key client configuration options? What is the Cosmos DB Table API and when would you migrate from Azure Table Storage to it? How does Cosmos DB handle security and access control?
Show more question and Answers...

MuleESB

Comments & Discussions