CodeMash 2013 Precompiler Day One, Session Two


I am going to “live blog” my notes from CodeMash as it goes. I’ll write a post during each session. My commentary will be in italics.

Sridhar Nanjundeswaran is the dev lead for MongoDB on Azure, so that makes him a useful person to know, I think. 🙂

MongoDB is a scalable, high-performance, open source, document-oriented database. Data is stored in BSON, which is optimized for fast querying. MongoDB trades off some of the speed of a key-value store for a lot of the functionality of a relational store. The big mixing features are transactions and joins.

Data is kept in memory-mapped files with b-tree indexes. It is implemented in C++ and runs on pretty much every environment. There are drivers for most development environments/languages. There is a mongo shell that you can use to access it directly. There are drivers from third parties as well for other environments. The drivers translate the BSON to native types, which allows developers to use the idioms of their language.

Table Collection
Row(s) Document
Index Index
Partition Shard
Join Embedding/Linking
Fixed Schema Flexible/Implied Schema

There is a a transactional guarantee around single document saves, and when you consider that what would have been in a relational transaction across multiple tables is generally all in a single document. So, for the use cases where you would want to use MongoDB, you have sufficient transactional integrity.

MongoDB has extensions to JSON to facilitate some additional data types. BSON spec defines these. In general, MongoDB will take more space than a relational DB for the same data due to the schema being embedded in the document.

MongoDB is very light in it’s use of CPU, but demands lots of memory and disk (relative to a similar relational DB).

_id is the “primary key” for an element in collection. ObjectId is a 12 byte value that is like a Guid but smaller (Guids are 16 bytes). ObjectIds are guaranteed unique across a cluster.

The MongoDB Shell is sort of like SQL Server Query Analyzer except it uses JavaScript instead of SQL.

Counts in MongoDB can be slow – the b-tree index used is not optimized for counting, but the next version of MongoDB is addressing this.

The JavaScript looks very much like twisted SQL, but it allows you to use Map-Reduce as well. MongoDB supports Hadoop, too.

A document in MongoDB can be up to 16 MB. GridFS is a mapping system to allow you get get around this if you need to.

You can change a document’s data or schema at any time, but you cannot do a push to an existing field that is a scalar. You would need to rewrite the entire entity.

“Normalization” gets really tricky. Do you store the child entities in the document of their parent, or so you store some sort of link/reference and store the child as it’s own collection. The trade offs are how hard is it too query, storage, performance, and maintainability. So, how do you update the entity if it changes and is used in a lot of documents? Sometimes that is acceptable and sometimes not. You have to determine what makes sense for your case.

There is a DBRef, but it is really just syntactic sugar that generally does not add an value. It is NOT a form of foreign key constraint as people often try to use it.

Indexes are the single biggest tunable performance factor in MongoDB (just like in relational databases). Too many indexes has the same problems as in relational – writes become slow, heavy resource usage (memory in this case is the bigger deal). Indexes get built during the write operation, same as in relational. Generally, the same rules and practices apply to MongoDB indexes as relational.

You can actually create unique constraints via indexes. The only database constraint you can enforce in Mongo. You can recreate the index as well, but that is a blocking operation.

You can create an index that defines a time to live which will delete the document when the TTL expires. There is a process that does this work every minute.

A collection cannot have more than 64 indexes (which is way more than practical). The size of the key values in the index cannot exceed 1024 characters and the name cannot exceed 127 bytes. You can only use one index per query. You can not sort more than 32 MB of data without there being an index on it. MongoDB indexes are case sensitive.

Append .explain() to any JavaScript query to see it’s query plan. There is also a profiler you can turn on that will profile everything or any query that exceeds a threshold. Several query plans are tried, the winning one is cached. It remains cached for some period of time, but not forever.

Replicate you database to multiple other nodes. One node is the primary. If it dies, the remaining nodes vote on who should become the new primary. In the event of a tie, there won’t be a primary and we’ll have a read-only situation. To avoid that, make one non-primary node the arbitrator who will ensure that someone wins the election to primary. You can also set priorities for the nodes to help define failover order. You can define a node as “hidden” which will prevent it from ever becoming the primary. You might do this for a reporting node.

You can set replication delays. Replication is asynchronous and pull data from the primary. You might set a node to delay replication (slaveDelay) so that you get a bit of a backup that will still have things that might be dropped or other destructive events. You would want to mark this node as hidden as well.

Replication is eventually consistent, but if you require things to be always consistent you could query the primary always. There are times when you need that and times when you don’t. There are some very complex procedures around how to configure replication and primary voting – you can probably cover yourself for most scenarios you would need. You can replicate to a node that is physically not in the same data center, which allows for disaster recovery.

You can define how you read data – Primary Only, Primary preferred, Secondary only, Secondary preferred, or nearest. If you need consistent reads, use primary only or preferred.

In order to get disaster recovery, you need to have multiple data centers (more than 2) with more than one node per data center.

Heartbeat goes out every 2 seconds. It times out after 10 seconds. A missed heartbeat means a node is down and triggers the election if it is the primary.

There is an oplog that tells what has happened in replication. It is replayable. There is no multi-operation transactions, but every single operation is transactional – it succeeds or fails all on it’s own.

Bad things happen when the clocks of a replica set get out of sync. Keep them in sync.

The slides for these presentations are generally here, I think: