What is sharding in MongoDB?



To my understanding, sharding is a way to apply a distributed system to MongoDB. With sharding, you can distribute the workload of storing, processing collections of data in MongoDB.


Sharding helps with performance, throughput, and fault tolerance of your system.

MongoDB can be configured to using sharding where replica sets of data are distributed across different physical machines. This makes for good "horizontal scaling" (as opposed to vertical scaling) where commodity hardware can be leveraged to distribute the work load across different machines.

Sharding is good for several reasons. Apart from obviously increasing the amount of data you can store, sharding also increases availability and throughput.


Sharding is a method for splitting a data set across different machines. Every shard is a subset of the collective data. Each shard is a replica set of the data.


Snarfing is the process of splitting data across different servers as replica sets. Each shard holds a portion of the sharded data. This helps let’s high availability and storage capacity. Sharding is fundamental to horizontal scaling.