skip to main content
Using the driver : MongoDB Sharding
  

Try DataDirect Drivers Now

MongoDB Sharding

MongoDB employs sharding as a horizontal scaling solution for supporting large sets of data. It is designed to provide scalable throughput and storage capacity. To accomplish this, sharding shares logical databases across multiple independent replica sets (clustered servers), or shards. By distributing data across servers, operations are delegated only to the servers that store data relevant to the task. This increases the availability of servers and CPU capacity, resulting in increased throughput. If operations exceed the available processing capacity of the clusters, additional servers can be added to the cluster, which reduces the number of operations performed by each server and can improve performance. A similar principle applies to storage capacity, where servers can be added to accommodate increased storage requirements.
Caution: Although the driver connects to a MongoDB sharded cluster transparently, it is critical that the primary key column have unique values for every row (or document) in the table. The values for the default primary key of _ID are generated by a MongoDB database; however, in a sharded cluster, these values are not guaranteed to be unique across shards unless specifically configured in the MongoDB cluster. If duplicate identifiers are mapped to a relational view, write operations can produce undesired results. You can prevent this behavior by confirming that the values of _ID key are configured to be unique across shards, or by designating a new primary key. See Designating a Primary Key for additional information.