Sharding recovery
In this section, we will explore different failure types and how we can recover in a sharded environment.
Mongos
mongos
is a relatively lightweight process that holds no state. In the case that the process fails we can just restart it or spin up a new process in a different server. It's recommended that mongos
processes co-locate in the same server as our application and so it makes sense to connect from our application using the set of mongos
servers that we have co-located in our application servers to ensure high availability of mongos
processes.
Mongod process
A mongod
process failing in a sharded environment is no different than it failing in a replica set. If it is a secondary, the primary and the other secondary (assuming three-node replica sets) will continue as usual.
If it is a mongod
process acting as a primary, then an election round will start to elect a new primary in this shard (which is really; a replica set).
In both cases, we should actively monitor and try...