Added more information about slave election in Redis Cluster alternative doc

2010-04-29 15:39:11 +02:00 · 2010-04-29 15:39:11 +02:00 · 0ce7679849
commit 0ce7679849
parent 5bdb384ff0
1 changed files with 62 additions and 0 deletions
--- a/design-documents/REDIS-CLUSTER-2
+++ b/design-documents/REDIS-CLUSTER-2
@ -278,4 +278,66 @@ to the same hash slot. In order to guarantee this, key tags can be used,
 where when a specific pattern is present in the key name, only that part is
 hashed in order to obtain the hash index.
 Random remarks
 ==============
 - It's still not clear how to perform an atomic election of a slave to master.
 - In normal conditions (all the nodes working) this new design is just
  K clients talking to N nodes without intermediate layers, no routes:
  this means it is horizontally scalable with O(1) lookups.
 - The cluster should optionally be able to work with manual fail over
  for environments where it's desirable to do so. For instance it's possible
  to setup periodic checks on all the nodes, and switch IPs when needed
  or other advanced configurations that can not be the default as they
  are too environment dependent.
 A few ideas about client-side slave election
 ============================================
 Detecting failures in a collaborative way
 -----------------------------------------
 In order to take the node failure detection and slave election a distributed
 effort, without any "control program" that is in some way a single point
 of failure (the cluster will not stop when it stops, but errors are not
 corrected without it running), it's possible to use a few consensus-alike
 algorithms.
 For instance all the nodes may take a list of errors detected by clients.
 If Client-1 detects some failure accessing Node-3, for instance a connection
 refused error or a timeout, it logs what happened with LPUSH commands against
 all the other nodes. This "error messages" will have a timestamp and the Node
 id. Something like:
    LPUSH __cluster__:errors 3:1272545939
 So if the error is reported many times in a small amount of time, at some
 point a client can have enough hints about the need of performing a
 slave election.
 Atomic slave election
 ---------------------
 In order to avoid races when electing a slave to master (that is in order to
 avoid that some client can still contact the old master for that node in
 the 10 seconds timeframe), the client performing the election may write
 some hint in the configuration, change the configuration SHA1 accordingly and
 wait for more than 10 seconds, in order to be sure all the clients will
 refresh the configuration before a new access.
 The config hint may be something like:
 "we are switching to a new master, that is x.y.z.k:port, in a few seconds"
 When a client updates the config and finds such a flag set, it starts to
 continuously refresh the config until a change is noticed (this will take
 at max 10-15 seconds).
 The client performing the election will wait that famous 10 seconds time frame
 and finally will update the config in a definitive way setting the new
 slave as mater. All the clients at this point are guaranteed to have the new
 config either because they refreshed or because in the next query their config
 is already expired and they'll update the configuration.
 EOF