632 Commits

Author SHA1 Message Date
antirez
a8c26a0397 Cluster: configdigest field no longer used. Removed. 2013-04-09 11:07:25 +02:00
antirez
9daa232d42 Cluster: properly send ping to nodes not pinged foro too much time.
In commit de720e4 it was introduced the concept of sending a ping to
every node not receiving a ping since node_timeout/2 seconds.
However the code was located in a place that was not executed because of
a previous conditional causing the loop to re-iterate.

This caused false positives in nodes availability detection.

The current code is still not perfect as a node may be detected to be in
PFAIL state even if it does not reply for just node_timeout/2 seconds
that is not correct. There is a plan to improve this code ASAP.
2013-04-08 19:40:20 +02:00
antirez
0c0db1bc3d Cluster: node timeout is now configurable. 2013-04-04 12:29:10 +02:00
antirez
2e9c57f2aa Cluster: turn hardcoded node timeout multiplicators into defines.
Most Redis Cluster time limits are expressed in terms of the configured
node timeout. Turn them into defines.
2013-04-04 12:04:11 +02:00
antirez
3923eba8b4 Cluster: when slave changes master, remove it from the old master. 2013-03-25 15:01:25 +01:00
antirez
ed244901f0 Cluster: set node role on successful handshake. 2013-03-25 13:03:01 +01:00
antirez
970c1f0bd5 Cluster: comment no longer in sync with code removed. 2013-03-21 10:47:10 +01:00
antirez
6a7f26a1ae Cluster: clear the PROMOTED slave directly into clusterSetMaster().
This way we make sure every time a master is turned into a replica
the flag will be cleared.
2013-03-20 11:51:44 +01:00
antirez
c03921d62b Cluster: master node must clear its hash slots when turning into a slave.
When a master turns into a slave after a failover event, make sure to
clear the assigned slots before setting up the replication, as a slave
should never claim slots in an explicit way, but just take over the
master slots when replacing its master.
2013-03-20 11:32:35 +01:00
antirez
20998c9f35 Cluster: new flag PROMOTED introduced.
A slave node set this flag for itself when, after receiving authorization
from the majority of nodes, it turns itself into a master.

At the same time now this flag is tested by nodes receiving a PING
message before reconfiguring after a failover event. This makes the
system more robust: even if currently there is no way to manually turn
a slave into a master it is possible that we'll have such a feature in
the future, or that simply because of misconfiguration a node joins the
cluster as master while others believe it's a slave. This alone is now
no longer enough to trigger reconfiguration as other nodes will check
for the PROMOTED flag.

The PROMOTED flag is cleared every time the node is turned back into a
replica of some other node.
2013-03-20 10:48:42 +01:00
antirez
27c1fe7c94 Cluster: add sender flags in cluster bus messages header.
Sender flags were not propagated for the sender, but only for nodes in
the gossip section. This is odd and in the next commits we'll need to
get updated flags for the sender node, so this commit adds a new field
in the cluster messages header.

The message header is the same size as we reused some free space that
was marked as 'unused' because of alignment concerns.
2013-03-20 10:32:00 +01:00
antirez
fcc1f71b1e Cluster: turn old master into a replica of node that failed over.
So when the failing master node is back in touch with the cluster,
instead of remaining unused it is converted into a replica of the
new master, ready to perform the fail over if the new master node
will fail at some point.

Note that as a side effect clients with stale configuration are now
not an issue as well, as the node converted into a slave will not
accept queries but will redirect clients accordingly.
2013-03-20 00:30:47 +01:00
antirez
44b4b45ae0 Cluster: node replication role change handle improved.
The code handling a master that turns into a slave or the contrary was
improved in order to avoid repeating the same operations. Also
the readability and conceptual simplicity was improved.
2013-03-19 16:01:30 +01:00
antirez
e7e092cede Cluster: new command CLUSTER FLUSHSLOTS.
It's just a simpler way to CLUSTER DELSLOTS with all the slots as
arguments, in order to obtain a node without assigned slots for
reconfiguration.
2013-03-19 09:58:05 +01:00
antirez
27b04ed380 Cluster: when failing over claim master slots. 2013-03-15 16:53:41 +01:00
antirez
0e3ebe454c Cluster: log when a slave asks for failover authorization. 2013-03-15 16:44:08 +01:00
antirez
300c6c17aa Cluster: slaves start failover with a small delay.
Redis Cluster can cope with a minority of nodes not informed about the
failure of a master in time for some reason (netsplit or node not
functioning properly, blocked, ...) however to wait a few seconds before
to start the failover will make most "normal" failovers simpler as the
FAIL message will propagate before the slave election happens.
2013-03-15 16:39:49 +01:00
antirez
acddbd3000 Cluster: a bit more serious node role change handling. 2013-03-15 16:35:16 +01:00
antirez
e9f97a54c5 Cluster: remove node from master slaves when it turns into a master.
Also, a few nearby comments improved.
2013-03-15 16:16:19 +01:00
antirez
5a86ea09a8 Cluster: slave failover implemented. 2013-03-15 16:11:34 +01:00
antirez
fa4c42f230 Cluster: election -> promotion in two comments. 2013-03-15 15:44:49 +01:00
antirez
24625432b5 Cluster: added function to broadcast pings.
See the function top-comment for info why this is useful sometimes.
2013-03-15 15:43:58 +01:00
antirez
b0497233f1 Cluster: don't broadcast messages to HANDSHAKE nodes.
Also don't check for NOADDR as we check that node->link is not NULL
that's enough.
2013-03-15 15:36:36 +01:00
antirez
e2662bdfcd Cluster: fix clusterHandleSlaveFailover() conditional: quorum is enough. 2013-03-15 13:20:34 +01:00
antirez
943e41223c Cluster: two lame bugs fixed in FAILOVER AUTH messages generation. 2013-03-14 21:27:12 +01:00
antirez
2160788260 Cluster: code to process messages moved in the right if-else chain. 2013-03-14 21:21:58 +01:00
antirez
63e3bc7cb3 Cluster: handle FAILOVER_AUTH_ACK messages.
That's trivial as we just need to increment the count of masters that
received with an ACK.
2013-03-14 16:43:13 +01:00
antirez
0ef025313a Cluster: request failover authorization, log if we have quorum.
However the failover is yet not really performed.
2013-03-14 16:39:02 +01:00
antirez
6aec70fbc4 Cluster: clusterSendFailoverAuth() implementation. 2013-03-14 16:31:57 +01:00
antirez
8b90a5ebb2 Cluster: clusterSendFailoverAuthIfNeeded() work in progress. 2013-03-13 19:08:03 +01:00
antirez
3e23f6b2bf Cluster: handle FAILOVER_AUTH_REQUEST in clusterProcessPacket().
However currently the control is passed to a function doing nothing at
all.
2013-03-13 18:38:08 +01:00
antirez
9975fac7bb Cluster: sanity check FAILOVER_AUTH_REQUEST messages for proper length. 2013-03-13 17:31:26 +01:00
antirez
95f0799010 Cluster: use 'else if' for mutually exclusive conditionals. 2013-03-13 17:27:06 +01:00
antirez
444fd457d2 Cluster: FAILOVER_AUTH_REQUEST message type introduced.
This message is sent by a slave that is ready to failover its master to
other nodes to get the authorization from the majority of masters.
2013-03-13 17:21:20 +01:00
antirez
80158107bd Cluster: clusterHandleSlaveFailover() stub. 2013-03-13 13:10:49 +01:00
antirez
d1d1f6cad4 Cluster: call clusterHandleSlaveFailover() when our master is down. 2013-03-13 12:44:02 +01:00
antirez
bb4fef5a5a Cluster: update cluster state on PFAIL flag set/cleared on nodes. 2013-03-07 15:40:53 +01:00
antirez
75e7bb8fd5 Cluster: mark cluster state as fail of majority of masters is unreachable. 2013-03-07 15:36:59 +01:00
antirez
ca207a1fae Cluster: log global cluster state change. 2013-03-07 15:22:32 +01:00
antirez
311f9d5164 Cluster: clusterUpdateState() function simplified.
Also the NEEDHELP Cluster state was removed as it will no longer be
used by Redis Cluster.
2013-03-06 18:25:40 +01:00
antirez
72b16ffa96 Cluster: sdssplitargs_free() -> sdsfreesplitres(). 2013-03-06 12:38:06 +01:00
antirez
a0f2b39791 Cluster: connect to our master ASAP after startup if we are a slave node. 2013-03-05 16:12:08 +01:00
antirez
856a3160a9 Cluster: more robust FAIL flag cleaup.
If we have a master in FAIL state that's reachable again, and apparently
no one is going to serve its slots, clear the FAIL flag and let the
cluster continue with its operations again.
2013-03-05 15:05:32 +01:00
antirez
24aa7a566e Cluster: new node field fail_time.
This is the unix time at which we set the FAIL flag for the node.
It is only valid if FAIL is set.

The idea is to use it in order to make the cluster more robust, for
instance in order to revert a FAIL state if it is long-standing but
still slots are assigned to this node, that is, no one is going to fix
these slots apparently.
2013-03-05 13:15:05 +01:00
antirez
a06092c576 Cluster: A comment updated in clusterCron(). 2013-03-05 12:17:30 +01:00
antirez
de720e4f0a Cluster: send a ping to every node we never contacted in timeout/2 seconds.
Usually we try to send just 1 ping every second, however when we detect
we are going to have unreliable failure detection because we can't ping
some node in time, send an additional ping.

This should only happen with very large clusters or when the the node
timeout is set to a very low value.
2013-03-05 12:16:02 +01:00
antirez
401893bc38 Cluster: set node->slaveof correctly when a node state is updated. 2013-03-05 11:50:11 +01:00
antirez
8eb9c13dcf Cluster: don't perform startup slots sanity check for slaves.
If we are a cluster node the DB content will not match our configured
slots. Don't do the check at all.
2013-03-04 19:47:00 +01:00
antirez
63cac938b3 Cluster: fix maximum line length when loading config.
There are pathological cases where the line can be even longer a single
node may contain all the slots in importing/migrating state.
2013-03-04 19:45:36 +01:00
antirez
9e34d02450 Cluster: actually setup replication in CLUSTER REPLICATE. 2013-03-04 15:27:58 +01:00