futriix

Author	SHA1	Message	Date
antirez	a8c26a0397	Cluster: configdigest field no longer used. Removed.	2013-04-09 11:07:25 +02:00
antirez	9daa232d42	Cluster: properly send ping to nodes not pinged foro too much time. In commit de720e4 it was introduced the concept of sending a ping to every node not receiving a ping since node_timeout/2 seconds. However the code was located in a place that was not executed because of a previous conditional causing the loop to re-iterate. This caused false positives in nodes availability detection. The current code is still not perfect as a node may be detected to be in PFAIL state even if it does not reply for just node_timeout/2 seconds that is not correct. There is a plan to improve this code ASAP.	2013-04-08 19:40:20 +02:00
antirez	0c0db1bc3d	Cluster: node timeout is now configurable.	2013-04-04 12:29:10 +02:00
antirez	2e9c57f2aa	Cluster: turn hardcoded node timeout multiplicators into defines. Most Redis Cluster time limits are expressed in terms of the configured node timeout. Turn them into defines.	2013-04-04 12:04:11 +02:00
antirez	3923eba8b4	Cluster: when slave changes master, remove it from the old master.	2013-03-25 15:01:25 +01:00
antirez	ed244901f0	Cluster: set node role on successful handshake.	2013-03-25 13:03:01 +01:00
antirez	970c1f0bd5	Cluster: comment no longer in sync with code removed.	2013-03-21 10:47:10 +01:00
antirez	6a7f26a1ae	Cluster: clear the PROMOTED slave directly into clusterSetMaster(). This way we make sure every time a master is turned into a replica the flag will be cleared.	2013-03-20 11:51:44 +01:00
antirez	c03921d62b	Cluster: master node must clear its hash slots when turning into a slave. When a master turns into a slave after a failover event, make sure to clear the assigned slots before setting up the replication, as a slave should never claim slots in an explicit way, but just take over the master slots when replacing its master.	2013-03-20 11:32:35 +01:00
antirez	20998c9f35	Cluster: new flag PROMOTED introduced. A slave node set this flag for itself when, after receiving authorization from the majority of nodes, it turns itself into a master. At the same time now this flag is tested by nodes receiving a PING message before reconfiguring after a failover event. This makes the system more robust: even if currently there is no way to manually turn a slave into a master it is possible that we'll have such a feature in the future, or that simply because of misconfiguration a node joins the cluster as master while others believe it's a slave. This alone is now no longer enough to trigger reconfiguration as other nodes will check for the PROMOTED flag. The PROMOTED flag is cleared every time the node is turned back into a replica of some other node.	2013-03-20 10:48:42 +01:00
antirez	27c1fe7c94	Cluster: add sender flags in cluster bus messages header. Sender flags were not propagated for the sender, but only for nodes in the gossip section. This is odd and in the next commits we'll need to get updated flags for the sender node, so this commit adds a new field in the cluster messages header. The message header is the same size as we reused some free space that was marked as 'unused' because of alignment concerns.	2013-03-20 10:32:00 +01:00
antirez	fcc1f71b1e	Cluster: turn old master into a replica of node that failed over. So when the failing master node is back in touch with the cluster, instead of remaining unused it is converted into a replica of the new master, ready to perform the fail over if the new master node will fail at some point. Note that as a side effect clients with stale configuration are now not an issue as well, as the node converted into a slave will not accept queries but will redirect clients accordingly.	2013-03-20 00:30:47 +01:00
antirez	44b4b45ae0	Cluster: node replication role change handle improved. The code handling a master that turns into a slave or the contrary was improved in order to avoid repeating the same operations. Also the readability and conceptual simplicity was improved.	2013-03-19 16:01:30 +01:00
antirez	e7e092cede	Cluster: new command CLUSTER FLUSHSLOTS. It's just a simpler way to CLUSTER DELSLOTS with all the slots as arguments, in order to obtain a node without assigned slots for reconfiguration.	2013-03-19 09:58:05 +01:00
antirez	27b04ed380	Cluster: when failing over claim master slots.	2013-03-15 16:53:41 +01:00
antirez	0e3ebe454c	Cluster: log when a slave asks for failover authorization.	2013-03-15 16:44:08 +01:00
antirez	300c6c17aa	Cluster: slaves start failover with a small delay. Redis Cluster can cope with a minority of nodes not informed about the failure of a master in time for some reason (netsplit or node not functioning properly, blocked, ...) however to wait a few seconds before to start the failover will make most "normal" failovers simpler as the FAIL message will propagate before the slave election happens.	2013-03-15 16:39:49 +01:00
antirez	acddbd3000	Cluster: a bit more serious node role change handling.	2013-03-15 16:35:16 +01:00
antirez	e9f97a54c5	Cluster: remove node from master slaves when it turns into a master. Also, a few nearby comments improved.	2013-03-15 16:16:19 +01:00
antirez	5a86ea09a8	Cluster: slave failover implemented.	2013-03-15 16:11:34 +01:00
antirez	fa4c42f230	Cluster: election -> promotion in two comments.	2013-03-15 15:44:49 +01:00
antirez	24625432b5	Cluster: added function to broadcast pings. See the function top-comment for info why this is useful sometimes.	2013-03-15 15:43:58 +01:00
antirez	b0497233f1	Cluster: don't broadcast messages to HANDSHAKE nodes. Also don't check for NOADDR as we check that node->link is not NULL that's enough.	2013-03-15 15:36:36 +01:00
antirez	e2662bdfcd	Cluster: fix clusterHandleSlaveFailover() conditional: quorum is enough.	2013-03-15 13:20:34 +01:00
antirez	943e41223c	Cluster: two lame bugs fixed in FAILOVER AUTH messages generation.	2013-03-14 21:27:12 +01:00
antirez	2160788260	Cluster: code to process messages moved in the right if-else chain.	2013-03-14 21:21:58 +01:00
antirez	63e3bc7cb3	Cluster: handle FAILOVER_AUTH_ACK messages. That's trivial as we just need to increment the count of masters that received with an ACK.	2013-03-14 16:43:13 +01:00
antirez	0ef025313a	Cluster: request failover authorization, log if we have quorum. However the failover is yet not really performed.	2013-03-14 16:39:02 +01:00
antirez	6aec70fbc4	Cluster: clusterSendFailoverAuth() implementation.	2013-03-14 16:31:57 +01:00
antirez	8b90a5ebb2	Cluster: clusterSendFailoverAuthIfNeeded() work in progress.	2013-03-13 19:08:03 +01:00
antirez	3e23f6b2bf	Cluster: handle FAILOVER_AUTH_REQUEST in clusterProcessPacket(). However currently the control is passed to a function doing nothing at all.	2013-03-13 18:38:08 +01:00
antirez	9975fac7bb	Cluster: sanity check FAILOVER_AUTH_REQUEST messages for proper length.	2013-03-13 17:31:26 +01:00
antirez	95f0799010	Cluster: use 'else if' for mutually exclusive conditionals.	2013-03-13 17:27:06 +01:00
antirez	444fd457d2	Cluster: FAILOVER_AUTH_REQUEST message type introduced. This message is sent by a slave that is ready to failover its master to other nodes to get the authorization from the majority of masters.	2013-03-13 17:21:20 +01:00
antirez	80158107bd	Cluster: clusterHandleSlaveFailover() stub.	2013-03-13 13:10:49 +01:00
antirez	d1d1f6cad4	Cluster: call clusterHandleSlaveFailover() when our master is down.	2013-03-13 12:44:02 +01:00
antirez	bb4fef5a5a	Cluster: update cluster state on PFAIL flag set/cleared on nodes.	2013-03-07 15:40:53 +01:00
antirez	75e7bb8fd5	Cluster: mark cluster state as fail of majority of masters is unreachable.	2013-03-07 15:36:59 +01:00
antirez	ca207a1fae	Cluster: log global cluster state change.	2013-03-07 15:22:32 +01:00
antirez	311f9d5164	Cluster: clusterUpdateState() function simplified. Also the NEEDHELP Cluster state was removed as it will no longer be used by Redis Cluster.	2013-03-06 18:25:40 +01:00
antirez	72b16ffa96	Cluster: sdssplitargs_free() -> sdsfreesplitres().	2013-03-06 12:38:06 +01:00
antirez	a0f2b39791	Cluster: connect to our master ASAP after startup if we are a slave node.	2013-03-05 16:12:08 +01:00
antirez	856a3160a9	Cluster: more robust FAIL flag cleaup. If we have a master in FAIL state that's reachable again, and apparently no one is going to serve its slots, clear the FAIL flag and let the cluster continue with its operations again.	2013-03-05 15:05:32 +01:00
antirez	24aa7a566e	Cluster: new node field fail_time. This is the unix time at which we set the FAIL flag for the node. It is only valid if FAIL is set. The idea is to use it in order to make the cluster more robust, for instance in order to revert a FAIL state if it is long-standing but still slots are assigned to this node, that is, no one is going to fix these slots apparently.	2013-03-05 13:15:05 +01:00
antirez	a06092c576	Cluster: A comment updated in clusterCron().	2013-03-05 12:17:30 +01:00
antirez	de720e4f0a	Cluster: send a ping to every node we never contacted in timeout/2 seconds. Usually we try to send just 1 ping every second, however when we detect we are going to have unreliable failure detection because we can't ping some node in time, send an additional ping. This should only happen with very large clusters or when the the node timeout is set to a very low value.	2013-03-05 12:16:02 +01:00
antirez	401893bc38	Cluster: set node->slaveof correctly when a node state is updated.	2013-03-05 11:50:11 +01:00
antirez	8eb9c13dcf	Cluster: don't perform startup slots sanity check for slaves. If we are a cluster node the DB content will not match our configured slots. Don't do the check at all.	2013-03-04 19:47:00 +01:00
antirez	63cac938b3	Cluster: fix maximum line length when loading config. There are pathological cases where the line can be even longer a single node may contain all the slots in importing/migrating state.	2013-03-04 19:45:36 +01:00
antirez	9e34d02450	Cluster: actually setup replication in CLUSTER REPLICATE.	2013-03-04 15:27:58 +01:00

... 8 9 10 11 12 ...

632 Commits