27398 Commits

Author SHA1 Message Date
antirez
fe8352540f AOF: don't abort on write errors unless fsync is 'always'.
A system similar to the RDB write error handling is used, in which when
we can't write to the AOF file, writes are no longer accepted until we
are able to write again.

For fsync == always we still abort on errors since there is currently no
easy way to avoid replying with success to the user otherwise, and this
would violate the contract with the user of only acknowledging data
already secured on disk.
2014-02-12 16:11:36 +01:00
antirez
283a633f98 Cluster: clusterDelNode(): remove node from master's slaves. 2014-02-11 10:34:25 +01:00
antirez
db6d628c3e Cluster: clusterDelNode(): remove node from master's slaves. 2014-02-11 10:34:25 +01:00
antirez
cfc5f8f67c Cluster: UPDATE messages are the norm and verbose.
Logging them at WARNING level was of little utility and of sure disturb.
2014-02-11 10:18:24 +01:00
antirez
5e0e03be41 Cluster: UPDATE messages are the norm and verbose.
Logging them at WARNING level was of little utility and of sure disturb.
2014-02-11 10:18:24 +01:00
antirez
234fafca84 Cluster: redis-trib fix: handling of another trivial case. 2014-02-11 10:13:18 +01:00
antirez
8251d2d150 Cluster: redis-trib fix: handling of another trivial case. 2014-02-11 10:13:18 +01:00
antirez
6d1d5542fc Cluster: configEpoch assignment in SETNODE improved.
Avoid to trash a configEpoch for every slot migrated if this node has
already the max configEpoch across the cluster.

Still work to do in this area but this avoids both ending with a very
high configEpoch without any reason and to flood the system with fsyncs.
2014-02-11 10:09:17 +01:00
antirez
4a64286c36 Cluster: configEpoch assignment in SETNODE improved.
Avoid to trash a configEpoch for every slot migrated if this node has
already the max configEpoch across the cluster.

Still work to do in this area but this avoids both ending with a very
high configEpoch without any reason and to flood the system with fsyncs.
2014-02-11 10:09:17 +01:00
antirez
9b8e0c972a Cluster: clusterSetStartupEpoch() made more generally useful.
The actual goal of the function was to get the max configEpoch found in
the cluster, so make it general by removing the assignment of the max
epoch to currentEpoch that is useful only at startup.
2014-02-11 10:00:14 +01:00
antirez
72f7abf6a2 Cluster: clusterSetStartupEpoch() made more generally useful.
The actual goal of the function was to get the max configEpoch found in
the cluster, so make it general by removing the assignment of the max
epoch to currentEpoch that is useful only at startup.
2014-02-11 10:00:14 +01:00
antirez
e200c6dd00 Cluster: always increment the configEpoch in SETNODE after import.
Removed a stale conditional preventing the configEpoch from incrementing
after the import in certain conditions. Since the master got a new slot
it should always claim a new configuration.
2014-02-11 09:50:37 +01:00
antirez
44f7afe28a Cluster: always increment the configEpoch in SETNODE after import.
Removed a stale conditional preventing the configEpoch from incrementing
after the import in certain conditions. Since the master got a new slot
it should always claim a new configuration.
2014-02-11 09:50:37 +01:00
antirez
b60d185126 Cluster: on resharding upgrade version of receiving node.
The node receiving the hash slot needs to have a version that wins over
the other versions in order to force the ownership of the slot.

However the current code is far from perfect since a failover can happen
during the manual resharding. The fix is a work in progress but the
bottom line is that the new version must either be voted as usually,
set by redis-trib manually after it makes sure can't be used by other
nodes, or reserved configEpochs could be used for manual operations (for
example odd versions could be never used by slaves and are always used
by CLUSTER SETSLOT NODE).
2014-02-11 00:36:05 +01:00
antirez
a1349728ea Cluster: on resharding upgrade version of receiving node.
The node receiving the hash slot needs to have a version that wins over
the other versions in order to force the ownership of the slot.

However the current code is far from perfect since a failover can happen
during the manual resharding. The fix is a work in progress but the
bottom line is that the new version must either be voted as usually,
set by redis-trib manually after it makes sure can't be used by other
nodes, or reserved configEpochs could be used for manual operations (for
example odd versions could be never used by slaves and are always used
by CLUSTER SETSLOT NODE).
2014-02-11 00:36:05 +01:00
antirez
a1d0249297 Cluster: fsync at every SETSLOT command puts too pressure on disks.
During slots migration redis-trib can send a number of SETSLOT commands.
Fsyncing every time is a bit too much in production as verified
empirically.

To make sure configs are fsynced on all nodes after a resharding
redis-trib may send something like CLUSTER CONFSYNC.

In this case fsyncs were not providing too much value since anyway
processes can crash in the middle of the resharding of an hash slot, and
redis-trib should be able to recover from this condition anyway.
2014-02-10 23:54:08 +01:00
antirez
6dc26795aa Cluster: fsync at every SETSLOT command puts too pressure on disks.
During slots migration redis-trib can send a number of SETSLOT commands.
Fsyncing every time is a bit too much in production as verified
empirically.

To make sure configs are fsynced on all nodes after a resharding
redis-trib may send something like CLUSTER CONFSYNC.

In this case fsyncs were not providing too much value since anyway
processes can crash in the middle of the resharding of an hash slot, and
redis-trib should be able to recover from this condition anyway.
2014-02-10 23:54:08 +01:00
antirez
435af98eb8 Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed.
If the slot is manually assigned to another node, clear the migrating
status regardless of the fact it was previously assigned to us or not,
as long as we no longer have keys for this slot.

This avoid a race during slots migration that may leave the slot in
migrating status in the source node, since it received an update message
from the destination node that is already claiming the slot.

This way we are sure that redis-trib at the end of the slot migration is
always able to close the slot correctly.
2014-02-10 23:51:47 +01:00
antirez
218358bbbd Cluster: conditions to clear "migrating" on slot for SETSLOT ... NODE changed.
If the slot is manually assigned to another node, clear the migrating
status regardless of the fact it was previously assigned to us or not,
as long as we no longer have keys for this slot.

This avoid a race during slots migration that may leave the slot in
migrating status in the source node, since it received an update message
from the destination node that is already claiming the slot.

This way we are sure that redis-trib at the end of the slot migration is
always able to close the slot correctly.
2014-02-10 23:51:47 +01:00
antirez
5a79453abf Cluster: remove debugging xputs from redis-trib. 2014-02-10 19:14:05 +01:00
antirez
3107e7ca60 Cluster: remove debugging xputs from redis-trib. 2014-02-10 19:14:05 +01:00
antirez
e4732138b0 Cluster: redis-trib fix: cover new case of open slot.
The case is the trivial one a single node claiming the slot as
migrating, without nodes claiming it as importing.
2014-02-10 19:10:23 +01:00
antirez
1ae50a9b1d Cluster: redis-trib fix: cover new case of open slot.
The case is the trivial one a single node claiming the slot as
migrating, without nodes claiming it as importing.
2014-02-10 19:10:23 +01:00
antirez
2411d8fd94 redis-trib: log event after we have reference to 'master'. 2014-02-10 18:48:40 +01:00
antirez
59e03a8f35 redis-trib: log event after we have reference to 'master'. 2014-02-10 18:48:40 +01:00
antirez
e4a6144fc5 Cluster: don't update slave's master if we don't know it.
There is no way we can update the slave's node->slaveof pointer if we
don't know the master (no node with such an ID in our tables).
2014-02-10 18:33:34 +01:00
antirez
bf670e0745 Cluster: don't update slave's master if we don't know it.
There is no way we can update the slave's node->slaveof pointer if we
don't know the master (no node with such an ID in our tables).
2014-02-10 18:33:34 +01:00
antirez
f31a53678a Cluster: ignore slot config changes if we are importing it. 2014-02-10 18:04:43 +01:00
antirez
a3755ae9ee Cluster: ignore slot config changes if we are importing it. 2014-02-10 18:04:43 +01:00
antirez
5c022633a2 Cluster: update configEpoch after manually messing with slots. 2014-02-10 18:01:58 +01:00
antirez
6fc53e16ad Cluster: update configEpoch after manually messing with slots. 2014-02-10 18:01:58 +01:00
antirez
a136867cc4 Cluster: redis-trib, more info about open slots error. 2014-02-10 17:44:16 +01:00
antirez
be0bb19fd3 Cluster: redis-trib, more info about open slots error. 2014-02-10 17:44:16 +01:00
antirez
36d8dcb5b7 Cluster: fixed inverted arguments in logging function call. 2014-02-10 17:21:10 +01:00
antirez
1a73c992a3 Cluster: fixed inverted arguments in logging function call. 2014-02-10 17:21:10 +01:00
antirez
8c577113ee Cluster: clear the FAIL status for masters without slots.
Masters without slots don't participate to the cluster but just do
redirections, no need to take them in FAIL state if they are back
reachable.
2014-02-10 17:18:27 +01:00
antirez
32563b4a5f Cluster: clear the FAIL status for masters without slots.
Masters without slots don't participate to the cluster but just do
redirections, no need to take them in FAIL state if they are back
reachable.
2014-02-10 17:18:27 +01:00
Matt Stancliff
33569106bc Auto-enter slaveMode when SYNC from redis-cli
If someone asks for SYNC or PSYNC from redis-cli,
automatically enter slaveMode (as if they ran
redis-cli --slave) and continue printing the replication
stream until either they Ctrl-C or the master gets disconnected.
2014-02-10 11:10:31 -05:00
Matt Stancliff
21648473aa Auto-enter slaveMode when SYNC from redis-cli
If someone asks for SYNC or PSYNC from redis-cli,
automatically enter slaveMode (as if they ran
redis-cli --slave) and continue printing the replication
stream until either they Ctrl-C or the master gets disconnected.
2014-02-10 11:10:31 -05:00
antirez
da9ae01802 Cluster: replica migration should only work for masters serving slots. 2014-02-10 17:08:37 +01:00
antirez
5b2082ead3 Cluster: replica migration should only work for masters serving slots. 2014-02-10 17:08:37 +01:00
antirez
aa408d80eb Cluster: redis-trib del-node variable typo fixed. 2014-02-10 16:59:09 +01:00
antirez
f106a79309 Cluster: redis-trib del-node variable typo fixed. 2014-02-10 16:59:09 +01:00
antirez
467ed194be Cluster: clusterReadHandler() fixed to work with new message header. 2014-02-10 16:27:37 +01:00
antirez
f885fa8bac Cluster: clusterReadHandler() fixed to work with new message header. 2014-02-10 16:27:37 +01:00
antirez
943f7c50ed Cluster: don't propagate PUBLISH two times.
PUBLISH both published messages via Cluster bus and replication when
cluster was enabled, resulting in duplicated message in the slave.
2014-02-10 16:00:27 +01:00
antirez
344a065d51 Cluster: don't propagate PUBLISH two times.
PUBLISH both published messages via Cluster bus and replication when
cluster was enabled, resulting in duplicated message in the slave.
2014-02-10 16:00:27 +01:00
antirez
e68a4656d3 Cluster: signature changed to "RCmb" (Redis Cluster message bus).
Sounds better after all.
2014-02-10 15:55:21 +01:00
antirez
7bf7b7350c Cluster: signature changed to "RCmb" (Redis Cluster message bus).
Sounds better after all.
2014-02-10 15:55:21 +01:00
antirez
99643c4d2e Cluster: discard bus messages with version != 0. 2014-02-10 15:54:22 +01:00