20983 Commits

Author SHA1 Message Date
FuGangqiang
239494db64 fix doc example 2015-04-20 21:46:48 +08:00
FuGangqiang
4e5b058ed0 fix typo 2015-04-19 23:42:27 +08:00
FuGangqiang
42b36c5ce9 fix typo 2015-04-19 23:42:27 +08:00
Glenn Nethercutt
d8390522cb uphold the smove contract to return 0 when the element is not a member of the source set, even if source=dest 2015-04-17 09:27:54 -04:00
Glenn Nethercutt
626b4f6907 uphold the smove contract to return 0 when the element is not a member of the source set, even if source=dest 2015-04-17 09:27:54 -04:00
antirez
2685af5aed Net: improve prepareClientToWrite() error handling and comments.
When we fail to setup the write handler it does not make sense to take
the client around, it is missing writes: whatever is a client or a slave
anyway the connection should terminated ASAP.

Moreover what the function does exactly with its return value, and in
which case the write handler is installed on the socket, was not clear,
so the functions comment are improved to make the goals of the function
more obvious.

Also related to #2485.
2015-04-01 10:07:45 +02:00
antirez
6c60526db9 Net: improve prepareClientToWrite() error handling and comments.
When we fail to setup the write handler it does not make sense to take
the client around, it is missing writes: whatever is a client or a slave
anyway the connection should terminated ASAP.

Moreover what the function does exactly with its return value, and in
which case the write handler is installed on the socket, was not clear,
so the functions comment are improved to make the goals of the function
more obvious.

Also related to #2485.
2015-04-01 10:07:45 +02:00
Oran Agra
cdd008856e fixes to diskless replication.
master was closing the connection if the RDB transfer took long time.
and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
2015-03-31 23:42:08 +03:00
Oran Agra
159875b5a3 fixes to diskless replication.
master was closing the connection if the RDB transfer took long time.
and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
2015-03-31 23:42:08 +03:00
antirez
f9811b205d Fix setTypeNext call assuming NULL can be passed.
Segfault introduced during a refactoring / warning suppression a few
commits away. This particular call assumed that it is safe to pass NULL
to the object pointer argument when we are sure the set has a given
encoding. This can't be assumed and is now guaranteed to segfault
because of the new API of setTypeNext().
2015-03-31 15:26:35 +02:00
antirez
66f9393ee4 Fix setTypeNext call assuming NULL can be passed.
Segfault introduced during a refactoring / warning suppression a few
commits away. This particular call assumed that it is safe to pass NULL
to the object pointer argument when we are sure the set has a given
encoding. This can't be assumed and is now guaranteed to segfault
because of the new API of setTypeNext().
2015-03-31 15:26:35 +02:00
antirez
0a18237a52 Set: setType*() API more defensive initializing both values.
This change fixes several warnings compiling at -O3 level with GCC
4.8.2, and at the same time, in case of misuse of the API, we have the
pointer initialize to NULL or the integer initialized to the value
-123456789 which is easy to spot by naked eye.
2015-03-30 12:24:57 +02:00
antirez
7f330b16f9 Set: setType*() API more defensive initializing both values.
This change fixes several warnings compiling at -O3 level with GCC
4.8.2, and at the same time, in case of misuse of the API, we have the
pointer initialize to NULL or the integer initialized to the value
-123456789 which is easy to spot by naked eye.
2015-03-30 12:24:57 +02:00
antirez
7c82681825 Check bio.c job type at thread startup.
Another one just to avoid a warning. Slightly more defensive code
anyway.
2015-03-30 12:17:46 +02:00
antirez
34460dd6ee Check bio.c job type at thread startup.
Another one just to avoid a warning. Slightly more defensive code
anyway.
2015-03-30 12:17:46 +02:00
antirez
6f051aeca9 Ensure array index is in range in addReplyLongLongWithPrefix().
Change done in order to remove a warning and improve code robustness. No
actual bug here.
2015-03-30 11:54:49 +02:00
antirez
221d2932b5 Ensure array index is in range in addReplyLongLongWithPrefix().
Change done in order to remove a warning and improve code robustness. No
actual bug here.
2015-03-30 11:54:49 +02:00
antirez
df2a1e0901 dict.c: convert types to unsigned long where appropriate.
No semantical changes since to make dict.c truly able to scale over the
32 bit table size limit, the hash function shoulds and other internals
related to hash function output should be 64 bit ready.
2015-03-27 10:14:52 +01:00
antirez
068d3c9737 dict.c: convert types to unsigned long where appropriate.
No semantical changes since to make dict.c truly able to scale over the
32 bit table size limit, the hash function shoulds and other internals
related to hash function output should be 64 bit ready.
2015-03-27 10:14:52 +01:00
antirez
ce5d516e75 dict.c: add casting to avoid compilation warning.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
2015-03-27 10:12:25 +01:00
antirez
9cd8333ed2 dict.c: add casting to avoid compilation warning.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
2015-03-27 10:12:25 +01:00
antirez
a67c20a067 Replication: disconnect blocked clients when switching to slave role.
Bug as old as Redis and blocking operations. It's hard to trigger since
only happens on instance role switch, but the results are quite bad
since an inconsistency between master and slave is created.

How to trigger the bug is a good description of the bug itself.

1. Client does "BLPOP mylist 0" in master.
2. Master is turned into slave, that replicates from New-Master.
3. Client does "LPUSH mylist foo" in New-Master.
4. New-Master propagates write to slave.
5. Slave receives the LPUSH, the blocked client get served.

Now Master "mylist" key has "foo", Slave "mylist" key is empty.

Highlights:

* At step "2" above, the client remains attached, basically escaping any
  check performed during command dispatch: read only slave, in that case.
* At step "5" the slave (that was the master), serves the blocked client
  consuming a list element, which is not consumed on the master side.

This scenario is technically likely to happen during failovers, however
since Redis Sentinel already disconnects clients using the CLIENT
command when changing the role of the instance, the bug is avoided in
Sentinel deployments.

Closes #2473.
2015-03-24 16:00:09 +01:00
antirez
c3ad70901f Replication: disconnect blocked clients when switching to slave role.
Bug as old as Redis and blocking operations. It's hard to trigger since
only happens on instance role switch, but the results are quite bad
since an inconsistency between master and slave is created.

How to trigger the bug is a good description of the bug itself.

1. Client does "BLPOP mylist 0" in master.
2. Master is turned into slave, that replicates from New-Master.
3. Client does "LPUSH mylist foo" in New-Master.
4. New-Master propagates write to slave.
5. Slave receives the LPUSH, the blocked client get served.

Now Master "mylist" key has "foo", Slave "mylist" key is empty.

Highlights:

* At step "2" above, the client remains attached, basically escaping any
  check performed during command dispatch: read only slave, in that case.
* At step "5" the slave (that was the master), serves the blocked client
  consuming a list element, which is not consumed on the master side.

This scenario is technically likely to happen during failovers, however
since Redis Sentinel already disconnects clients using the CLIENT
command when changing the role of the instance, the bug is avoided in
Sentinel deployments.

Closes #2473.
2015-03-24 16:00:09 +01:00
antirez
64ae753eb0 Cluster: redirection refactoring + handling of blocked clients.
There was a bug in Redis Cluster caused by clients blocked in a blocking
list pop operation, for keys no longer handled by the instance, or
in a condition where the cluster became down after the client blocked.

A typical situation is:

1) BLPOP <somekey> 0
2) <somekey> hash slot is resharded to another master.

The client will block forever int this case.

A symmentrical non-cluster-specific bug happens when an instance is
turned from master to slave. In that case it is more serious since this
will desynchronize data between slaves and masters. This other bug was
discovered as a side effect of thinking about the bug explained and
fixed in this commit, but will be fixed in a separated commit.
2015-03-24 11:56:24 +01:00
antirez
9b7f8b1c9b Cluster: redirection refactoring + handling of blocked clients.
There was a bug in Redis Cluster caused by clients blocked in a blocking
list pop operation, for keys no longer handled by the instance, or
in a condition where the cluster became down after the client blocked.

A typical situation is:

1) BLPOP <somekey> 0
2) <somekey> hash slot is resharded to another master.

The client will block forever int this case.

A symmentrical non-cluster-specific bug happens when an instance is
turned from master to slave. In that case it is more serious since this
will desynchronize data between slaves and masters. This other bug was
discovered as a side effect of thinking about the bug explained and
fixed in this commit, but will be fixed in a separated commit.
2015-03-24 11:56:24 +01:00
antirez
1bbe955b8b Cluster: fix Lua scripts replication to slave nodes. 2015-03-22 22:24:08 +01:00
antirez
2f4240b9d9 Cluster: fix Lua scripts replication to slave nodes. 2015-03-22 22:24:08 +01:00
antirez
6a44ada2ac Two cluster.c comments improved. 2015-03-21 12:12:23 +01:00
antirez
94030fa4d7 Two cluster.c comments improved. 2015-03-21 12:12:23 +01:00
antirez
f9090fccdd Cluster: TAKEOVER option for manual failover. 2015-03-21 11:54:32 +01:00
antirez
2950824ab6 Cluster: TAKEOVER option for manual failover. 2015-03-21 11:54:32 +01:00
antirez
fdbe2d6086 Fix typo in beforeSleep() comment. 2015-03-21 09:19:08 +01:00
antirez
d544600aa5 Fix typo in beforeSleep() comment. 2015-03-21 09:19:08 +01:00
antirez
4a8ec4f2bd Net: processUnblockedClients() and clientsArePaused() minor changes.
1. No need to set btype in processUnblockedClients(), since clients
   flagged REDIS_UNBLOCKED should have it already cleared.
2. When putting clients in the unblocked clients list, clientsArePaused()
   should flag them with REDIS_UNBLOCKED. Not strictly needed with the
   current code but is more coherent.
2015-03-21 09:13:29 +01:00
antirez
2b278a3394 Net: processUnblockedClients() and clientsArePaused() minor changes.
1. No need to set btype in processUnblockedClients(), since clients
   flagged REDIS_UNBLOCKED should have it already cleared.
2. When putting clients in the unblocked clients list, clientsArePaused()
   should flag them with REDIS_UNBLOCKED. Not strictly needed with the
   current code but is more coherent.
2015-03-21 09:13:29 +01:00
antirez
01fd23026c Net: clientsArePaused() should not touch blocked clients.
When the list of unblocked clients were processed, btype was set to
blocking type none, but the client remained flagged with REDIS_BLOCKED.
When timeout is reached (or when the client disconnects), unblocking it
will trigger an assertion.

There is no need to process pending requests from blocked clients, so
now clientsArePaused() just avoid touching blocked clients.

Close #2467.
2015-03-21 09:04:38 +01:00
antirez
5fe4a23131 Net: clientsArePaused() should not touch blocked clients.
When the list of unblocked clients were processed, btype was set to
blocking type none, but the client remained flagged with REDIS_BLOCKED.
When timeout is reached (or when the client disconnects), unblocking it
will trigger an assertion.

There is no need to process pending requests from blocked clients, so
now clientsArePaused() just avoid touching blocked clients.

Close #2467.
2015-03-21 09:04:38 +01:00
antirez
2946fba29d Cluster: non-conditional steps of slave failover refactored into a function. 2015-03-20 17:56:21 +01:00
antirez
a7010ae208 Cluster: non-conditional steps of slave failover refactored into a function. 2015-03-20 17:56:21 +01:00
antirez
dbe9d75c48 Cluster: separate unknown master check from the rest.
In no case we should try to attempt to failover if myself->slaveof is
NULL.
2015-03-20 16:56:59 +01:00
antirez
230d141420 Cluster: separate unknown master check from the rest.
In no case we should try to attempt to failover if myself->slaveof is
NULL.
2015-03-20 16:56:59 +01:00
antirez
7a24091ef4 Cluster: refactoring around configEpoch handling.
This commit moves the process of generating a new config epoch without
consensus out of the clusterCommand() implementation, in order to make
it reusable for other reasons (current target is to have a CLUSTER
FAILOVER option forcing the failover when no master majority is
reachable).

Moreover the commit moves other functions which are similarly related to
config epochs in a new logical section of the cluster.c file, just for
clarity.
2015-03-20 16:42:52 +01:00
antirez
4f2555aa17 Cluster: refactoring around configEpoch handling.
This commit moves the process of generating a new config epoch without
consensus out of the clusterCommand() implementation, in order to make
it reusable for other reasons (current target is to have a CLUSTER
FAILOVER option forcing the failover when no master majority is
reachable).

Moreover the commit moves other functions which are similarly related to
config epochs in a new logical section of the cluster.c file, just for
clarity.
2015-03-20 16:42:52 +01:00
antirez
99e8cc230d Cluster: better cluster state transiction handling.
Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.
2015-03-20 09:59:28 +01:00
antirez
25c0f5ac63 Cluster: better cluster state transiction handling.
Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.
2015-03-20 09:59:28 +01:00
antirez
0175a164e0 Cluster: move clusterBeforeSleep() call before unblocked clients processing.
Related to issue #2288.
2015-03-20 09:47:54 +01:00
antirez
2ecb5edf34 Cluster: move clusterBeforeSleep() call before unblocked clients processing.
Related to issue #2288.
2015-03-20 09:47:54 +01:00
antirez
ad7956ce89 Cluster: more robust slave check in CLUSTER REPLICATE.
There are rare conditions where node->slaveof may be NULL even if the
node is a slave. To check by flag is much more robust.
2015-03-18 12:10:14 +01:00
antirez
438a1a84e8 Cluster: more robust slave check in CLUSTER REPLICATE.
There are rare conditions where node->slaveof may be NULL even if the
node is a slave. To check by flag is much more robust.
2015-03-18 12:10:14 +01:00
Salvatore Sanfilippo
4aa4365112 Merge pull request #2386 from inkel/sentinel-add-client-command
Support CLIENT commands in Redis Sentinel
2015-03-13 18:23:36 +01:00