20879 Commits

Author SHA1 Message Date
antirez
b221e13dac CONFIG REWRITE: don't wipe unknown options.
With this commit options not explicitly rewritten by CONFIG REWRITE are
not touched at all. These include new options that may not have support
for REWRITE, and other special cases like rename-command and include.
2013-12-19 15:25:45 +01:00
antirez
e48365e2c2 Cluster: set n->slaves to NULL in clusterNodeResetSlaves().
The value was otherwise undefined, so next time the node was promoted
again from slave to master, adding a slave to the list of slaves
would likely crash the server or result into undefined behavior.
2013-12-17 14:50:24 +01:00
antirez
7a666ac419 Cluster: set n->slaves to NULL in clusterNodeResetSlaves().
The value was otherwise undefined, so next time the node was promoted
again from slave to master, adding a slave to the list of slaves
would likely crash the server or result into undefined behavior.
2013-12-17 14:50:24 +01:00
antirez
7f51cf8b56 Cluster: check link is valid before sending UPDATE. 2013-12-17 12:28:37 +01:00
antirez
fda91dbde3 Cluster: check link is valid before sending UPDATE. 2013-12-17 12:28:37 +01:00
antirez
c17be18035 Cluster: initialize todo_before_sleep flags to 0. 2013-12-17 12:22:02 +01:00
antirez
f57bb36ce7 Cluster: initialize todo_before_sleep flags to 0. 2013-12-17 12:22:02 +01:00
antirez
195aab3345 Cluster: use proper type mstime_t for ping delay var. 2013-12-17 10:27:36 +01:00
antirez
c70c0c6db7 Cluster: use proper type mstime_t for ping delay var. 2013-12-17 10:27:36 +01:00
antirez
b1d3dd657d Cluster: use an hardcoded 60 sec timeout in redis-trib connections.
Later this should be configurable from the command line but at least now
we use something more appropriate for our use case compared to the
redis-rb default timeout.
2013-12-17 10:00:33 +01:00
antirez
7c1cbdceb2 Cluster: use an hardcoded 60 sec timeout in redis-trib connections.
Later this should be configurable from the command line but at least now
we use something more appropriate for our use case compared to the
redis-rb default timeout.
2013-12-17 10:00:33 +01:00
antirez
118d0fb533 Fixed clearNodeFailureIfNeeded() time type to mstime_t.
This prevented 32bit cluster instances from clearing the FAIL flag when
needed.
2013-12-17 09:45:52 +01:00
antirez
47815d38e0 Fixed clearNodeFailureIfNeeded() time type to mstime_t.
This prevented 32bit cluster instances from clearing the FAIL flag when
needed.
2013-12-17 09:45:52 +01:00
antirez
9180fb7931 Cluster: use long long for timestamps in clusterGenNodesDescription().
Ping sent and pong received fields need to be casted to long long to be
printed correctly into 32 bit systems.
2013-12-17 09:38:11 +01:00
antirez
e88e6a6334 Cluster: use long long for timestamps in clusterGenNodesDescription().
Ping sent and pong received fields need to be casted to long long to be
printed correctly into 32 bit systems.
2013-12-17 09:38:11 +01:00
antirez
229267abd1 Makefile.dep updated. 2013-12-13 13:10:05 +01:00
antirez
2dfc5e35a9 Makefile.dep updated. 2013-12-13 13:10:05 +01:00
antirez
69a3303c18 SDIFF iterator misuse fixed in diff algorithm #1.
The bug could be easily triggered by:

    SADD foo a b c 1 2 3 4 5 6
    SDIFF foo foo

When the key was the same in two sets, an unsafe iterator was used to
check existence of elements in the same set we were iterating.
Usually this would just result into a wrong output, however with the
dict.c API misuse protection we have in place, the result was actually
an assertion failed that was triggered by the CI test, while creating
random datasets for the "MASTER and SLAVE consistency" test.
2013-12-13 11:34:21 +01:00
antirez
c00453da1d SDIFF iterator misuse fixed in diff algorithm #1.
The bug could be easily triggered by:

    SADD foo a b c 1 2 3 4 5 6
    SDIFF foo foo

When the key was the same in two sets, an unsafe iterator was used to
check existence of elements in the same set we were iterating.
Usually this would just result into a wrong output, however with the
dict.c API misuse protection we have in place, the result was actually
an assertion failed that was triggered by the CI test, while creating
random datasets for the "MASTER and SLAVE consistency" test.
2013-12-13 11:34:21 +01:00
antirez
134b4e97e7 Sentinel: dead code removed. 2013-12-13 11:01:13 +01:00
antirez
5320148883 Sentinel: dead code removed. 2013-12-13 11:01:13 +01:00
antirez
34a9b7d656 Makefile: remove odd syntax not compatible with some make versions.
See issue #1448.
2013-12-12 15:19:39 +01:00
antirez
452dea30f6 Makefile: remove odd syntax not compatible with some make versions.
See issue #1448.
2013-12-12 15:19:39 +01:00
Salvatore Sanfilippo
a0c777c99a Merge pull request #1460 from codeeply/simplify2
comment mistake fixed
2013-12-12 02:23:44 -08:00
Salvatore Sanfilippo
a99c751d6c Merge pull request #1460 from codeeply/simplify2
comment mistake fixed
2013-12-12 02:23:44 -08:00
codeeply
45637a3a82 comment mistake fixed 2013-12-12 16:33:29 +08:00
codeeply
0f06f8df07 comment mistake fixed 2013-12-12 16:33:29 +08:00
antirez
982e2855b8 Replication: publish the slave_repl_offset when disconnected from master.
When a slave was disconnected from its master the replication offset was
reported as -1. Now it is reported as the replication offset of the
previous master, so that failover can be performed using this value in
order to try to select a slave with more processed data from a set of
slaves of the old master.
2013-12-11 15:23:15 +01:00
antirez
a5ec247f13 Replication: publish the slave_repl_offset when disconnected from master.
When a slave was disconnected from its master the replication offset was
reported as -1. Now it is reported as the replication offset of the
previous master, so that failover can be performed using this value in
order to try to select a slave with more processed data from a set of
slaves of the old master.
2013-12-11 15:23:15 +01:00
Salvatore Sanfilippo
c0ff9075b6 Merge pull request #1451 from yossigo/unbalanced-quotes-fix
Return proper error on requests with an unbalanced number of quotes.
2013-12-11 03:06:18 -08:00
Salvatore Sanfilippo
0a89d9a0b1 Merge pull request #1451 from yossigo/unbalanced-quotes-fix
Return proper error on requests with an unbalanced number of quotes.
2013-12-11 03:06:18 -08:00
Yossi Gottlieb
74d9f048fa Fix wrong repldboff type which causes dropped replication in rare cases. 2013-12-11 11:38:02 +01:00
Yossi Gottlieb
88a5cede88 Fix wrong repldboff type which causes dropped replication in rare cases. 2013-12-11 11:38:02 +01:00
antirez
ccd6ccc7dd Slaves heartbeats during sync improved.
The previous fix for false positive timeout detected by master was not
complete. There is another blocking stage while loading data for the
first synchronization with the master, that is, flushing away the
current data from the DB memory.

This commit uses the newly introduced dict.c callback in order to make
some incremental work (to send "\n" heartbeats to the master) while
flushing the old data from memory.

It is hard to write a regression test for this issue unfortunately. More
support for debugging in the Redis core would be needed in terms of
functionalities to simulate a slow DB loading / deletion.
2013-12-10 18:47:31 +01:00
antirez
11120689c4 Slaves heartbeats during sync improved.
The previous fix for false positive timeout detected by master was not
complete. There is another blocking stage while loading data for the
first synchronization with the master, that is, flushing away the
current data from the DB memory.

This commit uses the newly introduced dict.c callback in order to make
some incremental work (to send "\n" heartbeats to the master) while
flushing the old data from memory.

It is hard to write a regression test for this issue unfortunately. More
support for debugging in the Redis core would be needed in terms of
functionalities to simulate a slow DB loading / deletion.
2013-12-10 18:47:31 +01:00
antirez
247a311317 dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
2eb781b35b dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
2860b5e234 Log empty DB + Loading data into two separated messages. 2013-12-10 18:43:25 +01:00
antirez
2c4ab8a534 Log empty DB + Loading data into two separated messages. 2013-12-10 18:43:25 +01:00
antirez
18d92c0836 Don't send more than 1 newline/sec while loading RDB. 2013-12-10 18:43:19 +01:00
antirez
7c531eb5ad Don't send more than 1 newline/sec while loading RDB. 2013-12-10 18:43:19 +01:00
antirez
54a526687d Slaves heartbeat while loading RDB files.
Starting with Redis 2.8 masters are able to detect timed out slaves,
while before 2.8 only slaves were able to detect a timed out master.

Now that timeout detection is bi-directional the following problem
happens as described "in the field" by issue #1449:

1) Master and slave setup with big dataset.
2) Slave performs the first synchronization, or a full sync
   after a failed partial resync.
3) Master sends the RDB payload to the slave.
4) Slave loads this payload.
5) Master detects the slave as timed out since does not receive back the
   REPLCONF ACK acknowledges.

Here the problem is that the master has no way to know how much the
slave will take to load the RDB file in memory. The obvious solution is
to use a greater replication timeout setting, but this is a shame since
for the 0.1% of operation time we are forced to use a timeout that is
not what is suited for 99.9% of operation time.

This commit tries to fix this problem with a solution that is a bit of
an hack, but that modifies little of the replication internals, in order
to be back ported to 2.8 safely.

During the RDB loading time, we send the master newlines to avoid
being sensed as timed out. This is the same that the master already does
while saving the RDB file to still signal its presence to the slave.

The single newline is used because:

1) It can't desync the protocol, as it is only transmitted all or
nothing.
2) It can be safely sent while we don't have a client structure for the
master or in similar situations just with write(2).
2013-12-09 20:26:00 +01:00
antirez
27db38d069 Slaves heartbeat while loading RDB files.
Starting with Redis 2.8 masters are able to detect timed out slaves,
while before 2.8 only slaves were able to detect a timed out master.

Now that timeout detection is bi-directional the following problem
happens as described "in the field" by issue #1449:

1) Master and slave setup with big dataset.
2) Slave performs the first synchronization, or a full sync
   after a failed partial resync.
3) Master sends the RDB payload to the slave.
4) Slave loads this payload.
5) Master detects the slave as timed out since does not receive back the
   REPLCONF ACK acknowledges.

Here the problem is that the master has no way to know how much the
slave will take to load the RDB file in memory. The obvious solution is
to use a greater replication timeout setting, but this is a shame since
for the 0.1% of operation time we are forced to use a timeout that is
not what is suited for 99.9% of operation time.

This commit tries to fix this problem with a solution that is a bit of
an hack, but that modifies little of the replication internals, in order
to be back ported to 2.8 safely.

During the RDB loading time, we send the master newlines to avoid
being sensed as timed out. This is the same that the master already does
while saving the RDB file to still signal its presence to the slave.

The single newline is used because:

1) It can't desync the protocol, as it is only transmitted all or
nothing.
2) It can be safely sent while we don't have a client structure for the
master or in similar situations just with write(2).
2013-12-09 20:26:00 +01:00
antirez
ae81525d35 Handle inline requested terminated with just \n. 2013-12-09 13:28:39 +01:00
antirez
eaf1bfb88b Handle inline requested terminated with just \n. 2013-12-09 13:28:39 +01:00
Yossi Gottlieb
834a5f530d Return proper error on requests with an unbalanced number of quotes. 2013-12-08 12:58:12 +02:00
Yossi Gottlieb
6e70c01148 Return proper error on requests with an unbalanced number of quotes. 2013-12-08 12:58:12 +02:00
antirez
b6d79f34e8 Sentinel: fix reported role info sampling.
The way the role change was recoded was not sane and too much
convoluted, causing the role information to be not always updated.

This commit fixes issue #1445.
2013-12-06 12:46:56 +01:00
antirez
c590549e40 Sentinel: fix reported role info sampling.
The way the role change was recoded was not sane and too much
convoluted, causing the role information to be not always updated.

This commit fixes issue #1445.
2013-12-06 12:46:56 +01:00
antirez
33ea913329 Sentinel: fix reported role fields when master is reset.
When there is a master address switch, the reported role must be set to
master so that we have a chance to re-sample the INFO output to check if
the new address is reporting the right role.

Otherwise if the role was wrong, it will be sensed as wrong even after
the address switch, and for enough time according to the role change
time, for Sentinel consider the master SDOWN.

This fixes isue #1446, that describes the effects of this bug in
practice.
2013-12-06 11:37:46 +01:00