4306 Commits

Author SHA1 Message Date
Salvatore Sanfilippo
03b51677c2 Merge pull request #1788 from zionwu/unstable
fix issue 1787
2014-06-06 10:33:11 +02:00
antirez
3cbd1ac23e Don't process min-slaves-to-write for slaves.
Replication is totally broken when a slave has this option, since it
stops accepting updates from masters.

This fixes issue #1434.
2014-06-05 10:48:05 +02:00
antirez
781d3ea87e Tests for min-slaves-* feature. 2014-06-05 10:46:12 +02:00
antirez
93ac1e49e8 Fixed dbuf variable scope in luaRedisGenericCommand().
I'm not sure if while the visibility is the inner block, the fact we
point to 'dbuf' is a problem or not, probably the stack var isx
guaranteed to live until the function returns. However obvious code is
better anyway.
2014-06-04 18:57:12 +02:00
antirez
b4385aeec0 Regression test for issue #1118. 2014-06-04 18:51:20 +02:00
antirez
6c9e224755 Scripting: better Lua number -> string conversion in luaRedisGenericCommand().
The lua_to*string() family of functions use a non optimal format
specifier when converting integers to strings. This has both the problem
of the number being converted in exponential notation, which we don't
use as a Redis return value when floating point numbers are involed,
and, moreover, there is a loss of precision since the default format
specifier is not able to represent numbers that must be represented
exactly in the IEEE 754 number mantissa.

The new code handles it as a special case using a saner conversion.

This fixes issue #1118.
2014-06-04 18:33:24 +02:00
zionwu
5e7a5a6e3c fix issue 1787 2014-06-01 02:23:24 +08:00
antirez
9dd14ee1d1 More trailing spaces in sentinel.c removed. 2014-05-28 15:46:05 +02:00
antirez
69ae75ebd8 Cluster test: add tmp dir to Git repo. 2014-05-26 18:08:12 +02:00
Salvatore Sanfilippo
8bacb6e0c9 Merge pull request #1775 from mattsta/fix-test-against-new-PID-format
Fix test framework to detect proper server PID
2014-05-26 17:56:58 +02:00
Matt Stancliff
4674e14ec7 Disable recursive watchdog signal handler
If we are in the signal handler, we don't want to handle
the signal again.  In extreme cases, this can cause a stack overflow
and segfault Redis.

Fixes #1771
2014-05-26 17:53:33 +02:00
antirez
2f6c99fc22 Cluster: always allow ok -> fail switch in clusterUpdateState().
There is a time defined by REDIS_CLUSTER_WRITABLE_DELAY where fail -> ok
switch is not possible after startup as a master for some time, however
the contrary (ok -> fail) should always be possible.
2014-05-26 16:24:12 +02:00
antirez
fd14e51048 Cluster test: catch FLUSHALL errors on node reset.
FLUSHALL will fail on read-only slaves, but there the command is not
needed in order to reset the instance with CLUSTER RESET so errors can
be ignored.
2014-05-26 11:00:11 +02:00
antirez
9df261fa21 Sentinel example config: explain you don't need to specify slaves. 2014-05-26 10:17:12 +02:00
Matt Stancliff
f642944040 Fix test framework to detect proper server PID
Previously the PID format was:
[PID] Timestamp

But it recently changed to:
PID:X Timestamp

The tcl testing framework was grabbing the PID from \[\d+\], but
that's not valid anymore.

Now we grab the pid from "PID: <PID>" in the part of Redis startup
output to the right of the ASCII logo.
2014-05-23 13:54:29 -04:00
antirez
a1970d090a Cluster test: basic failover unit added. 2014-05-23 11:47:47 +02:00
antirez
d6fa9a31c3 Cluster test: move basic read/write test into a procedure. 2014-05-23 11:41:50 +02:00
antirez
789b41ee46 Cluster test: more reliable 01-faildet unit.
Do things in a sequence that prevents failover during failure detection.
2014-05-23 11:40:34 +02:00
antirez
8d3491ad29 redisLogFromHandler() format changed to match new logs format. 2014-05-22 19:24:35 +02:00
antirez
fa975ee9fb Tag every log line with role.
Every log contains, just after the pid, a single character that provides
information about the role of an instance:

S - Slave
M - Master
C - Writing child
X - Sentinel
2014-05-22 18:48:37 +02:00
antirez
aec8e92316 Cluster: slave validity factor is now user configurable.
Check the commit changes in the example redis.conf for more information.
2014-05-22 16:57:54 +02:00
antirez
894b8f7b94 Test: AOF test false positive when running in slow hosts.
The bug was triggered by running the test with Valgrind (which is a lot
slower and more sensible to timing issues) after the recent changes
that made Redis more promptly able to reply with the -LOADING error.
2014-05-22 16:05:03 +02:00
antirez
170a7dc749 Test: dump.tcl fixed for RESTORE new error msg. 2014-05-22 15:56:17 +02:00
antirez
d493e5e7e6 Fix an error in redis-trib where we always talk with same node.
While iterating the list of nodes we want to set the slot as stable in
the current node, not always in the first node of the list.
2014-05-21 18:17:02 +02:00
antirez
09d1576a38 redis-trib fix improved: move keys from N nodes to owner. 2014-05-21 16:40:46 +02:00
antirez
f48c31ba9f redis-trib fix: use MIGRATE REPLACE when fixing slots.
This fixes issue #1765.
2014-05-21 12:15:06 +02:00
antirez
7cc1e6c7a8 Regression test for issue #1764. 2014-05-20 16:20:16 +02:00
antirez
b2774d2124 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2014-05-20 16:15:13 +02:00
Salvatore Sanfilippo
2ed0209ec2 Merge pull request #1764 from michael-grunder/lua_cache_segfault
Fix LUA_OBJCACHE segfault.
2014-05-20 16:14:34 +02:00
antirez
4b00916396 Remove trailing spaces from scripting.c 2014-05-20 16:11:22 +02:00
antirez
8b7981725b Remove trailing spaces from sentinel.c. 2014-05-20 14:22:42 +02:00
michael-grunder
6124589d27 Fix LUA_OBJCACHE segfault.
When scanning the argument list inside of a redis.call() invocation
for pre-cached values, there was no check being done that the
argument we were on was in fact within the bounds of the cache size.

So if a redis.call() command was ever executed with more than 32
arguments (current cache size #define setting) redis-server could
segfault.
2014-05-19 13:18:13 -07:00
antirez
a1f9154dbf HyperLogLog regression test for issue #1762. 2014-05-19 15:44:04 +02:00
Salvatore Sanfilippo
dbfaa04eb0 Merge pull request #1762 from trink/unstable
Correct the HyperLogLog stale cache flag to prevent unnecessary computat...
2014-05-19 15:39:30 +02:00
antirez
43ca660e33 Cluster test: better failure detection test and framework improvements. 2014-05-19 15:26:19 +02:00
antirez
18a2f12bc5 Cluster test: failure detection initial tests. 2014-05-19 11:39:15 +02:00
antirez
4b708f6f0b Cluster test: proper initialization at unit startup. 2014-05-19 11:24:15 +02:00
Mike Trinkala
ac1a3c7340 Correct the HyperLogLog stale cache flag to prevent unnecessary computations.
Set the MSB as documented.
2014-05-18 07:26:26 -07:00
antirez
352f4fbbd5 Cluster: use clusterSetNodeAsMaster() during slave failover.
clusterHandleSlaveFailover() was reimplementing what
clusterSetNodeAsMaster() without any good reason.
2014-05-15 17:03:28 +02:00
antirez
b1d19fd6e6 Cluster: clear todo_before_sleep flags when executing actions.
Thanks to this change, when there is some code like:

    clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|...);
    ... and later before returning to the event loop ...
    clusterUpdateState();

The clusterUpdateState() function will clar the flag and will not be
repeated in the clusterBeforeSleep() function. This especially important
for config save/fsync flags which are slow to execute and not a good
idea to repeat without a good reason.

This is implemented for all the CLUSTER_TODO flags.
2014-05-15 16:33:13 +02:00
antirez
73b3fcde40 Fixed typo in CLUSTER RESET implementation. 2014-05-15 12:33:57 +02:00
antirez
b5765f3a73 CLUSTER RESET implemented.
The new command is able to reset a cluster node so that it starts again
as a fresh node. By default the command performs a soft reset (the same
as calling it as CLUSTER RESET SOFT), and the following steps are
performed:

1) All slots are set as unassigned.
2) The list of known nodes is flushed.
3) Node is set as master if it is a slave.

When an hard reset is performed with CLUSTER RESET HARD the following
additional operations are performed:

4) A new Node ID is created at random.
5) Epochs are set to 0.

CLUSTER RESET is useful both when the sysadmin wants to reconfigure a
node with a different role (for example turning a slave into a master)
and for testing purposes.

It also may play a role in automatically provisioned Redis Clusters,
since it allows to reset a node back to the initial state in order to be
reconfigured.
2014-05-15 11:43:06 +02:00
antirez
f48f8fda62 Remove trailing spaces from cluster.c file. 2014-05-15 10:18:36 +02:00
antirez
dcf52ec3d6 Cluster test: added function assert_cluster_state. 2014-05-14 15:21:57 +02:00
antirez
0ca22608a8 Cluster: don't accept cluster bus connections during startup. 2014-05-14 12:05:00 +02:00
antirez
716b729ab9 Cluster: better handling of stolen slots.
The previous code handling a lost slot (by another master with an higher
configuration for the slot) was defensive, considering it an error and
putting the cluster in an odd state requiring redis-cli fix.

This was changed, because actually this only happens either in a
legitimate way, with failovers, or when the admin messed with the config
in order to reconfigure the cluster. So the new code instead will try to
make sure that the keys stored match the new slots map, by removing all
the keys in the slots we lost ownership from.

The function that deletes the keys from the lost slots is called only
if the node does not lose all its slots (resulting in a reconfiguration
as a slave of the node that got ownership). This is an optimization
since the replication code will anyway flush all the instance data in
a faster way.
2014-05-14 10:46:37 +02:00
antirez
f866ae1e91 cluster.tcl: fix redis links leak in refresh_nodes_map. 2014-05-14 09:10:03 +02:00
antirez
43c3cb9a57 cluster.tcl: saner error handling.
Better handling of connection errors in order to update the table and
recovery, populate the startup nodes table after fetching the list of
nodes.

More work to do about it, it is still not as reliable as
redis-rb-cluster implementation which is the minimal reference
implementation for Redis Cluster clients.
2014-05-14 00:15:52 +02:00
antirez
3d4c02a555 redis.tcl: return I/O error message when peer closes connection. 2014-05-14 00:14:35 +02:00
antirez
e7b8d75ba3 Cluster: fixed data_age computation / check integer overflow. 2014-05-12 17:46:15 +02:00