27398 Commits

Author SHA1 Message Date
Salvatore Sanfilippo
9f63e75a07 Merge pull request #2943 from sunheehnus/issue2855
lua_struct.c/getnum: throw error if overflow happen
2015-12-14 17:57:43 +01:00
Sun He
4dd05ab55e lua_struct.c/getnum: throw error if overflow happen
Fix issue #2855
2015-12-13 13:47:22 +08:00
Sun He
3a47c8cfb8 lua_struct.c/getnum: throw error if overflow happen
Fix issue #2855
2015-12-13 13:47:22 +08:00
antirez
51a0a15d31 Cluster: redis-trib: use variadic MIGRATE.
We use the new variadic/pipelined MIGRATE for faster migration.
Testing is not easy because to see the time it takes for a slot to be
migrated requires a very large data set, but even with all the overhead
of migrating multiple slots and to setup them properly, what used to
take 4 seconds (1 million keys, 200 slots migrated) is now 1.6 which is
a good improvement. However the improvement can be a lot larger if:

1. We use large datasets where a single slot has many keys.
2. By moving more than 10 keys per iteration, making this configurable,
   which is planned.

Close #2710
Close #2711
2015-12-11 18:12:56 +01:00
antirez
f0b168e894 Cluster: redis-trib: use variadic MIGRATE.
We use the new variadic/pipelined MIGRATE for faster migration.
Testing is not easy because to see the time it takes for a slot to be
migrated requires a very large data set, but even with all the overhead
of migrating multiple slots and to setup them properly, what used to
take 4 seconds (1 million keys, 200 slots migrated) is now 1.6 which is
a good improvement. However the improvement can be a lot larger if:

1. We use large datasets where a single slot has many keys.
2. By moving more than 10 keys per iteration, making this configurable,
   which is planned.

Close #2710
Close #2711
2015-12-11 18:12:56 +01:00
antirez
47cb5c697b MIGRATE: Fix key extraction for new form. 2015-12-11 18:09:01 +01:00
antirez
4e252e4c09 MIGRATE: Fix key extraction for new form. 2015-12-11 18:09:01 +01:00
antirez
b5967cf2ab MIGRATE: test more corner cases. 2015-12-11 14:27:08 +01:00
antirez
82fd74a118 MIGRATE: test more corner cases. 2015-12-11 14:27:08 +01:00
antirez
5e55c3929a MIGRATE: Fix new argument rewriting refcount handling. 2015-12-11 14:26:41 +01:00
antirez
ac0a731057 MIGRATE: Fix new argument rewriting refcount handling. 2015-12-11 14:26:41 +01:00
antirez
c7500f497c MIGRATE: fix replies processing and argument rewriting.
We need to process replies after errors in order to delete keys
successfully transferred. Also argument rewriting was fixed since
it was broken in several ways. Now a fresh argument vector is created
and set if we are acknowledged of at least one key.
2015-12-11 14:04:47 +01:00
antirez
d85fc1e9cf MIGRATE: fix replies processing and argument rewriting.
We need to process replies after errors in order to delete keys
successfully transferred. Also argument rewriting was fixed since
it was broken in several ways. Now a fresh argument vector is created
and set if we are acknowledged of at least one key.
2015-12-11 14:04:47 +01:00
antirez
c72ee4c632 Test: pipelined MIGRATE tests added. 2015-12-11 13:41:58 +01:00
antirez
29d680ed5a Test: pipelined MIGRATE tests added. 2015-12-11 13:41:58 +01:00
antirez
e3bb88e4f7 Pipelined multiple keys MIGRATE. 2015-12-11 13:38:26 +01:00
antirez
9ebf7a6776 Pipelined multiple keys MIGRATE. 2015-12-11 13:38:26 +01:00
antirez
73ef5586e6 Cluster: redis-trib migrate default timeout set to 60 sec. 2015-12-11 11:00:27 +01:00
antirez
e7945cf839 Cluster: redis-trib migrate default timeout set to 60 sec. 2015-12-11 11:00:27 +01:00
daniele
dd93862a1d redis-trib.rb: --timeout XXXXX option added to fix and reshard commands. Defaults to 15000 milliseconds 2015-12-11 10:59:08 +01:00
daniele
3d254e05f4 redis-trib.rb: --timeout XXXXX option added to fix and reshard commands. Defaults to 15000 milliseconds 2015-12-11 10:59:08 +01:00
antirez
60daa127fd Cluster: replica migration with delay.
We wait a fixed amount of time (5 seconds currently) much greater than
the usual Cluster node to node communication latency, before migrating.
This way when a failover occurs, before detecting the new master as a
target for migration, we give the time to its natural slaves (the slaves
of the failed over master) to announce they switched to the new master,
preventing an useless migration operation.
2015-12-11 09:19:06 +01:00
antirez
adc2fe6993 Cluster: replica migration with delay.
We wait a fixed amount of time (5 seconds currently) much greater than
the usual Cluster node to node communication latency, before migrating.
This way when a failover occurs, before detecting the new master as a
target for migration, we give the time to its natural slaves (the slaves
of the failed over master) to announce they switched to the new master,
preventing an useless migration operation.
2015-12-11 09:19:06 +01:00
antirez
c112520b8a Cluster: more reliable migration tests.
The old version was modeled with two failovers, however after the first
it is possible that another slave will migrate to the new master, since
for some time the new master is not backed by any slave. Probably there
should be some pause after a failover, before the migration. Anyway the
test is simpler in this way, and depends less on timing.
2015-12-10 12:58:28 +01:00
antirez
41db54a557 Cluster: more reliable migration tests.
The old version was modeled with two failovers, however after the first
it is possible that another slave will migrate to the new master, since
for some time the new master is not backed by any slave. Probably there
should be some pause after a failover, before the migration. Anyway the
test is simpler in this way, and depends less on timing.
2015-12-10 12:58:28 +01:00
antirez
a5e2a4e3fb Cluster: more reliable replicas migration test. 2015-12-10 09:11:02 +01:00
antirez
b55affbc0c Cluster: more reliable replicas migration test. 2015-12-10 09:11:02 +01:00
antirez
6aec0c37e2 Remove debugging message left there for error. 2015-12-10 08:56:33 +01:00
antirez
4159055f83 Remove debugging message left there for error. 2015-12-10 08:56:33 +01:00
antirez
6b630dc34e unlinkClient(): clear flags according to ops performed. 2015-12-09 23:06:44 +01:00
antirez
69897f5f30 unlinkClient(): clear flags according to ops performed. 2015-12-09 23:06:44 +01:00
antirez
923098bbda Fix replicas migration by adding a new flag.
Some time ago I broken replicas migration (reported in #2924).
The idea was to prevent masters without replicas from getting replicas
because of replica migration, I remember it to create issues with tests,
but there is no clue in the commit message about why it was so
undesirable.

However my patch as a side effect totally ruined the concept of replicas
migration since we want it to work also for instances that, technically,
never had slaves in the past: promoted slaves.

So now instead the ability to be targeted by replicas migration, is a
new flag "migrate-to". It only applies to masters, and is set in the
following two cases:

1. When a master gets a slave, it is set.
2. When a slave turns into a master because of fail over, it is set.

This way replicas migration targets are only masters that used to have
slaves, and slaves of masters (that used to have slaves... obviously)
and are promoted.

The new flag is only internal, and is never exposed in the output nor
persisted in the nodes configuration, since all the information to
handle it are implicit in the cluster configuration we already have.
2015-12-09 23:03:18 +01:00
antirez
e0f22df995 Fix replicas migration by adding a new flag.
Some time ago I broken replicas migration (reported in #2924).
The idea was to prevent masters without replicas from getting replicas
because of replica migration, I remember it to create issues with tests,
but there is no clue in the commit message about why it was so
undesirable.

However my patch as a side effect totally ruined the concept of replicas
migration since we want it to work also for instances that, technically,
never had slaves in the past: promoted slaves.

So now instead the ability to be targeted by replicas migration, is a
new flag "migrate-to". It only applies to masters, and is set in the
following two cases:

1. When a master gets a slave, it is set.
2. When a slave turns into a master because of fail over, it is set.

This way replicas migration targets are only masters that used to have
slaves, and slaves of masters (that used to have slaves... obviously)
and are promoted.

The new flag is only internal, and is never exposed in the output nor
persisted in the nodes configuration, since all the information to
handle it are implicit in the cluster configuration we already have.
2015-12-09 23:03:18 +01:00
antirez
3e9f8ae87e Fix typo UNCOMMENT -> COMMENT in example redis.conf. 2015-12-03 11:03:20 +01:00
antirez
f1472252eb Fix typo UNCOMMENT -> COMMENT in example redis.conf. 2015-12-03 11:03:20 +01:00
antirez
4623c72b54 Centralize slave replication handshake aborting.
Now we have a single function to call in any state of the slave
handshake, instead of using different functions for different states
which is error prone. Change performed in the context of issue #2479 but
does not fix it, since should be functionally identical to the past.
Just an attempt to make replication.c simpler to follow.
2015-12-03 10:38:56 +01:00
antirez
acc2336fd1 Centralize slave replication handshake aborting.
Now we have a single function to call in any state of the slave
handshake, instead of using different functions for different states
which is error prone. Change performed in the context of issue #2479 but
does not fix it, since should be functionally identical to the past.
Just an attempt to make replication.c simpler to follow.
2015-12-03 10:38:56 +01:00
antirez
74f83badbf Test HINCRBYFLOAT rounding only in x86_64 and when valgrind is not in use.
64 bit double math is not enough to make the test passing, and rounding
to 1.2999999 instead of 1.23 is not an error in the implementation.
Valgrind and sometimes other archs are not able to work with 80 bit
doubles.
2015-11-28 09:28:37 +01:00
antirez
fceaa46dda Test HINCRBYFLOAT rounding only in x86_64 and when valgrind is not in use.
64 bit double math is not enough to make the test passing, and rounding
to 1.2999999 instead of 1.23 is not an error in the implementation.
Valgrind and sometimes other archs are not able to work with 80 bit
doubles.
2015-11-28 09:28:37 +01:00
antirez
6990638623 fix sprintf and snprintf format string
There are some cases of printing unsigned integer with %d conversion
specificator and vice versa (signed integer with %u specificator).

Patch by Sergey Polovko. Backported to Redis from Disque.
2015-11-28 09:05:41 +01:00
antirez
96628cc40d fix sprintf and snprintf format string
There are some cases of printing unsigned integer with %d conversion
specificator and vice versa (signed integer with %u specificator).

Patch by Sergey Polovko. Backported to Redis from Disque.
2015-11-28 09:05:41 +01:00
antirez
0169de0b25 Fix typo in prepareClientToWrite() comment. 2015-11-27 16:17:21 +01:00
antirez
e6a5117426 Fix typo in prepareClientToWrite() comment. 2015-11-27 16:17:21 +01:00
antirez
a8e014a466 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-11-27 16:12:23 +01:00
antirez
c2c68c50ef Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-11-27 16:12:23 +01:00
antirez
e3c11c1198 Handle wait3() errors.
My guess was that wait3() with WNOHANG could never return -1 and an
error. However issue #2897 may possibly indicate that this could happen
under non clear conditions. While we try to understand this better,
better to handle a return value of -1 explicitly, otherwise in the
case a BGREWRITE is in progress but wait3() returns -1, the effect is to
match the first branch of the if/else block since server.rdb_child_pid
is -1, and call backgroundSaveDoneHandler() without a good reason, that
will, in turn, crash the Redis server with an assertion.
2015-11-27 16:11:05 +01:00
antirez
da82723858 Handle wait3() errors.
My guess was that wait3() with WNOHANG could never return -1 and an
error. However issue #2897 may possibly indicate that this could happen
under non clear conditions. While we try to understand this better,
better to handle a return value of -1 explicitly, otherwise in the
case a BGREWRITE is in progress but wait3() returns -1, the effect is to
match the first branch of the if/else block since server.rdb_child_pid
is -1, and call backgroundSaveDoneHandler() without a good reason, that
will, in turn, crash the Redis server with an assertion.
2015-11-27 16:11:05 +01:00
Salvatore Sanfilippo
1298e7e079 Merge pull request #2899 from itamarhaber/patch-1
Revert Lua's `redis.LOG_<level>` to original
2015-11-27 15:46:16 +01:00
Salvatore Sanfilippo
816441865b Merge pull request #2899 from itamarhaber/patch-1
Revert Lua's `redis.LOG_<level>` to original
2015-11-27 15:46:16 +01:00
Itamar Haber
ebaff08613 Revert Lua's redis.LOG_<level> to original
Fixes #2898
2015-11-27 15:55:38 +02:00