1069 Commits

Author SHA1 Message Date
antirez
ce1f9cf81d PSYNC2 test: check ability to resync after restart. 2016-11-29 11:15:16 +01:00
antirez
02b570290e PSYNC2 test: check ability to resync after restart. 2016-11-29 11:15:16 +01:00
antirez
93c5198c17 PSYNC2 test: 20 seconds are enough... 2016-11-29 10:27:53 +01:00
antirez
17c02cee7d PSYNC2 test: 20 seconds are enough... 2016-11-29 10:27:53 +01:00
antirez
f6e42f0e3f PSYNC2 test: test added to the default tests. 2016-11-29 10:25:42 +01:00
antirez
855b0c05e7 PSYNC2 test: test added to the default tests. 2016-11-29 10:25:42 +01:00
antirez
c8f0690255 PSYNC2 test: modify the test for production. 2016-11-29 10:22:40 +01:00
antirez
49af69dfdd PSYNC2 test: modify the test for production. 2016-11-29 10:22:40 +01:00
antirez
eab865a0a1 PSYNC2: stop sending newlines to sub-slaves when master is down.
This actually includes two changes:

1) No newlines to take the master-slave link up when the upstream master
is down. Doing this is dangerous because the sub-slave often is received
replication protocol for an half-command, so can't receive newlines
without desyncing the replication link, even with the code in order to
cancel out the bytes that PSYNC2 was using. Moreover this is probably
also not needed/sane, because anyway the slave can keep serving
requests, and because if it's configured to don't serve stale data, it's
a good idea, actually, to break the link.

2) When a +CONTINUE with a different ID is received, we now break
connection with the sub-slaves: they need to be notified as well. This
was part of the original specification but for some reason it was not
implemented in the code, and was alter found as a PSYNC2 bug in the
integration testing.
2016-11-28 17:54:04 +01:00
antirez
5639ca070b PSYNC2: stop sending newlines to sub-slaves when master is down.
This actually includes two changes:

1) No newlines to take the master-slave link up when the upstream master
is down. Doing this is dangerous because the sub-slave often is received
replication protocol for an half-command, so can't receive newlines
without desyncing the replication link, even with the code in order to
cancel out the bytes that PSYNC2 was using. Moreover this is probably
also not needed/sane, because anyway the slave can keep serving
requests, and because if it's configured to don't serve stale data, it's
a good idea, actually, to break the link.

2) When a +CONTINUE with a different ID is received, we now break
connection with the sub-slaves: they need to be notified as well. This
was part of the original specification but for some reason it was not
implemented in the code, and was alter found as a PSYNC2 bug in the
integration testing.
2016-11-28 17:54:04 +01:00
antirez
16559a02fc PSYNC2: Test (WIP).
This is the PSYNC2 test that helped find issues in the code, and that
still can show a protocol desync from time to time. Work is in progress
in order to find the issue. For now the test is not enabled in "make
test" and must be run manually.
2016-11-28 10:13:24 +01:00
antirez
893f757ebc PSYNC2: Test (WIP).
This is the PSYNC2 test that helped find issues in the code, and that
still can show a protocol desync from time to time. Work is in progress
in order to find the issue. For now the test is not enabled in "make
test" and must be run manually.
2016-11-28 10:13:24 +01:00
antirez
f115461f4e Test: WAIT tests added in wait.tcl unit. 2016-11-18 13:10:29 +01:00
antirez
eb5c80460e Test: WAIT tests added in wait.tcl unit. 2016-11-18 13:10:29 +01:00
antirez
9749e96f42 Test: regression test for #3564 added. 2016-10-31 15:46:58 +01:00
antirez
47396311df Test: regression test for #3564 added. 2016-10-31 15:46:58 +01:00
antirez
f633212073 Fix SELECT test, broken cause change in error msg. 2016-10-14 15:48:11 +02:00
antirez
96dad4da1f Fix SELECT test, broken cause change in error msg. 2016-10-14 15:48:11 +02:00
antirez
dacb69ed00 RDB AOF preamble: test it in the aofrw unit. 2016-08-24 15:39:39 +02:00
antirez
cf5154f72e RDB AOF preamble: test it in the aofrw unit. 2016-08-24 15:39:39 +02:00
antirez
356a6304ec Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS
tests failures were detected.

Fortunately when a GEORADIUS failure happens, the test suite logs enough
information in order to reproduce the problem: the PRNG seed,
coordinates and radius of the query.

By reproducing the issues, three different bugs were discovered and
fixed in this commit. This commit also improves the already good
reporting of the fuzzer and adds the failure vectors as regression
tests.

The issues found:

1. We need larger squares around the poles in order to cover the area
requested by the user. There were already checks in order to use a
smaller step (larger squares) but the limit set (+/- 67 degrees) is not
enough in certain edge cases, so 66 is used now.

2. Even near the equator, when the search area center is very near the
edge of the square, the north, south, west or ovest square may not be
able to fully cover the specified radius. Now a test is performed at the
edge of the initial guessed search area, and larger squares are used in
case the test fails.

3. Because of rounding errors between Redis and Tcl, sometimes the test
signaled false positives. This is now addressed.

Whenever possible the original code was improved a bit in other ways. A
debugging example stanza was added in order to make the next debugging
session simpler when the next bug is found.
2016-07-27 11:34:25 +02:00
antirez
976c2425c4 Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS
tests failures were detected.

Fortunately when a GEORADIUS failure happens, the test suite logs enough
information in order to reproduce the problem: the PRNG seed,
coordinates and radius of the query.

By reproducing the issues, three different bugs were discovered and
fixed in this commit. This commit also improves the already good
reporting of the fuzzer and adds the failure vectors as regression
tests.

The issues found:

1. We need larger squares around the poles in order to cover the area
requested by the user. There were already checks in order to use a
smaller step (larger squares) but the limit set (+/- 67 degrees) is not
enough in certain edge cases, so 66 is used now.

2. Even near the equator, when the search area center is very near the
edge of the square, the north, south, west or ovest square may not be
able to fully cover the specified radius. Now a test is performed at the
edge of the initial guessed search area, and larger squares are used in
case the test fails.

3. Because of rounding errors between Redis and Tcl, sometimes the test
signaled false positives. This is now addressed.

Whenever possible the original code was improved a bit in other ways. A
debugging example stanza was added in order to make the next debugging
session simpler when the next bug is found.
2016-07-27 11:34:25 +02:00
antirez
8b76d55f2e Sentinel: new test unit 07 that tests master down conditions. 2016-07-22 16:39:26 +02:00
antirez
7534ce9996 Sentinel: new test unit 07 that tests master down conditions. 2016-07-22 16:39:26 +02:00
antirez
3e9ce38b0a Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
f844032233 Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
abb3385e8d Regression test for issue #3333. 2016-07-06 11:50:20 +02:00
antirez
669db62ae1 Regression test for issue #3333. 2016-07-06 11:50:20 +02:00
antirez
a0dd0140f3 Fix test for new RDB checksum failure message. 2016-07-04 12:41:35 +02:00
antirez
1ce90bd39f Fix test for new RDB checksum failure message. 2016-07-04 12:41:35 +02:00
antirez
24bd9b19f6 Test: new randomized stress tester for #3343 alike bugs. 2016-06-28 09:42:20 +02:00
antirez
163f95978b Test: new randomized stress tester for #3343 alike bugs. 2016-06-28 09:42:20 +02:00
antirez
f983318e52 Stress tester WIP. 2016-06-28 09:33:36 +02:00
antirez
30e2f37ab7 Stress tester WIP. 2016-06-28 09:33:36 +02:00
antirez
49899866c8 Regression test for issue #3343 exact min crash sequence.
Note: it was verified that it can crash the test suite without the patch
applied.
2016-06-28 09:27:14 +02:00
antirez
2f3b3ae896 Regression test for issue #3343 exact min crash sequence.
Note: it was verified that it can crash the test suite without the patch
applied.
2016-06-28 09:27:14 +02:00
antirez
3bd20ea2f1 Test TOUCH and new TTL / TYPE behavior about object access time. 2016-06-15 17:15:51 +02:00
antirez
343b54cdbe Test TOUCH and new TTL / TYPE behavior about object access time. 2016-06-15 17:15:51 +02:00
antirez
212f157855 Regression test for #3282. 2016-06-15 11:49:49 +02:00
Pierre Chapuis
d88c3c77be make RPUSHX and LPUSHX variadic 2016-06-05 16:50:24 +02:00
antirez
b64fcbc74c Test: run GEO tests by default.
Thanks to @oranagra for noticing it was missing.
2016-05-31 16:43:51 +02:00
antirez
231c9db1b5 Now that SPOP can be called by scripts use BLPOP on 's' flag test. 2016-05-31 16:43:23 +02:00
antirez
078f46126c Test for BITFIELD regression #3221. 2016-05-18 14:53:30 +02:00
antirez
f9ee039a76 Scripting test: match new error message. 2016-05-06 09:12:56 +02:00
antirez
4c53bab17b Cluster test 12: reshard back just a few slots to speedup the test. 2016-05-05 11:57:49 +02:00
antirez
0bb787d3ad Quick fix to avoid false positive in replica migration test. 2016-05-05 09:45:31 +02:00
Salvatore Sanfilippo
b5352eea51 Merge pull request #3191 from oranagra/minor_fix
Minor fixes found during merge and code review
2016-05-04 09:11:36 +02:00
antirez
9c48f28e54 Cluster regression test for #3043.
The test works but is very slow so far, since it involves resharding
1/5 of all the cluster slots from master 0 to the other 4 masters and
back into the original master.
2016-05-02 18:37:37 +02:00
Oran Agra
5e3880a492 various cleanups and minor fixes 2016-04-25 16:49:57 +03:00
Damian Janowski
0b4bb502a2 Fix ZINCRBY return value. 2016-04-18 00:35:54 -03:00