207 Commits

Author SHA1 Message Date
John Sully
c179c98870 Fix issue where active replication doesn't replicate RDB data
Former-commit-id: 527b7eb0742567302e0343e3acbed9814c0cbb95
2020-11-23 02:01:40 +00:00
John Sully
e8753d1b4b Blocking clients should not crash if an active replica loads a remote RDB with a key in the blocklist
Former-commit-id: 1c525e20b10e0a47af687a0d46faf75229a1cbf5
2020-11-19 23:28:01 +00:00
John Sully
7db922f44b Additional test reliability fixes
Former-commit-id: dad5a902d394719ba722e487879fc283ca148786
2020-10-27 07:25:43 +00:00
John Sully
18c34bbfe2 Active replica test reliability enhancements
Former-commit-id: 444555d3e4ec6e9469dae847dc631f2be263fb5e
2020-10-27 06:23:14 +00:00
John Sully
4f18a247e3 Merge tag '6.0.8' into unstable
Former-commit-id: 4c7e4b91a6bb2034636856b608b8c386d07f5541
2020-09-30 19:47:55 +00:00
John Sully
3c0556093b Test RDB merge on load with active replication
Former-commit-id: 28183f4b66fc4c865048080b61e599eeb1d2293b
2020-09-29 03:26:06 +00:00
John Sully
4eecb1825f Implement replica-quorum config
Former-commit-id: ab4cdd2ed4d3ee9752737d20662178d73d16b4c2
2020-09-16 03:26:31 +00:00
Yossi Gottlieb
3c8b394511 Tests: clean up stale .cli files. (#7768)
(cherry picked from commit 918abd7276afcb994f2d3f8a86a0708993420e37)
2020-09-10 14:09:00 +03:00
Oran Agra
db6c763d8b test infra - wait_done_loading
reduce code duplication in aof.tcl.
move creation of clients into the test so that it can be skipped

(cherry picked from commit 1b7ba44e7917082ac6d5523666d3b4ab210dfbad)
2020-09-10 14:09:00 +03:00
Oran Agra
5b8de5b7f2 test infra - reduce disk space usage
this is important when running a test with --loop

(cherry picked from commit e3e69c25fd05b608f5ea8d612bc0e377922a6115)
2020-09-10 14:09:00 +03:00
Yossi Gottlieb
8d79702d8a Tests: fix redis-cli with remote hosts. (#7693)
(cherry picked from commit f80f3f492a0ca56e163899eeca7ad40d67d903be)
2020-09-01 09:27:58 +03:00
Oran Agra
916b215fc5 fix new rdb test failing on timing issues (#7604)
apparenlty on github actions sometimes 500ms is not enough

(cherry picked from commit 824bd2ac11472b7a3fce9fcf3189a8e6c6048115)
2020-09-01 09:27:58 +03:00
Oran Agra
67750ce3b3 Fix failing tests due to issues with wait_for_log_message (#7572)
- the test now waits for specific set of log messages rather than wait for
  timeout looking for just one message.
- we don't wanna sample the current length of the log after an action, due
  to a race, we need to start the search from the line number of the last
  message we where waiting for.
- when attempting to trigger a full sync, use multi-exec to avoid a race
  where the replica manages to re-connect before we completed the set of
  actions that should force a full sync.
- fix verify_log_message which was broken and unused

(cherry picked from commit 109b5ccdcd6e6b8cecdaeb13a246bc49ce7a61f4)
2020-09-01 09:27:58 +03:00
Oran Agra
6daa8b9adb Stabilize bgsave test that sometimes fails with valgrind (#7559)
on ci.redis.io the test fails a lot, reporting that bgsave didn't end.
increaseing the timeout we wait for that bgsave to get aborted.
in addition to that, i also verify that it indeed got aborted by
checking that the save counter wasn't reset.

add another test to verify that a successful bgsave indeed resets the
change counter.

(cherry picked from commit 8a57969fd75db01b881d438200911d95bdead293)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
f1d5d5d28e Tests: drop TCL 8.6 dependency. (#7548)
This re-implements the redis-cli --pipe test so it no longer depends on a close feature available only in TCL 8.6.

Basically what this test does is run redis-cli --pipe, generates a bunch of commands and pipes them through redis-cli, and inspects the result in both Redis and the redis-cli output.

To do that, we need to close stdin for redis-cli to indicate we're done so it can flush its buffers and exit. TCL has bi-directional channels can only offers a way to "one-way close" a channel with TCL 8.6. To work around that, we now generate the commands into a file and feed that file to redis-cli directly.

As we're writing to an actual file, the number of commands is now reduced.

(cherry picked from commit f57e844b2edbb86a5df2f3436045814812c0a3ae)
2020-09-01 09:27:58 +03:00
Oran Agra
05f8975d21 redis-cli tests, fix valgrind timing issue (#7519)
this test when run with valgrind on github actions takes 160 seconds

(cherry picked from commit 254c96255420e950bcad1a46bc4f8617b4373797)
2020-07-20 21:08:26 +03:00
Oran Agra
aea4db2f5a fix recently added time sensitive tests failing with valgrind (#7512)
interestingly the latency monitor test fails because valgrind is slow
enough so that the time inside PEXPIREAT command from the moment of
the first mstime() call to get the basetime until checkAlreadyExpired
calls mstime() again is more than 1ms, and that test was too sensitive.

using this opportunity to speed up the test (unrelated to the failure)
the fix is just the longer time passed to PEXPIRE.

(cherry picked from commit e5227aab899628653285478a9d1083e8e8f51b57)
2020-07-20 21:08:26 +03:00
Yossi Gottlieb
b057ff81ee TLS: Add missing redis-cli options. (#7456)
* Tests: fix and reintroduce redis-cli tests.

These tests have been broken and disabled for 10 years now!

* TLS: add remaining redis-cli support.

This adds support for the redis-cli --pipe, --rdb and --replica options
previously unsupported in --tls mode.

* Fix writeConn().

(cherry picked from commit d9f970d8d3f0b694f1e8915cab6d4eab9cfb2ef1)
2020-07-20 21:08:26 +03:00
Oran Agra
2b5f23197c stabilize tests that look for log lines (#7367)
tests were sensitive to additional log lines appearing in the log
causing the search to come empty handed.

instead of just looking for the n last log lines, capture the log lines
before performing the action, and then search from that offset.

(cherry picked from commit 8e76e13472b7d277af78691775c2cf845f68ab90)
2020-07-20 21:08:26 +03:00
Oran Agra
1104113c07 tests/valgrind: don't use debug restart (#7404)
* tests/valgrind: don't use debug restart

DEBUG REATART causes two issues:
1. it uses execve which replaces the original process and valgrind doesn't
   have a chance to check for errors, so leaks go unreported.
2. valgrind report invalid calls to close() which we're unable to resolve.

So now the tests use restart_server mechanism in the tests, that terminates
the old server and starts a new one, new PID, but same stdout, stderr.

since the stderr can contain two or more valgrind report, it is not enough
to just check for the absence of leaks, we also need to check for some known
errors, we do both, and fail if we either find an error, or can't find a
report saying there are no leaks.

other changes:
- when killing a server that was already terminated we check for leaks too.
- adding DEBUG LEAK which was used to test it.
- adding --trace-children to valgrind, although no longer needed.
- since the stdout contains two or more runs, we need slightly different way
  of checking if the new process is up (explicitly looking for the new PID)
- move the code that handles --wait-server to happen earlier (before
  watching the startup message in the log), and serve the restarted server too.

* squashme - CR fixes

(cherry picked from commit 69ade87325eedebdb44760af9a8c28e15381888e)
2020-07-20 21:08:26 +03:00
John Sully
d4dd336834 Merge tag '6.0.5' into unstable
Redis 6.0.5


Former-commit-id: b736a95b0d23e4b73daa88c676b76d1d18e8bd17
2020-07-13 00:55:23 +00:00
John Sully
c5f6cb1ba5 Add multi-master-no-forward command to reduce bus traffic with multi-master
Former-commit-id: d99d06b1250a51ea4bc54f678f451acbb7901e33
2020-07-12 19:25:19 +00:00
John Sully
cd08792df7 Fix failure to merge databases on active replica sync, due to bad merge with Redis 6
Former-commit-id: cd9514f4c8624932df2ec60ae3c2244899844aa6
2020-07-12 01:13:22 +00:00
John Sully
2c560f27b8 replication test race
Former-commit-id: e1f3cd6ec3bf2319484de04c3796dcfa75e0479c
2020-06-07 01:14:57 -04:00
Oran Agra
f33de403ed fix pingoff test race 2020-06-06 11:44:21 +02:00
John Sully
9fb7552b63 PSYNC test shouldn't wait forever
Former-commit-id: 130613e16636923296a8d5b2c4bc623e62fef2f5
2020-06-01 16:13:58 -04:00
John Sully
2b08505fed PSYNC test reliability improvements (test only issue)
Former-commit-id: 50fd4fa7e62f3996f15f6a8c4dcd892022f111ec
2020-06-01 16:01:26 -04:00
John Sully
4f7102f46c Fix for issue #187 we need to properly handle the case where a key with a subkey expirey itself expires during load
Former-commit-id: e6a9a6b428b91b6108df24ae6285ea9b582b7b23
2020-06-01 15:33:19 -04:00
John Sully
df5b0f0be5 sendfile has high latency in some scenarios, don't use it
Former-commit-id: 1eb0e3c1c604e71c54423f1d11b8c709c847a516
2020-05-31 23:22:25 -04:00
John Sully
eddc1ad46a Don't start multimaster tests until all nodes are connected
Former-commit-id: 202b97eff76501e736a2f0969607e3297e9703a4
2020-05-31 22:50:30 -04:00
John Sully
2aed24d0a5 active replica tests on slow computers
Former-commit-id: c9920849dd6d6d0f6ecfe0d1002cb0edd7f7bfa9
2020-05-29 01:58:15 -04:00
John Sully
acde7c340e Fix test issue with TLS
Former-commit-id: 81b240f81d1c52fd331c4e0e89659913380229c4
2020-05-29 01:44:52 -04:00
John Sully
cfe9f8f3bc Merge tag '6.0.4' into unstable
Redis 6.0.4.


Former-commit-id: 9c31ac7925edba187e527f506e5e992946bd38a6
2020-05-29 00:57:07 -04:00
antirez
59cd4c9f65 Test: take PSYNC2 test master timeout high during switch.
This will likely avoid false positives due to trailing pings.
2020-05-28 10:56:14 +02:00
Oran Agra
ab2984b1e2 adjust revived meaningful offset tests
these tests create several edge cases that are otherwise uncovered (at
least not consistently) by the test suite, so although they're no longer
testing what they were meant to test, it's still a good idea to keep
them in hope that they'll expose some issue in the future.
2020-05-28 10:09:51 +02:00
Oran Agra
1ff5a222de revive meaningful offset tests 2020-05-28 10:09:51 +02:00
antirez
3f8d113f1b Another meaningful offset test removed. 2020-05-28 10:09:51 +02:00
antirez
d4541349dc Remove the PSYNC2 meaningful offset test. 2020-05-28 10:09:51 +02:00
antirez
8f10137227 Test: PSYNC2 test can now show server logs. 2020-05-28 10:09:51 +02:00
John Sully
2d783a3cbf Merge tag '6.0.2' into unstable
Redis 6.0.2


Former-commit-id: a010e4a4b2cc2bcad1cb14604b7ebc596c35b05e
2020-05-22 16:45:18 -04:00
John Sully
1eeb5de69f Merge commit 'c57d9146f41f4b661d9d2cb48b83b3abc757ba0e' into unstable
Former-commit-id: d74871da40dea11bd1a226fbecb0974ff5f8ec8c
2020-05-22 15:36:44 -04:00
Qu Chen
58fc456cbd Disconnect chained replicas when the replica performs PSYNC with the master always to avoid replication offset mismatch between master and chained replicas. 2020-05-22 12:37:59 +02:00
Oran Agra
00d8b92b89 fix valgrind test failure in replication test
in b4416280c i added more keys to that test to make it run longer
but in valgrind this now means the test times out, give valgrind more
time.
2020-05-22 12:37:49 +02:00
Oran Agra
5e17e6276c add regression test for the race in #7205
with the original version of 6.0.0, this test detects an excessive full
sync.
with the fix in 1a7cd2c0e, this test detects memory corruption,
especially when using libc allocator with or without valgrind.
2020-05-22 12:37:49 +02:00
antirez
96e7c011e2 Improve the PSYNC2 test reliability. 2020-05-22 12:37:49 +02:00
John Sully
27eb239f1a Fix bad merge in CI.yml
Former-commit-id: 6311d709c39b3bacaeab77b18033010f1b548f81
2020-05-21 22:09:06 -04:00
Oran Agra
9da134cd88 fix redis 6.0 not freeing closed connections during loading.
This bug was introduced by a recent change in which readQueryFromClient
is using freeClientAsync, and despite the fact that now
freeClientsInAsyncFreeQueue is in beforeSleep, that's not enough since
it's not called during loading in processEventsWhileBlocked.
furthermore, afterSleep was called in that case but beforeSleep wasn't.

This bug also caused slowness sine the level-triggered mode of epoll
kept signaling these connections as readable causing us to keep doing
connRead again and again for ll of these, which keep accumulating.

now both before and after sleep are called, but not all of their actions
are performed during loading, some are only reserved for the main loop.

fixes issue #7215
2020-05-14 11:29:43 +02:00
Oran Agra
5c41802d55 fix unstable replication test
this test which has coverage for varoius flows of diskless master was
failing randomly from time to time.

the failure was:
[err]: diskless all replicas drop during rdb pipe in tests/integration/replication.tcl
log message of '*Diskless rdb transfer, last replica dropped, killing fork child*' not found

what seemed to have happened is that the master didn't detect that all
replicas dropped by the time the replication ended, it thought that one
replica is still connected.

now the test takes a few seconds longer but it seems stable.
2020-05-14 11:29:43 +02:00
John
181fadb708 more reliability fixes for multimaster
Former-commit-id: 3543a3c763de91a4d76bca89659fec9bf6b7a1c8
2020-05-11 05:38:21 -04:00
John
3d6f990104 Make multimaster tests more reliable
Former-commit-id: 3122912920973cb433d625a09b183c3f538e2523
2020-05-11 05:23:47 -04:00