313 Commits

Author SHA1 Message Date
Oran Agra
69ade87325
tests/valgrind: don't use debug restart (#7404)
* tests/valgrind: don't use debug restart

DEBUG REATART causes two issues:
1. it uses execve which replaces the original process and valgrind doesn't
   have a chance to check for errors, so leaks go unreported.
2. valgrind report invalid calls to close() which we're unable to resolve.

So now the tests use restart_server mechanism in the tests, that terminates
the old server and starts a new one, new PID, but same stdout, stderr.

since the stderr can contain two or more valgrind report, it is not enough
to just check for the absence of leaks, we also need to check for some known
errors, we do both, and fail if we either find an error, or can't find a
report saying there are no leaks.

other changes:
- when killing a server that was already terminated we check for leaks too.
- adding DEBUG LEAK which was used to test it.
- adding --trace-children to valgrind, although no longer needed.
- since the stdout contains two or more runs, we need slightly different way
  of checking if the new process is up (explicitly looking for the new PID)
- move the code that handles --wait-server to happen earlier (before
  watching the startup message in the log), and serve the restarted server too.

* squashme - CR fixes
2020-07-10 08:26:52 +03:00
John Sully
067382bb8c Merge branch 'unstable' into keydbpro
Former-commit-id: 461eea07260a31cd75753d5b7be691f5793a6f1b
2020-06-07 16:41:21 -04:00
John Sully
2c560f27b8 replication test race
Former-commit-id: e1f3cd6ec3bf2319484de04c3796dcfa75e0479c
2020-06-07 01:14:57 -04:00
Oran Agra
f33de403ed fix pingoff test race 2020-06-06 11:44:21 +02:00
John Sully
44c1e6d5e7 Merge branch 'unstable' into keydbpro
Former-commit-id: 08a36155e3db9918048e87c3d691b7317787c9ab
2020-06-01 17:41:37 -04:00
John Sully
9fb7552b63 PSYNC test shouldn't wait forever
Former-commit-id: 130613e16636923296a8d5b2c4bc623e62fef2f5
2020-06-01 16:13:58 -04:00
John Sully
2b08505fed PSYNC test reliability improvements (test only issue)
Former-commit-id: 50fd4fa7e62f3996f15f6a8c4dcd892022f111ec
2020-06-01 16:01:26 -04:00
John Sully
4f7102f46c Fix for issue #187 we need to properly handle the case where a key with a subkey expirey itself expires during load
Former-commit-id: e6a9a6b428b91b6108df24ae6285ea9b582b7b23
2020-06-01 15:33:19 -04:00
John Sully
df5b0f0be5 sendfile has high latency in some scenarios, don't use it
Former-commit-id: 1eb0e3c1c604e71c54423f1d11b8c709c847a516
2020-05-31 23:22:25 -04:00
John Sully
eddc1ad46a Don't start multimaster tests until all nodes are connected
Former-commit-id: 202b97eff76501e736a2f0969607e3297e9703a4
2020-05-31 22:50:30 -04:00
Oran Agra
c480af9007 fix pingoff test race 2020-05-31 15:51:52 +03:00
John Sully
2aed24d0a5 active replica tests on slow computers
Former-commit-id: c9920849dd6d6d0f6ecfe0d1002cb0edd7f7bfa9
2020-05-29 01:58:15 -04:00
John Sully
acde7c340e Fix test issue with TLS
Former-commit-id: 81b240f81d1c52fd331c4e0e89659913380229c4
2020-05-29 01:44:52 -04:00
John Sully
cfe9f8f3bc Merge tag '6.0.4' into unstable
Redis 6.0.4.


Former-commit-id: 9c31ac7925edba187e527f506e5e992946bd38a6
2020-05-29 00:57:07 -04:00
antirez
59cd4c9f65 Test: take PSYNC2 test master timeout high during switch.
This will likely avoid false positives due to trailing pings.
2020-05-28 10:56:14 +02:00
antirez
23f2b4d0a8 Test: take PSYNC2 test master timeout high during switch.
This will likely avoid false positives due to trailing pings.
2020-05-28 10:47:30 +02:00
Oran Agra
ab2984b1e2 adjust revived meaningful offset tests
these tests create several edge cases that are otherwise uncovered (at
least not consistently) by the test suite, so although they're no longer
testing what they were meant to test, it's still a good idea to keep
them in hope that they'll expose some issue in the future.
2020-05-28 10:09:51 +02:00
Oran Agra
1ff5a222de revive meaningful offset tests 2020-05-28 10:09:51 +02:00
antirez
3f8d113f1b Another meaningful offset test removed. 2020-05-28 10:09:51 +02:00
antirez
d4541349dc Remove the PSYNC2 meaningful offset test. 2020-05-28 10:09:51 +02:00
antirez
8f10137227 Test: PSYNC2 test can now show server logs. 2020-05-28 10:09:51 +02:00
Oran Agra
2a8af8e675 adjust revived meaningful offset tests
these tests create several edge cases that are otherwise uncovered (at
least not consistently) by the test suite, so although they're no longer
testing what they were meant to test, it's still a good idea to keep
them in hope that they'll expose some issue in the future.
2020-05-28 09:10:51 +03:00
Oran Agra
90f3856fd5 revive meaningful offset tests 2020-05-28 08:21:24 +03:00
antirez
484cfc3d76 Another meaningful offset test removed. 2020-05-27 12:50:02 +02:00
antirez
32d0df0c1f Remove the PSYNC2 meaningful offset test. 2020-05-27 12:47:34 +02:00
antirez
091fb64681 Test: PSYNC2 test can now show server logs. 2020-05-25 20:26:29 +02:00
John Sully
cece963cf3 Merge branch 'unstable' into keydbpro
Former-commit-id: a830cf85df236885558c5571c0bf23cfb23e3655
2020-05-24 14:41:53 -04:00
John Sully
2d783a3cbf Merge tag '6.0.2' into unstable
Redis 6.0.2


Former-commit-id: a010e4a4b2cc2bcad1cb14604b7ebc596c35b05e
2020-05-22 16:45:18 -04:00
John Sully
1eeb5de69f Merge commit 'c57d9146f41f4b661d9d2cb48b83b3abc757ba0e' into unstable
Former-commit-id: d74871da40dea11bd1a226fbecb0974ff5f8ec8c
2020-05-22 15:36:44 -04:00
Qu Chen
58fc456cbd Disconnect chained replicas when the replica performs PSYNC with the master always to avoid replication offset mismatch between master and chained replicas. 2020-05-22 12:37:59 +02:00
Oran Agra
00d8b92b89 fix valgrind test failure in replication test
in b4416280c i added more keys to that test to make it run longer
but in valgrind this now means the test times out, give valgrind more
time.
2020-05-22 12:37:49 +02:00
Oran Agra
5e17e6276c add regression test for the race in #7205
with the original version of 6.0.0, this test detects an excessive full
sync.
with the fix in 1a7cd2c0e, this test detects memory corruption,
especially when using libc allocator with or without valgrind.
2020-05-22 12:37:49 +02:00
antirez
96e7c011e2 Improve the PSYNC2 test reliability. 2020-05-22 12:37:49 +02:00
John Sully
27eb239f1a Fix bad merge in CI.yml
Former-commit-id: 6311d709c39b3bacaeab77b18033010f1b548f81
2020-05-21 22:09:06 -04:00
Qu Chen
42f5da5d2d Disconnect chained replicas when the replica performs PSYNC with the master always to avoid replication offset mismatch between master and chained replicas. 2020-05-21 18:42:10 -07:00
Oran Agra
75c11d7fec fix valgrind test failure in replication test
in b4416280c i added more keys to that test to make it run longer
but in valgrind this now means the test times out, give valgrind more
time.
2020-05-18 10:26:53 +03:00
antirez
0ca2f4f824 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2020-05-17 18:24:48 +02:00
antirez
96bb0c9471 Improve the PSYNC2 test reliability. 2020-05-17 18:24:34 +02:00
Oran Agra
357aace895 add regression test for the race in #7205
with the original version of 6.0.0, this test detects an excessive full
sync.
with the fix in 1a7cd2c0e, this test detects memory corruption,
especially when using libc allocator with or without valgrind.
2020-05-17 18:26:02 +03:00
Oran Agra
9da134cd88 fix redis 6.0 not freeing closed connections during loading.
This bug was introduced by a recent change in which readQueryFromClient
is using freeClientAsync, and despite the fact that now
freeClientsInAsyncFreeQueue is in beforeSleep, that's not enough since
it's not called during loading in processEventsWhileBlocked.
furthermore, afterSleep was called in that case but beforeSleep wasn't.

This bug also caused slowness sine the level-triggered mode of epoll
kept signaling these connections as readable causing us to keep doing
connRead again and again for ll of these, which keep accumulating.

now both before and after sleep are called, but not all of their actions
are performed during loading, some are only reserved for the main loop.

fixes issue #7215
2020-05-14 11:29:43 +02:00
Oran Agra
5c41802d55 fix unstable replication test
this test which has coverage for varoius flows of diskless master was
failing randomly from time to time.

the failure was:
[err]: diskless all replicas drop during rdb pipe in tests/integration/replication.tcl
log message of '*Diskless rdb transfer, last replica dropped, killing fork child*' not found

what seemed to have happened is that the master didn't detect that all
replicas dropped by the time the replication ended, it thought that one
replica is still connected.

now the test takes a few seconds longer but it seems stable.
2020-05-14 11:29:43 +02:00
antirez
c38fd1f661 Merge branch 'free_clients_during_loading' into unstable 2020-05-14 11:28:08 +02:00
Oran Agra
b4416280cf fix unstable replication test
this test which has coverage for varoius flows of diskless master was
failing randomly from time to time.

the failure was:
[err]: diskless all replicas drop during rdb pipe in tests/integration/replication.tcl
log message of '*Diskless rdb transfer, last replica dropped, killing fork child*' not found

what seemed to have happened is that the master didn't detect that all
replicas dropped by the time the replication ended, it thought that one
replica is still connected.

now the test takes a few seconds longer but it seems stable.
2020-05-12 08:59:09 +03:00
John
181fadb708 more reliability fixes for multimaster
Former-commit-id: 3543a3c763de91a4d76bca89659fec9bf6b7a1c8
2020-05-11 05:38:21 -04:00
John
daaf82b673 more reliability fixes for multimaster
Former-commit-id: fd5b541260908423c35227ff9e42a83f96ace6c0
2020-05-11 09:37:42 +00:00
John
3d6f990104 Make multimaster tests more reliable
Former-commit-id: 3122912920973cb433d625a09b183c3f538e2523
2020-05-11 05:23:47 -04:00
John
0e024808c2 Make multimaster tests more reliable
Former-commit-id: 4fe59ba11b720864ea0124885b358cb72127cc2d
2020-05-11 09:22:27 +00:00
Oran Agra
905e28ee87 fix redis 6.0 not freeing closed connections during loading.
This bug was introduced by a recent change in which readQueryFromClient
is using freeClientAsync, and despite the fact that now
freeClientsInAsyncFreeQueue is in beforeSleep, that's not enough since
it's not called during loading in processEventsWhileBlocked.
furthermore, afterSleep was called in that case but beforeSleep wasn't.

This bug also caused slowness sine the level-triggered mode of epoll
kept signaling these connections as readable causing us to keep doing
connRead again and again for ll of these, which keep accumulating.

now both before and after sleep are called, but not all of their actions
are performed during loading, some are only reserved for the main loop.

fixes issue #7215
2020-05-11 11:33:46 +03:00
Oran Agra
3d3861dd88 add daily github actions with libc malloc and valgrind
* fix memlry leaks with diskless replica short read.
* fix a few timing issues with valgrind runs
* fix issue with valgrind and watchdog schedule signal

about the valgrind WD issue:
the stack trace test in logging.tcl, has issues with valgrind:
==28808== Can't extend stack to 0x1ffeffdb38 during signal delivery for thread 1:
==28808==   too small or bad protection modes

it seems to be some valgrind bug with SA_ONSTACK.
SA_ONSTACK seems unneeded since WD is not recursive (SA_NODEFER was removed),
also, not sure if it's even valid without a call to sigaltstack()
2020-05-08 10:37:35 +02:00
Oran Agra
deee2c1ef2 add daily github actions with libc malloc and valgrind
* fix memlry leaks with diskless replica short read.
* fix a few timing issues with valgrind runs
* fix issue with valgrind and watchdog schedule signal

about the valgrind WD issue:
the stack trace test in logging.tcl, has issues with valgrind:
==28808== Can't extend stack to 0x1ffeffdb38 during signal delivery for thread 1:
==28808==   too small or bad protection modes

it seems to be some valgrind bug with SA_ONSTACK.
SA_ONSTACK seems unneeded since WD is not recursive (SA_NODEFER was removed),
also, not sure if it's even valid without a call to sigaltstack()
2020-05-04 09:52:20 +03:00