27610 Commits

Author SHA1 Message Date
Oran Agra
8ea131fc80
Fix leak in new blockedclient module API test (#7780) 2020-09-10 10:22:16 +03:00
Oran Agra
95d7674cad Fix RESP3 response for HKEYS/HVALS on non-existing key (#7781) 2020-09-10 10:09:13 +03:00
Oran Agra
de8d320230
Fix RESP3 response for HKEYS/HVALS on non-existing key (#7781) 2020-09-10 10:09:13 +03:00
Yossi Gottlieb
1abc94155a Tests: fix oom-score-adj false positives. (#7772)
The key save delay is too short and on certain systems the child process
is gone before we have a chance to inspect it.
2020-09-09 18:58:06 +03:00
Yossi Gottlieb
b2a73c404b
Tests: fix oom-score-adj false positives. (#7772)
The key save delay is too short and on certain systems the child process
is gone before we have a chance to inspect it.
2020-09-09 18:58:06 +03:00
杨博东
ce14668316 Tests: Add aclfile load and save tests (#7765)
improves test coverage
2020-09-09 17:13:35 +03:00
杨博东
0666267d27
Tests: Add aclfile load and save tests (#7765)
improves test coverage
2020-09-09 17:13:35 +03:00
Roi Lipman
b1de173ec0 RM_ThreadSafeContextTryLock a non-blocking method for acquiring GIL (#7738)
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 16:01:16 +03:00
Roi Lipman
042189fd87
RM_ThreadSafeContextTryLock a non-blocking method for acquiring GIL (#7738)
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 16:01:16 +03:00
Yossi Gottlieb
995f1fc53f Tests: validate CONFIG REWRITE for all params. (#7764)
This is a catch-all test to confirm that that rewrite produces a valid
output for all parameters and that this process does not introduce
undesired configuration changes.
2020-09-09 15:43:11 +03:00
Yossi Gottlieb
a8b7268911
Tests: validate CONFIG REWRITE for all params. (#7764)
This is a catch-all test to confirm that that rewrite produces a valid
output for all parameters and that this process does not introduce
undesired configuration changes.
2020-09-09 15:43:11 +03:00
Oran Agra
73e0cd5a7d Change THP warning to use madvise rather than never (#7771)
completes 60097d361d4096d3826c7580acffd4053f8a4835
2020-09-09 15:39:57 +03:00
Oran Agra
1461f02deb
Change THP warning to use madvise rather than never (#7771)
completes b2419c31c166bd2d73f7af3d089859795c0e3506
2020-09-09 15:39:57 +03:00
天河
df70120c90 Fix comments of _quicklistSplitNode function. (#4341)
Comments about the behavior of the function where wrong (off by one)
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 15:28:38 +03:00
天河
63730d9dd0
Fix comments of _quicklistSplitNode function. (#4341)
Comments about the behavior of the function where wrong (off by one)
Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 15:28:38 +03:00
Yossi Gottlieb
74bac9610e Fix default/explicit "save" parameter loading. (#7767)
Save parameters should either be default or whatever specified in the
config file. This fixes an issue introduced in #7092 which causes
configuration file settings to be applied on top of the defaults.
2020-09-09 15:12:57 +03:00
Yossi Gottlieb
818a746e32
Fix default/explicit "save" parameter loading. (#7767)
Save parameters should either be default or whatever specified in the
config file. This fixes an issue introduced in #7092 which causes
configuration file settings to be applied on top of the defaults.
2020-09-09 15:12:57 +03:00
Itamar Haber
c13fa0aa36 Documents RM_Call's fmt (#5448)
Improve RM_Call inline documentation about the fmt argument
so that we don't completely depend on the web docs.

Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 15:09:41 +03:00
Itamar Haber
ce15620dc1
Documents RM_Call's fmt (#5448)
Improve RM_Call inline documentation about the fmt argument
so that we don't completely depend on the web docs.

Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-09 15:09:41 +03:00
Jan-Erik Rediger
60097d361d Check that THP is not set to always (madvise is ok) (#4001)
THP can also be set to madvise, in which case it shouldn't cause
problems for Redis since redis (or the allocator) doesn't use madvise
to activate it.
2020-09-09 15:06:04 +03:00
Jan-Erik Rediger
b2419c31c1
Check that THP is not set to always (madvise is ok) (#4001)
THP can also be set to madvise, in which case it shouldn't cause
problems for Redis since redis (or the allocator) doesn't use madvise
to activate it.
2020-09-09 15:06:04 +03:00
Yossi Gottlieb
e5b1ad413b Tests: clean up stale .cli files. (#7768) 2020-09-09 12:30:43 +03:00
Yossi Gottlieb
918abd7276
Tests: clean up stale .cli files. (#7768) 2020-09-09 12:30:43 +03:00
Eran Liberty
7bee51bb5b Allow exec with read commands on readonly replica in cluster (#7766)
There was a bug. Although cluster replicas would allow read commands,
they would not allow a MULTI-EXEC that's composed solely of read commands.
Adds tests for coverage.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Eran Liberty <eranl@amazon.com>
2020-09-09 09:35:42 +03:00
Eran Liberty
b120366d48
Allow exec with read commands on readonly replica in cluster (#7766)
There was a bug. Although cluster replicas would allow read commands,
they would not allow a MULTI-EXEC that's composed solely of read commands.
Adds tests for coverage.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Eran Liberty <eranl@amazon.com>
2020-09-09 09:35:42 +03:00
Oran Agra
f10ef2eb77 Tests: Some fixes for macOS (#7757)
* Tests: Some fixes for macOS

1) cur_test: when restart_server, "no such variable" error occurs
  ./runtest --single integration/rdb
  test {client freed during loading}
      SET ::cur_test
      restart_server
        kill_server
          test "Check for memory leaks (pid $pid)"
          SET ::cur_test
          UNSET ::cur_test
      UNSET ::cur_test // This global variable has been unset.

2) `ps --ppid` not available on macOS platform, can be replaced with
`pgrep -P pid`.

* handle cur_test for nested tests

if there are nested tests and nested servers, we need to restore the
previous value of cur_test when a test exist.

example:
```
test{test 1} {
	start_server {
		test{test 1.1 - master only} {
		}
		start_server {
		    test{test 1.2 - with replication} {
            }
		}
	}
}
```
when `test 1.1 - master only exists`, we're still inside `test 1`

Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-08 16:28:58 +03:00
Oran Agra
340963b2d7
Tests: Some fixes for macOS (#7757)
* Tests: Some fixes for macOS

1) cur_test: when restart_server, "no such variable" error occurs
  ./runtest --single integration/rdb
  test {client freed during loading}
      SET ::cur_test
      restart_server
        kill_server
          test "Check for memory leaks (pid $pid)"
          SET ::cur_test
          UNSET ::cur_test
      UNSET ::cur_test // This global variable has been unset.

2) `ps --ppid` not available on macOS platform, can be replaced with
`pgrep -P pid`.

* handle cur_test for nested tests

if there are nested tests and nested servers, we need to restore the
previous value of cur_test when a test exist.

example:
```
test{test 1} {
	start_server {
		test{test 1.1 - master only} {
		}
		start_server {
		    test{test 1.2 - with replication} {
            }
		}
	}
}
```
when `test 1.1 - master only exists`, we're still inside `test 1`

Co-authored-by: Oran Agra <oran@redislabs.com>
2020-09-08 16:28:58 +03:00
Yossi Gottlieb
b3782098ae Fix CONFIG REWRITE of oom-score-adj-values. (#7761) 2020-09-08 16:00:20 +03:00
Yossi Gottlieb
750acf3a45
Fix CONFIG REWRITE of oom-score-adj-values. (#7761) 2020-09-08 16:00:20 +03:00
Oran Agra
610b4ff16a handle cur_test for nested tests
if there are nested tests and nested servers, we need to restore the
previous value of cur_test when a test exist.

example:
```
test{test 1} {
	start_server {
		test{test 1.1 - master only} {
		}
		start_server {
		    test{test 1.2 - with replication} {
            }
		}
	}
}
```
when `test 1.1 - master only exists`, we're still inside `test 1`
2020-09-08 14:12:03 +03:00
Oran Agra
0a1e734193
handle cur_test for nested tests
if there are nested tests and nested servers, we need to restore the
previous value of cur_test when a test exist.

example:
```
test{test 1} {
	start_server {
		test{test 1.1 - master only} {
		}
		start_server {
		    test{test 1.2 - with replication} {
            }
		}
	}
}
```
when `test 1.1 - master only exists`, we're still inside `test 1`
2020-09-08 14:12:03 +03:00
Oran Agra
1701f23b1f Add daily CI for MacOS (#7759) 2020-09-08 10:59:25 +03:00
Oran Agra
5496b4a7cd
Add daily CI for MacOS (#7759) 2020-09-08 10:59:25 +03:00
bodong.ybd
e90385e223 Tests: Some fixes for macOS
1) cur_test: when restart_server, "no such variable" error occurs
  ./runtest --single integration/rdb
  test {client freed during loading}
      SET ::cur_test
      restart_server
        kill_server
          test "Check for memory leaks (pid $pid)"
          SET ::cur_test
          UNSET ::cur_test
      UNSET ::cur_test // This global variable has been unset.

2) `ps --ppid` not available on macOS platform, can be replaced with
`pgrep -P pid`.
2020-09-08 14:27:53 +08:00
bodong.ybd
f22fa9594d Tests: Some fixes for macOS
1) cur_test: when restart_server, "no such variable" error occurs
  ./runtest --single integration/rdb
  test {client freed during loading}
      SET ::cur_test
      restart_server
        kill_server
          test "Check for memory leaks (pid $pid)"
          SET ::cur_test
          UNSET ::cur_test
      UNSET ::cur_test // This global variable has been unset.

2) `ps --ppid` not available on macOS platform, can be replaced with
`pgrep -P pid`.
2020-09-08 14:27:53 +08:00
Oran Agra
541d2709a0 Fix cluster consistency-check test (#7754)
This test was failing from time to time see discussion at the bottom of #7635
This was probably due to timing, the DEBUG SLEEP executed by redis-cli
didn't sleep for enough time.

This commit changes:
1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP
2) reduce many `after` sleeps with retry loops to speed up the test.
3) add many comment explaining the different steps of the test and
   it's purpose.
4) config appendonly before populating the volatile keys, so that they'll
   be part of the AOF command stream rather than the preamble RDB portion.

other complications: recently kill_instance switched from SIGKILL to
SIGTERM, and this would sometimes fail since there was an AOFRW running
in the background. now we wait for it to end before attempting the kill.
2020-09-07 18:06:25 +03:00
Oran Agra
b491d477c3
Fix cluster consistency-check test (#7754)
This test was failing from time to time see discussion at the bottom of #7635
This was probably due to timing, the DEBUG SLEEP executed by redis-cli
didn't sleep for enough time.

This commit changes:
1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP
2) reduce many `after` sleeps with retry loops to speed up the test.
3) add many comment explaining the different steps of the test and
   it's purpose.
4) config appendonly before populating the volatile keys, so that they'll
   be part of the AOF command stream rather than the preamble RDB portion.

other complications: recently kill_instance switched from SIGKILL to
SIGTERM, and this would sometimes fail since there was an AOFRW running
in the background. now we wait for it to end before attempting the kill.
2020-09-07 18:06:25 +03:00
Yossi Gottlieb
871e85b8a7 Tests: fix unmonitored servers. (#7756)
There is an inherent race condition in port allocation for spawned
servers. If a server fails to start because a port is taken, a new port
is allocated. This fixes a problem where the logs are not truncated and
as a result a large number of unmonitored servers are started.
2020-09-07 17:30:36 +03:00
Yossi Gottlieb
2df4cb93ac
Tests: fix unmonitored servers. (#7756)
There is an inherent race condition in port allocation for spawned
servers. If a server fails to start because a port is taken, a new port
is allocated. This fixes a problem where the logs are not truncated and
as a result a large number of unmonitored servers are started.
2020-09-07 17:30:36 +03:00
Oran Agra
470de9a516 fix broken cluster/sentinel tests by recent commit (#7752)
da723a917 added a file for stderr to keep valgrind log but i forgot to
add a similar thing when valgrind isn't being used.
the result is that `glob */err.txt` fails.
2020-09-07 16:26:11 +03:00
Oran Agra
42ba7a1b75
fix broken cluster/sentinel tests by recent commit (#7752)
2b998de46 added a file for stderr to keep valgrind log but i forgot to
add a similar thing when valgrind isn't being used.
the result is that `glob */err.txt` fails.
2020-09-07 16:26:11 +03:00
John Sully
ac42f938e8 Fix whitespace
Former-commit-id: d47aeb1fc8a6804a44035253ad87478b817605cf
2020-09-07 03:35:46 +00:00
John Sully
855753ebb3 Fix whitespace
Former-commit-id: d47aeb1fc8a6804a44035253ad87478b817605cf
2020-09-07 03:35:46 +00:00
John Sully
1c1b114555 Dramatically improve perf by blocking commands
Former-commit-id: e47584b286c41cf0783fe014ac8b6ec187564ade
2020-09-07 00:49:53 +00:00
John Sully
b6d8a5938d Dramatically improve perf by blocking commands
Former-commit-id: e47584b286c41cf0783fe014ac8b6ec187564ade
2020-09-07 00:49:53 +00:00
Oran Agra
725616534e if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (6fd5ff8).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
Oran Agra
573246f73c
if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (5a47794).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
Oran Agra
da723a917d Improve valgrind support for cluster tests (#7725)
- redirect valgrind reports to a dedicated file rather than console
- try to avoid killing instances with SIGKILL so that we get the memory
  leak report (killing with SIGTERM before resorting to SIGKILL)
- search for valgrind reports when done, print them and fail the tests
- add --dont-clean option to keep the logs on exit
- fix exit error code when crash is found (would have exited with 0)

changes that affect the normal redis test suite:
- refactor check_valgrind_errors into two functions one to search and
  one to report
- move the search half into util.tcl to serve the cluster tests too
- ignore "address range perms" valgrind warnings which seem non relevant.
2020-09-06 11:11:49 +03:00
Oran Agra
2b998de460
Improve valgrind support for cluster tests (#7725)
- redirect valgrind reports to a dedicated file rather than console
- try to avoid killing instances with SIGKILL so that we get the memory
  leak report (killing with SIGTERM before resorting to SIGKILL)
- search for valgrind reports when done, print them and fail the tests
- add --dont-clean option to keep the logs on exit
- fix exit error code when crash is found (would have exited with 0)

changes that affect the normal redis test suite:
- refactor check_valgrind_errors into two functions one to search and
  one to report
- move the search half into util.tcl to serve the cluster tests too
- ignore "address range perms" valgrind warnings which seem non relevant.
2020-09-06 11:11:49 +03:00
Oran Agra
cf22e8eb91 test infra - add durable mode to work around test suite crashing
in some cases a command that returns an error possibly due to a timing
issue causes the tcl code to crash and thus prevents the rest of the
tests from running. this adds an option to make the test proceed despite
the crash.
maybe it should be the default mode some day.
2020-09-06 09:59:19 +03:00