1121 Commits

Author SHA1 Message Date
Yossi Gottlieb
d3945c636d Tests: fix oom-score-adj false positives. (#7772)
The key save delay is too short and on certain systems the child process
is gone before we have a chance to inspect it.

(cherry picked from commit 1abc94155a26356f7fcaf5d20b80f031a55a3e82)
2020-09-10 14:09:00 +03:00
杨博东
d12e141780 Tests: Add aclfile load and save tests (#7765)
improves test coverage

(cherry picked from commit ce1466831686b617f72ffbdc51dde137ce5cf9ff)
2020-09-10 14:09:00 +03:00
Roi Lipman
ee3e45ac6e RM_ThreadSafeContextTryLock a non-blocking method for acquiring GIL (#7738)
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit b1de173ec0f6a03d6083b87f1505fbf843708685)
2020-09-10 14:09:00 +03:00
Yossi Gottlieb
8fb8c23746 Tests: validate CONFIG REWRITE for all params. (#7764)
This is a catch-all test to confirm that that rewrite produces a valid
output for all parameters and that this process does not introduce
undesired configuration changes.

(cherry picked from commit 995f1fc53f7daf3d289d5d70d7b45cdd486dc6cc)
2020-09-10 14:09:00 +03:00
Yossi Gottlieb
82aee21c92 Tests: clean up stale .cli files. (#7768)
(cherry picked from commit e5b1ad413bdc05e6539dbaa23b5114e15103516e)
2020-09-10 14:09:00 +03:00
Eran Liberty
7fa69e6394 Allow exec with read commands on readonly replica in cluster (#7766)
There was a bug. Although cluster replicas would allow read commands,
they would not allow a MULTI-EXEC that's composed solely of read commands.
Adds tests for coverage.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Eran Liberty <eranl@amazon.com>
(cherry picked from commit 7bee51bb5b2cccbaae76f4721761880acf4d5a93)
2020-09-10 14:09:00 +03:00
Oran Agra
826b49bcb4 handle cur_test for nested tests
if there are nested tests and nested servers, we need to restore the
previous value of cur_test when a test exist.

example:
```
test{test 1} {
	start_server {
		test{test 1.1 - master only} {
		}
		start_server {
		    test{test 1.2 - with replication} {
            }
		}
	}
}
```
when `test 1.1 - master only exists`, we're still inside `test 1`

(cherry picked from commit 610b4ff16a62062338588c4508a73784fb962c0b)
2020-09-10 14:09:00 +03:00
bodong.ybd
cb4f96657b Tests: Some fixes for macOS
1) cur_test: when restart_server, "no such variable" error occurs
  ./runtest --single integration/rdb
  test {client freed during loading}
      SET ::cur_test
      restart_server
        kill_server
          test "Check for memory leaks (pid $pid)"
          SET ::cur_test
          UNSET ::cur_test
      UNSET ::cur_test // This global variable has been unset.

2) `ps --ppid` not available on macOS platform, can be replaced with
`pgrep -P pid`.

(cherry picked from commit e90385e2232d41fd7c40dc239279f9837e7bdf57)
2020-09-10 14:09:00 +03:00
Oran Agra
95966ceb24 Fix cluster consistency-check test (#7754)
This test was failing from time to time see discussion at the bottom of #7635
This was probably due to timing, the DEBUG SLEEP executed by redis-cli
didn't sleep for enough time.

This commit changes:
1) use SET-ACTIVE-EXPIRE instead of DEBUG SLEEP
2) reduce many `after` sleeps with retry loops to speed up the test.
3) add many comment explaining the different steps of the test and
   it's purpose.
4) config appendonly before populating the volatile keys, so that they'll
   be part of the AOF command stream rather than the preamble RDB portion.

other complications: recently kill_instance switched from SIGKILL to
SIGTERM, and this would sometimes fail since there was an AOFRW running
in the background. now we wait for it to end before attempting the kill.

(cherry picked from commit 541d2709a0bd1a7f88681afa001c714b19df5dc1)
2020-09-10 14:09:00 +03:00
Yossi Gottlieb
9275c8b990 Tests: fix unmonitored servers. (#7756)
There is an inherent race condition in port allocation for spawned
servers. If a server fails to start because a port is taken, a new port
is allocated. This fixes a problem where the logs are not truncated and
as a result a large number of unmonitored servers are started.

(cherry picked from commit 871e85b8a75a53f90044ac04b0f5a9ba415c3bfa)
2020-09-10 14:09:00 +03:00
Oran Agra
d37b034321 fix broken cluster/sentinel tests by recent commit (#7752)
da723a917 added a file for stderr to keep valgrind log but i forgot to
add a similar thing when valgrind isn't being used.
the result is that `glob */err.txt` fails.

(cherry picked from commit 470de9a516b0dcb92acb8cf2841ddac604bcbd3a)
2020-09-10 14:09:00 +03:00
Oran Agra
540841d6f7 Improve valgrind support for cluster tests (#7725)
- redirect valgrind reports to a dedicated file rather than console
- try to avoid killing instances with SIGKILL so that we get the memory
  leak report (killing with SIGTERM before resorting to SIGKILL)
- search for valgrind reports when done, print them and fail the tests
- add --dont-clean option to keep the logs on exit
- fix exit error code when crash is found (would have exited with 0)

changes that affect the normal redis test suite:
- refactor check_valgrind_errors into two functions one to search and
  one to report
- move the search half into util.tcl to serve the cluster tests too
- ignore "address range perms" valgrind warnings which seem non relevant.

(cherry picked from commit da723a917dec7f2514d821a615668e158bb4f60c)
2020-09-10 14:09:00 +03:00
Oran Agra
81476c0cf7 test infra - add durable mode to work around test suite crashing
in some cases a command that returns an error possibly due to a timing
issue causes the tcl code to crash and thus prevents the rest of the
tests from running. this adds an option to make the test proceed despite
the crash.
maybe it should be the default mode some day.

(cherry picked from commit cf22e8eb91c2c1a769fda4c4de9eba3163dd7f05)
2020-09-10 14:09:00 +03:00
Oran Agra
e001152825 test infra - wait_done_loading
reduce code duplication in aof.tcl.
move creation of clients into the test so that it can be skipped

(cherry picked from commit cc455a710cc68d0fd8243cd1f04c5ee7332e4fdb)
2020-09-10 14:09:00 +03:00
Oran Agra
f180326b65 test infra - flushall between tests in external mode
(cherry picked from commit 2468c17a3229ae37825466a18dce9a5272eeef30)
2020-09-10 14:09:00 +03:00
Oran Agra
575d07b7a8 test infra - improve test skipping ability
- skip full units
- skip a single test (not just a list of tests)
- when skipping tag, skip spinning up servers, not just the tests
- skip tags when running against an external server too
- allow using multiple tags (split them)

(cherry picked from commit 5c61f1a6ed876186b944e79f903354cd81077bb6)
2020-09-10 14:09:00 +03:00
Oran Agra
7d3cec9686 test infra - reduce disk space usage
this is important when running a test with --loop

(cherry picked from commit fc18f16260d15b3584d92f73cebafa3a552e2686)
2020-09-10 14:09:00 +03:00
Oran Agra
60bec0c20c test infra - write test name to logfile
(cherry picked from commit e783c03dd1828fbf67259ee037a4faf835c4700a)
2020-09-10 14:09:00 +03:00
Oran Agra
6041fc99b5 Reduce the probability of failure when start redis in runtest-cluster #7554 (#7635)
When runtest-cluster, at first, we need to create a cluster use spawn_instance,
a port which is not used is choosen, however sometimes we can't run server on
the port. possibley due to a race with another process taking it first.
such as redis/redis/runs/896537490. It may be due to the machine problem or
In order to reduce the probability of failure when start redis in
runtest-cluster, we attemp to use another port when find server do not start up.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: yanhui13 <yanhui13@meituan.com>
(cherry picked from commit 1deaad884c38e92e5b691f36b253ef4ee2201ca4)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
ba1da77a3d Fix oom-score-adj on older distros. (#7724)
Don't assume `ps` handles `-h` to display output without headers and
manually trim headers line from output.

(cherry picked from commit ae8420298cacc2737e8e3ffa3c5acc038cd27849)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
f0e28abc07 Add oom-score-adj configuration option to control Linux OOM killer. (#1690)
Add Linux kernel OOM killer control option.

This adds the ability to control the Linux OOM killer oom_score_adj
parameter for all Redis processes, depending on the process role (i.e.
master, replica, background child).

A oom-score-adj global boolean flag control this feature. In addition,
specific values can be configured using oom-score-adj-values if
additional tuning is required.

(cherry picked from commit 70c823a64e800f22ac68f0172acdd1da82d7be32)
2020-09-01 09:27:58 +03:00
Meir Shpilraien (Spielrein)
63e3f1e449 see #7544, added RedisModule_HoldString api. (#7577)
Added RedisModule_HoldString that either returns a
shallow copy of the given String (by increasing
the String ref count) or a new deep copy of String
in case its not possible to get a shallow copy.

Co-authored-by: Itamar Haber <itamar@redislabs.com>
(cherry picked from commit 4f99b22118ca91e3a7fe9c1c68c19dd717dfdbb5)
2020-09-01 09:27:58 +03:00
Meir Shpilraien (Spielrein)
f63e428e5b This PR introduces a new loaded keyspace event (#7536)
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Itamar Haber <itamar@redislabs.com>
(cherry picked from commit 73198c50194cbf0254afd4cc5245f9274a538d13)
2020-09-01 09:27:58 +03:00
valentinogeron
3c136a7777 EXEC with only read commands should not be rejected when OOM (#7696)
If the server gets MULTI command followed by only read
commands, and right before it gets the EXEC it reaches OOM,
the client will get OOM response.

So, from now on, it will get OOM response only if there was
at least one command that was tagged with `use-memory` flag

(cherry picked from commit 0292720ccb0a189d3ed49d7bf912602360a4ecdd)
2020-09-01 09:27:58 +03:00
Valentino Geron
34124fff88 Fix LPOS command when RANK is greater than matches
When calling to LPOS command when RANK is higher than matches,
the return value is non valid response. For example:
```
LPUSH l a
:1
LPOS l b RANK 5 COUNT 10
*-4
```
It may break client-side parser.

Now, we count how many replies were replied in the array.
```
LPUSH l a
:1
LPOS l b RANK 5 COUNT 10
*0
```

(cherry picked from commit 7a555da64f56a4fb2f300d84a35778bee8f471ca)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
c5675c66bc Tests: fix redis-cli with remote hosts. (#7693)
(cherry picked from commit 257f9f462f7782dcaecf7bbf35f4701b20b88a45)
2020-09-01 09:27:58 +03:00
杨博东
b42976bd56 Fix flock cluster config may cause failure to restart after kill -9 (#7674)
After fork, the child process(redis-aof-rewrite) will get the fd opened
by the parent process(redis), when redis killed by kill -9, it will not
graceful exit(call prepareForShutdown()), so redis-aof-rewrite thread may still
alive, the fd(lock) will still be held by redis-aof-rewrite thread, and
redis restart will fail to get lock, means fail to start.

This issue was causing failures in the cluster tests in github actions.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 5e6212e087c4696abc682b64079202c9ade8666c)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
d13c44583c Module API: fix missing RM_CLIENTINFO_FLAG_SSL. (#7666)
The `REDISMODULE_CLIENTINFO_FLAG_SSL` flag was already a part of the `RedisModuleClientInfo` structure but was not implemented.

(cherry picked from commit 2ec11f941ae41188e517670fc3224b12c7666541)
2020-09-01 09:27:58 +03:00
Oran Agra
b4a6b4f28d fix new rdb test failing on timing issues (#7604)
apparenlty on github actions sometimes 500ms is not enough

(cherry picked from commit 191b1181023b0860ec60afde7a41bd4f03c55097)
2020-09-01 09:27:58 +03:00
Oran Agra
3a4ee4b6d6 module hook for master link up missing on successful psync (#7584)
besides, hooks test was time sensitive. when the replica managed to
reconnect quickly after the client kill, the test would fail

(cherry picked from commit c5d85c69c75438f98f84e549877c2999a2e450a8)
2020-09-01 09:27:58 +03:00
WuYunlong
37fba8f4d8 Fix running single test 14-consistency-check.tcl (#7587)
(cherry picked from commit be11e1b5eaf0d6ab5e68f86c1346570531eee766)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
2e6563cdcf Fix TLS cluster tests. (#7578)
Fix consistency test added in 0c9916d00 without considering TLS
redis-cli configuration.

(cherry picked from commit 675b00c7e0b7d68bafa11fcc7f66a394c3c3cd36)
2020-09-01 09:27:58 +03:00
Oran Agra
10a8407a4f Fix failing tests due to issues with wait_for_log_message (#7572)
- the test now waits for specific set of log messages rather than wait for
  timeout looking for just one message.
- we don't wanna sample the current length of the log after an action, due
  to a race, we need to start the search from the line number of the last
  message we where waiting for.
- when attempting to trigger a full sync, use multi-exec to avoid a race
  where the replica manages to re-connect before we completed the set of
  actions that should force a full sync.
- fix verify_log_message which was broken and unused

(cherry picked from commit 06aaeabaea9d9b248e8a790dde352cd14d66628a)
2020-09-01 09:27:58 +03:00
Jiayuan Chen
7b2af98316 Add optional tls verification (#7502)
Adds an `optional` value to the previously boolean `tls-auth-clients` configuration keyword.

Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
(cherry picked from commit 198770751fdc4c46eb4971ead9b5787fd6ce39fd)
2020-09-01 09:27:58 +03:00
Oran Agra
558a343b3c Stabilize bgsave test that sometimes fails with valgrind (#7559)
on ci.redis.io the test fails a lot, reporting that bgsave didn't end.
increaseing the timeout we wait for that bgsave to get aborted.
in addition to that, i also verify that it indeed got aborted by
checking that the save counter wasn't reset.

add another test to verify that a successful bgsave indeed resets the
change counter.

(cherry picked from commit 49d4aebce0a0b94cd2b302d276be95d1a1ce8610)
2020-09-01 09:27:58 +03:00
Oran Agra
2b45c88a6a testsuite may leave servers alive on error (#7549)
in cases where you have
test name {
  start_server {
    start_server {
      assert
    }
  }
}

the exception will be thrown to the test proc, and the servers are
supposed to be killed on the way out. but it seems there was always a
bug of not cleaning the server stack, and recently (#7404) we started
relying on that stack in order to kill them, so with that bug sometimes
we would have tried to kill the same server twice, and leave one alive.

luckly, in most cases the pattern is:
start_server {
  test name {
  }
}

(cherry picked from commit bb170fa06e5909dd816b6530121952d57c8209a0)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
6d80011e73 Tests: drop TCL 8.6 dependency. (#7548)
This re-implements the redis-cli --pipe test so it no longer depends on a close feature available only in TCL 8.6.

Basically what this test does is run redis-cli --pipe, generates a bunch of commands and pipes them through redis-cli, and inspects the result in both Redis and the redis-cli output.

To do that, we need to close stdin for redis-cli to indicate we're done so it can flush its buffers and exit. TCL has bi-directional channels can only offers a way to "one-way close" a channel with TCL 8.6. To work around that, we now generate the commands into a file and feed that file to redis-cli directly.

As we're writing to an actual file, the number of commands is now reduced.

(cherry picked from commit dbc0a64843ccd07515ac41ca80497a9e5ffd107a)
2020-09-01 09:27:58 +03:00
Remi Collet
443e57b08e Fix deprecated tail syntax in tests (#7543)
(cherry picked from commit 7853d8410b12c3ffac699c8a2e06f2a8e6df26b0)
2020-09-01 09:27:58 +03:00
WuYunlong
8b20802a09 Fix command help for unexpected options (#7476)
(cherry picked from commit e5166eccee3396a24dfd3a79d3211943e5a3d25e)
2020-07-20 21:08:26 +03:00
Oran Agra
6bdc5a4a08 redis-cli tests, fix valgrind timing issue (#7519)
this test when run with valgrind on github actions takes 160 seconds

(cherry picked from commit 8a14ce8634c49d992aa929cf0f98e96f03bccba4)
2020-07-20 21:08:26 +03:00
WuYunlong
b80f2ec8ac Fix out of update help info in tcl tests. (#7516)
Before this commit, the output of "./runtest-cluster --help" is incorrect.
After this commit, the format of the following 3 output is consistent:
./runtest --help
./runtest-cluster --help
./runtest-sentinel --help

(cherry picked from commit 24b6f62741483f3f695838c0ad091f1931c36df5)
2020-07-20 21:08:26 +03:00
Oran Agra
dde79afbf7 fix recently added time sensitive tests failing with valgrind (#7512)
interestingly the latency monitor test fails because valgrind is slow
enough so that the time inside PEXPIREAT command from the moment of
the first mstime() call to get the basetime until checkAlreadyExpired
calls mstime() again is more than 1ms, and that test was too sensitive.

using this opportunity to speed up the test (unrelated to the failure)
the fix is just the longer time passed to PEXPIRE.

(cherry picked from commit 663e637da87ee9385527fe3a37edb241a1f97cc6)
2020-07-20 21:08:26 +03:00
Oran Agra
905ffb72e9 runtest --stop pause stops before terminating the redis server (#7513)
in the majority of the cases (on this rarely used feature) we want to
stop and be able to connect to the shard with redis-cli.
since these are two different processes interracting with the tty we
need to stop both, and we'll have to hit enter twice, but it's not that
bad considering it is rarely used.

(cherry picked from commit 3351549c22434337dfa8a262dce678679a35d7da)
2020-07-20 21:08:26 +03:00
WuYunlong
b6fd631d04 Add missing latency-monitor tcl test to test_helper.tcl. (#6782)
(cherry picked from commit 136e5efeaba3296c79e67a6fe5dc796c0830b61e)
2020-07-20 21:08:26 +03:00
Yossi Gottlieb
27d44fbf73 TLS: Session caching configuration support. (#7420)
* TLS: Session caching configuration support.
* TLS: Remove redundant config initialization.

(cherry picked from commit c611a836f630ecf358b5cfb0d3c5e21c9f0bc105)
2020-07-20 21:08:26 +03:00
Yossi Gottlieb
6af3d57beb TLS: Add missing redis-cli options. (#7456)
* Tests: fix and reintroduce redis-cli tests.

These tests have been broken and disabled for 10 years now!

* TLS: add remaining redis-cli support.

This adds support for the redis-cli --pipe, --rdb and --replica options
previously unsupported in --tls mode.

* Fix writeConn().

(cherry picked from commit 99b920534f7710d544c38b870fd10c6053283d99)
2020-07-20 21:08:26 +03:00
Oran Agra
da96665c04 RESTORE ABSTTL won't store expired keys into the db (#7472)
Similarly to EXPIREAT with TTL in the past, which implicitly deletes the
key and return success, RESTORE should not store key that are already
expired into the db.
When used together with REPLACE it should emit a DEL to keyspace
notification and replication stream.

(cherry picked from commit 1c35f8741baa0def2f87eeab17898c79f0147d11)
2020-07-20 21:08:26 +03:00
Oran Agra
cf116e08cc skip a test that uses +inf on valgrind (#7440)
On some platforms strtold("+inf") with valgrind returns a non-inf result

[err]: INCRBYFLOAT does not allow NaN or Infinity in tests/unit/type/incr.tcl
Expected 'ERR*would produce*' to equal or match '1189731495357231765085759.....'

(cherry picked from commit 6b53c630d92c8fbeb9bea0c406c9454fb68b8467)
2020-07-20 21:08:26 +03:00
Oran Agra
c994e73c8e stabilize tests that look for log lines (#7367)
tests were sensitive to additional log lines appearing in the log
causing the search to come empty handed.

instead of just looking for the n last log lines, capture the log lines
before performing the action, and then search from that offset.

(cherry picked from commit efc4189b6227a17f26ed9bd6bbac62bf4bf7ab66)
2020-07-20 21:08:26 +03:00
Oran Agra
298e93c360 tests/valgrind: don't use debug restart (#7404)
* tests/valgrind: don't use debug restart

DEBUG REATART causes two issues:
1. it uses execve which replaces the original process and valgrind doesn't
   have a chance to check for errors, so leaks go unreported.
2. valgrind report invalid calls to close() which we're unable to resolve.

So now the tests use restart_server mechanism in the tests, that terminates
the old server and starts a new one, new PID, but same stdout, stderr.

since the stderr can contain two or more valgrind report, it is not enough
to just check for the absence of leaks, we also need to check for some known
errors, we do both, and fail if we either find an error, or can't find a
report saying there are no leaks.

other changes:
- when killing a server that was already terminated we check for leaks too.
- adding DEBUG LEAK which was used to test it.
- adding --trace-children to valgrind, although no longer needed.
- since the stdout contains two or more runs, we need slightly different way
  of checking if the new process is up (explicitly looking for the new PID)
- move the code that handles --wait-server to happen earlier (before
  watching the startup message in the log), and serve the restarted server too.

* squashme - CR fixes

(cherry picked from commit 8d4f055e43ab554adfce617c971f10c4b6423484)
2020-07-20 21:08:26 +03:00