futriix

Author	SHA1	Message	Date
John Sully	b12b48ab27	Initial implementation of snapshot fast replication. There are still a few TODOs in progress Former-commit-id: 0febdcdab8693af443f350968ed3d8c80106675d	2021-11-09 19:36:07 +00:00
John Sully	ea6a0f214b	Merge tag '6.2.2' into unstable Former-commit-id: 93ebb31b17adec5d406d2e30a5b9ea71c07fce5c	2021-05-21 05:54:39 +00:00
John Sully	f49d8f9adb	Merge tag '6.2.1' into unstable Former-commit-id: bfed57e3e0edaa724b9d060a6bb8edc5a6de65fa	2021-05-19 02:59:48 +00:00
Oran Agra	ca1a42e3e6	Improve testsuite print of log file (#8805 ) 1. the `dump_logs` option would have printed only logs of servers that were spawn before the test proc started, and not ones that the test proc started inside it. 2. when a server proc catches an exception it should normally forward the exception upwards, specifically when it's an assertion that should be caught by a test proc above. however, in `durable` mode, we caught all exceptions printed them to stdout and let the code continue, this was wrong to do for assertions, which should have still been propagated to the test function. 3. don't bother to search for crash log to print if we printed the the entire log anyway 4. if no crash log was found, no need to print anything (i.e. the fact it wasn't found) 5. rename warnings_from_file to crashlog_from_file	2021-04-18 11:55:54 +03:00
Huang Zhw	6e15162367	When tests exit normally, some processes may still be alive (#8647 ) In certain scenario start_server may think it failed to start a redis server although it started successfully. in these cases, it'll not terminate it, and it'll remain running when the test is over. In start_server if config doesn't have bind (the minimal.conf in introspection.tcl), it will try to bind ipv4 and ipv6. One may success while other fails. It will output "Could not create server TCP listening socket". wait_server_started uses this message to check whether instance started successfully. So it will consider that it failed even though redis started successfully. Additionally, in some cases it wasn't clear to users why the server exited, since the warning message printed to the log, could in some cases be harmless, and in some cases fatal. This PR adds makes a clear distinction between a warning log message and a fatal one, and changes the test suite to look for the fatal message.	2021-03-16 17:25:30 +02:00
Madelyn Olson	5427ba4476	Allow stopped redis processes to be killed in tests (#8552 )	2021-02-24 14:26:16 -08:00
Yossi Gottlieb	b07ddb2c04	Add --dump-logs tests option. (#8459 ) Dump the entire server log if a test failed, to easy troubleshooting with no access to log files.	2021-02-07 12:37:24 +02:00
christianEQ	c068f2cd3d	Merge tag 'tags/6.0.10' into redismerge_2021-01-20 Former-commit-id: dadce055f897cee83946c2d3e5cbb76341b94230	2021-01-26 21:43:09 +00:00
Yossi Gottlieb	e4d0ba933d	Add io-thread daily CI tests. (#8232 ) This adds basic coverage to IO threads by running the cluster and few selected Redis test suite tests with the IO threads enabled. Also provides some necessary additional improvements to the test suite: * Add --config to sentinel/cluster tests for arbitrary configuration. * Fix --tags whitelisting which was broken. * Add a `network` tag to some tests that are more network intensive. This is work in progress and more tests should be properly tagged in the future.	2021-01-17 15:48:48 +02:00
Yang Bodong	214e4189a3	Tests: fix the problem that Darwin memory leak detection may fail (#8213 ) Apparently the "leaks" took reports a different error string about process that's not found in each version of MacOS. This cause the test suite to fail on some OS versions, since some tests terminate the process before looking for leaks. Instead of looking at the error string, we now look at the (documented) exit code.	2020-12-23 16:28:17 +02:00
Yossi Gottlieb	11b9e4092e	TLS: Add different client cert support. (#8076 ) This adds a new `tls-client-cert-file` and `tls-client-key-file` configuration directives which make it possible to use different certificates for the TLS-server and TLS-client functions of Redis. This is an optional directive. If it is not specified the `tls-cert-file` and `tls-key-file` directives are used for TLS client functions as well. Also, `utils/gen-test-certs.sh` now creates additional server-only and client-only certs and will skip intensive operations if target files already exist.	2020-12-11 18:31:40 +02:00
Oran Agra	912e22b4f9	Sanitize dump payload: fuzz tester and fixes for segfaults and leaks it exposed The test creates keys with various encodings, DUMP them, corrupt the payload and RESTORES it. It utilizes the recently added use-exit-on-panic config to distinguish between asserts and segfaults. If the restore succeeds, it runs random commands on the key to attempt to trigger a crash. It runs in two modes, one with deep sanitation enabled and one without. In the first one we don't expect any assertions or segfaults, in the second one we expect assertions, but no segfaults. We also check for leaks and invalid reads using valgrind, and if we find them we print the commands that lead to that issue. Changes in the code (other than the test): - Replace a few NPD (null pointer deference) flows and division by zero with an assertion, so that it doesn't fail the test. (since we set the server to use `exit` rather than `abort` on assertion). - Fix quite a lot of flows in rdb.c that could have lead to memory leaks in RESTORE command (since it now responds with an error rather than panic) - Add a DEBUG flag for SET-SKIP-CHECKSUM-VALIDATION so that the test don't need to bother with faking a valid checksum - Remove a pile of code in serverLogObjectDebugInfo which is actually unsafe to run in the crash report (see comments in the code) - fix a missing boundary check in lzf_decompress test suite infra improvements: - be able to run valgrind checks before the process terminates - rotate log files when restarting servers	2020-12-06 14:54:34 +02:00
Yossi Gottlieb	c9dcf82020	Fix tests failure on busybox systems. (#7916 ) (cherry picked from commit 6a035f5edff0be3b404f1bfb9f07e9373cd63934)	2020-10-27 09:12:01 +02:00
Yossi Gottlieb	6a035f5edf	Fix tests failure on busybox systems. (#7916 )	2020-10-18 14:50:29 +03:00
John Sully	14daf6f909	Merge tag '6.0.8' into unstable Former-commit-id: 4c7e4b91a6bb2034636856b608b8c386d07f5541	2020-09-30 19:47:55 +00:00
Yossi Gottlieb	9275c8b990	Tests: fix unmonitored servers. (#7756 ) There is an inherent race condition in port allocation for spawned servers. If a server fails to start because a port is taken, a new port is allocated. This fixes a problem where the logs are not truncated and as a result a large number of unmonitored servers are started. (cherry picked from commit 871e85b8a75a53f90044ac04b0f5a9ba415c3bfa)	2020-09-10 14:09:00 +03:00
Oran Agra	540841d6f7	Improve valgrind support for cluster tests (#7725 ) - redirect valgrind reports to a dedicated file rather than console - try to avoid killing instances with SIGKILL so that we get the memory leak report (killing with SIGTERM before resorting to SIGKILL) - search for valgrind reports when done, print them and fail the tests - add --dont-clean option to keep the logs on exit - fix exit error code when crash is found (would have exited with 0) changes that affect the normal redis test suite: - refactor check_valgrind_errors into two functions one to search and one to report - move the search half into util.tcl to serve the cluster tests too - ignore "address range perms" valgrind warnings which seem non relevant. (cherry picked from commit da723a917dec7f2514d821a615668e158bb4f60c)	2020-09-10 14:09:00 +03:00
Oran Agra	81476c0cf7	test infra - add durable mode to work around test suite crashing in some cases a command that returns an error possibly due to a timing issue causes the tcl code to crash and thus prevents the rest of the tests from running. this adds an option to make the test proceed despite the crash. maybe it should be the default mode some day. (cherry picked from commit cf22e8eb91c2c1a769fda4c4de9eba3163dd7f05)	2020-09-10 14:09:00 +03:00
Oran Agra	f180326b65	test infra - flushall between tests in external mode (cherry picked from commit 2468c17a3229ae37825466a18dce9a5272eeef30)	2020-09-10 14:09:00 +03:00
Oran Agra	575d07b7a8	test infra - improve test skipping ability - skip full units - skip a single test (not just a list of tests) - when skipping tag, skip spinning up servers, not just the tests - skip tags when running against an external server too - allow using multiple tags (split them) (cherry picked from commit 5c61f1a6ed876186b944e79f903354cd81077bb6)	2020-09-10 14:09:00 +03:00
Oran Agra	7d3cec9686	test infra - reduce disk space usage this is important when running a test with --loop (cherry picked from commit fc18f16260d15b3584d92f73cebafa3a552e2686)	2020-09-10 14:09:00 +03:00
Oran Agra	60bec0c20c	test infra - write test name to logfile (cherry picked from commit e783c03dd1828fbf67259ee037a4faf835c4700a)	2020-09-10 14:09:00 +03:00
Yossi Gottlieb	871e85b8a7	Tests: fix unmonitored servers. (#7756 ) There is an inherent race condition in port allocation for spawned servers. If a server fails to start because a port is taken, a new port is allocated. This fixes a problem where the logs are not truncated and as a result a large number of unmonitored servers are started.	2020-09-07 17:30:36 +03:00
Oran Agra	da723a917d	Improve valgrind support for cluster tests (#7725 ) - redirect valgrind reports to a dedicated file rather than console - try to avoid killing instances with SIGKILL so that we get the memory leak report (killing with SIGTERM before resorting to SIGKILL) - search for valgrind reports when done, print them and fail the tests - add --dont-clean option to keep the logs on exit - fix exit error code when crash is found (would have exited with 0) changes that affect the normal redis test suite: - refactor check_valgrind_errors into two functions one to search and one to report - move the search half into util.tcl to serve the cluster tests too - ignore "address range perms" valgrind warnings which seem non relevant.	2020-09-06 11:11:49 +03:00
Oran Agra	cf22e8eb91	test infra - add durable mode to work around test suite crashing in some cases a command that returns an error possibly due to a timing issue causes the tcl code to crash and thus prevents the rest of the tests from running. this adds an option to make the test proceed despite the crash. maybe it should be the default mode some day.	2020-09-06 09:59:19 +03:00
Oran Agra	2468c17a32	test infra - flushall between tests in external mode	2020-09-06 09:59:19 +03:00
Oran Agra	5c61f1a6ed	test infra - improve test skipping ability - skip full units - skip a single test (not just a list of tests) - when skipping tag, skip spinning up servers, not just the tests - skip tags when running against an external server too - allow using multiple tags (split them)	2020-09-06 09:59:19 +03:00
Oran Agra	fc18f16260	test infra - reduce disk space usage this is important when running a test with --loop	2020-09-06 09:59:19 +03:00
Oran Agra	e783c03dd1	test infra - write test name to logfile	2020-09-06 09:59:19 +03:00
Oran Agra	2b45c88a6a	testsuite may leave servers alive on error (#7549 ) in cases where you have test name { start_server { start_server { assert } } } the exception will be thrown to the test proc, and the servers are supposed to be killed on the way out. but it seems there was always a bug of not cleaning the server stack, and recently (#7404) we started relying on that stack in order to kill them, so with that bug sometimes we would have tried to kill the same server twice, and leave one alive. luckly, in most cases the pattern is: start_server { test name { } } (cherry picked from commit bb170fa06e5909dd816b6530121952d57c8209a0)	2020-09-01 09:27:58 +03:00
Oran Agra	bb170fa06e	testsuite may leave servers alive on error (#7549 ) in cases where you have test name { start_server { start_server { assert } } } the exception will be thrown to the test proc, and the servers are supposed to be killed on the way out. but it seems there was always a bug of not cleaning the server stack, and recently (#7404) we started relying on that stack in order to kill them, so with that bug sometimes we would have tried to kill the same server twice, and leave one alive. luckly, in most cases the pattern is: start_server { test name { } }	2020-07-21 16:56:19 +03:00
Oran Agra	298e93c360	tests/valgrind: don't use debug restart (#7404 ) * tests/valgrind: don't use debug restart DEBUG REATART causes two issues: 1. it uses execve which replaces the original process and valgrind doesn't have a chance to check for errors, so leaks go unreported. 2. valgrind report invalid calls to close() which we're unable to resolve. So now the tests use restart_server mechanism in the tests, that terminates the old server and starts a new one, new PID, but same stdout, stderr. since the stderr can contain two or more valgrind report, it is not enough to just check for the absence of leaks, we also need to check for some known errors, we do both, and fail if we either find an error, or can't find a report saying there are no leaks. other changes: - when killing a server that was already terminated we check for leaks too. - adding DEBUG LEAK which was used to test it. - adding --trace-children to valgrind, although no longer needed. - since the stdout contains two or more runs, we need slightly different way of checking if the new process is up (explicitly looking for the new PID) - move the code that handles --wait-server to happen earlier (before watching the startup message in the log), and serve the restarted server too. * squashme - CR fixes (cherry picked from commit 8d4f055e43ab554adfce617c971f10c4b6423484)	2020-07-20 21:08:26 +03:00
Oran Agra	8d4f055e43	tests/valgrind: don't use debug restart (#7404 ) * tests/valgrind: don't use debug restart DEBUG REATART causes two issues: 1. it uses execve which replaces the original process and valgrind doesn't have a chance to check for errors, so leaks go unreported. 2. valgrind report invalid calls to close() which we're unable to resolve. So now the tests use restart_server mechanism in the tests, that terminates the old server and starts a new one, new PID, but same stdout, stderr. since the stderr can contain two or more valgrind report, it is not enough to just check for the absence of leaks, we also need to check for some known errors, we do both, and fail if we either find an error, or can't find a report saying there are no leaks. other changes: - when killing a server that was already terminated we check for leaks too. - adding DEBUG LEAK which was used to test it. - adding --trace-children to valgrind, although no longer needed. - since the stdout contains two or more runs, we need slightly different way of checking if the new process is up (explicitly looking for the new PID) - move the code that handles --wait-server to happen earlier (before watching the startup message in the log), and serve the restarted server too. * squashme - CR fixes	2020-07-10 08:26:52 +03:00
John Sully	ed2e0e66f6	Merge tag '6.0.4' into unstable Redis 6.0.4. Former-commit-id: 9c31ac7925edba187e527f506e5e992946bd38a6	2020-05-29 00:57:07 -04:00
Oran Agra	4653d796f0	tests: each test client work on a distinct port range apparently when running tests in parallel (the default of --clients 16), there's a chance for two tests to use the same port. specifically, one test might shutdown a master and still have the replica up, and then another test will re-use the port number of master for another master, and then that replica will connect to the master of the other test. this can cause a master to count too many full syncs and fail a test if we run the tests with --single integration/psync2 --loop --stop see Probmem 2 in #7314	2020-05-28 10:09:51 +02:00
Oran Agra	8f0c339892	tests: each test client work on a distinct port range apparently when running tests in parallel (the default of --clients 16), there's a chance for two tests to use the same port. specifically, one test might shutdown a master and still have the replica up, and then another test will re-use the port number of master for another master, and then that replica will connect to the master of the other test. this can cause a master to count too many full syncs and fail a test if we run the tests with --single integration/psync2 --loop --stop see Probmem 2 in #7314	2020-05-26 11:17:08 +03:00
John Sully	327d543f2c	Merge commit 'c5d805f87771581d3f6b29861ed2062c0ae2a688' into unstable Former-commit-id: 95cecb0229af0278cf614ffd746ba829ae7c897c	2020-05-21 17:45:15 -04:00
John Sully	d9c08a1db3	Run all KeyDB instances in testmode during tests Former-commit-id: cd306f1d23f4fbb900433edbf55d89099bbf903c	2020-04-15 22:27:04 -04:00
John Sully	0725491043	Merge commit 'c609bf3f2c7f0982f632f82623ee4802868b8ef1' into redis_6_merge Former-commit-id: 320bc3c0329ff9e5a980b79426b719addae381cf	2020-04-14 21:04:42 -04:00
John Sully	68c50ae876	Merge commit '6718d5d37517bd927635649708913affb98f67c9' into redis_6_merge Former-commit-id: ef1236b6009ebd7b00f6dd2f43df57ad95e51253	2020-04-14 20:19:48 -04:00
Oran Agra	dcd6726366	diffrent fix for runtest --host --port	2020-04-07 16:52:28 +02:00
Oran Agra	98208c3f30	diffrent fix for runtest --host --port	2020-04-06 09:41:14 +03:00
bodong.ybd	c609bf3f2c	Fix bug of tcl test using external server	2020-03-25 15:54:34 +01:00
bodong.ybd	53fd8f4d0d	Fix bug of tcl test using external server	2020-03-11 21:01:27 +08:00
antirez	2542ff9c25	Test engine: experimental change to avoid busy port problems.	2020-02-27 18:02:30 +01:00
antirez	22ad06eafd	Test engine: detect timeout when checking for Redis startup.	2020-02-27 18:02:30 +01:00
antirez	2d9a144515	Test engine: better tracking of what workers are doing.	2020-02-27 18:00:47 +01:00
antirez	ab71e1020c	Test engine: experimental change to avoid busy port problems.	2020-02-24 10:46:23 +01:00
antirez	49b80848e7	Test engine: detect timeout when checking for Redis startup.	2020-02-21 18:55:56 +01:00
antirez	55bd09593d	Test engine: better tracking of what workers are doing.	2020-02-21 17:08:45 +01:00

1 2 3

102 Commits