futriix

Author	SHA1	Message	Date
Oran Agra	73d286d523	Fix stream sanitization for non-int first value (#9553 ) This was recently broken in #9321 when we validated stream IDs to be integers but did that after to the stepping next record instead of before. (cherry picked from commit 5a4ab7c7d2da1773c5ed3dcfc6e367b5af03a33e)	2021-10-04 13:59:40 +03:00
sundb	a2e8a3a241	Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399 ) (cherry picked from commit 492d8d09613cff88f15dcef98732392b8d509eb1)	2021-10-04 13:59:40 +03:00
sundb	09c63c45dd	Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349 ) This commit mainly fixes empty keys due to RDB loading and restore command, which was omitted in #9297. 1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned. If only some of the ziplists are empty, then we will skip the empty ziplists silently. 2) When loading hash zipmap, if zipmap is empty, sanitization check will fail. 3) When loading hash ziplist, if ziplist is empty, NULL will be returned. 4) Add RDB loading test with sanitize. (cherry picked from commit cbda492909cd2fff25263913cd2e1f00bc48a541)	2021-10-04 13:59:40 +03:00
Oran Agra	4b04ca0b18	Improvements to corrupt payload sanitization (#9321 ) Recently we found two issues in the fuzzer tester: #9302 #9285 After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them. Here's a list of the fixes - Prevent an overflow when allocating a dict hashtable - Prevent OOM when attempting to allocate a huge string - Prevent a few invalid accesses in listpack - Improve sanitization of listpack first entry - Validate integrity of stream consumer groups PEL - Validate integrity of stream listpack entry IDs - Validate ziplist tail followed by extra data which start with 0xff Co-authored-by: sundb <sundbcn@gmail.com> (cherry picked from commit 0c90370e6d71cc68e4d9cc79a0d8b1e768712a5b)	2021-10-04 13:59:40 +03:00
sundb	2f54107289	Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297 ) When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key. This could either be a corrupt payload, or a result of a bug (see #8453 ) This PR mainly fixes the following: 1) When restore command will return `Bad data format` error. 2) When loading RDB, we will silently discard the key. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit 8ea777a6a02cae22aeff95f054d810f30b7b69ad)	2021-10-04 13:59:40 +03:00
Viktor Söderqvist	77386ae011	redis-cli ASK redirect test: Add retry loop to fix timing issue (#9315 ) (cherry picked from commit 1c59567a7fe207997eef6197eefa7d508d7fbf9f)	2021-10-04 13:59:40 +03:00
Oran Agra	0c959294a8	Skip new redis-cli ASK test in TLS mode (#9312 ) (cherry picked from commit 52df350fe59d73e6a1a4a5fb3c2b91d5c62f5a76)	2021-10-04 13:59:40 +03:00
Huang Zhw	8892b5cf9e	When redis-cli received ASK, it didn't handle it (#8930 ) When redis-cli received ASK, it used string matching wrong and didn't handle it. When we access a slot which is in migrating state, it maybe return ASK. After redirect to the new node, we need send ASKING command before retry the command. In this PR after redis-cli receives ASK, we send a ASKING command before send the origin command after reconnecting. Other changes: * Make redis-cli -u and -c (unix socket and cluster mode) incompatible with one another. * When send command fails, we avoid the 2nd reconnect retry and just print the error info. Users will decide how to do next. See #9277. * Add a test faking two redis nodes in TCL to just send ASK and OK in redis protocol to test ASK behavior. Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit cf61ad14cc45787e57d9af3f28f41462ac0f2aa2)	2021-10-04 13:59:40 +03:00
Oran Agra	bae0512c8a	longer timeout in replication test (#8963 ) the test normally passes. but we saw one failure in a valgrind run in github actions (cherry picked from commit 8458baf6a96fa6c6050bac24160f82d32a0b9ed4)	2021-07-21 21:06:49 +03:00
Mikhail Fesenko	8884971223	Direct redis-cli repl prints to stderr, because --rdb can print to stdout. fflush stdout after responses (#9136 ) 1. redis-cli can output --rdb data to stdout but redis-cli also write some messages to stdout which will mess up the rdb. 2. Make redis-cli flush stdout when printing a reply This was needed in order to fix a hung in redis-cli test that uses --replica. Note that printf does flush when there's a newline, but fwrite does not. 3. fix the redis-cli --replica test which used to pass previously because it didn't really care what it read, and because redis-cli used printf to print these other things to stdout. 4. improve redis-cli --replica test to run with both diskless and disk-based. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se> (cherry picked from commit 1eb4baa5b8e76adc337ae9fab49acc2585a0cdd0)	2021-07-21 21:06:49 +03:00
YaacovHazan	ff27217639	stabilize tests that involved with load handlers (#8967 ) When test stop 'load handler' by killing the process that generating the load, some commands that already in the input buffer, still might be processed by the server. This may cause some instability in tests, that count on that no more commands processed after we stop the `load handler' In this commit, new proc 'wait_load_handlers_disconnected' added, to verify that no more cammands from any 'load handler' prossesed, by checking that the clients who genreate the load is disconnceted. Also, replacing check of dbsize with wait_for_ofs_sync before comparing debug digest, as it would fail in case the last key the workload wrote was an overridden key (not a new one). Affected tests Race fix: - failover command to specific replica works - Connect multiple replicas at the same time (issue #141), master diskless=$mdl, replica diskless=$sdl - AOF rewrite during write load: RDB preamble=$rdbpre Cleanup and speedup: - Test replication with blocking lists and sorted sets operations - Test replication with parallel clients writing in different DBs - Test replication partial resync: $descr (diskless: $mdl, $sdl, reconnect: $reconnect (cherry picked from commit 32a2584e079a1b3c2d1e6649e38239381a73a459)	2021-07-21 21:06:49 +03:00
YaacovHazan	5102c0da92	unregister AE_READABLE from the read pipe in backgroundSaveDoneHandlerSocket (#8991 ) In diskless replication, we create a read pipe for the RDB, between the child and the parent. When we close this pipe (fd), the read handler also needs to be removed from the event loop (if it still registered). Otherwise, next time we will use the same fd, the registration will be fail (panic), because we will use EPOLL_CTL_MOD (the fd still register in the event loop), on fd that already removed from epoll_ctl (cherry picked from commit 501d7755831527b4237f9ed6050ec84203934e4d)	2021-06-01 17:03:36 +03:00
bugwz	0851705304	Print the number of abnormal line in AOF (#8823 ) When redis-check-aof finds an error, it prints the line number for faster troubleshooting. (cherry picked from commit 761d7d27711edfbf737def41ff28f5b325fb16c8)	2021-05-03 22:57:00 +03:00
Oran Agra	a9897b0084	Fix timing of new replication test (#8807 ) In github actions CI with valgrind, i saw that even the fast replica (one that wasn't paused), didn't get to complete the replication fast enough, and ended up getting disconnected by timeout. Additionally, due to a typo in uname, we didn't get to actually run the CPU efficiency part of the test.	2021-04-18 15:12:34 +03:00
guybe7	d63d02601f	Add a timeout mechanism for replicas stuck in fullsync (#8762 ) Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork child so that the parent is the one sending data to the replicas. This mechanism has an issue in which a hung replica will cause the master to wait for it to read the data sent to it forever, thus preventing the fork child from terminating and preventing the creations of any other forks. This PR adds a timeout mechanism, much like the ACK-based timeout, we disconnect replicas that aren't reading the RDB file fast enough.	2021-04-15 17:18:51 +03:00
Oran Agra	cd81dcf18b	solve race conditions in psync2-pingoff test (#8720 ) Another test race condition in the macos tests. the test was waiting for PINGs to be generated and put on the replication stream, but waiting for 1 or 2 seconds doesn't really guarantee that. then the test that expected 6 full syncs, found only 4	2021-03-30 11:41:06 +03:00
Qu Chen	7de6451818	Properly initialize variable to make valgrind happy in checkChildrenDone(). Removed usage for the obsolete wait3() and wait4() in favor of waitpid(), and properly check for the exit status code. (#8666 )	2021-03-24 08:41:05 -07:00
Oran Agra	f6e1a94e03	Corrupt stream key access to uninitialized memory (#8681 ) the corrupt-dump-fuzzer test found a case where an access to a corrupt stream would have caused accessing to uninitialized memory. now it'll panic instead. The issue was that there was a stream that says it has more than 0 records, but looking for the max ID came back empty handed. p.s. when sanitize-dump-payload is used, this corruption is detected, and the RESTORE command is gracefully rejected.	2021-03-24 11:33:49 +02:00
Oran Agra	a7c02b19bf	Fix race in replication test (#8679 ) Since redis 6.2, redis immediately tries to connect to the master, not waiting for replication cron. in the slow freebsd CI, this test failed and master_link_status was already "up" when INFO was called.	2021-03-22 10:50:39 +02:00
Yossi Gottlieb	3c7d6a1853	Improve redis-cli non-binary safe string handling. (#8566 ) * The `redis-cli --scan` output should honor output mode (set explicitly or implicitly), and quote key names when not in raw mode. * Technically this is a breaking change, but it should be very minor since raw mode is by default on for non-tty output. * It should only affect TTY output (human users) or non-tty output if `--no-raw` is specified. * Added `--quoted-input` option to treat all arguments as potentially quoted strings. * Added `--quoted-pattern` option to accept a potentially quoted pattern. Unquoting is applied to potentially quoted input only if single or double quotes are used. Fixes #8561, #8563	2021-03-04 15:03:49 +02:00
Yossi Gottlieb	5d180d2834	Fix potential replication-4 test race condition. (#8583 ) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-03-02 18:12:11 +02:00
Oran Agra	349ef3f6a0	fix stream deep sanitization with deleted records (#8568 ) When sanitizing the stream listpack, we need to count the deleted records too. otherwise the last line that checks the next pointer fails. Add test to cover that state in the stream tests.	2021-03-01 17:23:29 +02:00
Yossi Gottlieb	95ea74549c	Fix failed tests on Linux Alpine and add a CI job. (#8532 ) * Remove linux/version.h dependency. This introduces unnecessary dependencies, and generally not a good idea as the platform we build on may be different than the platform we run on. To determine if sync_file_range exists we can simply rely on header file hints. * Fix setproctitle() on libmusl. The previous ifdef checks were a bit too strict for no apparent reason. * Fix tests failure on Linux with no backtrace. * Add alpine daily CI job.	2021-02-23 12:57:45 +02:00
uriyage	fd052d2a86	Adds INFO fields to track fork child progress (#8414 ) * Adding current_save_keys_total and current_save_keys_processed info fields. Present in replication, BGSAVE and AOFRW. * Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress) * Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW, and module forks.	2021-02-16 16:06:51 +02:00
Yossi Gottlieb	141ac8df59	Escape unsafe field name characters in INFO. (#8492 ) Fixes #8489	2021-02-15 17:08:53 +02:00
Oran Agra	30775bc3e3	solve race in replication-2 test - again (#8491 ) this should make it timing independent and also faster in most cases	2021-02-15 12:50:23 +02:00
Oran Agra	02ab14cc2e	solve race in replication-2 test (#8461 ) use SIGSTOP instead of DEBUG SLEEP, reduces the test time by some 2 seconds and avoids failures on slow machines	2021-02-07 16:22:30 +02:00
Yossi Gottlieb	de6f3ad017	Fix FreeBSD tests and CI Daily issues. (#8438 ) * Add bash temporarily to allow sentinel fd leaks test to run. * Use vmactions-freebsd rdist sync to work around bind permission denied and slow execution issues. * Upgrade to tcl8.6 to be aligned with latest Ubuntu envs. * Concat all command executions to avoid ignoring failures. * Skip intensive fuzzer on FreeBSD. For some yet unknown reason, generate_fuzzy_traffic_on_key causes TCL to significantly bloat on FreeBSD resulting with out of memory.	2021-02-03 17:35:28 +02:00
Oran Agra	5a7eb9c881	Fix test issues from introduction of HRANDFIELD (#8424 ) * The corrupt dump fuzzer found a division by zero. * in some cases the random fields from the HRANDFIELD tests produced fields with newlines and other special chars (due to \ char), this caused the TCL tests to see a bulk response that has a newline in it and add {} around it, later it can think this is a nested list. in fact the `alpha` random string generator isn't using spaces and newlines, so it should not use `\` either.	2021-01-31 12:13:45 +02:00
Allen Farris	0d18a1e85f	implement FAILOVER command (#8315 ) Implement FAILOVER command, which coordinates failover between the server and one of its replicas.	2021-01-28 13:18:05 -08:00
Raghav Muddur	0367a80819	GETEX, GETDEL and SET PXAT/EXAT (#8327 ) This commit introduces two new command and two options for an existing command GETEX <key> [PERSIST][EX seconds][PX milliseconds] [EXAT seconds-timestamp] [PXAT milliseconds-timestamp] The getexCommand() function implements extended options and variants of the GET command. Unlike GET command this command is not read-only. Only one of the options can be used at a given time. 1. PERSIST removes any TTL associated with the key. 2. EX Set expiry TTL in seconds. 3. PX Set expiry TTL in milliseconds. 4. EXAT Same like EX instead of specifying the number of seconds representing the TTL (time to live), it takes an absolute Unix timestamp 5. PXAT Same like PX instead of specifying the number of milliseconds representing the TTL (time to live), it takes an absolute Unix timestamp Command would return either the bulk string, error or nil. GETDEL <key> Would delete the key after getting. SET key value [NX] [XX] [KEEPTTL] [GET] [EX <seconds>] [PX <milliseconds>] [EXAT <seconds-timestamp>][PXAT <milliseconds-timestamp>] Two new options added here are EXAT and PXAT Key implementation notes - `SET` with `PX/EX/EXAT/PXAT` is always translated to `PXAT` in `AOF`. When relative time is specified (`PX/EX`), replication will always use `PX`. - `setexCommand` and `psetexCommand` would no longer need translation in `feedAppendOnlyFile` as they are modified to invoke `setGenericCommand ` with appropriate flags which will take care of correct AOF translation. - `GETEX` without any optional argument behaves like `GET`. - `GETEX` command is never propagated, It is either propagated as `PEXPIRE[AT], or PERSIST`. - `GETDEL` command is propagated as `DEL` - Combined the validation for `SET` and `GETEX` arguments. - Test cases to validate AOF/Replication propagation	2021-01-27 19:47:26 +02:00
Yossi Gottlieb	522d93607a	Add io-thread daily CI tests. (#8232 ) This adds basic coverage to IO threads by running the cluster and few selected Redis test suite tests with the IO threads enabled. Also provides some necessary additional improvements to the test suite: * Add --config to sentinel/cluster tests for arbitrary configuration. * Fix --tags whitelisting which was broken. * Add a `network` tag to some tests that are more network intensive. This is work in progress and more tests should be properly tagged in the future.	2021-01-17 15:48:48 +02:00
Oran Agra	8dd16caec8	Fix last COW INFO report, Skip test on non-linux platforms (#8301 ) - the last COW report wasn't always read from the pipe (receiveLastChildInfo wasn't used) - but in fact, there's no reason we won't always try to drain that pipe so i'm unifying receiveLastChildInfo with receiveChildInfo - adjust threshold of the COW test when run in accurate mode - add some prints in case this test fails again - fix indentation, page size, and PID! in MacOS proc info p.s. it seems that pri_pages_dirtied is always 0	2021-01-08 23:35:30 +02:00
YaacovHazan	ea930a352c	Report child copy-on-write info continuously Add INFO field, rdb_active_cow_size, to report COW of a live fork child while it's active. - once in 1024 keys check the time, and if there's more than one second since the last report send a report to the parent via the pipe. - refactor the child_info_data struct, it's an implementation detail that shouldn't be in the server struct, and not used to communicate data between caller and callee - remove the magic value from that struct (not sure what it was good for), and instead add handling of short reads. - add another value to the structure, cow_type, to indicate if the report is for the new rdb_active_cow_size field, or it's the last report of a successful operation - add new Module API to report the active COW - add more asserts variants to test.tcl	2021-01-07 16:14:29 +02:00
Oran Agra	cfb449cc80	Sanitize dump payload: excessive free on dup zset fields (#8189 )	2020-12-14 17:10:31 +02:00
Oran Agra	7ca00d694d	Sanitize dump payload: fail RESTORE if memory allocation fails When RDB input attempts to make a huge memory allocation that fails, RESTORE should fail gracefully rather than die with panic	2020-12-06 14:54:34 +02:00
Oran Agra	3716950cfc	Sanitize dump payload: validate no duplicate records in hash/zset/intset If RESTORE passes successfully with full sanitization, we can't affort to crash later on assertion due to duplicate records in a hash when converting it form ziplist to dict. This means that when doing full sanitization, we must make sure there are no duplicate records in any of the collections.	2020-12-06 14:54:34 +02:00
Oran Agra	c31055db61	Sanitize dump payload: fuzz tester and fixes for segfaults and leaks it exposed The test creates keys with various encodings, DUMP them, corrupt the payload and RESTORES it. It utilizes the recently added use-exit-on-panic config to distinguish between asserts and segfaults. If the restore succeeds, it runs random commands on the key to attempt to trigger a crash. It runs in two modes, one with deep sanitation enabled and one without. In the first one we don't expect any assertions or segfaults, in the second one we expect assertions, but no segfaults. We also check for leaks and invalid reads using valgrind, and if we find them we print the commands that lead to that issue. Changes in the code (other than the test): - Replace a few NPD (null pointer deference) flows and division by zero with an assertion, so that it doesn't fail the test. (since we set the server to use `exit` rather than `abort` on assertion). - Fix quite a lot of flows in rdb.c that could have lead to memory leaks in RESTORE command (since it now responds with an error rather than panic) - Add a DEBUG flag for SET-SKIP-CHECKSUM-VALIDATION so that the test don't need to bother with faking a valid checksum - Remove a pile of code in serverLogObjectDebugInfo which is actually unsafe to run in the crash report (see comments in the code) - fix a missing boundary check in lzf_decompress test suite infra improvements: - be able to run valgrind checks before the process terminates - rotate log files when restarting servers	2020-12-06 14:54:34 +02:00
Oran Agra	01c13bddea	Sanitize dump payload: improve tests of ziplist and stream encodings - improve stream rdb encoding test to include more types of stream metadata - add test to cover various ziplist encoding entries (although it does look like the stress test above it is able to find some too - add another test for ziplist encoding for hash with full sanitization - add similar ziplist encoding tests for list	2020-12-06 14:54:34 +02:00
Oran Agra	ca1c182567	Sanitize dump payload: ziplist, listpack, zipmap, intset, stream When loading an encoded payload we will at least do a shallow validation to check that the size that's encoded in the payload matches the size of the allocation. This let's us later use this encoded size to make sure the various offsets inside encoded payload don't reach outside the allocation, if they do, we'll assert/panic, but at least we won't segfault or smear memory. We can also do 'deep' validation which runs on all the records of the encoded payload and validates that they don't contain invalid offsets. This lets us detect corruptions early and reject a RESTORE command rather than accepting it and asserting (crashing) later when accessing that payload via some command. configuration: - adding ACL flag skip-sanitize-payload - adding config sanitize-dump-payload [yes/no/clients] For now, we don't have a good way to ensure MIGRATE in cluster resharding isn't being slowed down by these sanitation, so i'm setting the default value to `no`, but later on it should be set to `clients` by default. changes: - changing rdbReportError not to `exit` in RESTORE command - adding a new stat to be able to later check if cluster MIGRATE isn't being slowed down by sanitation.	2020-12-06 14:54:34 +02:00
luhuachao	7885faf18b	Modify help msg PING_BULK to PING_MBULK in benchmark (#8109 ) As described in redis-benchamrk help message 'The test names are the same as the ones produced as output.', In redis-benchmark output, we can only see PING_BULK, but the cmd `redis-benchmark -t ping_bulk` is not supported. We have to run it with ping_mbulk which is not user friendly.	2020-12-02 13:17:25 +02:00
filipe oliveira	10b5006934	Enable specifying TLS ciphers(suites) in redis-cli/redis-benchmark (#8005 ) Enable specifying the preferred ciphers and/or ciphersuites for redis-cli/redis-benchmark. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2020-11-04 14:49:15 +02:00
Meir Shpilraien (Spielrein)	762be79f0b	Disable new SIGABRT test on valgrind (#8013 ) The crash reports cause false-positive warnings when run with valgrind.	2020-11-04 13:13:55 +02:00
Meir Shpilraien (Spielrein)	f210e197f3	Added crash report on SIGABRT (#8004 ) The reason that we want to get a full crash report on SIGABRT is that the jmalloc, when detecting a corruption, calls abort(). This will cause the Redis to exist silently without any report and without any way to analyze what happened.	2020-11-03 14:59:21 +02:00
Oran Agra	9122379abc	Propagate GETSET and SET-GET as SET (#7957 ) - Generates a more backwards compatible command stream - Slightly more efficient execution in replica/AOF - Add a test for coverage	2020-11-03 14:56:57 +02:00
Wang Yuan	dc899c4c88	Fix timing dependence in replication tcl tests (#7969 ) Remove 'fork child $pid' log in replication.tcl	2020-10-27 09:36:42 +02:00
filipe oliveira	01acfa71ca	redis-benchmark: add tests, --version, a minor bug fixes (#7947 ) - add test suite coverage for redis-benchmark - add --version (similar to what redis-cli has) - fix bug sending more requests than intended when pipeline > 1. - when done sending requests, avoid freeing client in the write handler, in theory before responses are received (probably dead code since the read handler will call clientDone first) Co-authored-by: Oran Agra <oran@redislabs.com>	2020-10-26 08:04:59 +02:00
Yossi Gottlieb	843a13e88f	Add a --no-latency tests flag. (#7939 ) Useful for running tests on systems which may be way slower than usual.	2020-10-22 11:10:53 +03:00
Felipe Machado	c3f9e01794	Adds new pop-push commands (LMOVE, BLMOVE) (#6929 ) Adding [B]LMOVE <src> <dst> RIGHT\|LEFT RIGHT\|LEFT. deprecating [B]RPOPLPUSH. Note that when receiving a BRPOPLPUSH we'll still propagate an RPOPLPUSH, but on BLMOVE RIGHT LEFT we'll propagate an LMOVE improvement to existing tests - Replace "after 1000" with "wait_for_condition" when wait for clients to block/unblock. - Add a pre-existing element to target list on basic tests so that we can check if the new element was added to the correct side of the list. - check command stats on the replica to make sure the right command was replicated Co-authored-by: Oran Agra <oran@redislabs.com>	2020-10-08 08:33:17 +03:00
Wang Yuan	1bb5794a1f	Kill disk-based fork child when all replicas drop and 'save' is not enabled (#7819 ) When all replicas waiting for a bgsave get disconnected (possibly due to output buffer limit), It may be good to kill the bgsave child. in diskless replication it already happens, but in disk-based, the child may still serve some purpose (for persistence). By killing the child, we prevent it from eating COW memory in vain, and we also allow a new child fork sooner for the next full synchronization or bgsave. We do that only if rdb persistence wasn't enabled in the configuration. Btw, now, rdbRemoveTempFile in killRDBChild won't block server, so we can killRDBChild safely.	2020-09-22 09:47:58 +03:00

1 2 3 4 5

215 Commits