The execution of the RPOPLPUSH command by the fuzzer created junk keys,
that were later being selected by RANDOMKEY and modified.
This also meant that lists were statistically tested more than other
files.
Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check
that detects that new keys are not added by the fuzzer to detect future
similar issues.
(cherry picked from commit 3f3f678a4741e6af18230ee1862d9ced7af79faf)
this means the assertion that checks that when deep sanitization is enabled,
there are no crashes, was missing.
(cherry picked from commit 3db0f1a284e4fba703419b892b2d5b8d385afc06)
This was recently broken in #9321 when we validated stream IDs to be
integers but did that after to the stepping next record instead of before.
(cherry picked from commit 5a4ab7c7d2da1773c5ed3dcfc6e367b5af03a33e)
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.
1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.
(cherry picked from commit cbda492909cd2fff25263913cd2e1f00bc48a541)
Recently we found two issues in the fuzzer tester: #9302#9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.
Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff
Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 0c90370e6d71cc68e4d9cc79a0d8b1e768712a5b)
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )
This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8ea777a6a02cae22aeff95f054d810f30b7b69ad)
When redis-cli received ASK, it used string matching wrong and didn't
handle it.
When we access a slot which is in migrating state, it maybe
return ASK. After redirect to the new node, we need send ASKING
command before retry the command. In this PR after redis-cli receives
ASK, we send a ASKING command before send the origin command
after reconnecting.
Other changes:
* Make redis-cli -u and -c (unix socket and cluster mode) incompatible
with one another.
* When send command fails, we avoid the 2nd reconnect retry and just
print the error info. Users will decide how to do next.
See #9277.
* Add a test faking two redis nodes in TCL to just send ASK and OK in
redis protocol to test ASK behavior.
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit cf61ad14cc45787e57d9af3f28f41462ac0f2aa2)
the test normally passes. but we saw one failure in a valgrind run in github actions
(cherry picked from commit 8458baf6a96fa6c6050bac24160f82d32a0b9ed4)
1. redis-cli can output --rdb data to stdout
but redis-cli also write some messages to stdout which will mess up the rdb.
2. Make redis-cli flush stdout when printing a reply
This was needed in order to fix a hung in redis-cli test that uses
--replica.
Note that printf does flush when there's a newline, but fwrite does not.
3. fix the redis-cli --replica test which used to pass previously
because it didn't really care what it read, and because redis-cli
used printf to print these other things to stdout.
4. improve redis-cli --replica test to run with both diskless and disk-based.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
(cherry picked from commit 1eb4baa5b8e76adc337ae9fab49acc2585a0cdd0)
When test stop 'load handler' by killing the process that generating the load,
some commands that already in the input buffer, still might be processed by the server.
This may cause some instability in tests, that count on that no more commands
processed after we stop the `load handler'
In this commit, new proc 'wait_load_handlers_disconnected' added, to verify that no more
cammands from any 'load handler' prossesed, by checking that the clients who
genreate the load is disconnceted.
Also, replacing check of dbsize with wait_for_ofs_sync before comparing debug digest, as
it would fail in case the last key the workload wrote was an overridden key (not a new one).
Affected tests
Race fix:
- failover command to specific replica works
- Connect multiple replicas at the same time (issue #141), master diskless=$mdl, replica diskless=$sdl
- AOF rewrite during write load: RDB preamble=$rdbpre
Cleanup and speedup:
- Test replication with blocking lists and sorted sets operations
- Test replication with parallel clients writing in different DBs
- Test replication partial resync: $descr (diskless: $mdl, $sdl, reconnect: $reconnect
(cherry picked from commit 32a2584e079a1b3c2d1e6649e38239381a73a459)
In diskless replication, we create a read pipe for the RDB, between the child and the parent.
When we close this pipe (fd), the read handler also needs to be removed from the event loop (if it still registered).
Otherwise, next time we will use the same fd, the registration will be fail (panic), because
we will use EPOLL_CTL_MOD (the fd still register in the event loop), on fd that already removed from epoll_ctl
(cherry picked from commit 501d7755831527b4237f9ed6050ec84203934e4d)
When redis-check-aof finds an error, it prints the line number for faster troubleshooting.
(cherry picked from commit 761d7d27711edfbf737def41ff28f5b325fb16c8)
In github actions CI with valgrind, i saw that even the fast replica
(one that wasn't paused), didn't get to complete the replication fast
enough, and ended up getting disconnected by timeout.
Additionally, due to a typo in uname, we didn't get to actually run the
CPU efficiency part of the test.
Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork
child so that the parent is the one sending data to the replicas.
This mechanism has an issue in which a hung replica will cause the master to wait
for it to read the data sent to it forever, thus preventing the fork child from terminating
and preventing the creations of any other forks.
This PR adds a timeout mechanism, much like the ACK-based timeout,
we disconnect replicas that aren't reading the RDB file fast enough.
Another test race condition in the macos tests.
the test was waiting for PINGs to be generated and put on the replication stream,
but waiting for 1 or 2 seconds doesn't really guarantee that.
then the test that expected 6 full syncs, found only 4
the corrupt-dump-fuzzer test found a case where an access to a corrupt
stream would have caused accessing to uninitialized memory.
now it'll panic instead.
The issue was that there was a stream that says it has more than 0
records, but looking for the max ID came back empty handed.
p.s. when sanitize-dump-payload is used, this corruption is detected,
and the RESTORE command is gracefully rejected.
Since redis 6.2, redis immediately tries to connect to the master, not
waiting for replication cron.
in the slow freebsd CI, this test failed and master_link_status was
already "up" when INFO was called.
* The `redis-cli --scan` output should honor output mode (set explicitly or implicitly), and quote key names when not in raw mode.
* Technically this is a breaking change, but it should be very minor since raw mode is by default on for non-tty output.
* It should only affect TTY output (human users) or non-tty output if `--no-raw` is specified.
* Added `--quoted-input` option to treat all arguments as potentially quoted strings.
* Added `--quoted-pattern` option to accept a potentially quoted pattern.
Unquoting is applied to potentially quoted input only if single or double quotes are used.
Fixes#8561, #8563
When sanitizing the stream listpack, we need to count the deleted records too.
otherwise the last line that checks the next pointer fails.
Add test to cover that state in the stream tests.
* Remove linux/version.h dependency.
This introduces unnecessary dependencies, and generally not a good idea
as the platform we build on may be different than the platform we run
on.
To determine if sync_file_range exists we can simply rely on header file
hints.
* Fix setproctitle() on libmusl.
The previous ifdef checks were a bit too strict for no apparent
reason.
* Fix tests failure on Linux with no backtrace.
* Add alpine daily CI job.
* Adding current_save_keys_total and current_save_keys_processed info fields.
Present in replication, BGSAVE and AOFRW.
* Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress)
* Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW,
and module forks.
* Add bash temporarily to allow sentinel fd leaks test to run.
* Use vmactions-freebsd rdist sync to work around bind permission denied
and slow execution issues.
* Upgrade to tcl8.6 to be aligned with latest Ubuntu envs.
* Concat all command executions to avoid ignoring failures.
* Skip intensive fuzzer on FreeBSD. For some yet unknown reason, generate_fuzzy_traffic_on_key causes TCL to significantly bloat on FreeBSD resulting with out of memory.
* The corrupt dump fuzzer found a division by zero.
* in some cases the random fields from the HRANDFIELD tests produced
fields with newlines and other special chars (due to \ char), this caused
the TCL tests to see a bulk response that has a newline in it and add {}
around it, later it can think this is a nested list. in fact the `alpha` random
string generator isn't using spaces and newlines, so it should not use `\`
either.