This affects later distribution of clients among threads if there had been many connection attempts during loading phase.
Former-commit-id: 889bcd1bf1adeb246af04bbeb7f9e51c0c4eff1b
The execution of the RPOPLPUSH command by the fuzzer created junk keys,
that were later being selected by RANDOMKEY and modified.
This also meant that lists were statistically tested more than other
files.
Fix the fuzzer not to pass junk key names to RPOPLPUSH, and add a check
that detects that new keys are not added by the fuzzer to detect future
similar issues.
(cherry picked from commit 3f3f678a4741e6af18230ee1862d9ced7af79faf)
this means the assertion that checks that when deep sanitization is enabled,
there are no crashes, was missing.
(cherry picked from commit 3db0f1a284e4fba703419b892b2d5b8d385afc06)
This was recently broken in #9321 when we validated stream IDs to be
integers but did that after to the stepping next record instead of before.
(cherry picked from commit 5a4ab7c7d2da1773c5ed3dcfc6e367b5af03a33e)
A write request may be paused unexpectedly because `server.client_pause_end_time` is old.
**Recreate this:**
redis-cli -p 6379
127.0.0.1:6379> client pause 500000000 write
OK
127.0.0.1:6379> client unpause
OK
127.0.0.1:6379> client pause 10000 write
OK
127.0.0.1:6379> set key value
The write request `set key value` is paused util the timeout of 500000000 milliseconds was reached.
**Fix:**
reset `server.client_pause_end_time` = 0 in `unpauseClients`
(cherry picked from commit f560531d5b8a6e6d810b62114e69a5ffda7730f7)
When a replica paused, it would not apply any commands event the command comes from master, if we feed the non-applied command to replication stream, the replication offset would be wrong, and data would be lost after failover(since replica's `master_repl_offset` grows but command is not applied).
To fix it, here are the changes:
* Don't update replica's replication offset or propagate commands to sub-replicas when it's paused in `commandProcessed`.
* Show `slave_read_repl_offset` in info reply.
* Add an assert to make sure master client should never be blocked unless pause or module (some modules may use block way to do background (parallel) processing and forward original block module command to the replica, it's not a good way but it can work, so the assert excludes module now, but someday in future all modules should rewrite block command to propagate like what `BLPOP` does).
(cherry picked from commit 1b83353dc382959e218191f64d94edb9703552e3)
1. MIGRATE has a potnetial key arg in argv[3]. It should be reflected in the command table.
2. getKeysUsingCommandTable should never free getKeysResult, it is always freed by the caller)
The reason we never encountered this double-free bug is that almost always getKeysResult
uses the statis buffer and doesn't allocate a new one.
(cherry picked from commit 6aa2285e32a6bc16fe2938bfb40d833db7d3752d)
Normally we execute the read event first and then the write event.
When the barrier is set, we will do it reverse.
However, under `kqueue`, if an `fd` has both read and write events,
reading the event using `kevent` will generate two events, which will
result in uncontrolled read and write timing.
This also means that the guarantees of AOF `appendfsync` = `always` are
not met on MacOS without this fix.
The main change to this pr is to cache the events already obtained when reading
them, so that if the same `fd` occurs again, only the mask in the cache is updated,
rather than a new event is generated.
This was exposed by the following test failure on MacOS:
```
*** [err]: AOF fsync always barrier issue in tests/integration/aof.tcl
Expected 544 != 544 (context: type eval line 26 cmd {assert {$size1 != $size2}} proc ::test)
```
(cherry picked from commit 306a5ccd2d053ff653988b61a779e3cbce408874)
If we want to check `defined(SYNC_FILE_RANGE_WAIT_BEFORE)`, we should include fcntl.h.
otherwise, SYNC_FILE_RANGE_WAIT_BEFORE is not defined, and there is alway not `sync_file_range` system call.
Introduced by #8532
(cherry picked from commit 8edc3cd62c0d0508b68c887610ca53b632b8165b)
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.
1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.
(cherry picked from commit cbda492909cd2fff25263913cd2e1f00bc48a541)
Recently we found two issues in the fuzzer tester: #9302#9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.
Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff
Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 0c90370e6d71cc68e4d9cc79a0d8b1e768712a5b)
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )
This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8ea777a6a02cae22aeff95f054d810f30b7b69ad)
Fix that there is no sample latency after the key expires via expireIfNeeded().
Some refactoring for shared code.
(cherry picked from commit ca559819f7dcd97ba9ef667bf38360a9527d62f6)
When redis-cli received ASK, it used string matching wrong and didn't
handle it.
When we access a slot which is in migrating state, it maybe
return ASK. After redirect to the new node, we need send ASKING
command before retry the command. In this PR after redis-cli receives
ASK, we send a ASKING command before send the origin command
after reconnecting.
Other changes:
* Make redis-cli -u and -c (unix socket and cluster mode) incompatible
with one another.
* When send command fails, we avoid the 2nd reconnect retry and just
print the error info. Users will decide how to do next.
See #9277.
* Add a test faking two redis nodes in TCL to just send ASK and OK in
redis protocol to test ASK behavior.
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit cf61ad14cc45787e57d9af3f28f41462ac0f2aa2)
There's an infinite loop when redis-cli fails to connect in cluster mode.
This commit adds a 1 second sleep to prevent flooding the console with errors.
It also adds a specific error print in a few places that could have error without printing anything.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8351a10b959364cff9fc026188ebc9c653ef230a)
when SELECT fails, we should reset dbnum to 0, so the prompt will not
display incorrectly.
Additionally when SELECT and HELLO fail, we output message to inform
it.
Add config.input_dbnum which means the dbnum about to select.
And config.dbnum means currently selected dbnum. When users succeed to
select db, config.dbnum and config.input_dbnum will be the same. When
users select db failed, config.input_dbnum will be kept. Next time if users
auth success, config.input_dbnum will be automatically selected.
When reconnect, we should select the origin dbnum.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 6b475989984bb28499327e33cc79315d6264bc06)
With an empty src key, we need to deal with two situations:
1. non-STORE: We should return emptyarray.
2. STORE: Try to delete the store key and return 0.
This applies to both GEOSEARCHSTORE (new to v6.2), and
also GEORADIUS STORE (which was broken since forever)
This pr try to fix#9261. i.e. both STORE variants would have behaved
like the non-STORE variants when the source key was missing,
returning an empty array and not deleting the destination key,
instead of returning 0, and deleting the destination key.
Also add more tests for some commands.
- GEORADIUS: wrong type src key, non existing src key, empty search,
store with non existing src key, store with empty search
- GEORADIUSBYMEMBER: wrong type src key, non existing src key,
non existing member, store with non existing src key
- GEOSEARCH: wrong type src key, non existing src key, empty search,
frommember with non existing member
- GEOSEARCHSTORE: wrong type key, non existing src key,
fromlonlat with empty search, frommember with non existing member
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 86555ae0f7cc45abac7f758d72bf456e90793b46)
- fix possible heap corruption in ziplist and listpack resulting by trying to
allocate more than the maximum size of 4GB.
- prevent ziplist (hash and zset) from reaching size of above 1GB, will be
converted to HT encoding, that's not a useful size.
- prevent listpack (stream) from reaching size of above 1GB.
- XADD will start a new listpack if the new record may cause the previous
listpack to grow over 1GB.
- XADD will respond with an error if a single stream record is over 1GB
- List type (ziplist in quicklist) was truncating strings that were over 4GB,
now it'll respond with an error.
When LUA call our C code, by default, the LUA stack has room for 20
elements. In most cases, this is more than enough but sometimes it's not
and the caller must verify the LUA stack size before he pushes elements.
On 3 places in the code, there was no verification of the LUA stack size.
On specific inputs this missing verification could have lead to invalid
memory write:
1. On 'luaReplyToRedisReply', one might return a nested reply that will
explode the LUA stack.
2. On 'redisProtocolToLuaType', the Redis reply might be deep enough
to explode the LUA stack (notice that currently there is no such
command in Redis that returns such a nested reply, but modules might
do it)
3. On 'ldbRedis', one might give a command with enough arguments to
explode the LUA stack (all the arguments will be pushed to the LUA
stack)
This commit is solving all those 3 issues by calling 'lua_checkstack' and
verify that there is enough room in the LUA stack to push elements. In
case 'lua_checkstack' returns an error (there is not enough room in the
LUA stack and it's not possible to increase the stack), we will do the
following:
1. On 'luaReplyToRedisReply', we will return an error to the user.
2. On 'redisProtocolToLuaType' we will exit with panic (we assume this
scenario is rare because it can only happen with a module).
3. On 'ldbRedis', we return an error.
The protocol parsing on 'ldbReplParseCommand' (LUA debugging)
Assumed protocol correctness. This means that if the following
is given:
*1
$100
test
The parser will try to read additional 94 unallocated bytes after
the client buffer.
This commit fixes this issue by validating that there are actually enough
bytes to read. It also limits the amount of data that can be sent by
the debugger client to 1M so the client will not be able to explode
the memory.
This change sets a low limit for multibulk and bulk length in the
protocol for unauthenticated connections, so that they can't easily
cause redis to allocate massive amounts of memory by sending just a few
characters on the network.
The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)
The redis-cli command line tool and redis-sentinel service may be vulnerable
to integer overflow when parsing specially crafted large multi-bulk network
replies. This is a result of a vulnerability in the underlying hiredis
library which does not perform an overflow check before calling the calloc()
heap allocation function.
This issue only impacts systems with heap allocators that do not perform their
own overflow checks. Most modern systems do and are therefore not likely to
be affected. Furthermore, by default redis-sentinel uses the jemalloc allocator
which is also not vulnerable.
The vulnerability involves changing the default set-max-intset-entries
configuration parameter to a very large value and constructing specially
crafted commands to manipulate sets