Part two of implementing #8702 (zset), after #8887.
## Description of the feature
Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance.
## Rdb format changes
New `RDB_TYPE_ZSET_LISTPACK` rdb type.
## Rdb loading improvements:
1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist.
2) Simplifying the release of empty key objects when RDB loading.
3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c.
## Interface changes
1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`).
2) OBJECT ENCODING will return listpack instead of ziplist.
## Listpack improvements:
1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack.
2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string.
3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`.
## Zset improvements:
1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop.
2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset.
## Tests
1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function.
2) Add zset RDB loading test.
3) Add benchmark test for `lpCompare` and `ziplsitCompare`.
4) Add empty listpack zset corrupt dump test.
Part one of implementing #8702 (taking hashes first before other types)
## Description of the feature
1. Change ziplist encoded hash objects to listpack encoding.
2. Convert existing ziplists on RDB loading time. an O(n) operation.
## Rdb format changes
1. Add RDB_TYPE_HASH_LISTPACK rdb type.
2. Bump RDB_VERSION to 10
## Interface changes
1. New `hash-max-listpack-entries` config is an alias for `hash-max-ziplist-entries` (same with `hash-max-listpack-value`)
2. OBJECT ENCODING will return `listpack` instead of `ziplist`
## Listpack improvements:
1. Support direct insert, replace integer element (rather than convert back and forth from string)
3. Add more listpack capabilities to match the ziplist ones (like `lpFind`, `lpRandomPairs` and such)
4. Optimize element length fetching, avoid multiple calculations
5. Use inline to avoid function call overhead.
## Tests
1. Add a new test to the RDB load time conversion
2. Adding the listpack unit tests. (based on the one in ziplist.c)
3. Add a few "corrupt payload: fuzzer findings" tests, and slightly modify existing ones.
Co-authored-by: Oran Agra <oran@redislabs.com>
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )
This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.
Co-authored-by: Oran Agra <oran@redislabs.com>
Create new module type enhanced callbacks: mem_usage2, free_effort2, unlink2, copy2.
These will be given a context point from which the module can obtain the key name and database id.
In addition the digest and defrag context can now be used to obtain the key name and database id.
This PR adds a spell checker CI action that will fail future PRs if they introduce typos and spelling mistakes.
This spell checker is based on blacklist of common spelling mistakes, so it will not catch everything,
but at least it is also unlikely to cause false positives.
Besides that, the PR also fixes many spelling mistakes and types, not all are a result of the spell checker we use.
Here's a summary of other changes:
1. Scanned the entire source code and fixes all sorts of typos and spelling mistakes (including missing or extra spaces).
2. Outdated function / variable / argument names in comments
3. Fix outdated keyspace masks error log when we check `config.notify-keyspace-events` in loadServerConfigFromString.
4. Trim the white space at the end of line in `module.c`. Check: https://github.com/redis/redis/pull/7751
5. Some outdated https link URLs.
6. Fix some outdated comment. Such as:
- In README: about the rdb, we used to said create a `thread`, change to `process`
- dbRandomKey function coment (about the dictGetRandomKey, change to dictGetFairRandomKey)
- notifyKeyspaceEvent fucntion comment (add type arg)
- Some others minor fix in comment (Most of them are incorrectly quoted by variable names)
7. Modified the error log so that users can easily distinguish between TCP and TLS in `changeBindAddr`
When a database on a 64 bit build grows past 2^31 keys, the underlying hash table expands to 2^32 buckets. After this point, the algorithms for selecting random elements only return elements from half of the available buckets because they use random() which has a range of 0 to 2^31 - 1. This causes problems for eviction policies which use dictGetSomeKeys or dictGetRandomKey. Over time they cause the hash table to become unbalanced because, while new keys are spread out evenly across all buckets, evictions come from only half of the available buckets. Eventually this half of the table starts to run out of keys and it takes longer and longer to find candidates for eviction. This continues until no more evictions can happen.
This solution addresses this by using a 64 bit PRNG instead of libc random().
Co-authored-by: Greg Femec <gfemec@google.com>
When loading an encoded payload we will at least do a shallow validation to
check that the size that's encoded in the payload matches the size of the
allocation.
This let's us later use this encoded size to make sure the various offsets
inside encoded payload don't reach outside the allocation, if they do, we'll
assert/panic, but at least we won't segfault or smear memory.
We can also do 'deep' validation which runs on all the records of the encoded
payload and validates that they don't contain invalid offsets. This lets us
detect corruptions early and reject a RESTORE command rather than accepting
it and asserting (crashing) later when accessing that payload via some command.
configuration:
- adding ACL flag skip-sanitize-payload
- adding config sanitize-dump-payload [yes/no/clients]
For now, we don't have a good way to ensure MIGRATE in cluster resharding isn't
being slowed down by these sanitation, so i'm setting the default value to `no`,
but later on it should be set to `clients` by default.
changes:
- changing rdbReportError not to `exit` in RESTORE command
- adding a new stat to be able to later check if cluster MIGRATE isn't being
slowed down by sanitation.
The reason that we want to get a full crash report on SIGABRT
is that the jmalloc, when detecting a corruption, calls abort().
This will cause the Redis to exist silently without any report
and without any way to analyze what happened.
Reloading of the RDB generated by
DEBUG POPULATE 5000000
SAVE
is now 25% faster.
This commit also prepares the ability to have more flexibility when
loading stuff from the RDB, since we no longer use dbAdd() but can
control exactly how things are added in the database.
* replication hooks: role change, master link status, replica online/offline
* persistence hooks: saving, loading, loading progress
* misc hooks: cron loop, shutdown, module loaded/unloaded
* change the way hooks test work, and add tests for all of the above
startLoading() now gets flag indicating what is loaded.
stopLoading() now gets an indication of success or failure.
adding startSaving() and stopSaving() with similar args and role.
now that replica can read rdb directly from the socket, it should avoid exiting
on short read and instead try to re-sync.
this commit tries to have minimal effects on non-diskless rdb reading.
and includes a test that tries to trigger this scenario on various read cases.
The implementation of the diskless replication was currently diskless only on the master side.
The slave side was still storing the received rdb file to the disk before loading it back in and parsing it.
This commit adds two modes to load rdb directly from socket:
1) when-empty
2) using "swapdb"
the third mode of using diskless slave by flushdb is risky and currently not included.
other changes:
--------------
distinguish between aof configuration and state so that we can re-enable aof only when sync eventually
succeeds (and not when exiting from readSyncBulkPayload after a failed attempt)
also a CONFIG GET and INFO during rdb loading would have lied
When loading rdb from the network, don't kill the server on short read (that can be a network error)
Fix rdb check when performed on preamble AOF
tests:
run replication tests for diskless slave too
make replication test a bit more aggressive
Add test for diskless load swapdb
due to incorrect forward declaration, it didn't provide all arguments.
this lead to random value being read from the stack and return of incorrect time,
which in this case doesn't matter since no one uses it.
The rio structure is referenced in the global 'riostate' structure
in order for the logging functions to be always able to access the state
of the "pseudo-loading" of the RDB, needed for the check.
Courtesy of Valgrind.
So far we used an external program (later executed within Redis) and
parser in order to check RDB files for correctness. This forces, at each
RDB format update, to have two copies of the same format implementation
that are hard to keep in sync. Morover the former RDB checker only
checked the very high-level format of the file, without actually trying
to load things in memory. Certain corruptions can only be handled by
really loading key-value pairs.
This first commit attempts to unify the Redis RDB loadig code with the
task of checking the RDB file for correctness. More work is needed but
it looks like a sounding direction so far.
This also makes it backward compatible in the usage, but for the command
name. However the old command name was less obvious so it is worth to
break it probably.
With the new setup the program main can perform argument parsing and
everything else useful for an RDB check regardless of the Redis server
itself.
redis-check-rdb (previously redis-check-dump) had every RDB define
copy/pasted from rdb.h and some defines copied from redis.h. Since
the initial copy, some constants had changed in Redis headers and
check-dump was using incorrect values.
Since check-rdb is now a mode of Redis, the old check-dump code
is cleaned up to:
- replace all printf with redisLog (and remove \n from all strings)
- remove all copy/pasted defines to use defines from rdb.h and redis.h
- replace all malloc/free with zmalloc/zfree
- remove unnecessary include headers
redis-check-dump is now named redis-check-rdb and it runs
as a mode of redis-server instead of an independent binary.
You can now use 'redis-server redis.conf --check-rdb' to check
the RDB defined in redis.conf. Using argument --check-rdb
checks the RDB and exits. We could potentially also allow
the server to continue starting if the RDB check succeeds.
This change also enables us to use RDB checking programatically
from inside Redis for certain failure conditions.