449 Commits

Author SHA1 Message Date
Madelyn Olson
6faa48a358
Don't initialize the key buffer in getKeysResult (#631)
getKeysResults is typically initialized with 2kb of zeros (16 * 256),
which isn't strictly necessary since the only thing we have to
initialize is some of the metadata fields. The rest of the data can
remain junk as long as we don't access it. This was a bit of a
regression in 7.0 with the keyspecs, since we doubled the size of the
zeros, but hopefully this recovers a lot of the performance drop.

I saw a modest performance bump for deep pipeline of cluster mode (~8%).

I think we would see some comparable improvements in the other places
where we are using it such as tracking and ACLs.

---------

Signed-off-by: Madelyn Olson <matolson@amazon.com>
2024-06-14 08:42:00 -07:00
Ping Xie
54c9747935
Remove master and slave from source code (#591)
External facing interfaces are not affected.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-07 14:21:33 -07:00
Chen Tianjie
d16b4ec1b9
Unshare object to avoid LRU/LFU being messed up (#250)
When LRU/LFU enabled, Valkey does not allow using shared objects, as
value objects may be shared among many different keys and they can't
share LRU/LFU information.

However `maxmemory-policy` is modifiable at runtime. If LRU/LFU is not
enabled at start, but then enabled when some shared objects are already
used, there could be some confusion in LRU/LFU information.

For `set` command it is OK since it is going to create a new object when
LRU/LFU enabled, but `get` command will not unshare the object and just
update LRU/LFU information.

So we may duplicate the object in this case. It is a one-time task for
each key using shared objects, unless this is the case for so many keys,
there should be no serious performance degradation.

Still, LRU will be updated anyway, no matter LRU/LFU is enabled or not,
because `OBJECT IDLETIME` needs it, unless `maxmemory-policy` is set to
LFU. So idle time of a key may still be messed up.

---------

Signed-off-by: chentianjie.ctj <chentianjie.ctj@alibaba-inc.com>
Signed-off-by: Chen Tianjie <TJ_Chen@outlook.com>
2024-06-01 10:09:20 +02:00
Samuel Adetunji
5d0f4bc9f0
Require C11 atomics (#490)
- Replaces custom atomics logic with C11 default atomics logic.
- Drops  "atomicvar_api" field from server info

Closes #485

---------

Signed-off-by: adetunjii <adetunjithomas1@outlook.com>
Signed-off-by: Samuel Adetunji <adetunjithomas1@outlook.com>
Co-authored-by: teej4y <samuel.adetunji@prunny.com>
2024-05-26 18:41:11 +02:00
Ping Xie
c41dd77a3e
Add clang-format configs (#323)
I have validated that these settings closely match the existing coding
style with one major exception on `BreakBeforeBraces`, which will be
`Attach` going forward. The mixed `BreakBeforeBraces` styles in the
current codebase are hard to imitate and also very odd IMHO - see below

```
if (a == 1) { /*Attach */
}
```

```
if (a == 1 ||
    b == 2)
{ /* Why? */
}
```

Please do NOT merge just yet. Will add the github action next once the
style is reviewed/approved.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-22 23:24:12 -07:00
Viktor Söderqvist
efa8ba519b
Finish postponed SCAN changes (#501)
Commit 07ed0eafa98a66 introduced some SCAN improvements, but some
changes were postponed to a later version (8.0), which this PR finishes:

1. Prepare to move the TYPE filtering to the scan callback as well. this
was put on hold since it has side effects that can be considered a
breaking change, which is that we will not attempt to do lazy expire
(delete) a key that was filtered by not matching the TYPE (changing it
would mean TYPE filter starts behaving the same as MATCH filter already
does in that respect).
2. when the specified key TYPE filter is an unknown type, server will
reply a error immediately instead of doing a full scan that comes back
empty handed.

Fixes #235

Release notes:

> SCAN: Expired keys that don't match the TYPE argument for the SCAN are
no longer deleted by SCAN

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-05-17 13:35:31 +02:00
Shivshankar
1125bdbb80
Update serverpanic output based on 'extended-redis-compatibility' config. (#415)
Updated serverPanic output in db.c based on the
extended-redis-compatibility config. and also updated comments in other
files.

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-05-08 12:17:32 -07:00
Chen Tianjie
ba9dd7b23a
Add noscores option to command ZSCAN. (#324)
Command syntax is now:
```
ZSCAN key cursor [MATCH pattern] [COUNT count] [NOSCORES]
```
Return format:
```
127.0.0.1:6379> zadd z 1 a 2 b 3 c
(integer) 3
127.0.0.1:6379> zscan z 0
1) "0"
2) 1) "a"
   2) "1"
   3) "b"
   4) "2"
   5) "c"
   6) "3"
127.0.0.1:6379> zscan z 0 noscores
1) "0"
2) 1) "a"
   2) "b"
   3) "c"
```
when NOSCORES is on, the command will only return members in the zset,
without scores.

For client side parsing the command return, I believe it is fine as long
as the command is backwards compatible. The return structure are still
lists, what has changed is the content. And clients can tell the
difference by the subcommand they use.

Since `novalues` option of `HSCAN` is already accepted
(redis/redis#12765), I think similar thing can be done to `ZSCAN`.

---------

Signed-off-by: Chen Tianjie <TJ_Chen@outlook.com>
2024-05-07 14:39:28 +08:00
Jacob Murphy
df5db0627f
Remove trademarked language in code comments (#223)
This includes comments used for module API documentation.

* Strategy for replacement: Regex search: `(//|/\*| \*|#).* ("|\()?(r|R)edis( |\.
  |'|\n|,|-|\)|")(?!nor the names of its contributors)(?!Ltd.)(?!Labs)(?!Contributors.)`
* Don't edit copyright comments
* Replace "Redis version X.X" -> "Redis OSS version X.X" to distinguish
from newly licensed repository
* Replace "Redis Object" -> "Object"
* Exclude markdown for now
* Don't edit Lua scripting comments referring to redis.X API
* Replace "Redis Protocol" -> "RESP"
* Replace redis-benchmark, -cli, -server, -check-aof/rdb with "valkey-"
prefix
* Most other places, I use best judgement to either remove "Redis", or
replace with "the server" or "server"

Fixes #148

---------

Signed-off-by: Jacob Murphy <jkmurphy@google.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-09 10:24:03 +02:00
Ping Xie
aaec321213
Remove REDISMODULE_ prefixes and introduce compatibility header (#194)
Fix #146 

Removed REDISMODULE_ prefixes from the core source code to align with
the new SERVERMODULE_ naming convention. Added a new 'redismodule.h'
header file to ensure full backward compatibility with existing modules.
This compatibility layer maps all legacy REDISMODULE_ prefixed
identifiers to their new SERVERMODULE_ equivalents, allowing existing
Redis modules to function without modification.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-04-05 16:59:55 -07:00
0del
b19ebaf551
Rename redisCommand to serverCommand (#174)
Part of #144

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:54:33 +02:00
0del
a236fc8ef0
Rename redisCommandProc, redisGetKeysProc to server prefix (#173)
Part of #144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:33:33 +02:00
0del
717dfe8022
Rename redisDb to serverDb (#156)
A task of #144.

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 11:02:43 +08:00
LiiNen
ba8bda9cff
Fix shutdown syntax hint, following intention (#116)
Change syntax hint from

    SHUTDOWN [NOSAVE | SAVE] [NOW] [FORCE] [ABORT]

to

    SHUTDOWN [[NOSAVE | SAVE] [NOW] [FORCE] | ABORT]

It's not that important for docs, but the latter is preferred for valkey-cli's automatic syntax hints.

Signed-off-by: LiiNen <kjeonghoon065@gmail.com>
2024-04-02 15:36:53 +02:00
Binbin
7b070423b8
Fix dictionary use-after-free in active expire and make kvstore iter to respect EMPTY flag (#13135)
After #13072, there is an use-after-free error. In expireScanCallback, we
will delete the dict, and then in dictScan we will continue to use the dict,
like we will doing `dictResumeRehashing(d)` in the end, this casued an error.

In this PR, in freeDictIfNeeded, if the dict's pauserehash is set, don't
delete the dict yet, and then when scan returns try to delete it again.

At the same time, we noticed that there will be similar problems in iterator.
We may also delete elements during the iteration process, causing the dict
to be deleted, so the part related to iter in the PR has also been modified.
dictResetIterator was also missing from the previous kvstoreIteratorNextDict,
we currently have no scenario that elements will be deleted in kvstoreIterator
process, deal with it together to avoid future problems. Added some simple
tests to verify the changes.

In addition, the modification in #13072 omitted initTempDb and emptyDbAsync,
and they were also added. This PR also remove the slow flag from the expire
test (consumes 1.3s) so that problems can be found in CI in the future.
2024-03-18 17:41:54 +02:00
Yanqi Lv
0a12f380e8
Optimize DEL on expired keys (#13080)
If we call `DEL` on expired keys, keys may be deleted in
`expireIfNeeded` and we don't need to call `dbSyncDelete` or
`dbAsyncDelete` after, which repeat the deletion process(i.e. find keys
in main db).

In this PR, I refine the return values of `expireIfNeeded` to indicate
whether we have deleted the expired key to avoid the potential redundant
deletion logic in `delGenericCommand`. Besides, because both KEY_EXPIRED
and KEY_DELETED are non-zero, this PR won't affect other functions
calling `expireIfNeeded`.

I also make a performance test. I first close active expiration by
`debug set-active-expire 0` and write 1 million keys with 1ms TTL. Then
I repeatedly delete 100 expired keys in one `DEL`. The results are as
follow, which shows that this PR can improve performance by about 10% in
this situation.
**unstable**
```
Summary:
  throughput summary: 10080.65 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.953     0.136     0.959     1.215     1.335     2.247
```

**This PR**
```
Summary:			
  throughput summary: 11074.20 requests per second			
  latency summary (msec):			
          avg       min       p50       p95       p99       max			
        0.865     0.128     0.879     1.055     1.175     2.159			
```

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Oran Agra <oran@redislabs.com>
2024-02-26 12:50:04 +02:00
Binbin
81666a6510
Fix heap-use-after-free when pubsubshard_channels became NULL (#13038)
After fix for #13033, address sanitizer reports this heap-use-after-free
error. When the pubsubshard_channels dict becomes empty, we will delete
the dict, and the dictReleaseIterator will call dictResetIterator, it
will use the dict so we will trigger the error.

This PR introduced a new struct kvstoreDictIterator to wrap
dictIterator.
Replace the original dict iterator with the new kvstore dict iterator.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: guybe7 <guy.benoish@redislabs.com>
2024-02-07 14:53:50 +02:00
guybe7
8cd62f82ca
Refactor the per-slot dict-array db.c into a new kvstore data structure (#12822)
# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of https://github.com/redis/redis/pull/12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
https://github.com/redis/redis/pull/12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
2024-02-05 17:21:35 +02:00
Chen Tianjie
f469dd8ca6
Add novalues option to command HSCAN. (#12765)
Add a way to HSCAN a hash key, and get only the filed names.
Command syntax is now:
```
HSCAN key cursor [MATCH pattern] [COUNT count] [NOVALUES]
```
when `NOVALUES` is on, the command will only return keys in the hash.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-01-30 20:32:58 +02:00
Chen Tianjie
af7ceeb765
Optimize resizing hash table to resize not only non-empty dicts. (#12819)
The function `tryResizeHashTables` only attempts to shrink the dicts
that has keys (change from #11695), this was a serious problem until the
change in #12850 since it meant if all keys are deleted, we won't shrink
the dick.
But still, both dictShrink and dictExpand may be blocked by a fork child
process, therefore, the cron job needs to perform both dictShrink and
dictExpand, for not just non-empty dicts, but all dicts in DBs.

What this PR does:

1. Try to resize all dicts in DBs (not just non-empty ones, as it was
since #12850)
2. handle both shrink and expand (not just shrink, as it was since
forever)
3. Refactor some APIs about dict resizing (get rid of `htNeedsShrink`
`htNeedsShrink` `dictShrinkToFit`, and expose `dictShrinkIfNeeded`
`dictExpandIfNeeded` which already contains all the code of those
functions we get rid of, to make APIs more neat)
4. In the `Don't rehash if redis has child process` test, now that cron
would do resizing, we no longer need to write to DB after the child
process got killed, and can wait for the cron to expand the hash table.
2024-01-29 21:02:07 +02:00
zhaozhao.zz
8d0156eb18
Set the correct id for tempDb (#12947)
background: some modules need to know the `dbid` information, such as
the function used during RDB loading:

```
robj *rdbLoadObject(int rdbtype, rio *rdb, sds key, int dbid, int *error) {
....
        moduleInitIOContext(io,mt,rdb,&keyobj,dbid);
```

However, during replication, the "tempDb" created for diskless RDB
loading is not correctly set with the dbid. This leads to passing the
wrong dbid to the `rdbLoadObject` function (as tempDb uses zcalloc, all
ids are 0).

```
disklessLoadInitTempDb()->rdbLoadRioWithLoadingCtx()->
        /* Read value */
        val = rdbLoadObject(type,rdb,key,db->id,&error);
```

To fix it, set the correct ID (relative index) for the tempdb.
2024-01-22 11:47:51 +08:00
zhaozhao.zz
d8a21c5767
Unified db rehash method for both standalone and cluster (#12848)
After #11695, we added two functions `rehashingStarted` and
`rehashingCompleted` to the dict structure. We also registered two
handlers for the main database's dict and expire structures. This allows
the main database to record the dict in `rehashing` list when rehashing
starts. Later, in `serverCron`, the `incrementallyRehash` function is
continuously called to perform the rehashing operation. However,
currently, when rehashing is completed, `rehashingCompleted` does not
remove the dict from the `rehashing` list. This results in the
`rehashing` list containing many invalid dicts. Although subsequent cron
checks and removes dicts that don't require rehashing, it is still
inefficient.

This PR implements the functionality to remove the dict from the
`rehashing` list in `rehashingCompleted`. This is achieved by adding
`metadata` to the dict structure, which keeps track of its position in
the `rehashing` list, allowing for quick removal. This approach avoids
storing duplicate dicts in the `rehashing` list.

Additionally, there are other modifications:

1. Whether in standalone or cluster mode, the dict in database is
inserted into the rehashing linked list when rehashing starts. This
eliminates the need to distinguish between standalone and cluster mode
in `incrementallyRehash`. The function only needs to focus on the dicts
in the `rehashing` list that require rehashing.
2. `rehashing` list is moved from per-database to Redis server level.
This decouples `incrementallyRehash` from the database ID, and in
standalone mode, there is no need to iterate over all databases,
avoiding unnecessary access to databases that do not require rehashing.
In the future, even if unsharded-cluster mode supports multiple
databases, there will be no risk involved.
3. The insertion and removal operations of dict structures in the
`rehashing` list are decoupled from `activerehashing` config.
`activerehashing` only controls whether `incrementallyRehash` is
executed in serverCron. There is no need for additional steps when
modifying the `activerehashing` switch, as in #12705.
2023-12-15 10:42:53 +08:00
Binbin
62419c01db
Handle missing fields in dbSwapDatabases and swapMainDbWithTempDb (#12763)
The change in dbSwapDatabases seems harmless. Because in non-clustered
mode, dbBuckets calculations are strictly accurate and in cluster mode,
we only have one DB. Modify it for uniformity (just like resize_cursor).

The change in swapMainDbWithTempDb is needed in case we swap with the
temp db, otherwise the overhead memory usage of db can be miscalculated.

In addition we will swap all fields (including rehashing list), just for
completeness (and reduce the chance of surprises in the future).

Introduced in #12697.
2023-12-10 17:30:20 +02:00
Binbin
a3ae2ed37b
Remove dead code around should_expand_db (#12767)
when dbExpand is called from rdb.c with try_expand set to 0, it will
either panic panic on OOM, or be non-fatal (should not fail RDB loading)

At the same time, the log text has been slightly adjusted to make it
more unified.
2023-12-10 10:40:15 +02:00
Chen Tianjie
f2d59c4f91
Avoid unnecessary slot computing in KEYS command. (#12843)
If not in cluster mode, there is no need to compute slot.

A bit optimization for #12754
2023-12-07 19:48:15 +08:00
zhaozhao.zz
8e11f84ded
Fix replica node cannot expand dicts when loading legacy RDB (#12839)
When loading RDB on cluster nodes, it is necessary to consider the
scenario where a node is a replica.

For example, during a rolling upgrade, new version instances are often
mounted as replicas on old version instances. In this case, the full
synchronization legacy RDB does not contain slot information, and the
new version instance, acting as a replica, should be able to handle the
legacy RDB correctly for `dbExpand`.

Additionally, renaming `getMyClusterSlotCount` to `getMyShardSlotCount`
would be appropriate.

Introduced in #11695
2023-12-07 14:30:48 +08:00
zhaozhao.zz
9ee1cc33a3
Make the sampling logic in eviction clearer (#12781)
Additional optimizations for the eviction logic in #11695:

To make the eviction logic clearer and decouple the number of sampled
keys from the running mode (cluster or standalone).
* When sampling in each database, we only care about the number of keys
in the current database (not the dicts we sampled from).
* If there are a insufficient number of keys in the current database
(e.g. 10 times the value of `maxmemory_samples`), we can break out
sooner (to avoid looping on a sparse database).
* We'll never try to sample the db dicts more times than the number of
non-empty dicts in the db (max 1 in non-cluster mode).

And it also ensures that each database has a sufficient amount of
sampled keys, so even if unsharded-cluster supports multiple databases,
there won't be any issues.

other changes:
1. keep track of the number of non-empty dicts in each database.
2. move key_count tracking into cumulativeKeyCountAdd rather than all
it's callers

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-12-06 09:37:24 +08:00
Chen Tianjie
991aff1c0f
Optimize KEYS when pattern includes hashtag and implies a single slot. (#12754)
in #12536 we made a similar optimization for SCAN, now that hashtags in
patterns. When we can make sure all keys matching the pettern will be in
the same slot, we can limit the iteration to run only one one.
2023-12-05 16:21:50 +02:00
zhaozhao.zz
a1c5171c1d
Fix resize hash tables stuck on the last non-empty slot (#12802)
Introduced in #11695 .

The tryResizeHashTables function gets stuck on the last non-empty slot
while iterating through dictionaries. It does not restart from the
beginning. The reason for this issue is a problem with the usage of
dbIteratorNextDict:

/* Returns next dictionary from the iterator, or NULL if iteration is complete. */
dict *dbIteratorNextDict(dbIterator *dbit) {
    if (dbit->next_slot == -1) return NULL;
    dbit->slot = dbit->next_slot;
    dbit->next_slot = dbGetNextNonEmptySlot(dbit->db, dbit->slot, dbit->keyType);
    return dbGetDictFromIterator(dbit);
}

When iterating to the last non-empty slot, next_slot is set to -1,
causing it to loop indefinitely on that slot. We need to modify the code
to ensure that after iterating to the last non-empty slot, it returns to
the first non-empty slot.

BTW, function tryResizeHashTables is actually iterating over slots
that have keys. However, in its implementation, it leverages the
dbIterator (which is a key iterator) to obtain slot and dictionary
information. While this approach works fine, but it is not very
intuitive. This PR also improves readability by changing the iteration
to directly iterate over slots, thereby enhancing clarity.
2023-11-28 18:50:16 +08:00
zhaozhao.zz
095d05786f
clarify the comment of findSlotByKeyIndex function (#12811)
The current comment for `findSlotByKeyIndex` is a bit ambiguous and can
be misleading, as it may be misunderstood as getting the next slot
corresponding to target.
2023-11-27 18:45:40 -08:00
Binbin
dedbf99a80
Fix dbExpand not dividing by slots, resulting in consuming slots times the dictExpand (#12773)
We meant to divide it by the number of slots, otherwise it will do slots
times
dictExpand, bug was introduced in #11695.
2023-11-22 11:16:06 +02:00
Josh Hershberg
33ef6a3003 Cluster refactor: s/clusterNodeGetSlotBit/clusterNodeCoversSlot/
Simple rename, "GetSlotBit" is implementation specific

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
98a6c44b75 Cluster refactor: Make clusterState private
Move clusterState into cluster_legacy.h. In order to achieve
this some "accessor" methods needed to be added to the
cluster API and some other minor refactors.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:44:10 +02:00
Binbin
9b6dded421
Fix empty rehashing list in swapdb mode (#12770)
In swapdb mode, the temp db does not init the rehashing list,
the change added in #12764 caused cluster ci to fail.
2023-11-16 11:18:25 +02:00
Binbin
4366bbaa61
Empty rehashing list in emptyDbStructure (#12764)
This is currently harmless, since we have already cleared the dict
before, it will reset the rehashidx to -1, and in incrementallyRehash
we will call dictIsRehashing to check.

It would be nice to empty the list to avoid meaningless attempts, and
the code is also unified to reduce misunderstandings.
2023-11-15 07:55:34 +02:00
Binbin
fe36306340
Fix DB iterator not resetting pauserehash causing dict being unable to rehash (#12757)
When using DB iterator, it will use dictInitSafeIterator to init a old safe
dict iterator. When dbIteratorNext is used, it will jump to the next slot db
dict when we are done a dict. During this process, we do not have any calls to
dictResumeRehashing, which causes the dict's pauserehash to always be > 0.

And at last, it will be returned directly in dictRehashMilliseconds, which causes
us to have slot dict in a state where rehash cannot be completed.

In the "expire scan should skip dictionaries with lot's of empty buckets" test,
adding a `keys *` can reproduce the problem stably. `keys *` will call dbIteratorNext
to trigger a traversal of all slot dicts.

Added dbReleaseIterator and dbIteratorInitNextSafeIterator methods to call dictResetIterator.
Issue was introduced in #11695.
2023-11-14 14:28:46 +02:00
Roshan Khatri
88e83e517b
Add DEBUG_ASSERTIONS option to custom assert (#12667)
This PR introduces a new macro, serverAssertWithInfoDebug, to do complex assertions only for debugging. The main intention is to allow running complex operations during tests without impacting runtime performance. This assertion is enabled when setting DEBUG_ASSERTIONS.

The DEBUG_ASSERTIONS flag is set for the daily and CI variants of `test-sanitizer-address`.
2023-11-11 20:31:34 -08:00
zhaozhao.zz
6258edebf0
reset bucket_count when empty db (#12750)
Introduced in #12697 , should reset bucket_count when empty db, or the overhead memory usage of db can be miscalculated.
2023-11-10 15:52:57 +02:00
Viktor Söderqvist
8878817d89
Optimize SCAN with MATCH when pattern implies cluster slot (#12536)
Optimize the performance of SCAN commands when a match pattern can only contain keys from a 
single slot in cluster mode. This can happen when the pattern contains a hash tag before any 
wildcard matchers or when the key contains no matchers.
2023-11-01 00:06:49 -07:00
Roshan Khatri
f7fa481156
Optimize finding the slot for a given key count in a fenwick tree (#12704)
This PR optimizes the time complexity of findSlotByKeyIndex from O(log^2(N)) to O(log(N)) by using the 
tree structure of binary index tree to find a slot in one search of the index.
2023-10-27 17:15:19 -07:00
Harkrishn Patro
4145d628b4
Reduce dbBuckets operation time complexity from O(N) to O(1) (#12697)
As part of #11695 independent dictionaries were introduced per slot.
Time complexity to discover total no. of buckets across all dictionaries
increased to O(N) with straightforward implementation of iterating over
all dictionaries and adding dictBuckets of each.

To optimize the time complexity, we could maintain a global counter at
db level to keep track of the count of buckets and update it on the start
and end of rehashing.

---------

Co-authored-by: Roshan Khatri <rvkhatri@amazon.com>
2023-10-27 22:05:40 +03:00
Binbin
372ea21875
Update comment around propagateDeletion (#12687)
Fix some outdated comments and add comment for moduleNotifyKeyspaceEvent
we added in #11084 since it seems a bit implicit.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-10-24 13:10:03 +03:00
Vitaly
0270abda82
Replace cluster metadata with slot specific dictionaries (#11695)
This is an implementation of https://github.com/redis/redis/issues/10589 that eliminates 16 bytes per entry in cluster mode, that are currently used to create a linked list between entries in the same slot.  Main idea is splitting main dictionary into 16k smaller dictionaries (one per slot), so we can perform all slot specific operations, such as iteration, without any additional info in the `dictEntry`. For Redis cluster, the expectation is that there will be a larger number of keys, so the fixed overhead of 16k dictionaries will be The expire dictionary is also split up so that each slot is logically decoupled, so that in subsequent revisions we will be able to atomically flush a slot of data.

## Important changes
* Incremental rehashing - one big change here is that it's not one, but rather up to 16k dictionaries that can be rehashing at the same time, in order to keep track of them, we introduce a separate queue for dictionaries that are rehashing. Also instead of rehashing a single dictionary, cron job will now try to rehash as many as it can in 1ms.
* getRandomKey - now needs to not only select a random key, from the random bucket, but also needs to select a random dictionary. Fairness is a major concern here, as it's possible that keys can be unevenly distributed across the slots. In order to address this search we introduced binary index tree). With that data structure we are able to efficiently find a random slot using binary search in O(log^2(slot count)) time.
* Iteration efficiency - when iterating dictionary with a lot of empty slots, we want to skip them efficiently. We can do this using same binary index that is used for random key selection, this index allows us to find a slot for a specific key index. For example if there are 10 keys in the slot 0, then we can quickly find a slot that contains 11th key using binary search on top of the binary index tree.
* scan API - in order to perform a scan across the entire DB, the cursor now needs to not only save position within the dictionary but also the slot id. In this change we append slot id into LSB of the cursor so it can be passed around between client and the server. This has interesting side effect, now you'll be able to start scanning specific slot by simply providing slot id as a cursor value. The plan is to not document this as defined behavior, however. It's also worth nothing the SCAN API is now technically incompatible with previous versions, although practically we don't believe it's an issue.
* Checksum calculation optimizations - During command execution, we know that all of the keys are from the same slot (outside of a few notable exceptions such as cross slot scripts and modules). We don't want to compute the checksum multiple multiple times, hence we are relying on cached slot id in the client during the command executions. All operations that access random keys, either should pass in the known slot or recompute the slot. 
* Slot info in RDB - in order to resize individual dictionaries correctly, while loading RDB, it's not enough to know total number of keys (of course we could approximate number of keys per slot, but it won't be precise). To address this issue, we've added additional metadata into RDB that contains number of keys in each slot, which can be used as a hint during loading.
* DB size - besides `DBSIZE` API, we need to know size of the DB in many places want, in order to avoid scanning all dictionaries and summing up their sizes in a loop, we've introduced a new field into `redisDb` that keeps track of `key_count`. This way we can keep DBSIZE operation O(1). This is also kept for O(1) expires computation as well.

## Performance
This change improves SET performance in cluster mode by ~5%, most of the gains come from us not having to maintain linked lists for keys in slot, non-cluster mode has same performance. For workloads that rely on evictions, the performance is similar because of the extra overhead for finding keys to evict. 

RDB loading performance is slightly reduced, as the slot of each key needs to be computed during the load.

## Interface changes
* Removed `overhead.hashtable.slot-to-keys` to `MEMORY STATS`
* Scan API will now require 64 bits to store the cursor, even on 32 bit systems, as the slot information will be stored.
* New RDB version to support the new op code for SLOT information. 

---------

Co-authored-by: Vitaly Arbuzov <arvit@amazon.com>
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Roshan Khatri <rvkhatri@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-10-14 23:58:26 -07:00
Ye Lin Aung
b705049a7a
Replace emptyDb() with new emptyData() (#12646)
The function was renamed, but the comments were outdated.
2023-10-12 15:34:08 +03:00
Binbin
96e9dec419
Bump codespell from 2.2.4 to 2.2.5 (#12557)
and adjustments.
2023-09-08 16:10:17 +03:00
bodong.ybd
b59f53efb3
Fix sort_ro get-keys function return wrong key number (#12522)
Before:
```
127.0.0.1:6379> command getkeys sort_ro key
(empty array)
127.0.0.1:6379>
```
After:
```
127.0.0.1:6379> command getkeys sort_ro key
1) "key"
127.0.0.1:6379>
```
2023-08-30 22:00:02 +03:00
zhaozhao.zz
aefdc57f1e
add an assertion to make sure numkeys is 0 in getKeysUsingKeySpecs (#12389)
This is an addition to #12380, to prevent potential bugs when collecting
keys from multiple commands in the future.
Note that this function also resets numkeys in some cases.
2023-07-06 08:17:25 +03:00
Lior Lahav
b7559d9f3b
Fix possible crash in command getkeys (#12380)
When getKeysUsingKeySpecs processes a command with more than one key-spec,
and called with a total of more than 256 keys, it'll call getKeysPrepareResult again,
but since numkeys isn't updated, getKeysPrepareResult will not bother to copy key
names from the old result (leaving these slots uninitialized). Furthermore, it did not
consider the keys it already found when allocating more space.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-07-03 12:45:18 +03:00
judeng
07ed0eafa9
improve performance for scan command when matching pattern or data type (#12209)
Optimized the performance of the SCAN command in a few ways:
1. Move the key filtering (by MATCH pattern) in the scan callback,
  so as to avoid collecting them for later filtering.
2. Reduce a many memory allocations and copying (use a reference
  to the original sds, instead of creating an robj, an excessive 2 mallocs
  and one string duplication)
3. Compare TYPE filter directly (as integers), instead of inefficient string
  compare per key.
4. fixed a small bug: when scan zset and hash types, maxiterations uses
  a more accurate number to avoid wrong double maxiterations.

Changes **postponed** for a later version (8.0):
1. Prepare to move the TYPE filtering to the scan callback as well. this was
  put on hold since it has side effects that can be considered a breaking
  change, which is that we will not attempt to do lazy expire (delete) a key
  that was filtered by not matching the TYPE (changing it would mean TYPE filter
  starts behaving the same as MATCH filter already does in that respect). 
2. when the specified key TYPE filter is an unknown type, server will reply a error
  immediately instead of doing a full scan that comes back empty handed. 

Benchmark result:
For different scenarios, we obtained about 30% or more performance improvement.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-27 16:43:46 +03:00
judeng
789c33bb3e
improve performance for keys with expiration time (#12177)
This change only affects keys with expiry time.
For SETEX, the average improvement is 5%, and for GET with
expiation key, we gain a improvement of 13%.

When keys have expiration time, Redis has an assertion to look
up the main dict every time when it touches the expires.
This comes with a performance const, especially during rehash.
the damage will be double.

It looks like that assert was added some ten years old, maybe out
of paranoia, and there's probably no reason to keep it at that cost.
2023-06-14 11:17:39 +03:00