992 Commits

Author SHA1 Message Date
Shivshankar
4be97ebcbe
update valkey in serverLog messeges in server.c file (#231)
Updated keyword "Redis" to "Valkey" in log messeges in server.c file

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-11 16:34:32 -04:00
Jacob Murphy
df5db0627f
Remove trademarked language in code comments (#223)
This includes comments used for module API documentation.

* Strategy for replacement: Regex search: `(//|/\*| \*|#).* ("|\()?(r|R)edis( |\.
  |'|\n|,|-|\)|")(?!nor the names of its contributors)(?!Ltd.)(?!Labs)(?!Contributors.)`
* Don't edit copyright comments
* Replace "Redis version X.X" -> "Redis OSS version X.X" to distinguish
from newly licensed repository
* Replace "Redis Object" -> "Object"
* Exclude markdown for now
* Don't edit Lua scripting comments referring to redis.X API
* Replace "Redis Protocol" -> "RESP"
* Replace redis-benchmark, -cli, -server, -check-aof/rdb with "valkey-"
prefix
* Most other places, I use best judgement to either remove "Redis", or
replace with "the server" or "server"

Fixes #148

---------

Signed-off-by: Jacob Murphy <jkmurphy@google.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-09 10:24:03 +02:00
Viktor Söderqvist
d26d596b3e
Log branding (#252)
Small changes to the log messages printed during startup and shutdown,
for Valkey branding.

SERVER_NAME is replaced by verbatim "Valkey" in one place, because
SERVER_NAME expands to "valkey" in lowercase. (Should we introduce
another macro that expands to "Valkey"?)

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-07 17:28:02 -07:00
Vitah Lin
ba0c93cbdf
Add redis symlinks at the same place as the installed binaries (#193)
Adds a new make variable called `USE_REDIS_SYMLINKS`, with default value
`yes`. If yes, then `make install` creates additional symlinks to the
installed binaries:

* `valkey-server`
* `valkey-cli`
* `valkey-benchmark`
* `valkey-check-rdb`
* `valkey-check-aof`
* `valkey-sentinel`

The names of the symlinks are the legacy redis binary names
(`redis-server`, etc.). The purpose is to provide backward compatibility
for scripts expecting the these filenames. The symlinks are installed in
the same directory as the binaries (typically `/usr/local/bin/` or
similar).

Similarly, `make uninstall` removes these symlinks if
`USE_REDIS_SYMLINKS` is `yes`.

This is described in a note in README.md.

Fixes #147

---------

Signed-off-by: Vitah Lin <vitahlin@gmail.com>
Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
2024-04-06 18:41:53 +02:00
Madelyn Olson
bc28fb4ac0
Update Server version to valkey version (#232)
This commit updates the following fields:
1. server_version -> valkey_version in server info. Since we would like
to advertise specific compatibility, we are making the version specific
to valkey. servername will remain as an optional indicator, and other
valkey compatible stores might choose to advertise something else.
1. We dropped redis-ver from the API. This isn't related to API
compatibility, but we didn't want to "fake" that valkey was creating an
rdb from a Redis version.
1. Renamed server-ver -> valkey_version in rdb info. Same as point one,
we want to explicitly indicate this was created by a valkey server.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-04-05 21:15:57 -07:00
Ping Xie
aaec321213
Remove REDISMODULE_ prefixes and introduce compatibility header (#194)
Fix #146 

Removed REDISMODULE_ prefixes from the core source code to align with
the new SERVERMODULE_ naming convention. Added a new 'redismodule.h'
header file to ensure full backward compatibility with existing modules.
This compatibility layer maps all legacy REDISMODULE_ prefixed
identifiers to their new SERVERMODULE_ equivalents, allowing existing
Redis modules to function without modification.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-04-05 16:59:55 -07:00
Wen Hui
bb1a3fffe7
Fix CI break issue due to serverTests merged issue (#218)
Here is the latest CI run result for this PR

https://github.com/hwware/placeholderkv/actions/runs/8561152261

Signed-off-by: hwware <wen.hui.ware@gmail.com>
2024-04-04 17:37:15 -04:00
Viktor Söderqvist
4646d0825e
Redis in HELP commands (#216)
Removes the word Redis in the output of COMMAND HELP and DEBUG HELP.

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-04 23:18:37 +02:00
0del
6ea6a693e2
Rename 'redis' to 'server' functions missing (#203)
related: https://github.com/valkey-io/valkey/issues/144

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-04 18:21:11 +02:00
Lipeng Zhu
9c5e2bb226
Changes references to redis binaries in output of "--help", "--version" (#113)
Rename output from redis-* to valkey-* for binaries:

1. `valkey-benchmark`
2. `valkey-cli`
3. `valkey-server`
4. `valkey-sentinel`
5. `valkey-check-rdb`
6. `valkey-check-aof`

"--help" "--version" option.

Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
2024-04-04 10:46:17 +02:00
0del
e3e1f9a372
Rename 'redis' to 'server' and redisNodeFlags to clusterNodeFlags (#191)
Rename additional instances of redis to server, as well as redisNodeFlags to clusterNodeFlags.

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:45:23 -07:00
Madelyn Olson
39d0f457a2
Update versioning fields for compatibility (#47)
New info information to be used to determine the valkey versioning info.

Internally, introduce new define values for "SERVER_VERSION" which is
different from the Redis compatibility version, "REDIS_VERSION".

Add two new info fields:
`server_version`: The Valkey server version
`server_name`: Indicates that the server is valkey.

Add one new RDB field: `server_ver`, which indicates the valkey version
that produced the server.

Add 3 new LUA globals: `SERVER_VERSION_NUM`, `SERVER_VERSION`, and
`SERVER_NAME`. Which reflect the valkey version instead of the Redis
compatibility version.

Also clean up various places where Redis and configuration was being
used that is no longer necessary.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-04-03 14:52:36 -07:00
Daniel House
55de74e0dc
The usage (--help) message now refers to valkey (#189)
Fixing redis -> valkey in the output of valkey-server --help.

Signed-off-by: Daniel House <daniel.house@huawei.com>
Co-authored-by: Daniel House <daniel.house@huawei.com>
2024-04-03 23:23:34 +02:00
0del
125a2987af
rename git sha related (#184)
redisGitSHA1 -> serverGitSHA1  
redisGitDirty -> serverGitDirty 
redisBuildId -> serverBuildId 
redisBuildIdRaw -> serverBuildIdRaw 
redisBuildIdString -> serverBuildIdString

#144
#170

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:46:23 +02:00
0del
598b951fb5
rename redisServer to valkeyServer (#183)
https://github.com/valkey-io/valkey/issues/144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:34:18 +02:00
0del
3a0ba0ad93
rename redisCommandArgType serverCommandArgType (#182)
redisCommandArgType -> serverCommandArgType
redisCommandArg -> serverCommandArg

https://github.com/valkey-io/valkey/issues/144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:33:38 +02:00
0del
edee864b34
rename redisOp to serverOp (#181)
https://github.com/valkey-io/valkey/issues/144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:30:30 +02:00
0del
f753db5141
rename redis functions in server.h (#179)
redisPopcount -> serverPopcount
redisSetProcTitle -> serverSetProcTitle
redisCommunicateSystemd -> serverCommunicateSystemd
redisSetCpuAffinity -> serverSetCpuAffinity
redisFork -> serverFork

#144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:26:33 +02:00
Harkrishn Patro
1736018aa9
Remove trademarked wording on configuration file and individual configs (#29)
Remove trademarked wording on configuration layer.

Following changes for release notes:

1. Rename redis.conf to valkey.conf
2. Pre-filled config in the template config file: Changing pidfile to `/var/run/valkey_6379.pid`

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-04-03 19:47:26 +02:00
0del
1629e28f86
Rename redisError to serverError (#177)
Part of #144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 19:12:34 +02:00
0del
b19ebaf551
Rename redisCommand to serverCommand (#174)
Part of #144

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:54:33 +02:00
0del
730174445b
Rename redisOpArray to serverOpArray (#157)
A task of #144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 12:26:20 +08:00
0del
717dfe8022
Rename redisDb to serverDb (#156)
A task of #144.

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 11:02:43 +08:00
0del
98892bb5c3
Rename redisMemOverhead to serverMemOverhead (#159)
Part of #144.

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 10:29:42 +08:00
Vitah Lin
e35d86f2a2
Fix rename REDIS_TEST to SERVER_TEST to pass the Daily workflow (#131)
The test flag `REDIS_TEST` has already be changed to `SERVER_TEST` in
`.github/workflows/daily.yml`, the name in the src directory need to be
changed as well.

```shell
 run: |
        sudo apt-get update && sudo apt-get install libc6-dev-i386
        make 32bit SERVER_CFLAGS='-Werror -DSERVER_TEST'
```

Signed-off-by: Vitah Lin <vitahlin@gmail.com>
2024-04-02 15:43:37 +08:00
John Vandenberg
253fe9dced
Fix typos and replace 'codespell' with 'typos' (#72)
Uses https://github.com/taiki-e/install-action to install
https://github.com/crate-ci/typos in CI

This finds many more/different typos than
https://github.com/codespell-project/codespell , while having very few
false positives.

Signed-off-by: John Vandenberg <jayvdb@gmail.com>
2024-03-31 12:38:22 -07:00
Madelyn Olson
57789d4d08
Update naming to to Valkey (#62)
Documentation references should use `Valkey` while server and cli
references are all under `valkey`.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-03-28 09:58:28 -07:00
Wen Hui
1c9710fca1
Fix the CI sentinel break issue (#30)
I run the CI in my branch, test cases for sentinel could pass


https://github.com/hwware/placeholderkv/actions/runs/8429003027/job/23082549380
2024-03-26 10:20:08 -04:00
Madelyn Olson
bd7387354a Fixed fork/exec problem 2024-03-21 19:36:11 -07:00
Binbin
e04d41d78d
Prevent lua error_reply abuse from causing errorstats to become larger (#13141)
Users who abuse lua error_reply will generate a new error object on each
error call, which can make server.errors get bigger and bigger. This
will
cause the server to block when calling INFO (we also return errorstats
by
default).

To prevent the damage it can cause, when a misuse is detected, we will
print a warning log and disable the errorstats to avoid adding more new
errors. It can be re-enabled via CONFIG RESETSTAT.

Because server.errors may be very large (it may be better now since we
have the limit), config resetstat may block for a while. So in
resetErrorTableStats, we will try to lazyfree server.errors.

See the related discussion at the end of #8217.
2024-03-19 08:18:22 +02:00
Binbin
7b070423b8
Fix dictionary use-after-free in active expire and make kvstore iter to respect EMPTY flag (#13135)
After #13072, there is an use-after-free error. In expireScanCallback, we
will delete the dict, and then in dictScan we will continue to use the dict,
like we will doing `dictResumeRehashing(d)` in the end, this casued an error.

In this PR, in freeDictIfNeeded, if the dict's pauserehash is set, don't
delete the dict yet, and then when scan returns try to delete it again.

At the same time, we noticed that there will be similar problems in iterator.
We may also delete elements during the iteration process, causing the dict
to be deleted, so the part related to iter in the PR has also been modified.
dictResetIterator was also missing from the previous kvstoreIteratorNextDict,
we currently have no scenario that elements will be deleted in kvstoreIterator
process, deal with it together to avoid future problems. Added some simple
tests to verify the changes.

In addition, the modification in #13072 omitted initTempDb and emptyDbAsync,
and they were also added. This PR also remove the slow flag from the expire
test (consumes 1.3s) so that problems can be found in CI in the future.
2024-03-18 17:41:54 +02:00
Binbin
3b3d16f748
Add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires kvstore (#13072)
Currently (following #11695, and #12822), keys kvstore and expires
kvstore both flag with ON_DEMAND, it means that a cluster node will
only allocate a dict when the slot is assigned to it and populated,
but on the other hand, when the slot is unassigned, the dict will
remain allocated.

We considered releasing the dict when the slot is unassigned, but it
causes complications on replicas. On the other hand, from benchmarks
we conducted, it looks like the performance impact of releasing the
dict when it becomes empty and re-allocate it when a key is added
again, isn't huge.

This PR add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires
kvstore.

The impact is about about 2% performance drop, for this hopefully
uncommon scenario.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-03-13 08:30:20 +02:00
Binbin
ad28d222ed
Lua eval scripts first in first out LRU eviction (#13108)
In some cases, users will abuse lua eval. Each EVAL call generates
a new lua script, which is added to the lua interpreter and cached
to redis-server, consuming a large amount of memory over time.

Since EVAL is mostly the one that abuses the lua cache, and these
won't have pipeline issues (i.e. the script won't disappear
unexpectedly,
and cause errors like it would with SCRIPT LOAD and EVALSHA),
we implement a plain FIFO LRU eviction only for these (not for
scripts loaded with SCRIPT LOAD).

### Implementation notes:
When not abused we'll probably have less than 100 scripts, and when
abused we'll have many thousands. So we use a hard coded value of 500
scripts. And considering that we don't have many scripts, then unlike
keys, we don't need to worry about the memory usage of keeping a true
sorted LRU linked list. We compute the SHA of each script anyway,
and put the script in a dict, we can store a listNode there, and use
it for quick removal and re-insertion into an LRU list each time the
script is used.

### New interfaces:
At the same time, a new `evicted_scripts` field is added to
INFO, which represents the number of evicted eval scripts. Users
can check it to see if they are abusing EVAL.

### benchmark:
`./src/redis-benchmark -P 10 -n 1000000 -r 10000000000 eval "return
__rand_int__" 0`

The simple abuse of eval benchmark test that will create 1 million EVAL
scripts. The performance has been improved by 50%, and the max latency
has dropped from 500ms to 13ms (this may be caused by table expansion
inside Lua when the number of scripts is large). And in the INFO memory,
it used to consume 120MB (server cache) + 310MB (lua engine), but now
it only consumes 70KB (server cache) + 210KB (lua_engine) because of
the scripts eviction.

For non-abusive case of about 100 EVAL scripts, there's no noticeable
change in performance or memory usage.

### unlikely potentially breaking change:
in theory, a user can maybe load a
script with EVAL and then use EVALSHA to call it (by calculating the
SHA1 value on the client side), it could be that if we read the docs
carefully we'll realized it's a valid scenario, but we suppose it's
extremely rare. So it may happen that EVALSHA acts on a script created
by EVAL, and the script is evicted and EVALSHA returns a NOSCRIPT error.
that is if you have more than 500 scripts being used in the same
transaction / pipeline.

This solves the second point in #13102.
2024-03-13 08:27:41 +02:00
Chen Tianjie
4cae99e785
Add overhead of all DBs and rehashing dict count to info. (#12913)
Sometimes we need to make fast judgement about why Redis is suddenly
taking more memory. One of the reasons is main DB's dicts doing
rehashing.

We may use `MEMORY STATS` to monitor the overhead memory of each DB, but
there still lacks a total sum to show an overall trend. So this PR adds
the total overhead of all DBs to `INFO MEMORY` section, together with
the total count of rehashing DB dicts, providing some intuitive metrics
about main dicts rehashing.

This PR adds the following metrics to INFO MEMORY
* `mem_overhead_db_hashtable_rehashing` - only size of ht[0] in
dictionaries we're rehashing (i.e. the memory that's gonna get released
soon)

and a similar ones to MEMORY STATS:
* `overhead.db.hashtable.lut` (complements the existing
`overhead.hashtable.main` and `overhead.hashtable.expires` which also
counts the `dictEntry` structs too)
* `overhead.db.hashtable.rehashing` - temporary rehashing overhead.
* `db.dict.rehashing.count` - number of top level dictionaries being
rehashed.

---------

Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2024-03-01 13:41:24 +08:00
debing.sun
f6785df663
Defragger improvements around large bins (#12996)
Implement #12963

## Changes
1. large bins don't have external fragmentation or are at least
non-defraggable, so we should ignore the effect of
large bins when measuring fragmentation, and only measure fragmentation
of small bins. this affects both the allocator_frag* metrics and also
the active-defrag trigger
2. Adding INFO metrics for `muzzy` memory, which is memory returned to
the OS but still shows as RSS until the OS reclaims it.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-02-20 18:11:09 +02:00
zhaozhao.zz
8876d264ac
Calculate the incremental rehash time more precisely (#13063)
In the `databasesCron()`, the time consumed by
`kvstoreIncrementallyRehash()` is used to calculate the exit condition.
However, within `kvstoreIncrementallyRehash()`, the loop first checks
for timeout before performing rehashing. Therefore, the time for the
last rehash isn't accounted for, making the consumed time inaccurate. We
need to precisely calculate all the time spent on rehashing.
Additionally, the time allocated to `kvstoreIncrementallyRehash()`
should be the remaining time, which is
`INCREMENTAL_REHASHING_THRESHOLD_US` minus the already consumed
`elapsed_us`.
2024-02-19 14:29:54 +02:00
Binbin
9103ccc398
AOF_FSYNC_EVERYSEC higher resolution, change aof_last_fsync and aof_flush_postponed_start to use mstime (#13041)
Currently aof_last_fsync is using a low resolution unixtime is really
bad,
it checks if the absolute number of (full) seconds changed by one.
depending on which side of the second barrier it falls, we can get very
different results.

This PR change the resolution to use milliseconds instead of complete
seconds.

In cases where the event loop cycle duration is short and their rapid
(e.g. running
many fast commands with short pipeline, or a high `hz` config), this
change will not
make much difference, since in anyway, we'll be quick to detect that
we're on a "new
second", and it's likely that these fsync will always be executed close
to the second
switch barrier.

But in cases of rare or slow event loops cycles (e.g. either slow
commands, or very
low rate of traffic to redis, and low `hz`), it could easily be that
with the old code,
in some cases we'll have over 1.5 seconds between fsyncs, and in others
less than 0.5.

see discussion in #8612

This PR also handle aof_flush_postponed_start as well, the damage there
is smaller
since the threshold is 2 seconds, and not 1.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-02-18 12:08:29 +02:00
zhaozhao.zz
50d6fe8c4b
Add metrics for WATCH (#12966)
Redis has some special commands that mark the client's state, such as
`subscribe` and `blpop`, which mark the client as `CLIENT_PUBSUB` or
`CLIENT_BLOCKED`, and we have metrics for the special use cases.

However, there are also other special commands, like `WATCH`, which
although do not have a specific flags, and should also be considered
stateful client types. For stateful clients, in many scenarios, the
connections cannot be shared in "connection pool", meaning connection
pool cannot be used. For example, whenever the `WATCH` command is
executed, a new connection is required to put the client into the "watch
state" because the watched keys are stored in the client.

If different business logic requires watching different keys, separate
connections must be used; otherwise, there will be contamination. This
also means that if a user's business heavily relies on the `WATCH`
command, a large number of connections will be required.

Recently we have encountered this situation in our platform, where some
users consume a significant number of connections when using Redis
because of `WATCH`.

I hope we can have a way to observe these special use cases and special
client connections. Here I add a few monitoring metrics:

1. `watching_clients` in `INFO` reply: The number of clients currently
in the "watching" state.
2. `total_watched_keys` in `INFO` reply: The total number of keys being
watched.
3. `watch` in `CLIENT LIST` reply: The number of keys each client is
currently watching.
2024-02-18 10:36:41 +02:00
Binbin
493e31e3ad
Add new DEBUG dict-resizing command to disable the dict resize (#13043)
The test fails here and there:
```
*** [err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
scan didn't handle slot skipping logic.
```

There are two case:
1. In the case of passing the test, we use child process to avoid the
dict resize, but it can not completely limit it, since in the dictDelete
we still have chance to trigger the resize (hit the force radio). The
reason why our test passed before is because the expire dict is still
in the rehashing process, so the dictDelete, the dictShrinkIfNeeded can
not trigger the resize.

2. In the case of failing the test, the expire dict finished the
rehashing,
so the last dictDelete, the dictShrinkIfNeeded trigger the dict resize
since it hit the force radio, so the skipping logic fail.

This PR add a new DEBUG command to disbale the dict resize.
2024-02-08 16:39:58 +02:00
Binbin
13bd3643c2
Re-compute active_defrag_running after adjusting defrag configurations (#13020)
Currently, once active defrag starts, we can not adjust
active_defrag_running
downwards. This is because active_defrag_running will be dynamically
compute
based on the fragmentation, we think we should not lower the effort when
the
fragmentation drops.

However, we need to note that active_defrag_running will also be
dynamically
computed based on configurations. In this case, we are not respecting
cycle-min
or cycle-max. Some people may realize halfway through that defrag
consumes a
lot and want to adjust it.

Previously we could only turn off activedefrag and then turn it on again
to
adjust active_defrag_running downwards. So in this PR, when a active
defrag
configuration change is made, we will re-compute it.

These configuration items are:
- active-defrag-cycle-min
- active-defrag-cycle-max
- active-defrag-threshold-upper
2024-02-06 13:39:07 +02:00
guybe7
8cd62f82ca
Refactor the per-slot dict-array db.c into a new kvstore data structure (#12822)
# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of https://github.com/redis/redis/pull/12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
https://github.com/redis/redis/pull/12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
2024-02-05 17:21:35 +02:00
Binbin
492021db95
Fix blocking commands timeout is reset due to re-processing command (#13004)
In #11012, we will reprocess command when client is unblocked on keys,
in some blocking commands, for example, in the XREADGROUP BLOCK
scenario,
because of the re-processing command, we will recalculate the block
timeout,
causing the blocking time to be reset.

This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly
let the command know that it is being re-processed, later in
blockForKeys
we will not reset the timeout.

Affected BLOCK cases: 
- list / zset / stream, added test cases for each.

Unaffected cases:
- module (never re-process the commands).
- WAIT / WAITAOF (never re-process the commands).

Fixes #12998.
2024-01-30 11:32:59 +02:00
Chen Tianjie
af7ceeb765
Optimize resizing hash table to resize not only non-empty dicts. (#12819)
The function `tryResizeHashTables` only attempts to shrink the dicts
that has keys (change from #11695), this was a serious problem until the
change in #12850 since it meant if all keys are deleted, we won't shrink
the dick.
But still, both dictShrink and dictExpand may be blocked by a fork child
process, therefore, the cron job needs to perform both dictShrink and
dictExpand, for not just non-empty dicts, but all dicts in DBs.

What this PR does:

1. Try to resize all dicts in DBs (not just non-empty ones, as it was
since #12850)
2. handle both shrink and expand (not just shrink, as it was since
forever)
3. Refactor some APIs about dict resizing (get rid of `htNeedsShrink`
`htNeedsShrink` `dictShrinkToFit`, and expose `dictShrinkIfNeeded`
`dictExpandIfNeeded` which already contains all the code of those
functions we get rid of, to make APIs more neat)
4. In the `Don't rehash if redis has child process` test, now that cron
would do resizing, we no longer need to write to DB after the child
process got killed, and can wait for the cron to expand the hash table.
2024-01-29 21:02:07 +02:00
zhaozhao.zz
85a834bfa2
Revert multi OOM limit and add multi buffer limit (#12961)
Fix #9926 , and introduce an alternative method to prevent abuse of
transactions:

1. revert #5454 (which was blocking read-only transactions in OOM
state), and break the tie of MULTI state memory usage and the server OOM
state. Meaning that we'll limit the total memory a single client can
queue, and do that unconditionally regardless of the server being OOM or
not.
2. to prevent abuse of transactions, we use the
`client-query-buffer-limit` to restrict the size of the transaction.
Because the commands cached in the MULTI/EXEC queue have not been
executed yet, so they are also considered a part of the "query buffer"
in a broader sense. In other words, the commands in the MULTI queue and
the `querybuf` of the client together constitute the "query buffer".
When they exceed the limit, the connection will be disconnected.

The reasoning is that it's sensible to sends a single command with a
huge (1GB) argument, and it's sensible to sends a transaction with many
small commands, but it's probably not common to sends a long transaction
with many huge arguments (will consume a lot of memory before even being
executed).

If anyone runs into that, they can simply increase the
`client-query-buffer-limit` config.

P.S. To prevent DDoS attacks, unauthenticated clients have a separate
hard limit. Their query buffer should not exceed a maximum of 1MB. In
other words, if the query buffer of an unauthenticated client exceeds
1MB or the `client-query-buffer-limit` (if it is set to a value smaller
than 1MB,), the connection will be disconnected.
2024-01-25 11:17:39 +02:00
Binbin
628c0dea1b
Some cleanups around function (#12940)
This PR did some cleanups around function:
- drop the comment about Libraries Ctx, since we do have comment
  in functionsLibCtx, no need to maintain multiple copies.
- remove outdated comment about the dropped Library description.
- remove unused desc and code vars in functionExtractLibMetaData.
- fix engines_nemory typo, changed it to engines_memory.
- remove outdated comment about FUNCTION CREATE and FUNCTION INFO,
  FUNCTION CREATE was renamed to FUNCTION LOAD.
- Check in initServer whether the return of functionsInit is OK.
2024-01-23 14:26:33 +02:00
Yanqi Lv
b07174afc2
Change the threshold of dict expand, shrink and rehash (#12948)
Before this change (most recently modified in
https://github.com/redis/redis/pull/12850#discussion_r1421406393), The
trigger for normal expand threshold was 100% utilization and the trigger
for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the
trigger thresholds were multiplied by 5 (`dict_force_resize_ratio`),
meaning 500% for expand and 2% (100/10/5) for shrink.

However, in `dictRehash` (the incremental rehashing), the rehashing
threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by
mistake.
This meant that if a shrinking is triggered when `dict_can_resize` is
`DICT_RESIZE_ENABLE` which the threshold is 10%, the rehashing can
continue when `dict_can_resize` is `DICT_RESIZE_AVOID`.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the
thresholds of the incremental rehashing the same, however, in one we
compare the size of the hash table to the number of records, and in the
other we compare the size of ht[0] to the size of ht[1], so the formula
is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so
the normal shrinking threshold is changed from 100/10 (i.e. 10%) to
100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to
4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125%
(100/8/4)
2024-01-19 17:00:43 +02:00
debing.sun
d0640029dc
Fix race condition issues between the main thread and module threads (#12817)
Fix #12785 and other race condition issues.
See the following isolated comments.

The following report was obtained using SANITIZER thread.
```sh
make SANITIZER=thread
./runtest-moduleapi --config io-threads 4 --config io-threads-do-reads yes --accurate
```

1. Fixed thread-safe issue in RM_UnblockClient()
Related discussion:
https://github.com/redis/redis/pull/12817#issuecomment-1831181220
* When blocking a client in a module using `RM_BlockClientOnKeys()` or
`RM_BlockClientOnKeysWithFlags()`
with a timeout_callback, calling RM_UnblockClient() in module threads
can lead to race conditions
     in `updateStatsOnUnblock()`.

     - Introduced: 
        Version: 6.2
        PR: #7491

     - Touch:
`server.stat_numcommands`, `cmd->latency_histogram`, `server.slowlog`,
and `server.latency_events`
     
     - Harm Level: High
Potentially corrupts the memory data of `cmd->latency_histogram`,
`server.slowlog`, and `server.latency_events`

     - Solution:
Differentiate whether the call to moduleBlockedClientTimedOut() comes
from the module or the main thread.
Since we can't know if RM_UnblockClient() comes from module threads, we
always assume it does and
let `updateStatsOnUnblock()` asynchronously update the unblock status.
     
* When error reply is called in timeout_callback(), ctx is not
thread-safe, eventually lead to race conditions in `afterErrorReply`.

     - Introduced: 
        Version: 6.2
        PR: #8217

     - Touch
       `server.stat_total_error_replies`, `server.errors`, 

     - Harm Level: High
       Potentially corrupts the memory data of `server.errors`
   
      - Solution: 
Make the ctx in `timeout_callback()` with `REDISMODULE_CTX_THREAD_SAFE`,
and asynchronously reply errors to the client.

2. Made RM_Reply*() family API thread-safe
Related discussion:
https://github.com/redis/redis/pull/12817#discussion_r1408707239
Call chain: `RM_Reply*()` -> `_addReplyToBufferOrList()` -> touch
server.current_client

    - Introduced: 
       Version: 7.2.0
       PR: #12326

   - Harm Level: None
Since the module fake client won't have the `CLIENT_PUSHING` flag, even
if we touch server.current_client,
     we can still exit after `c->flags & CLIENT_PUSHING`.

   - Solution
      Checking `c->flags & CLIENT_PUSHING` earlier.

3. Made freeClient() thread-safe
    Fix #12785

    - Introduced: 
       Version: 4.0
Commit:
3fcf959e60

    - Harm Level: Moderate
       * Trigger assertion
It happens when the module thread calls freeClient while the io-thread
is in progress,
which just triggers an assertion, and doesn't make any race condiaions.

* Touch `server.current_client`, `server.stat_clients_type_memory`, and
`clientMemUsageBucket->clients`.
It happens between the main thread and the module threads, may cause
data corruption.
1. Error reset `server.current_client` to NULL, but theoretically this
won't happen,
because the module has already reset `server.current_client` to old
value before entering freeClient.
2. corrupts `clientMemUsageBucket->clients` in
updateClientMemUsageAndBucket().
3. Causes server.stat_clients_type_memory memory statistics to be
inaccurate.
    
    - Solution:
* No longer counts memory usage on fake clients, to avoid updating
`server.stat_clients_type_memory` in freeClient.
* No longer resetting `server.current_client` in unlinkClient, because
the fake client won't be evicted or disconnected in the mid of the
process.
* Judgment assertion `io_threads_op == IO_THREADS_OP_IDLE` only if c is
not a fake client.

4. Fixed free client args without GIL
Related discussion:
https://github.com/redis/redis/pull/12817#discussion_r1408706695
When freeing retained strings in the module thread (refcount decr), or
using them in some way (refcount incr), we should do so while holding
the GIL,
otherwise, they might be simultaneously freed while the main thread is
processing the unblock client state.

    - Introduced: 
       Version: 6.2.0
       PR: #8141

   - Harm Level: Low
     Trigger assertion or double free or memory leak. 

   - Solution:
Documenting that module API users need to ensure any access to these
retained strings is done with the GIL locked

5. Fix adding fake client to server.clients_pending_write
    It will incorrectly log the memory usage for the fake client.
Related discussion:
https://github.com/redis/redis/pull/12817#issuecomment-1851899163

    - Introduced: 
       Version: 4.0
Commit:
9b01b64430

    - Harm Level: None
      Only result in NOP

    - Solution:
       * Don't add fake client into server.clients_pending_write
* Add c->conn assertion for updateClientMemUsageAndBucket() and
updateClientMemoryUsage() to avoid same
         issue in the future.
So now it will be the responsibility of the caller of both of them to
avoid passing in fake client.

6. Fix calling RM_BlockedClientMeasureTimeStart() and
RM_BlockedClientMeasureTimeEnd() without GIL
    - Introduced: 
       Version: 6.2
       PR: #7491

   - Harm Level: Low
Causes inaccuracies in command latency histogram and slow logs, but does
not corrupt memory.

   - Solution:
Module API users, if know that non-thread-safe APIs will be used in
multi-threading, need to take responsibility for protecting them with
their own locks instead of the GIL, as using the GIL is too expensive.

### Other issue
1. RM_Yield is not thread-safe, fixed via #12905.

### Summarize
1. Fix thread-safe issues for `RM_UnblockClient()`, `freeClient()` and
`RM_Yield`, potentially preventing memory corruption, data disorder, or
assertion.
2. Updated docs and module test to clarify module API users'
responsibility for locking non-thread-safe APIs in multi-threading, such
as RM_BlockedClientMeasureTimeStart/End(), RM_FreeString(),
RM_RetainString(), and RM_HoldString().

### About backpot to 7.2
1. The implement of (1) is not too satisfying, would like to get more
eyes.
2. (2), (3) can be safely for backport
3. (4), (6) just modifying the module tests and updating the
documentation, no need for a backpot.
4. (5) is harmless, no need for a backpot.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-01-19 15:12:49 +02:00
Binbin
14b1edfd99
Fix dict resize ratio checks, avoid precision loss from integer division (#12952)
In the past we used integers to compare ratios, let us assume that
we have the following data in expanding:
```
used / size > 5
`80 / 16 > 5` is false
`81 / 16 > 5` is false
`95 / 16 > 5` is false
`96 / 16 > 5` is true
```

Because the integer result is rounded, our resize breaks the ratio
constraint, this has existed since the beginning, which resulted in
us not strictly following the ratio (shrink also has the same issue).

This PR change it to multiplication to avoid floating point
calculations.
2024-01-18 11:16:50 +02:00
Yanqi Lv
e2b7932b34
Shrink dict when deleting dictEntry (#12850)
When we insert entries into dict, it may autonomously expand if needed.
However, when we delete entries from dict, it doesn't shrink to the
proper size. If there are few entries in a very large dict, it may cause
huge waste of memory and inefficiency when iterating.

The main keyspace dicts (keys and expires), are shrinked by cron
(`tryResizeHashTables` calls `htNeedsResize` and `dictResize`),
And some data structures such as zset and hash also do that (call
`htNeedsResize`) right after a loop of calls to `dictDelete`,
But many other dicts are completely missing that call (they can only
expand).

In this PR, we provide the ability to automatically shrink the dict when
deleting. The conditions triggering the shrinking is the same as
`htNeedsResize` used to have. i.e. we expand when we're over 100%
utilization, and shrink when we're below 10% utilization.

Additionally:
* Add `dictPauseAutoResize` so that flows that do mass deletions, will
only trigger shrinkage at the end.
* Rename `dictResize` to `dictShrinkToFit` (same logic as it used to
have, but better name describing it)
* Rename `_dictExpand` to `_dictResize` (same logic as it used to have,
but better name describing it)
 
related to discussion
https://github.com/redis/redis/pull/12819#discussion_r1409293878

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2024-01-15 08:20:53 +02:00
Yanqi Lv
c452e414a8
Optimize performance when many clients [p|s]unsubscribe simultaneously (#12838)
I'm testing the performance of Pub/Sub command recently. I find if many
clients unsubscribe or are killed simultaneously, Redis needs a long
time to deal with it.

In my experiment, I set 5000 clients and each client subscribes 100
channels. Then I call `client kill type pubsub` to simulate the
situation where clients unsubscribe all channels at the same time and
calculate the execution time. The result shows that it takes about 23s.
I use the _perf_ and find that `listSearchKey` in
`pubsubUnsubscribeChannel` costs more than 90% cpu time. I think we can
optimize this situation.

In this PR, I replace list with dict to track the clients subscribing
the channel more efficiently. It changes O(N) to O(1) in the search
phase. Then I repeat the experiment as above. The results are as
follows.

|              | Execution Time(s) |used_memory(MB) |
| :---------------- | :------: | :----: |
| unstable(1bd0b54)        |   23.734   | 65.41 |
| optimize-pubsub           |   0.288   | 67.66 |

Thanks for #11595 , I use a no-value dict and the results shows that the
performance improves significantly but the memory usage only increases
slightly.

Notice:

- This PR will cause the performance degradation about 20% in
`[p|s]subscribe` command but won't freeze Redis.
2024-01-08 10:32:31 +02:00