1565 Commits

Author SHA1 Message Date
Guillaume Koenig
e8078b7315
Allow MEMORY MALLOC-STATS and MEMORY PURGE during loading phase (#1317)
- Enable investigation of memory issues during loading
- Previously, all memory commands were rejected with LOADING error
(except memory help)
- `MEMORY MALLOC-STATS` and `MEMORTY PURGE` are now allowed
as they don't depend on the dataset
- `MEMORY STATS` and `MEMORY USAGE KEY` remain disallowed

Fixes #1299

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
2024-12-08 20:30:07 +08:00
Binbin
176fafcaf7
Add a note to conf about the dangers of modifying dir at runtime (#887)
We've had security issues in the past with it, which is why
we marked it as PROTECTED. But, modifying during runtime
is also a dangerous action. For example, when child processes
are running, persistent temp files and log files may have
unexpected effects.

A scenario for modifying dir at runtime is to migrate a disk
failure, such as using disk-based replication to migrate a node,
writing nodes.conf to save the cluster configuration.

We decided to leave it as is and add a note in the conf
about the dangers of modifying dir at runtime.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-12-08 20:28:14 +08:00
Viktor Söderqvist
a2fe6af457
Fix Module Update Args test when other modules are loaded (#1403)
Fixes #1400

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-12-07 10:25:40 +01:00
Wen Hui
71560a2a4a
Add API UpdateRuntimeArgs for updating the module arguments during runtime (#1041)
Before Redis OSS 7, if we load a module with some arguments during
runtime,
and run the command "config rewrite", the module information will not be
saved into the
config file.

Since Redis OSS 7 and Valkey 7.2, if we load a module with some
arguments during runtime,
the module information (path, arguments number, and arguments value) can
be saved into the config file after config rewrite command is called.
Thus, the module will be loaded automatically when the server startup
next time.

Following is one example:

bind 172.25.0.58
port 7000
protected-mode no
enable-module-command yes

Generated by CONFIG REWRITE
latency-tracking-info-percentiles 50 99 99.9
dir "/home/ubuntu/valkey"
save 3600 1 300 100 60 10000
user default on nopass sanitize-payload ~* &* +https://github.com/ALL
loadmodule tests/modules/datatype.so 10 20

However, there is one problem.
If developers write a module, and update the running arguments by
someway, the updated arguments can not be saved into the config file
even "config rewrite" is called.
The reason comes from the following function
rewriteConfigLoadmoduleOption (src/config.c)

void rewriteConfigLoadmoduleOption(struct rewriteConfigState *state) {
..........
struct ValkeyModule *module = dictGetVal(de);
line = sdsnew("loadmodule ");
line = sdscatsds(line, module->loadmod->path);
for (int i = 0; i < module->loadmod->argc; i++) {
line = sdscatlen(line, " ", 1);
line = sdscatsds(line, module->loadmod->argv[i]->ptr);
}
rewriteConfigRewriteLine(state, "loadmodule", line, 1);
.......
}

The function only save the initial arguments information
(module->loadmod) into the configfile.

After core members discuss, ref
https://github.com/valkey-io/valkey/issues/1177


We decide add the following API to implement this feature:

Original proposal:

int VM_UpdateRunTimeArgs(ValkeyModuleCtx *ctx, int index, char *value);


Updated proposal:

ValkeyModuleString **values VM_GetRuntimeArgs(ValkeyModuleCtx *ctx);
**int VM_UpdateRuntimeArgs(ValkeyModuleCtx *ctx, int argc,
ValkeyModuleString **values);



Why we do not recommend the following way: 


MODULE UNLOAD
Update module args in the conf file
MODULE LOAD

I think there are the following disadvantages:

1. Some modules can not be unloaded. Such as the example module
datatype.so, which is tests/modules/datatype.so
2. it is not atomic operation for MODULE UNLOAD + MODULE LOAD
3. sometimes, if we just run the module unload, the client business
could be interrupted

---------

Signed-off-by: hwware <wen.hui.ware@gmail.com>
2024-12-05 11:58:24 -05:00
uriyage
9f8b174c2e
Optimize IO thread offload for modified argv (#1360)
### Improve expired commands performance with IO threads

#### Background
In our IO threads architecture, IO threads allocate client argv's and
later when we free it after processCommand we offload its free to the IO
threads.
With jemalloc, it's crucial that the same thread that allocates memory
also frees it.

For some commands we modify the client's argv in the main thread during
command processing (for example in `SET EX` command we rewrite the
command to use absolute time for replication propagation).

#### Current issues
1. When commands are rewritten (e.g., expire commands), we store the
original argv
   in `c->original_argv`. However, we're currently:
   - Freeing new argv (allocated by main thread) in IO threads
   - Freeing original argv (allocated by IO threads) in main thread
2. Currently, `c->original_argv` points to new array with old 
objects, while `c->argv` has old array with new objects, making memory
free management complicated.

#### Changes
1. Refactored argv modification handling code to ensure consistency -
both array and objects are now either all new or all old
2. Moved original_argv cleanup to happen in resetClient after argv
cleanup
3. Modified IO threads code to properly handle original argv cleanup
when argv are modified.

#### Performance Impact
Benchmark with `SET EX` commands (650 clients, 512 byte value, 8 IO
threads):
- New implementation: **729,548 ops/sec**
- Old implementation: **633,243 ops/sec**
Representing a **~15%** performance improvement due to more efficient
memory handling.

---------

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Co-authored-by: ranshid <88133677+ranshid@users.noreply.github.com>
2024-12-03 19:20:31 +02:00
Jim Brunner
397201c48f
Refactor of ActiveDefrag to reduce latencies (#1242)
Refer to:  https://github.com/valkey-io/valkey/issues/1141

This update refactors the defrag code to:
* Make the overall code more readable and maintainable
* Reduce latencies incurred during defrag processing

With this update, the defrag cycle time is reduced to 500us, with more
frequent cycles. This results in much more predictable latencies, with a
dramatic reduction in tail latencies.

(See https://github.com/valkey-io/valkey/issues/1141 for more complete
details.)

This update is focused mostly on the high-level processing, and does NOT
address lower level functions which aren't currently timebound (e.g.
`activeDefragSdsDict()`, and `moduleDefragGlobals()`). These are out of
scope for this update and left for a future update.

I fixed `kvstoreDictLUTDefrag` because it was using up to 7ms on a CME
single shard. See original github issue for performance details.

---------

Signed-off-by: Jim Brunner <brunnerj@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-12-03 08:42:29 -08:00
Nugine
3df609ef06
Optimize PFCOUNT, PFMERGE command by SIMD acceleration (#1293)
This PR optimizes the performance of HyperLogLog commands (PFCOUNT,
PFMERGE) by adding AVX2 fast paths.

Two AVX2 functions are added for conversion between raw representation
and dense representation. They are 15 ~ 30 times faster than scalar
implementaion. Note that sparse representation is not accelerated.

AVX2 fast paths are enabled when the CPU supports AVX2 (checked at
runtime) and the hyperloglog configuration is default (HLL_REGISTERS ==
16384 && HLL_BITS == 6).

`PFDEBUG SIMD (ON|OFF)` subcommand is added for unit tests. A new TCL
unit test checks that the results produced by non-AVX2 and AVX2
implementations are exactly equal.

When merging 3 dense hll structures, the benchmark shows a 12x speedup
compared to the scalar version.

```
pfcount key1 key2 key3
pfmerge keyall key1 key2 key3
```

```
======================================================================================================
Type             Ops/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec 
------------------------------------------------------------------------------------------------------
PFCOUNT-scalar    5665.56        35.29839        32.25500        63.99900        67.58300       608.60
PFCOUNT-avx2     72377.83         2.75834         2.67100         5.34300         6.81500      7774.96
------------------------------------------------------------------------------------------------------
PFMERGE-scalar    9851.29        20.28806        20.09500        36.86300        39.16700       615.71
PFMERGE-avx2    125621.89         1.59126         1.55100         3.11900         4.70300     15702.74
------------------------------------------------------------------------------------------------------

scalar: valkey:unstable  2df56d87c0ebe802f38e8922bb2ea1e4ca9cfa76
avx2:   Nugine:hll-simd  8f9adc34021080d96e60bd0abe06b043f3ed0275

CPU:    13th Gen Intel® Core™ i9-13900H × 20
Memory: 32.0 GiB
OS:     Ubuntu 22.04.5 LTS
```

Experiment repo: https://github.com/Nugine/redis-hyperloglog
Benchmark script:
https://github.com/Nugine/redis-hyperloglog/blob/main/scripts/memtier.sh
Algorithm:
https://github.com/Nugine/redis-hyperloglog/blob/main/cpp/bench.cpp

---------

Signed-off-by: Xuyang Wang <xuyangwang@link.cuhk.edu.cn>
2024-12-02 19:40:38 +01:00
Binbin
fbbfe5d3d3
Print logs when the cluster state changes to fail or the fail reason changes (#1188)
This log allows us to easily distinguish between full coverage and
minority partition when the cluster fails. Sometimes it is not easy
to see the minority partition in a healthy shards (both primary and
replicas).

And we decided not to add a cluster_fail_reason field to cluster info.
Given that there are only two reasons and both are well-known and if
we ended up adding more down the road we can add it in the furture.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-12-02 15:55:24 +08:00
zhenwei pi
4695d118dd
RDMA builtin support (#1209)
There are several patches in this PR:

* Abstract set/rewrite config bind option: `bind` option is a special
config, `socket` and `tls` are using the same one. However RDMA uses the
similar style but different one. Use a bit abstract work to make it
flexible for both `socket` and `RDMA`. (Even for QUIC in the future.)
* Introduce closeListener for connection type: closing socket by a
simple syscall would be fine, RDMA has complex logic. Introduce
connection type specific close listener method.
* RDMA: Use valkey.conf style instead of module parameters: use
`--rdma-bind` and `--rdma-port` style instead of module parameters. The
module style config `rdma.bind` and `rdma.port` are removed.
* RDMA: Support builtin: support `make BUILD_RDMA=yes`. module style is
still kept for now.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2024-11-29 11:13:34 +01:00
zvi-code
fd58f8d058
Disable lazy free in defrag test to fix 32bit daily failure (#1370)
Signed-off-by: Zvi Schneider <zvi.schneider22@gmail.com>
Co-authored-by: Zvi Schneider <zvi.schneider22@gmail.com>
2024-11-28 16:27:00 +01:00
Binbin
db7b7396ff
Make KEYS can visit expired key in import-source state (#1326)
After #1185, a client in import-source state can visit expired key
both in read commands and write commands, this commit handle
keyIsExpired function to handle import-source state as well, so
KEYS can visit the expired key.

This is not particularly important, but it ensures the definition,
also doing some cleanup around the test, verified that the client
can indeed visit the expired key.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-28 00:16:55 +08:00
Binbin
b9d224097a
Brocast a PONG to all node in cluster when role changed (#1295)
When a node role changes, we should brocast the change to notify other nodes.
For example, one primary and one replica, after a failover, the replica became
a new primary, the primary became a new replica.

And then we trigger a second cluster failover for the new replica, the
new replica will send a MFSTART to its primary, ie, the new primary.

But the new primary may reject the MFSTART due to this logic:
```
    } else if (type == CLUSTERMSG_TYPE_MFSTART) {
        if (!sender || sender->replicaof != myself) return 1;
```

In the new primary views, sender is still a primary, and sender->replicaof
is NULL, so we will return. Then the manual failover timedout.

Another possibility is that other primaries refuse to vote after receiving
the FAILOVER_AUTH_REQUEST, since in their's views, sender is still a primary,
so it refuse to vote, and then manual failover timedout.
```
void clusterSendFailoverAuthIfNeeded(clusterNode *node, clusterMsg *request) {
    ...
        if (clusterNodeIsPrimary(node)) {
            serverLog(LL_WARNING, "Failover auth denied to...
```

The reason is that, currently, we only update the node->replicaof information
when we receive a PING/PONG from the sender. For details, see clusterProcessPacket.
Therefore, in some scenarios, such as clusters with many nodes and a large
cluster-ping-interval (that is, cluster-node-timeout), the role change of the node
will be very delayed.

Added a DEBUG DISABLE-CLUSTER-RANDOM-PING command, send cluster ping
to a random node every second (see clusterCron).

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-23 00:22:04 +08:00
Binbin
c4be326c32
Make manual failover reset the on-going election to promote failover (#1274)
If a manual failover got timed out, like the election don't get the
enough votes, since we have a auth_timeout and a auth_retry_time, a
new manual failover will not be able to proceed on the replica side.

Like if we initiate a new manual failover after a election timed out,
we will pause the primary, but on the replica side, due to retry_time,
replica does not trigger the new election and the manual failover will
eventually time out.

In this case, if we initiate manual failover again and there is an
ongoing election, we will reset it so that the replica can initiate
a new election at the manual failover's request.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-22 10:28:59 +08:00
Yanqi Lv
4986310945
Import-mode: Avoid expiration and eviction during data syncing (#1185)
New config: `import-mode (yes|no)`

New command: `CLIENT IMPORT-SOURCE (ON|OFF)`

The config, when set to `yes`, disables eviction and deletion of expired
keys, except for commands coming from a client which has marked itself
as an import-source, the data source when importing data from another
node, using the CLIENT IMPORT-SOURCE command.

When we sync data from the source Valkey to the destination Valkey using
some sync tools like
[redis-shake](https://github.com/tair-opensource/RedisShake), the
destination Valkey can perform expiration and eviction, which may cause
data corruption. This problem has been discussed in
https://github.com/redis/redis/discussions/9760#discussioncomment-1681041
and Redis already have a solution. But in Valkey we haven't fixed it by
now.

E.g. we call `set key 1 ex 1` on the source server and transfer this
command to the destination server. Then we call `incr key` on the source
server before the key expired, we will have a key on the source server
with a value of 2. But when the command arrived at the destination
server, the key may be expired and has deleted. So we will have a key on
the destination server with a value of 1, which is inconsistent with the
source server.

In standalone mode, we can use writable replica to simplify the sync
process. However, in cluster mode, we still need a sync tool to help us
transfer the source data to the destination. The sync tool usually work
as a normal client and the destination works as a primary which keep
expiration and eviction.

In this PR, we add a new mode named 'import-mode'. In this mode, server
stop expiration and eviction just like a replica. Notice that this mode
exists only in sync state to avoid data inconsistency caused by
expiration and eviction. Import mode only takes effect on the primary.
Sync tools can mark their clients as an import source by `CLIENT
IMPORT-SOURCE`, which work like a client from primary and can visit
expired keys in `lookupkey`.

**Notice: during the migration, other clients, apart from the import
source, should not access the data imported by import source.**

---------

Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: Yanqi Lv <lvyanqi.lyq@alibaba-inc.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-11-19 21:53:19 +01:00
Binbin
ee386c92ff
Manual failover vote is not limited by two times the node timeout (#1305)
This limit should not restrict manual failover, otherwise in some
scenarios, manual failover will time out.

For example, if some FAILOVER_AUTH_REQUESTs or some FAILOVER_AUTH_ACKs
are lost during a manual failover, it cannot vote in the second manual
failover. Or in a mixed scenario of plain failover and manual failover,
it cannot vote for the subsequent manual failover.

The problem with the manual failover retry is that the mf will pause
the client 5s in the primary side. So every retry every manual failover
timed out is a bad move.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-11-19 11:17:20 -05:00
Binbin
aa2dd3ecb8
Stabilize replica migration test to make sure cluster config is consistent (#1311)
CI report this failure:
```
[exception]: Executing test client: MOVED 1 127.0.0.1:22128.
MOVED 1 127.0.0.1:22128
    while executing
"wait_for_condition 1000 50 {
            [R 3 get key_991803] == 1024 && [R 3 get key_977613] == 10240 &&
            [R 4 get key_991803] == 1024 && ..."
```

This may be because, even though the cluster state becomes OK,
The cluster still has inconsistent configuration for a short period
of time. We make sure to wait for the config to be consistent.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-16 18:58:25 +08:00
Binbin
92181b6797
Fix primary crash when processing dirty slots during shutdown wait / failover wait / client pause (#1131)
We have an assert in propagateNow. If the primary node receives a
CLUSTER UPDATE such as dirty slots during SIGTERM waitting or during
a manual failover pausing or during a client pause, the delKeysInSlot
call will trigger this assert and cause primary crash.

In this case, we added a new server_del_keys_in_slot state just like
client_pause_in_transaction to track the state to avoid the assert
in propagateNow, the dirty slots will be deleted in the end without
affecting the data consistency.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-11-15 16:47:15 +08:00
Binbin
2df56d87c0
Fix empty primary may have dirty slots data due to bad migration (#1285)
If we become an empty primary for some reason, we still need to
check if we need to delete dirty slots, because we may have dirty
slots data left over from a bad migration. Like the target node forcibly
executes CLUSTER SETSLOT NODE to take over the slot without
performing key migration.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-11 22:13:47 +08:00
Binbin
a2d22c63c0
Fix replica not able to initate election in time when epoch fails (#1009)
If multiple primary nodes go down at the same time, their replica nodes will
initiate the elections at the same time. There is a certain probability that
the replicas will initate the elections in the same epoch.

And obviously, in our current election mechanism, only one replica node can
eventually get the enough votes, and the other replica node will fail to win
due the the insufficient majority, and then its election will time out and
we will wait for the retry, which result in a long failure time.

If another node has been won the election in the failover epoch, we can assume
that my election has failed and we can retry as soom as possible.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-11 22:12:49 +08:00
Wen Hui
3672f9b2c3
Revert "Decline unsubscribe related command in non-subscribed mode" (#1265)
This PR goal is to revert the changes on PR
https://github.com/valkey-io/valkey/pull/759

Recently, we got some reports that in Valkey 8.0 the PR
https://github.com/valkey-io/valkey/pull/759 (Decline unsubscribe
related command in non-subscribed mode) causes break change.
(https://github.com/valkey-io/valkey/issues/1228)

Although from my thought, call commands "unsubscribeCommand",
"sunsubscribeCommand", "punsubscribeCommand" in request-response mode
make no sense. This is why I created PR
https://github.com/valkey-io/valkey/pull/759

But breaking change is always no good, @valkey-io/core-team How do you
think we revert this PR code changes?

Signed-off-by: hwware <wen.hui.ware@gmail.com>
2024-11-07 20:05:16 -05:00
Binbin
22bc49c4a6
Try to stabilize the failover call in the slot migration test (#1078)
The CI report replica will return the error when performing CLUSTER
FAILOVER:
```
-ERR Master is down or failed, please use CLUSTER FAILOVER FORCE
```

This may because the primary state is fail or the cluster connection
is disconnected during the primary pause. In this PR, we added some
waits in wait_for_role, if the role is replica, we will wait for the
replication link and the cluster link to be ok.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-07 13:42:20 +08:00
Binbin
a0b1cbad83
Change errno from EEXIST to EALREADY in serverFork if child process exists (#1258)
We set this to EEXIST in 568c2e039bac388003068cd8debb2f93619dd462,
it prints "File exists" which is not quite accurate,
change it to EALREADY, it will print "Operation already in progress".

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-07 12:13:00 +08:00
Binbin
12c5af03b8
Remove empty DB check branch in KEYS command (#1259)
We don't think we really care about optimizing for the empty DB case,
which should be uncommon. Adding branches hurts branch prediction.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-06 10:32:00 +08:00
Binbin
a102852d5e
Fix timing issue in cluster-shards tests (#1243)
The cluster-node-timeout is 3000 in our tests, the timing test wasn't
succeeding, so extending the wait_for made them much more reliable.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-11-02 19:51:14 +08:00
Binbin
2743b7e04b
Fix SORT GET to ignore special pattern # in cluster slot check (#1182)
This special pattern '#' is used to get the element itself,
it does not actually participate in the slot check.

In this case, passing `GET #` will cause '#' to participate
in the slot check, causing the command to get an
`pattern may be in different slots` error.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-19 14:56:10 +08:00
Binbin
701ab72429
Remove the restriction that cli --cluster create requires at least 3 primary nodes (#1075)
There is no limitation in Valkey to create a cluster with 1 or 2 primaries,
only that it cannot do automatic failover. Remove this restriction and
add `are you sure` prompt to prompt the user.

This allow we use it to create a test cluster by cli or by
create-cluster.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-17 13:33:44 +08:00
Nadav Levanoni
136d0fd212
Add 'WithDictIndex' expiry API and update RANDOMKEY command (#1155)
https://github.com/valkey-io/valkey/issues/1145

First part of a two-step effort to add `WithSlot` API for expiry. This
PR is to fix a crash that occurs when a RANDOMKEY uses a different slot
than the cached slot of a client during a multi-exec.

The next part will be to utilize the new API as an optimization to
prevent duplicate work when calculating the slot for a key.

---------

Signed-off-by: Nadav Levanoni <nadavl@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Nadav Levanoni <nadavl@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-10-16 17:40:11 -07:00
Binbin
247a8f23c5
Fix FUNCTION KILL error message being displayed as SCRIPT KILL (#1171)
The client that was killed by FUNCTION KILL received a reply of
SCRIPT KILL and the server log also showed SCRIPT KILL.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-15 23:32:42 +08:00
Binbin
416defdc0e
Minor cleanups in acl-v2 tests (#1166)
1. Make sure to assert the ERR prefix.
2. Match "Syntax error*" in case of the message change.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-15 10:30:03 +08:00
Binbin
e50f31ef3a
Fix aof race in shutdown nosave timedout script test (#1156)
Ci report this failure:
```
*** [err]: SHUTDOWN NOSAVE can kill a timedout script anyway in tests/unit/scripting.tcl
Expected 'BUSY Valkey is busy running a script. *' to match '*connection refused*' (context: type eval line 8 cmd {assert_match {*connection refused*} $e} proc ::test)
```

We can see the logs the shutdown got rejected because there is an AOFRW
pending:
```
Writing initial AOF, can't exit.
Errors trying to shut down the server. Check the logs for more information.
```

The reason is that the previous test enabled the aof.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-13 22:06:28 +08:00
Shivshankar
079f18ad97
Add io-threads-do-reads config to deprecated config table to have no effect. (#1138)
this fixes: https://github.com/valkey-io/valkey/issues/1116

_Issue details from #1116 by @zuiderkwast_ 

> This config is undocumented since #758. The default was changed to
"yes" and it is quite useless to set it to "no". Yet, it can happen that
some user has an old config file where it is explicitly set to "no". The
result will be bad performace, since I/O threads will not do all the
I/O.
> 
> It's indeed confusing.
> 
> 1. Either remove the whole option from the code. And thus no need for
documentation. _OR:_
> 2. Introduce the option back in the configuration, just as a comment
is fine. And showing the default value "yes": `# io-threads-do-reads
yes` with additional text.
> 
> _Originally posted by @melroy89 in [#1019 (reply in
thread)](https://github.com/orgs/valkey-io/discussions/1019#discussioncomment-10824778)_

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-10-10 17:46:09 +02:00
Roshan Khatri
9b8a06137c
Fix empty response for ACL CAT category subcommand for module defined categories (#1140)
The module commands which were added to acl categories were getting
skipped when `ACL CAT category` command was executed.

This PR fixes the bug.
Before:
```
127.0.0.1:6379> ACL CAT foocategory
(empty array)
```
After:
```
127.0.0.1:6379> ACL CAT foocategory
aclcheck.module.command.test.add.new.aclcategories
```

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>
2024-10-09 21:20:47 -07:00
Binbin
1892f8a731
Add server log when module load fails with busy name (#1084)
Currently when module loading fails due to busy name, we
don't have a clean way to assist to troubleshooting.

Case 1: when loading the same module multiple times, we can
not detemine the cause of its failure without referring to
the module list or the earliest module load log. The log
may not exist and sometimes it is difficult for people
to associate module list.

Case 2: when multiple modules use the same module name,
we can not quickly associate the busy name without referring
to the module list and the earliest module load log.
Different people wrote modules with the same module name,
they don't easily associate module name.

So in this PR, when doing module onload, we will try to
print a busy name log if this happen. Currently we check
ctx.module since if it is NULL it means the Init call
failed, and Init currently only fails with busy name.

It's kind of ugly. It would have been nice if we could have had a
better way for onload to signal why the load failed.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-09 16:10:29 +08:00
Madelyn Olson
150c197bdd
Apply CVE patches for CVE-2024-31449, CVE-2024-31227, CVE-2024-31228 (#1115)
Applying the CVEs against mainline.

(CVE-2024-31449) Lua library commands may lead to stack overflow and
potential RCE.
(CVE-2024-31227) Potential Denial-of-service due to malformed ACL
selectors.
(CVE-2024-31228) Potential Denial-of-service due to unbounded pattern
matching.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-10-02 19:22:09 -04:00
Binbin
9827eef4d0
Avoid timing issue in diskless-load-swapdb test (#1077)
Since we paused the primary node earlier, the replica may enter
cluster down due to primary node pfail. Here set allow read to
prevent subsequent read errors.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-10-01 13:14:30 -07:00
Viktor Söderqvist
69eddb4874
Speed up AOF rewrite test case (#1093)
These two test cases run in a loop:

* AOF rewrite during write load: RDB preamble=yes
* AOF rewrite during write load: RDB preamble=no

Both of the test cases build up a lot of data (3-4 million keys when I
run locally) so we should empty the data before the second test case.
Otherwise, the second test cases adds keys on top of the keys added in
the first test case, resulting in the double number of keys and takes
more time.

Before this commit:

    [ok]: AOF rewrite during write load: RDB preamble=yes (18225 ms)
    [ok]: AOF rewrite during write load: RDB preamble=no (37249 ms)

After:

    [ok]: AOF rewrite during write load: RDB preamble=yes (18777 ms)
    [ok]: AOF rewrite during write load: RDB preamble=no (19940 ms)

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-09-30 19:55:23 +02:00
Viktor Söderqvist
99865b197c
Fix bug for CLUSTER SLOTS from EVAL over TLS (#1072)
For fake clients like the ones used for Lua and modules, we don't
determine TLS in the right way, causing CLUSTER SLOTS from EVAL over TLS
to fail a debug-assert.

This error was introduced when the caching of CLUSTER SLOTS was
introduced, i.e. in 8.0.0.

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-09-25 03:55:53 -04:00
Binbin
80fcbd3fec
Fix module / script call CLUSTER SLOTS / SHARDS fake client check crash (#1063)
The reason is VM_Call will use a fake client without connection,
so we also need to check if c->conn is NULL.

This also affects scripts. If they are called in the script, the
server will crash. Injecting commands into AOF will also cause
startup failure.

Fixes #1054.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-25 14:50:48 +08:00
Binbin
56fba564b6
Print an empty primary log when primary lost its last slot (#1064)
The one in CLUSTER SETSLOT help us keep track of state better,
of course it also can make the test case happy.

The one in gossip process fixes a problem that a replica can
print a log saying it is an empty primary.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@outlook.com>
2024-09-23 13:14:09 +08:00
Binbin
d9c41e9ef9
Fix timing issue in the new tot-net-out replica test (#1060)
Apparently there is a timing issue when using wait_for_ofs_sync:
```
[exception]: Executing test client: can't read "out_before": no such variable.
can't read "out_before": no such variable
```

The reason is that if the connection between the primary
and the replica is not established yet, the master_repl_offset
of the primary and replica in wait_for_ofs_sync is 0, and
the check fails, resulting in no replica client in the
client list below.

In this case, we need to make sure the replica is online
before proceeding.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-20 14:25:05 +08:00
Shivshankar
56fd97733b
Move printver test to info-command file (#1056)
This fixes: #219

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-09-20 10:18:19 +08:00
Binbin
dcc7678fc4
Fix replica unable trigger migration when it received CLUSTER SETSLOT in advance (#981)
Fix timing issue in evaluating `cluster-allow-replica-migration` for replicas

There is a timing bug where the primary and replica have different 
`cluster-allow-replica-migration` settings. In issue #970, we found that if 
the replica receives `CLUSTER SETSLOT` before the gossip update, it remains 
in the original shard. This happens because we only process the 
`cluster-allow-replica-migration` flag for primaries during `CLUSTER SETSLOT`.

This commit fixes the issue by also evaluating this flag for replicas in the 
`CLUSTER SETSLOT` path, ensuring correct replica migration behavior.

Closes #970
---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@outlook.com>
2024-09-13 15:32:20 -07:00
Ping Xie
3cc619f637
Disable flaky empty shard slot migration tests (#1027)
Will continue my investigation offline

Signed-off-by: Ping Xie <pingxie@google.com>
2024-09-13 00:02:39 -07:00
Binbin
f7c5b40183
Avoid false positive in election tests (#984)
The node may not be able to initiate an election in time due to
problems with cluster communication. If an election is initiated,
make sure its offset is 0.

Closes #967.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-13 14:53:39 +08:00
Ping Xie
76a59788e6
Re-enable empty-shard slot migration tests (#1024)
Related to #734 and #858

Signed-off-by: Ping Xie <pingxie@google.com>
2024-09-11 23:19:32 -07:00
uriyage
8cca11ac54
Fix wrong count for replica's tot-net-out (#1013)
Fix duplicate calculation of replica's `net_output_bytes`

- Remove redundant calculation leftover from previous refactor
- Add test to prevent regression

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
2024-09-12 10:36:40 +08:00
Madelyn Olson
2b207ee1b3
Improve stability of hostnames test (#1016)
Maybe partially resolves https://github.com/valkey-io/valkey/issues/952.

The hostnames test relies on an assumption that node zero and node six
don't communicate with each other to test a bunch of behavior in the
handshake stake. This was done by previously dropping all meet packets,
however it seems like there was some case where node zero was sending a
single pong message to node 6, which was partially initializing the
state.

I couldn't track down why this happened, but I adjusted the test to
simply pause node zero which also correctly emulates the state we want
to be in since we're just testing state on node 6, and removes the
chance of errant messages. The test was failing about 5% of the time
locally, and I wasn't able to reproduce a failure with this new
configuration.

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-09-11 09:52:34 -07:00
Binbin
4033c99ef5
Fix module RdbLoad wrongly disable the AOF (#1001)
In RdbLoad, we disable AOF before emptyData and rdbLoad to prevent copy-on-write issues. After rdbLoad completes, AOF should be re-enabled, but the code incorrectly checks server.aof_state, which has been reset to AOF_OFF in stopAppendOnly. This leads to AOF not being re-enabled after being disabled.
---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-10 21:00:08 -07:00
Binbin
50c1fe59f7
Add missing moduleapi getchannels test and fix tests (#1002)
Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-10 10:13:54 +08:00
Binbin
5693fe4664
Fix set expire test due to the new lazyfree configs changes (#980)
Test failed because these two PRs #865 and #913.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-09-02 22:43:09 +08:00