This includes a way to run two versions of the server from the TCL test
framework. It's a preparation to add more cross-version tests. The
runtest script accepts a new parameter
./runtest --other-server-path path/to/valkey-server
and a new tag "needs:other-server" for test cases and start_server.
Tests with this tag are automatically skipped if `--other-server-path`
is not provided.
This PR adds it in a CI job with Valkey 7.2.7 by downloading a binary
release.
Fixes#76
---------
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
The desync regression test was created as a regression test for the
following bug:
in case we embed NULL termination inside inline/multi-bulk message we
will not be able to perform strchr in order to
identify the newline(\n)/carriage-return(\r) in the client query buffer.
this can influence (for example) replica reading primary stream and keep
filling it's query buffer endlessly consuming more and more memory.
In order to handle the above risk, a check was added to verify the
inline bulk and multi-bulk size are not exceeding the 64K bytes in the
query-buffer. A test was placed in order to verify this.
This PR introduce the following fixes to the desync regression test:
1. fix the sent payload to flush 1024 bytes block of 'A's instead of
'payload' which was sent by mistake.
2. Make sure that the connection is correctly terminated on protocol
error by the server after exceeding the 64K and not over 64K.
3. add another test intrinsic which will also verify the nested bulk
with embedded null termination (was not verified before)
fixes https://github.com/valkey-io/valkey/issues/1583
NOTE: Although it is possible to change the use of strchr to a more
"safe" utility (eg memchr) which will not pause scan at first occurrence
of '\0', we still like to protect against over excessive usage of the
query buffer and also preserve the current behavior(?). We will look
into improving this though in a followup issue.
---------
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
change the format of the dual channel replication logs so that it will
not
conflict with existing log formats like modules.
Fixes: https://github.com/valkey-io/valkey/issues/1509
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
The test case checks for expire-cycle in LATENCY LATEST, but with the
new hash table, the expiry-cycle is too fast to be logged by latency
monitor. Lower the latency monitor threshold to make it more likely to
be logged.
Fixes#1580
---------
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
When processing a cluster bus PING extension, there is a memory leak
when adding a new key to the `nodes_black_list` dict. We now make sure
to free the key `sds` if the dict did not take ownership of it.
Signed-off-by: Pierre Turin <pieturin@amazon.com>
This issue affected only two message types (CLUSTERMSG_TYPE_PUBLISH and CLUSTERMSG_TYPE_PUBLISHSHARD) because they used a light message header, which caused the CLUSTER INFO stats to miss sent/received message information for those types.
---------
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
This commit creates a new compilation unit for the scripting engine code
by extracting the existing code from the functions unit.
We're doing this refactor to prepare the code for running the `EVAL`
command using different scripting engines.
This PR has a module API change: we changed the type of error messages
returned by the callback
`ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc` to be a
`ValkeyModuleString` (aka `robj`);
This PR also fixes#1470.
---------
Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
Some commands that use unix-time, such as `EXPIREAT` and `SET EXAT`, should include the deleted keys in the `expired_keys` statistics if the specified time has already expired, and notifications should be sent in the manner of expired.
---------
Signed-off-by: Ray Cao <zisong.cw@alibaba-inc.com>
Just like spell-check workflow, we should allow to trigger it
in push events, so that the forks repo can notice the format
thing way before submitting the PR.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Adds filter options to CLIENT LIST:
* USER <username>
Return clients authenticated by <username>.
* ADDR <ip:port>
Return clients connected from the specified address.
* LADDR <ip:port>
Return clients connected to the specified local address.
* SKIPME (YES|NO)
Exclude the current client from the list (default: no).
* MAXAGE <maxage>
Only list connections older than the specified age.
Modifies the ID filter to CLIENT KILL to allow multiple IDs
* ID <client-id> [<client-id>...]
Kill connections by client ids.
This makes CLIENT LIST and CLIENT KILL accept the same options.
For backward compatibility, the default value for SKIPME is NO for
CLIENT LIST and YES for CLIENT KILL.
The MAXAGE comes from CLIENT KILL, where it *keeps* clients with the
given max age and kills the older ones. This logic becomes weird for
CLIENT LIST, but is kept for similary with CLIENT KILL, for the use case
of first testing manually using CLIENT LIST, and then running CLIENT
KILL with the same filters.
The `ID client-id [client-id ...]` no longer needs to be the last
filter. The parsing logic determines if an argument is an ID or not
based on whether it can be parsed as an integer or not.
Partly addresses: #668
---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Add `paused_actions` and `paused_timeout_milliseconds` for INFO Clients
to inform users about if clients are paused.
---------
Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
`sds` is a typedef of `char *`.
`const sds` means `char * const`, i.e. a const-pointer to non-const
content.
More often, you would want `const char *`, i.e. a pointer to
const-content. Until now, it's not possible to express that. This PR
adds `const_sds` which is a pointer to const-content sds.
To get a const-pointer to const-content sds, you can use `const
const_sds`.
In this PR, some uses of `const sds` are replaced by `const_sds`. We can
use it more later.
Fixes#1542
---------
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
In some cases unix groups could have whitespace and/or `\` in them.
One example is my workstation. It's a MacOS in an Active Directory
domain. So my user has group `LD\Domain Users`.
Running `make test` on `unstable` and `8.0` branches fails with:
I'm not sure if we need to fix this in 8.0. But it seems that it should
be fixed in unstable.
Signed-off-by: secwall <secwall@yandex-team.ru>
This PR replaces dict with the new hashtable data structure in the HASH
datatype. There is a new struct for hashtable items which contains a
pointer to value sds string and the embedded key sds string. These
values were previously stored in dictEntry. This structure is kept
opaque so we can easily add small value embedding or other optimizations
in the future.
closes#1095
---------
Signed-off-by: Rain Valentine <rsg000@gmail.com>
After #1545 disabled some tests for reply schema validation, we now have
another issue that ECHO is not covered.
```
WARNING! The following commands were not hit at all:
echo
ERROR! at least one command was not hit by the tests
```
This patch adds a test case for ECHO in the unit/other test suite. I
haven't checked if there are more commands that aren't covered.
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
The commands used in valkey-cli tests are not important the reply schema
validation. Skip them to avoid the problem if tests hanging. This has
failed lately in the daily job:
```
[TIMEOUT]: clients state report follows.
sock55fedcc19be0 => (IN PROGRESS) valkey-cli pubsub mode with single standard channel subscription
Killing still running Valkey server 33357
```
These test cases use a special valkey-cli command `:get pubsub` command,
which is an internal command to valkey-cli rather than a Valkey server
command. This command hangs when compiled with with logreqres enabled.
Easy solution is to skip the tests in this setup.
The test cases were introduced in #1432.
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
When the cluster changes, we need to persist the cluster configuration,
and these file IO operations may cause latency.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Imagine we have a cluster, for example a three-shard cluster,
if shard 1 doing a CLUSTER RESET HARD, it will change the node
name, and then other nodes will mark it as NOADR since the node
name received by PONG has changed.
In the eyes of other nodes, there is one working primary node
left but with no address, and in this case, the address report
in MOVED will be invalid and will confuse the clients. And in
the same time, the replica will not failover since its primary
is not in the FAIL state. And the cluster looks OK to everyone.
This leaves a cluster that appears OK, but with no coverage for
shard 1, obviously we should do something like CLUSTER FORGET
to remove the node and fix the cluster before using it.
But the point in here, we can mark the NOADDR node as FAIL to
advance the cluster state. If a node is NOADDR means it does
not have a valid address, so we won't reconnect it, we won't
send PING, we won't gossip it, it seems reasonable to mark it
as FAIL.
Signed-off-by: Binbin <binloveplay1314@qq.com>
When multiple primary nodes fail simultaneously, the cluster can not recover
within the default effective time (data_age limit). The main reason is that
the vote is without ranking among multiple replica nodes, which case too many
epoch conflicts.
Therefore, we introduced into ranking based on the failed primary shard-id.
Introduced a new failed_primary_rank var, this var means the rank of this
myself instance in the context of all failed primary list. This var will be
used in failover and we will do the failover election packets in order based
on the rank, this can effectively avoid the voting conflicts.
If a single primary is down, the behavior is the same as before. If multiple
primaries are down, their replica election initiation time will be delayed
by 500ms according to the ranking.
Signed-off-by: Binbin <binloveplay1314@qq.com>
When latency-monitor-threshold is set to 0, it means the latency monitor
is disabled, and in VM_LatencyAddSample, we wrote the condition
incorrectly, causing us to record latency when latency was turned off.
This bug was introduced in the very first day, see e3b1d6d, it was merged
in 2019.
Signed-off-by: Binbin <binloveplay1314@qq.com>
In #1441, we found a assert, and decided remove this assert and instead
just free the newly created node and close the link, since if we cannot
get the IP from the link it probably means the connection was closed.
```
=== VALKEY BUG REPORT START: Cut & paste starting from here ===
17847:M 19 Dec 2024 00:15:58.021 # === ASSERTION FAILED ===
17847:M 19 Dec 2024 00:15:58.021 # ==> cluster_legacy.c:3252 'nodeIp2String(node->ip, link, hdr->myip) == C_OK' is not true
------ STACK TRACE ------
17847 valkey-server *
src/valkey-server 127.0.0.1:27131 [cluster](clusterProcessPacket+0x1304) [0x4e5634]
src/valkey-server 127.0.0.1:27131 [cluster](clusterReadHandler+0x11e) [0x4e59de]
/__w/valkey/valkey/src/valkey-tls.so(+0x2f1e) [0x7f083983ff1e]
src/valkey-server 127.0.0.1:27131 [cluster](aeMain+0x8a) [0x41afea]
src/valkey-server 127.0.0.1:27131 [cluster](main+0x4d7) [0x40f547]
/lib64/libc.so.6(+0x40c8) [0x7f083985a0c8]
/lib64/libc.so.6(__libc_start_main+0x8b) [0x7f083985a18b]
src/valkey-server 127.0.0.1:27131 [cluster](_start+0x25) [0x410ef5]
```
But it also introduces another assert. The reason is that this new node
is not added to the cluster nodes dict.
```
17128:M 08 Jan 2025 10:51:44.061 # === ASSERTION FAILED ===
17128:M 08 Jan 2025 10:51:44.061 # ==> cluster_legacy.c:1693 'dictDelete(server.cluster->nodes, nodename) == DICT_OK' is not true
------ STACK TRACE ------
17128 valkey-server *
src/valkey-server 127.0.0.1:28627 [cluster][0x4ebdc4]
src/valkey-server 127.0.0.1:28627 [cluster][0x4e81d2]
src/valkey-server 127.0.0.1:28627 [cluster](clusterReadHandler+0x268)[0x4e8618]
/__w/valkey/valkey/src/valkey-tls.so(+0xb278)[0x7f109480b278]
src/valkey-server 127.0.0.1:28627 [cluster](aeMain+0x89)[0x592b09]
src/valkey-server 127.0.0.1:28627 [cluster](main+0x4b3)[0x453e23]
/lib64/libc.so.6(__libc_start_main+0xe5)[0x7f10958bf7e5]
src/valkey-server 127.0.0.1:28627 [cluster](_start+0x2e)[0x454a5e]
```
This closes#1527.
Signed-off-by: Binbin <binloveplay1314@qq.com>
The fix that Redis gave us for the CVE-2024-46981 was freeing lctx.lua,
and I didn't merge it correctly. We made some changes so that we are
able to async free the lua context, so we need to free the passed in
context. This was applied correctly on the two released versions (8.0
and 7.2) just not on unstable.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
This PR is to cleanup the `SERVER_TEST` compiler flag from cmake compile
definitions, as it is no longer required in the new unit test framework, see #428.
Signed-off-by: Karthick Ariyaratnam <karthyuom@gmail.com>
This PR introduces improvements to the hashtable iterator, implementing
prefetching technique described in the blog post [Unlock One Million RPS
- Part 2](https://valkey.io/blog/unlock-one-million-rps-part2/) . The
changes lay the groundwork for further enhancements in use cases
involving iterators. Future PRs will build upon this foundation to
improve performance and functionality in various iterator-dependent
operations.
In the pursuit of maximizing iterator performance, I conducted a
comprehensive series of experiments. My tests encompassed a wide range
of approaches, including processing multiple bucket indices in parallel,
prefetching the next bucket upon completion of the current one, and
several other timing and quantity variations. Surprisingly, after
rigorous testing and performance analysis, the simplest implementation
presented in this PR consistently outperformed all other more complex
strategies.
## Implementation
Each time we start iterating over a bucket, we prefetch data for future
iterations:
- We prefetch the entries of the next bucket (if it exists).
- We prefetch the structure (but not the entries) of the bucket after
the next.
This prefetching is done when we pick up a new bucket, increasing the
chance that the data will be in cache by the time we need it.
## Performance
The data below was taken by conducting keys command on 64cores Graviton
3 Amazon EC2 instance with 50 mil keys in size of 100 bytes each. The
results regarding the duration of “keys *” command was taken from “info
all” command.
```
+--------------------+------------------+-----------------------------+
| prefetching | Time (seconds) | Keys Processed per Second |
+--------------------+------------------+-----------------------------+
| No | 11.112279 | 4,499,529 |
| Yes | 3.141916 | 15,913,862 |
+--------------------+------------------+-----------------------------+
Improvement:
Comparing the iterator without prefetching to the one with prefetching,
we can see a speed improvement of 11.112279 / 3.141916 ≈ 3.54 times faster.
```
### Save command improvment
#### Setup:
- 64cores Graviton 3 Amazon EC2 instance.
- 50 mil keys in size of 100 bytes each.
- Running valkey server over RAM file system.
- crc checksum and comperssion off.
#### Results
```
+--------------------+------------------+-----------------------------+
| prefetching | Time (seconds) | Keys Processed per Second |
+--------------------+------------------+-----------------------------+
| No | 28 | 1,785,700 |
| Yes | 19.6 | 2,550,000 |
+--------------------+------------------+-----------------------------+
Improvement:
- Reduced SAVE time by 30% (8.4 seconds faster)
- Increased key processing rate by 42.8% (764,300 more keys/second)
```
Signed-off-by: NadavGigi <nadavgigi102@gmail.com>
Resolves issue with valkey-cli not auto exiting from subscribed mode on
reaching zero pub/sub subscription (previously filed on Redis)
https://github.com/redis/redis/issues/12592
---------
Signed-off-by: Nikhil Manglore <nmanglor@amazon.com>
This PR replaces dict with hashtable in the ZSET datatype. Instead of
mapping key to score as dict did, the hashtable maps key to a node in
the skiplist, which contains the score. This takes advantage of
hashtable performance improvements and saves 15 bytes per set item - 24
bytes overhead before, 9 bytes after.
Closes#1096
---------
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
# Refactor client structure to use modular data components
## Current State
The client structure allocates memory for replication / pubsub /
multi-keys / module / blocked data for every client, despite these
features being used by only a small subset of clients. In addition the
current field layout in the client struct is suboptimal, with poor
alignment and unnecessary padding between fields, leading to a larger
than necessary memory footprint of 896 bytes per client. Furthermore,
fields that are frequently accessed together during operations are
scattered throughout the struct, resulting in poor cache locality.
## This PR's Change
1. Lazy Initialization
- **Components are only allocated when first used:**
- PubSubData: Created on first SUBSCRIBE/PUBLISH operation
- ReplicationData: Initialized only for replica connections
- ModuleData: Allocated when module interaction begins
- BlockingState: Created when first blocking command is issued
- MultiState: Initialized on MULTI command
2. Memory Layout Optimization:
- Grouped related fields for better locality
- Moved rarely accessed fields (e.g., client->name) to struct end
- Optimized field alignment to eliminate padding
3. Additional changes:
- Moved watched_keys to be static allocated in the `mstate` struct
- Relocated replication init logic to replication.c
### Key Benefits
- **Efficient Memory Usage:**
- 45% smaller base client structure - Basic clients now use 528 bytes
(down from 896).
- Better memory locality for related operations
- Performance improvement in high throughput scenarios. No performance
regressions in other cases.
### Performance Impact
Tested with 650 clients and 512 bytes values.
#### Single Thread Performance
| Operation | Dataset | New (ops/sec) | Old (ops/sec) | Change % |
|------------|---------|---------------|---------------|-----------|
| SET | 1 key | 261,799 | 258,261 | +1.37% |
| SET | 3M keys | 209,134 | ~209,000 | ~0% |
| GET | 1 key | 281,564 | 277,965 | +1.29% |
| GET | 3M keys | 231,158 | 228,410 | +1.20% |
#### 8 IO Threads Performance
| Operation | Dataset | New (ops/sec) | Old (ops/sec) | Change % |
|------------|---------|---------------|---------------|-----------|
| SET | 1 key | 1,331,578 | 1,331,626 | -0.00% |
| SET | 3M keys | 1,254,441 | 1,152,645 | +8.83% |
| GET | 1 key | 1,293,149 | 1,289,503 | +0.28% |
| GET | 3M keys | 1,152,898 | 1,101,791 | +4.64% |
#### Pipeline Performance (3M keys)
| Operation | Pipeline Size | New (ops/sec) | Old (ops/sec) | Change % |
|-----------|--------------|---------------|---------------|-----------|
| SET | 10 | 548,964 | 538,498 | +1.94% |
| SET | 20 | 606,148 | 594,872 | +1.89% |
| SET | 30 | 631,122 | 616,606 | +2.35% |
| GET | 10 | 628,482 | 624,166 | +0.69% |
| GET | 20 | 687,371 | 681,659 | +0.84% |
| GET | 30 | 725,855 | 721,102 | +0.66% |
### Observations:
1. Single-threaded operations show consistent improvements (1-1.4%)
2. Multi-threaded performance shows significant gains for large
datasets:
- SET with 3M keys: +8.83% improvement
- GET with 3M keys: +4.64% improvement
3. Pipeline operations show consistent improvements:
- SET operations: +1.89% to +2.35%
- GET operations: +0.66% to +0.84%
4. No performance regressions observed in any test scenario
Related issue:https://github.com/valkey-io/valkey/issues/761
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: uriyage <78144248+uriyage@users.noreply.github.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
This commit, https://github.com/valkey-io/valkey/pull/1504, moved the
wrong worker to ubuntu 22. We wanted to move codecov and not coverity.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
It's inconvenient for client implementations to extract the
`availability_zone` information from the `INFO` response. The `INFO`
response contains a lot of information that a client implementation
typically doesn't need.
This PR adds the availability zone to the `HELLO` response. Clients
usually already use the `HELLO` command for protocol negotiation and
also get the server `version` and `role` from its response. To keep the
`HELLO` response small, the field is only added if availability zone is
configured.
---------
Signed-off-by: Rueian <rueiancsie@gmail.com>
Reset GC state before closing the lua VM to prevent user data to be
wrongly freed while still might be used on destructor callbacks.
Created and publish by Redis in their OSS branch.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: YaacovHazan <yaacov.hazan@redis.com>
The explanation on the original commit was wrong. Key based access must
have a `~` in order to correctly configure whey key prefixes to apply
the selector to. If this is missing, a server assert will be triggered
later.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: YaacovHazan <yaacov.hazan@redis.com>
This may speed up the transition to the fail state a bit.
Previously we would only check when we received a pfail/fail
report from others in gossip. If myself is the last vote,
we can directly switch to fail in here without waiting for
the next gossip packet.
Signed-off-by: Binbin <binloveplay1314@qq.com>
When building with `CMake` (especially the targets `valkey-cli`,
`valkey-server` and `valkey-benchmark`) it is possible to have a
successful build while having warnings.
This PR fixes this - which is aligned with how the `Makefile` is working
today:
- Enable `-Wall` + `-Werror` for valkey targets
- Fixed warning in valkey-cli:jsonStringOutput method
Signed-off-by: Eran Ifrah <eifrah@amazon.com>
The issues in #1453 seem to
have only shown up since we moved to ubuntu 24, as part of the rolling
`ubunut-latest` migration from 22->24.
Closes#1453.
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
When looking up a key in no-touch mode, `LOOKUP_NOTOUCH` is set to avoid
updating the last access time in `lookupKey`. An exception must be made
for the `TOUCH` command which must always update the key.
When called from a script, `server.executing_client` will point to the
`TOUCH` command, while `server.current_client` will point to e.g. an
`EVAL` command. So, we must use the former to find out the currently
executing command if defined.
This fix addresses the issue where TOUCH wasn't updating key access
times when called from scripts like EVAL.
Fixes#1498
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
In this commit we move all structures and functions declarations related
to Valkey modules from `server.h` to the recently added `module.h` file.
This re-organization makes it easier for new contributors to find the
valkey modules related code, as well as reducing the compilation times
when changes are made to the modules code.
---------
Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>