futriix

Author	SHA1	Message	Date
Binbin	92181b6797	Fix primary crash when processing dirty slots during shutdown wait / failover wait / client pause (#1131 ) We have an assert in propagateNow. If the primary node receives a CLUSTER UPDATE such as dirty slots during SIGTERM waitting or during a manual failover pausing or during a client pause, the delKeysInSlot call will trigger this assert and cause primary crash. In this case, we added a new server_del_keys_in_slot state just like client_pause_in_transaction to track the state to avoid the assert in propagateNow, the dirty slots will be deleted in the end without affecting the data consistency. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-11-15 16:47:15 +08:00
Binbin	4e2493e5c9	Kill diskless fork child asap when the last replica drop (#1227 ) We originally checked the replica connection to whether to kill the diskless child only when rdbPipeReadHandler is triggered. Actually we can check it when the replica is disconnected, so that we don't have to wait for rdbPipeReadHandler to be triggered and can kill the forkless child as soon as possible. In this way, when the child or rdbPipeReadHandler is stuck for some reason, we can kill the child faster and release the fork resources. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-15 16:34:32 +08:00
Binbin	d3f3b9cc3a	Fix daily valgrind build with unit tests (#1309 ) This was introduced in #515. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-15 14:27:28 +08:00
bentotten	b9994030e9	Log clusterbus handshake timeout failures (#1247 ) This adds a log when a handshake fails for a timeout. This can help troubleshoot cluster asymmetry issues caused by failed MEETs --------- Signed-off-by: Ben Totten <btotten@amazon.com> Signed-off-by: bentotten <59932872+bentotten@users.noreply.github.com> Co-authored-by: Ben Totten <btotten@amazon.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-11-14 20:48:48 -08:00
Qu Chen	32f7541fe3	Simplify dictType callbacks and move some macros from dict.h to dict.c (#1281 ) Remove the dict pointer argument to the `dictType` callbacks `keyDup`, `keyCompare`, `keyDestructor` and `valDestructor`. This argument was unused in all of the callback implementations. The macros `dictFreeKey()` and `dictFreeVal()` are made internal to dict and moved from dict.h to dict.c. They're also changed from macros to static inline functions. Signed-off-by: Qu Chen <quchen@amazon.com>	2024-11-14 09:45:47 +01:00
Parth	863d312803	Fix link-time optimization to work correctly for unit tests (i.e. -flto flag) (#1290 ) (#1296 ) * We compile various c files into object and package them into library (.a file) using ar to feed to unit tests. With new GCC versions, the objects inside such library don't participate in LTO process without additional flags. * Here is a direct quote from gcc documentation explaining this issue: "If you are not using a linker with plugin support and/or do not enable the linker plugin, then the objects inside libfoo.a are extracted and linked as usual, but they do not participate in the LTO optimization process. In order to make a static library suitable for both LTO optimization and usual linkage, compile its object files with -flto-ffat-lto-objects." * Read full documentation about -flto at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html * Without this additional flag, I get following errors while executing "make test-unit". With this change, those errors go away. ``` ARCHIVE libvalkey.a ar: threads_mngr.o: plugin needed to handle lto object ... .. . /tmp/ccDYbMXL.ltrans0.ltrans.o: In function `dictClear': /local/workplace/elasticache/valkey/src/unit/../dict.c:776: undefined reference to `valkey_free' /local/workplace/elasticache/valkey/src/unit/../dict.c:770: undefined reference to `valkey_free' /tmp/ccDYbMXL.ltrans0.ltrans.o: In function `dictGetVal': ``` Fixes #1290 --------- Signed-off-by: Parth Patel <661497+parthpatel@users.noreply.github.com>	2024-11-13 21:50:55 -08:00
skyfirelee	4a9864206f	Migrate quicklist unit test to new framework (#515 ) Migrate quicklist unit test to new unit test framework, and cleanup remaining references of SERVER_TEST, parent ticket #428. Closes #428. Signed-off-by: artikell <739609084@qq.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2024-11-14 10:37:44 +08:00
Binbin	6fba747c39	Fix log printing always shows the role as child under daemonize (#1301 ) In #1282, we init server.pid earlier to keep log message role consistent, but we forgot to consider daemonize. In daemonize mode, we will always print the child role. We need to reset server.pid after daemonize(), otherwise the log printing role will always be the child. It also causes a incorrect server.pid value, affecting the concatenation of some pid names. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-14 10:26:23 +08:00
Binbin	2df56d87c0	Fix empty primary may have dirty slots data due to bad migration (#1285 ) If we become an empty primary for some reason, we still need to check if we need to delete dirty slots, because we may have dirty slots data left over from a bad migration. Like the target node forcibly executes CLUSTER SETSLOT NODE to take over the slot without performing key migration. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-11 22:13:47 +08:00
Binbin	a2d22c63c0	Fix replica not able to initate election in time when epoch fails (#1009 ) If multiple primary nodes go down at the same time, their replica nodes will initiate the elections at the same time. There is a certain probability that the replicas will initate the elections in the same epoch. And obviously, in our current election mechanism, only one replica node can eventually get the enough votes, and the other replica node will fail to win due the the insufficient majority, and then its election will time out and we will wait for the retry, which result in a long failure time. If another node has been won the election in the failover epoch, we can assume that my election has failed and we can retry as soom as possible. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-11 22:12:49 +08:00
Binbin	167e8ab8de	Trigger the election immediately when doing a manual failover (#1081 ) Currently when a manual failover is triggeded, we will set a CLUSTER_TODO_HANDLE_FAILOVER to start the election as soon as possible in the next beforeSleep. But in fact, we won't delay the election in manual failover, waitting for the next beforeSleep to kick in will delay the election a some milliseconds. We can trigger the election immediately in this case in the same function call, without waitting for beforeSleep, which can save us some milliseconds. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-11 21:43:46 +08:00
Binbin	4aacffa32d	Stabilize dual replication test to avoid getting LOADING error (#1288 ) When doing `$replica replicaof no one`, we may get a LOADING error, this is because during the test execution, the replica may reconnect very quickly, and the full sync is initiated, and the replica has entered the LOADING state. In this commit, we make sure the primary is pasued after the fork, so the replica won't enter the LOADING state, and with this fix, this test seems more natural and predictable. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-11 21:42:34 +08:00
Qu Chen	9300a7ebc8	Set fields to NULL after free in freeClient() (#1279 ) Null out several references after freeing the object in `freeClient()`. This is just to make the code more safe, to protect against use-after-free for future changes. Signed-off-by: Qu Chen <quchen@amazon.com>	2024-11-11 10:39:48 +01:00
zixuan zhao	0b5b2c7484	Log as primary role (M) instead of child process (C) during startup (#1282 ) Init server.pid earlier to keep log message role consistent. Closes #1206. Before: ```text 24881:C 21 Oct 2024 21:10:57.165 * oO0OoO0OoO0Oo Valkey is starting oO0OoO0OoO0Oo 24881:C 21 Oct 2024 21:10:57.165 * Valkey version=255.255.255, bits=64, commit=814e0f55, modified=1, pid=24881, just started 24881:C 21 Oct 2024 21:10:57.165 * Configuration loaded 24881:M 21 Oct 2024 21:10:57.167 * Increased maximum number of open files to 10032 (it was originally set to 1024). ``` After: ```text 68560:M 08 Nov 2024 16:10:12.257 * oO0OoO0OoO0Oo Valkey is starting oO0OoO0OoO0Oo 68560:M 08 Nov 2024 16:10:12.257 * Valkey version=255.255.255, bits=64, commit=45d596e1, modified=1, pid=68560, just started 68560:M 08 Nov 2024 16:10:12.257 * Configuration loaded 68560:M 08 Nov 2024 16:10:12.258 * monotonic clock: POSIX clock_gettime ``` Signed-off-by: azuredream <zhaozixuan67@gmail.com>	2024-11-11 10:33:26 +01:00
zhenwei pi	45d596e121	RDMA: Use conn ref counter to prevent double close (#1250 ) RDMA: Use connection reference counter style The reference counter of connection is used to protect re-entry of closenmethod. Use this style instead the unsafe one. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2024-11-08 09:33:01 +01:00
Jacob Murphy	e972d56460	Make sure to copy null terminator byte in dual channel code (#1272 ) As @madolson pointed out, these do have proper null terminators. This cleans them up to follow the rest of the code which copies the last byte explicitly, which should help reduce cognitive load and make it more resilient should code refactors occur (e.g. non-static allocation of memory, changes to other functions). --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>	2024-11-07 18:25:43 -08:00
eifrah-aws	07b3e7ae7a	Add CMake build system for valkey (#1196 ) With this commit, users are able to build valkey using `CMake`. ## Example usage: Build `valkey-server` in Release mode with TLS enabled and using `jemalloc` as the allocator: ```bash mkdir build-release cd $_ cmake .. -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=/tmp/valkey-install \ -DBUILD_MALLOC=jemalloc -DBUILD_TLS=1 make -j$(nproc) install # start valkey /tmp/valkey-install/bin/valkey-server ``` Build `valkey-unit-tests`: ```bash mkdir build-release-ut cd $_ cmake .. -DCMAKE_BUILD_TYPE=Release \ -DBUILD_MALLOC=jemalloc -DBUILD_UNIT_TESTS=1 make -j$(nproc) # Run the tests ./bin/valkey-unit-tests ``` Current features supported by this PR: - Building against different allocators: (`jemalloc`, `tcmalloc`, `tcmalloc_minimal` and `libc`), e.g. to enable `jemalloc` pass `-DBUILD_MALLOC=jemalloc` to `cmake` - OpenSSL builds (to enable TLS, pass `-DBUILD_TLS=1` to `cmake`) - Sanitizier: pass `-DBUILD_SANITIZER=<address\|thread\|undefined>` to `cmake` - Install target + redis symbolic links - Build `valkey-unit-tests` executable - Standard CMake variables are supported. e.g. to install `valkey` under `/home/you/root` pass `-DCMAKE_INSTALL_PREFIX=/home/you/root` Why using `CMake`? To list some of the advantages of using `CMake`: - Superior IDE integrations: cmake generates the file `compile_commands.json` which is required by `clangd` to get a compiler accuracy code completion (in other words: your VScode will thank you) - Out of the source build tree: with the current build system, object files are created all over the place polluting the build source tree, the best practice is to build the project on a separate folder - Multiple build types co-existing: with the current build system, it is often hard to have multiple build configurations. With cmake you can do it easily: - It is the de-facto standard for C/C++ project these days More build examples: ASAN build: ```bash mkdir build-asan cd $_ cmake .. -DBUILD_SANITIZER=address -DBUILD_MALLOC=libc make -j$(nproc) ``` ASAN with jemalloc: ```bash mkdir build-asan-jemalloc cd $_ cmake .. -DBUILD_SANITIZER=address -DBUILD_MALLOC=jemalloc make -j$(nproc) ``` As seen by the previous examples, any combination is allowed and co-exist on the same source tree. ## Valkey installation With this new `CMake`, it is possible to install the binary by running `make install` or creating a package `make package` (currently supported on Debian like distros) ### Example 1: build & install using `make install`: ```bash mkdir build-release cd $_ cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/valkey-install -DCMAKE_BUILD_TYPE=Release make -j$(nproc) install # valkey is now installed under $HOME/valkey-install ``` ### Example 2: create a `.deb` installer: ```bash mkdir build-release cd $_ cmake .. -DCMAKE_BUILD_TYPE=Release make -j$(nproc) package # ... CPack deb generation output sudo gdebi -n ./valkey_8.1.0_amd64.deb # valkey is now installed under /opt/valkey ``` ### Example 3: create installer for non Debian systems (e.g. FreeBSD or macOS): ```bash mkdir build-release cd $_ cmake .. -DCMAKE_BUILD_TYPE=Release make -j$(nproc) package mkdir -p /opt/valkey && ./valkey-8.1.0-Darwin.sh --prefix=/opt/valkey --exclude-subdir # valkey-server is now installed under /opt/valkey ``` Signed-off-by: Eran Ifrah <eifrah@amazon.com>	2024-11-07 18:01:37 -08:00
Wen Hui	3672f9b2c3	Revert "Decline unsubscribe related command in non-subscribed mode" (#1265 ) This PR goal is to revert the changes on PR https://github.com/valkey-io/valkey/pull/759 Recently, we got some reports that in Valkey 8.0 the PR https://github.com/valkey-io/valkey/pull/759 (Decline unsubscribe related command in non-subscribed mode) causes break change. (https://github.com/valkey-io/valkey/issues/1228) Although from my thought, call commands "unsubscribeCommand", "sunsubscribeCommand", "punsubscribeCommand" in request-response mode make no sense. This is why I created PR https://github.com/valkey-io/valkey/pull/759 But breaking change is always no good, @valkey-io/core-team How do you think we revert this PR code changes? Signed-off-by: hwware <wen.hui.ware@gmail.com>	2024-11-07 20:05:16 -05:00
Binbin	1c18c80844	Fix incorrect cache_memory reset in functionsLibCtxClear (#1255 ) functionsLibCtxClear should clear the provided lib_ctx parameter, not the static variable curr_functions_lib_ctx, as this contradicts the function's intended purpose. The impact i guess is minor, like in some unhappy paths (diskless load fails, function restore fails?), we will mess up the functions_caches field, which is used in used_memory_functions / used_memory_scripts fileds in INFO. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-07 13:44:21 +08:00
Binbin	22bc49c4a6	Try to stabilize the failover call in the slot migration test (#1078 ) The CI report replica will return the error when performing CLUSTER FAILOVER: ``` -ERR Master is down or failed, please use CLUSTER FAILOVER FORCE ``` This may because the primary state is fail or the cluster connection is disconnected during the primary pause. In this PR, we added some waits in wait_for_role, if the role is replica, we will wait for the replication link and the cluster link to be ok. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-07 13:42:20 +08:00
Binbin	a0b1cbad83	Change errno from EEXIST to EALREADY in serverFork if child process exists (#1258 ) We set this to EEXIST in 568c2e039bac388003068cd8debb2f93619dd462, it prints "File exists" which is not quite accurate, change it to EALREADY, it will print "Operation already in progress". Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-07 12:13:00 +08:00
Binbin	12c5af03b8	Remove empty DB check branch in KEYS command (#1259 ) We don't think we really care about optimizing for the empty DB case, which should be uncommon. Adding branches hurts branch prediction. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-06 10:32:00 +08:00
Amit Nagler	48ebe21ad1	fix: clean up refactoring leftovers (#1264 ) This commit addresses issues that were likely introduced during a rebase related to: `b0f23df165` Change dual channel replication state in main handler only Signed-off-by: naglera <anagler123@gmail.com>	2024-11-05 04:57:34 -08:00
Madelyn Olson	3c32ee1bda	Add a filter option to drop all cluster packets (#1252 ) A minor debugging change that helped in the investigation of https://github.com/valkey-io/valkey/issues/1251. Basically there are some edge cases where we want to fully isolate a note from receiving packets, but can't suspend the process because we need it to continue sending outbound traffic. So, added a filter for that. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-11-04 12:36:20 -08:00
Binbin	a102852d5e	Fix timing issue in cluster-shards tests (#1243 ) The cluster-node-timeout is 3000 in our tests, the timing test wasn't succeeding, so extending the wait_for made them much more reliable. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-02 19:51:14 +08:00
Jim Brunner	0d7b2344b2	correct type internal to kvstore (minor) (#1246 ) All of the internal variables related to number of dicts in the kvstore are type `int`. Not sure why these 2 items were declared as `long long`. Signed-off-by: Jim Brunner <brunnerj@amazon.com>	2024-11-01 15:16:18 -07:00
zhenwei pi	e985ead7f9	RDMA: Prevent IO for child process (#1244 ) RDMA MR (memory region) is not forkable, the VMA (virtual memory area) of a MR gets empty in a child process. Prevent IO for child process to avoid server crash. In the check for whether read and write is allowed in an RDMA connection, a check that if we're in a child process is added. If we are, the function returns an error, which will cause the RDMA client to be disconnected. Suggested-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2024-11-01 13:28:09 +01:00
Madelyn Olson	1c222f77ce	Improve performance of sdssplitargs (#1230 ) The current implementation of `sdssplitargs` does repeated `sdscatlen` to build the parsed arguments, which isn't very efficient because it does a lot of extra reallocations and moves through the sds code a lot. It also typically results in memory overhead, because `sdscatlen` over-allocates, which is usually not needed since args are usually not modified after being created. The new implementation of sdssplitargs does two passes, the first to parse the argument to figure out the final length and the second to actually copy the string. It's generally about 2x faster for larger strings (~100 bytes), and about 20% faster for small strings (~10 bytes). This is generally faster since as long as everything is in the CPU cache, it's going to be fast. There are a couple of sanity tests, none existed before, as well as some fuzzying which was used to find some bugs and also to do the benchmarking. The original benchmarking code can be seen `6576aeb86a`. ``` test_sdssplitargs_benchmark - unit/test_sds.c:530] Using random seed: 1729883235 [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 56.44%, new:13039us, old:29930us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 56.58%, new:12057us, old:27771us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 59.18%, new:9048us, old:22165us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 54.61%, new:12381us, old:27278us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 51.17%, new:16012us, old:32793us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 49.18%, new:16041us, old:31563us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 58.40%, new:12450us, old:29930us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 56.49%, new:13066us, old:30031us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 58.75%, new:12744us, old:30894us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 52.44%, new:16885us, old:35504us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 62.57%, new:8107us, old:21659us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 62.12%, new:8320us, old:21966us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 45.23%, new:13960us, old:25487us [test_sdssplitargs_benchmark - unit/test_sds.c:577] Improvement: 57.95%, new:9188us, old:21849us ``` --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-10-31 11:37:53 -07:00
Masahiro Ide	91cbf77442	Eliminate snprintf usage at setDeferredAggregateLen (#1234 ) to align with how we encode the length at `_addReplyLongLongWithPrefix` Signed-off-by: Masahiro Ide <masahiro.ide@lycorp.co.jp> Co-authored-by: Masahiro Ide <masahiro.ide@lycorp.co.jp>	2024-10-31 11:30:05 -07:00
zhenwei pi	ab98f375db	RDMA: Delete keepalive timer on closing (#1237 ) Typically, RDMA connection gets closed by client side, the server side handles diconnected CM event, and delete keepalive timer correctly. However, the server side may close connection voluntarily, for example the maxium connections exceed. Handle this case to avoid invalid memory access. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2024-10-30 11:12:42 +01:00
Binbin	789a73b0d0	Minor fix to debug logging in replicationFeedStreamFromPrimaryStream (#1235 ) We should only print logs when hide-user-data-from-log is off. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-10-30 10:25:50 +08:00
Shivshankar	13f5f665f2	Update the argument of clusterNodeGetReplica declaration (#1239 ) clusterNodeGetReplica agrumnets are missed to migrate during the slave to replication migration so updated the argument slave to replica. Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-10-30 00:19:56 +01:00
Madelyn Olson	5a4c0640ce	Mark main and serverAssert as weak symbols to be overridden (#1232 ) At some point unit tests stopped building on MacOS because of duplicate symbols. I had originally solved this problem by using a flag that overrides symbols, but the much better solution is to mark the duplicate symbols as weak and they can be overridden during linking. (Symbols by default are strong, strong symbols override weak symbols) I also added macos unit build to the CI, so that this doesn't silently break in the future again. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-10-29 14:26:17 -07:00
zixuan zhao	8ee7a58025	Document log format configs in valkey.conf (#1233 ) Add config options for log format and timestamp format introduced by #1022 Related to #1225 This change adds two new configs into valkey.conf: log-format log-timestamp-format --------- Signed-off-by: azuredream <zhaozixuan67@gmail.com>	2024-10-29 11:13:30 +01:00
Lipeng Zhu	c21f1dc084	Increase the IO_THREADS_MAX_NUM. (#1220 ) ### Description This patch try to increase the max number of io-threads from 16(128) to 256 for below reasons: 1. The core number increases a lot in the modern server processors, for example, the [Sierra Forest](https://en.wikipedia.org/wiki/Sierra_Forest) processors are targeted towards with up to 288 cores. Due to limitation of _io-threads_ number (16 and 128 ), benchmark like https://openbenchmarking.org/test/pts/valkey even cannot run on a high core count server. 2. For some workloads, the bottleneck could be main thread, but for the other workloads, big key/value which caused heavy io, the bottleneck could be the io-threads, for example benchmark `memtier_benchmark -s 127.0.0.1 -p 9001 "--data-size" "20000" --ratio 1:0 --key-pattern P:P --key-minimum=1 --key-maximum 1000000 --test-time 180 -c 50 -t 16 --hide-histogram`. The QPS is still scalable after 16 io-threads. ![image](https://github.com/user-attachments/assets/e980f805-a162-44be-b03e-ab37a9c489cf) Fig 1. QPS Scale factor with io-threads number grows. Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>	2024-10-27 22:43:23 -07:00
Binbin	5d2ff853a3	Fix minor repldbfd leak in updateReplicasWaitingBgsave if fstat fails (#1226 ) In the old code, if fstat fails, replica->repldbfd will hold the fd and we are doing a free client. And in freeClient, we check and close only if repl_state == REPLICA_STATE_SEND_BULK. So if fstat fails, we will leak the fd. We can also extend freeClient to handle REPLICA_STATE_WAIT_BGSAVE_END as well, but here seems to be a more friendly (and safer) way. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-10-27 15:23:00 +08:00
Shivshankar	4be09e434a	Fix typo in valkey.conf file's shutdown section (#1224 ) Found typo "exists" ==> "exits" in valkey.conf in shutdown section. Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-10-25 14:03:59 +02:00
Lipeng Zhu	9c60fcdae2	Do security attack check only when command not found to reduce the critical path (#1212 ) When explored the cycles distribution for main thread with io-threads enabled. We found this security attack check takes significant time in main thread, ~3% cycles were used to do the commands security check in main thread. This patch try to completely avoid doing it in the hot path. We can do it only after we looked up the command and it wasn't found, just before we call commandCheckExistence. --------- Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>	2024-10-25 11:13:28 +02:00
zixuan zhao	55bbbe09a3	Configurable log and timestamp formats (logfmt, ISO8601) (#1022 ) Add ability to configure log output format and timestamp format in the logs. This change adds two new configs: * `log-format`: Either legacy or logfmt (See https://brandur.org/logfmt) * `log-timestamp-format`: legacy, iso8601 or milliseconds (since the eppch). Related to #1006. Example: ``` $ ./valkey-server /home/zhaoz12/git/valkey/valkey/valkey.conf pid=109463 role=RDB/AOF timestamp="2024-09-10T20:37:25.738-04:00" level=warning message="WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect." pid=109463 role=RDB/AOF timestamp="2024-09-10T20:37:25.738-04:00" level=notice message="oO0OoO0OoO0Oo Valkey is starting oO0OoO0OoO0Oo" pid=109463 role=RDB/AOF timestamp="2024-09-10T20:37:25.738-04:00" level=notice message="Valkey version=255.255.255, bits=64, commit=affbea5d, modified=1, pid=109463, just started" pid=109463 role=RDB/AOF timestamp="2024-09-10T20:37:25.738-04:00" level=notice message="Configuration loaded" pid=109463 role=master timestamp="2024-09-10T20:37:25.738-04:00" level=notice message="monotonic clock: POSIX clock_gettime" pid=109463 role=master timestamp="2024-09-10T20:37:25.739-04:00" level=warning message="Failed to write PID file: Permission denied" ``` --------- Signed-off-by: azuredream <zhaozixuan67@gmail.com>	2024-10-25 00:36:32 +02:00
Binbin	2956367731	Maintain return value of rdbSaveDb after writing slot-info aux (#1222 ) All other places written in this function are maintained it, although the caller of rdbSaveDb does not reply on it, it is maintained to be consistent with other places, is its duty. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-10-24 09:53:05 -04:00
Binbin	a21fe718f4	Limit CLUSTER_CANT_FAILOVER_DATA_AGE log to 10 times period (#1189 ) If a replica is step into data_age too old stage, it can not trigger the failover and currently it can not be automatically recovered and we will print a log every CLUSTER_CANT_FAILOVER_RELOG_PERIOD, which is every second. If the primary has not recovered or there is no manual failover, this log will flood the log file. In this case, limit its frequency to 10 times period, which is 10 seconds in our code. Also in this data_age too old stage, the repeated logs also can stand for the progress of the failover. See also #780 for more details about it. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@outlook.com>	2024-10-24 16:38:47 +08:00
muelstefamzn	c419524c05	Trim free space from inline command argument strings to avoid excess memory usage (#1213 ) The command argument strings created while parsing inline commands (see `processInlineBuffer()`) can contain free capacity. Since some commands ,such as `SET`, store these strings in the database, that free capacity increases the memory usage. In the worst case, it could double the memory usage. This only occurs if the inline command format is used. The argument strings are built by appending character by character in `sdssplitargs()`. Regular RESP commands are not affected. This change trims the strings within `processInlineBuffer()`. ### Why `trimStringObjectIfNeeded()` within `object.c` is not solving this? When the command argument string is packed into an object, `trimStringObjectIfNeeded()` is called. This does only trim the string if it is larger than `PROTO_MBULK_BIG_ARG` (32kB), as only strings larger than this would ever need trimming if the command it sent using the bulk string format. We could modify this condition, but that would potentially have a performance impact on commands using the bulk format. Since those make up for the vast majority of executed commands, limiting this change to inline commands seems prudent. ### Experiment Results * 1 million `SET [key] [value]` commands * Random keys (16 bytes) * 600 bytes values Memory usage without this change: ``` used_memory:1089327888 used_memory_human:1.01G used_memory_rss:1131696128 used_memory_rss_human:1.05G used_memory_peak:1089348264 used_memory_peak_human:1.01G used_memory_peak_perc:100.00% used_memory_overhead:49302800 used_memory_startup:911808 used_memory_dataset:1040025088 used_memory_dataset_perc:95.55% ``` Memory usage with this change: ``` used_memory:705327888 used_memory_human:672.65M used_memory_rss:718802944 used_memory_rss_human:685.50M used_memory_peak:705348256 used_memory_peak_human:672.67M used_memory_peak_perc:100.00% used_memory_overhead:49302800 used_memory_startup:911808 used_memory_dataset:656025088 used_memory_dataset_perc:93.13% ``` If the same experiment is repeated using the normal RESP array of bulk string format (`*3\r\n$3\r\nSET\r\n...`) then the memory usage is 672MB with and without of this change. If a replica is attached, its memory usage is 672MB with and without this change, since the replication link never uses inline commands. Signed-off-by: Stefan Mueller <muelstef@amazon.com>	2024-10-23 16:56:32 -07:00
danish-mehmood	c176de4251	Clarify the wording from dually to the more common doubly (#1214 ) Clarify documentation is ziplist.c Signed-off-by: danish-mehmood <rdm355190@gmail.com>	2024-10-23 14:30:42 -07:00
Binbin	b803f7aeff	Cleaned up getSlotOrReply is return -1 instead of C_ERR (#1211 ) Minor cleanup since getSlotOrReply return -1 on error, not return C_ERR. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-10-23 17:11:42 +08:00
Binbin	5d70ccd70e	Make replica CLUSTER RESET flush async based on lazyfree-lazy-user-flush (#1190 ) Currently, if the replica has a lot of data, CLUSTER RESET will block for a while and report the slowlog, and it seems that there is no harm in making it async so external components can be easier when monitoring it. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@outlook.com>	2024-10-23 10:22:25 +08:00
Shivshankar	285064b114	fix typo (#1202 ) Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-10-21 22:54:40 -04:00
Shivshankar	771918e4bf	Updating command.def by running the generate-command-code.py (#1203 ) Part of https://github.com/valkey-io/valkey/pull/1200 PR, since feild is changed. Looks like commands.def is missed to get genereated based on the changes so that is causing CI failure on unstable. Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-10-21 13:48:29 -07:00
Viktor Söderqvist	5885dc56bd	Fix BGSAVE CANCEL since and history fields (#1200 ) Fixes wrong "since" and "history" introduced in #757. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-10-21 16:04:47 +02:00
ranshid	29b83f1ac8	Introduce bgsave cancel (#757 ) In some cases bgsave child process can run for a long time exhausting system resources. Although it is possible to kill the bgsave child process from the system shell, sometimes it is not possible allowing OS level access. This PR adds a new subcommand to the BGSAVE command. When user will issue `BGSAVE CANCEL`, it will do one of the 2: 1. In case a bgsave child process is currently running, the child process would be immediately killed thus terminating any save/replication full sync process. 2. In case a bgsave child process is SCHEDULED to run, the scheduled execution will be cancelled. --------- Signed-off-by: ranshid <ranshid@amazon.com> Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-10-21 11:56:44 +02:00
zhenwei pi	71f8c34eed	RDMA: Fix listener priv opaque pointer (#1194 ) struct connListener.priv should be used by connection type specific data, static local listener data should not use this. A RDMA config structure is going to be introduced in the next step: ``` typedef struct serverRdmaContextConfig { char *bindaddr; int bindaddr_count; int port; int rx_size; int comp_vector; ... } serverRdmaContextConfig; ``` Then a builtin RDMA will be supported. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2024-10-21 10:11:27 +02:00

... 3 4 5 6 7 ...

12914 Commits