futriix

Author	SHA1	Message	Date
zhaozhao.zz	3f21705a6c	Feature COMMANDLOG to record slow execution and large request/reply (#1294 ) As discussed in PR #336. We have different types of resources like CPU, memory, network, etc. The `slowlog` can only record commands eat lots of CPU during the processing phase (doesn't include read/write network time), but can not record commands eat too many memory and network. For example: 1. run "SET key value(10 megabytes)" command would not be recored in slowlog, since when processing it the SET command only insert the value's pointer into db dict. But that command eats huge memory in query buffer and bandwidth from network. In this case, just 1000 tps can cause 10GB/s network flow. 2. run "GET key" command and the key's value length is 10 megabytes. The get command can eat huge memory in output buffer and bandwidth to network. This PR introduces a new command `COMMANDLOG`, to log commands that consume significant network bandwidth, including both input and output. Users can retrieve the results using `COMMANDLOG get <count> large-request` and `COMMANDLOG get <count> large-reply`, all subcommands for `COMMANDLOG` are: * `COMMANDLOG HELP` * `COMMANDLOG GET <count> <slow\|large-request\|large-reply>` * `COMMANDLOG LEN <slow\|large-request\|large-reply>` * `COMMANDLOG RESET <slow\|large-request\|large-reply>` And the slowlog is also incorporated into the commandlog. For each of these three types, additional configs have been added for control: * `commandlog-request-larger-than` and `commandlog-large-request-max-len` represent the threshold for large requests(the unit is Bytes) and the maximum number of commands that can be recorded. * `commandlog-reply-larger-than` and `commandlog-large-reply-max-len` represent the threshold for large replies(the unit is Bytes) and the maximum number of commands that can be recorded. * `commandlog-execution-slower-than` and `commandlog-slow-execution-max-len` represent the threshold for slow executions(the unit is microseconds) and the maximum number of commands that can be recorded. * Additionally, `slowlog-log-slower-than` and `slowlog-max-len` are now set as aliases for these two new configs. --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Ping Xie <pingxie@outlook.com>	2025-01-24 11:41:40 +08:00
Nadav Gigi	f2510783f9	Accelerate hash table iterator with value prefetching (#1568 ) This PR builds upon the [previous entry prefetching optimization](https://github.com/valkey-io/valkey/pull/1501) to further enhance performance by implementing value prefetching for hashtable iterators. ## Implementation Modified `hashtableInitIterator` to accept a new flags parameter, allowing control over iterator behavior. Implemented conditional value prefetching within `hashtableNext` based on the new `HASHTABLE_ITER_PREFETCH_VALUES` flag. When the flag is set, hashtableNext now calls `prefetchBucketValues` at the start of each new bucket, preemptively loading the values of filled entries into the CPU cache. The actual prefetching of values is performed using type-specific callback functions implemented in `server.c`: - For `robj` the `hashtableObjectPrefetchValue` callback is used to prefetch the value if not embeded. This implementation is specifically focused on main database iterations at this stage. Applying it to hashtables that hold other object types should not be problematic, but its performance benefits for those cases will need to be proven through testing and benchmarking. ## Performance ### Setup: - 64cores Graviton 3 Amazon EC2 instance. - 50 mil keys with different value sizes. - Running valkey server over RAM file system. - crc checksum and comperssion off. ### Action - save command. ### Results The results regarding the duration of “save” command was taken from “info all” command. ``` +--------------------+------------------+------------------+ \| Prefetching \| Value size (byte)\| Time (seconds) \| +--------------------+------------------+------------------+ \| No \| 100 \| 20.112279 \| \| Yes \| 100 \| 12.758519 \| \| No \| 40 \| 16.945366 \| \| Yes \| 40 \| 10.902022 \| \| No \| 20 \| 9.817000 \| \| Yes \| 20 \| 9.626821 \| \| No \| 10 \| 9.71510 \| \| Yes \| 10 \| 9.510565 \| +--------------------+------------------+------------------+ ``` The results largely align with our expectations, showing significant improvements for larger values (100 bytes and 40 bytes) that are stored outside the robj. For smaller values (20 bytes and 10 bytes) that are embedded within the robj, we see almost no improvement, which is as expected. However, the small improvement observed even for these embedded values is somewhat surprising. Given that we are not actively prefetching these embedded values, this minor performance gain was not anticipated. perf record on save command without value prefetching: ``` --99.98%--rdbSaveDb \| \|--91.38%--rdbSaveKeyValuePair \| \| \| \|--42.72%--rdbSaveRawString \| \| \| \| \| \|--26.69%--rdbWriteRaw \| \| \| \| \| \| \| --25.75%--rioFileWrite.lto_priv.0 \| \| \| \| \| --15.41%--rdbSaveLen \| \| \| \| \| \|--7.58%--rdbWriteRaw \| \| \| \| \| \| \| --7.08%--rioFileWrite.lto_priv.0 \| \| \| \| \| \| \| --6.54%--_IO_fwrite \| \| \| \| \| \| \| \| --7.42%--rdbWriteRaw.constprop.1 \| \| \| \| \| --7.18%--rioFileWrite.lto_priv.0 \| \| \| \| \| --6.73%--_IO_fwrite \| \| \| \| \| \|--40.44%--rdbSaveStringObject \| \| \| --7.62%--rdbSaveObjectType \| \| \| --7.39%--rdbWriteRaw.constprop.1 \| \| \| --7.04%--rioFileWrite.lto_priv.0 \| \| \| --6.59%--_IO_fwrite \| \| --7.33%--hashtableNext.constprop.1 \| --6.28%--prefetchNextBucketEntries.lto_priv.0 ``` perf record on save command with value prefetching: ``` rdbSaveRio \| --99.93%--rdbSaveDb \| \|--79.81%--rdbSaveKeyValuePair \| \| \| \|--66.79%--rdbSaveRawString \| \| \| \| \| \|--42.31%--rdbWriteRaw \| \| \| \| \| \| \| --40.74%--rioFileWrite.lto_priv.0 \| \| \| \| \| --23.37%--rdbSaveLen \| \| \| \| \| \|--11.78%--rdbWriteRaw \| \| \| \| \| \| \| --11.03%--rioFileWrite.lto_priv.0 \| \| \| \| \| \| \| --10.30%--_IO_fwrite \| \| \| \| \| \| \| \| \| --10.98%--rdbWriteRaw.constprop.1 \| \| \| \| \| --10.44%--rioFileWrite.lto_priv.0 \| \| \| \| \| --9.74%--_IO_fwrite \| \| \| \| \| \| \|--11.33%--rdbSaveObjectType \| \| \| \| \| --10.96%--rdbWriteRaw.constprop.1 \| \| \| \| \| --10.51%--rioFileWrite.lto_priv.0 \| \| \| \| \| --9.75%--_IO_fwrite \| \| \| \| \| \| --0.77%--rdbSaveStringObject \| --18.39%--hashtableNext \| \|--10.04%--hashtableObjectPrefetchValue \| --6.06%--prefetchNextBucketEntries ``` Conclusions: The prefetching strategy appears to be working as intended, shifting the performance bottleneck from data access to I/O operations. The significant reduction in rdbSaveStringObject time suggests that string objects(which are the values) are being accessed more efficiently. Signed-off-by: NadavGigi <nadavgigi102@gmail.com>	2025-01-23 12:17:20 +01:00
Ricardo Dias	af71619c45	Extract the scripting engine code from the functions unit (#1312 ) This commit creates a new compilation unit for the scripting engine code by extracting the existing code from the functions unit. We're doing this refactor to prepare the code for running the `EVAL` command using different scripting engines. This PR has a module API change: we changed the type of error messages returned by the callback `ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc` to be a `ValkeyModuleString` (aka `robj`); This PR also fixes #1470. --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-01-16 10:08:16 +01:00
Rain Valentine	d13aad45f4	Replace dict with new hashtable: hash datatype (#1502 ) This PR replaces dict with the new hashtable data structure in the HASH datatype. There is a new struct for hashtable items which contains a pointer to value sds string and the embedded key sds string. These values were previously stored in dictEntry. This structure is kept opaque so we can easily add small value embedding or other optimizations in the future. closes #1095 --------- Signed-off-by: Rain Valentine <rsg000@gmail.com>	2025-01-13 11:17:16 +01:00
Binbin	d6bdd9e7d7	Fix module LatencyAddSample still work when latency-monitor-threshold is 0 (#1541 ) When latency-monitor-threshold is set to 0, it means the latency monitor is disabled, and in VM_LatencyAddSample, we wrote the condition incorrectly, causing us to record latency when latency was turned off. This bug was introduced in the very first day, see e3b1d6d, it was merged in 2019. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-01-11 10:32:58 +08:00
Rain Valentine	ab627d6721	Replace dict with new hashtable: sorted set datatype (#1427 ) This PR replaces dict with hashtable in the ZSET datatype. Instead of mapping key to score as dict did, the hashtable maps key to a node in the skiplist, which contains the score. This takes advantage of hashtable performance improvements and saves 15 bytes per set item - 24 bytes overhead before, 9 bytes after. Closes #1096 --------- Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-01-08 18:34:02 +01:00
uriyage	6c09eea2bc	client struct: lazy init components and optimize struct layout (#1405 ) # Refactor client structure to use modular data components ## Current State The client structure allocates memory for replication / pubsub / multi-keys / module / blocked data for every client, despite these features being used by only a small subset of clients. In addition the current field layout in the client struct is suboptimal, with poor alignment and unnecessary padding between fields, leading to a larger than necessary memory footprint of 896 bytes per client. Furthermore, fields that are frequently accessed together during operations are scattered throughout the struct, resulting in poor cache locality. ## This PR's Change 1. Lazy Initialization - Components are only allocated when first used: - PubSubData: Created on first SUBSCRIBE/PUBLISH operation - ReplicationData: Initialized only for replica connections - ModuleData: Allocated when module interaction begins - BlockingState: Created when first blocking command is issued - MultiState: Initialized on MULTI command 2. Memory Layout Optimization: - Grouped related fields for better locality - Moved rarely accessed fields (e.g., client->name) to struct end - Optimized field alignment to eliminate padding 3. Additional changes: - Moved watched_keys to be static allocated in the `mstate` struct - Relocated replication init logic to replication.c ### Key Benefits - Efficient Memory Usage: - 45% smaller base client structure - Basic clients now use 528 bytes (down from 896). - Better memory locality for related operations - Performance improvement in high throughput scenarios. No performance regressions in other cases. ### Performance Impact Tested with 650 clients and 512 bytes values. #### Single Thread Performance \| Operation \| Dataset \| New (ops/sec) \| Old (ops/sec) \| Change % \| \|------------\|---------\|---------------\|---------------\|-----------\| \| SET \| 1 key \| 261,799 \| 258,261 \| +1.37% \| \| SET \| 3M keys \| 209,134 \| ~209,000 \| ~0% \| \| GET \| 1 key \| 281,564 \| 277,965 \| +1.29% \| \| GET \| 3M keys \| 231,158 \| 228,410 \| +1.20% \| #### 8 IO Threads Performance \| Operation \| Dataset \| New (ops/sec) \| Old (ops/sec) \| Change % \| \|------------\|---------\|---------------\|---------------\|-----------\| \| SET \| 1 key \| 1,331,578 \| 1,331,626 \| -0.00% \| \| SET \| 3M keys \| 1,254,441 \| 1,152,645 \| +8.83% \| \| GET \| 1 key \| 1,293,149 \| 1,289,503 \| +0.28% \| \| GET \| 3M keys \| 1,152,898 \| 1,101,791 \| +4.64% \| #### Pipeline Performance (3M keys) \| Operation \| Pipeline Size \| New (ops/sec) \| Old (ops/sec) \| Change % \| \|-----------\|--------------\|---------------\|---------------\|-----------\| \| SET \| 10 \| 548,964 \| 538,498 \| +1.94% \| \| SET \| 20 \| 606,148 \| 594,872 \| +1.89% \| \| SET \| 30 \| 631,122 \| 616,606 \| +2.35% \| \| GET \| 10 \| 628,482 \| 624,166 \| +0.69% \| \| GET \| 20 \| 687,371 \| 681,659 \| +0.84% \| \| GET \| 30 \| 725,855 \| 721,102 \| +0.66% \| ### Observations: 1. Single-threaded operations show consistent improvements (1-1.4%) 2. Multi-threaded performance shows significant gains for large datasets: - SET with 3M keys: +8.83% improvement - GET with 3M keys: +4.64% improvement 3. Pipeline operations show consistent improvements: - SET operations: +1.89% to +2.35% - GET operations: +0.66% to +0.84% 4. No performance regressions observed in any test scenario Related issue:https://github.com/valkey-io/valkey/issues/761 --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com> Signed-off-by: uriyage <78144248+uriyage@users.noreply.github.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-01-08 10:28:54 +02:00
Ricardo Dias	8d764f27b3	Refactor: move all valkey modules related declarations to `module.h` (#1489 ) In this commit we move all structures and functions declarations related to Valkey modules from `server.h` to the recently added `module.h` file. This re-organization makes it easier for new contributors to find the valkey modules related code, as well as reducing the compilation times when changes are made to the modules code. --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-01-02 18:35:10 +01:00
ranshid	0f273bb648	Align rejected unblocked commands to update the correct error statistic (#577 ) Currently, in case a blocked command is unblocked externally (eg. due to the relevant slot being migrated or the CLIENT UNBLOCK command was issued, the command statistics will always update the failed_calls error statistic. This leads to missalignment with `90b9f08e5d` as well as some inconsistencies. For example when a key is migrated during cluster slot migration, clients blocked on XREADGROUP will be unblocked and update the rejected_calls stat, while clients blocked on BLPOP will get unblocked updating the failed_calls stat. In this PR we add explicit indication in updateStatsOnUnblock thet indicates if the command was rejected or failed. --------- Signed-off-by: ranshid <ranshid@amazon.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com>	2025-01-01 16:33:09 +02:00
Ricardo Dias	6adef8e2f9	Adds support for scripting engines as Valkey modules (#1277 ) This PR extends the module API to support the addition of different scripting engines to execute user defined functions. The scripting engine can be implemented as a Valkey module, and can be dynamically loaded with the `loadmodule` config directive, or with the `MODULE LOAD` command. This PR also adds an example of a dummy scripting engine module, to show how to use the new module API. The dummy module is implemented in `tests/modules/helloscripting.c`. The current module API support, only allows to load scripting engines to run functions using `FCALL` command. The additions to the module API are the following: ```c /* This struct represents a scripting engine function that results from the * compilation of a script by the engine implementation. / struct ValkeyModuleScriptingEngineCompiledFunction typedef ValkeyModuleScriptingEngineCompiledFunction (ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc)( ValkeyModuleScriptingEngineCtx engine_ctx, const char code, size_t timeout, size_t out_num_compiled_functions, char err); typedef void (ValkeyModuleScriptingEngineCallFunctionFunc)( ValkeyModuleCtx module_ctx, ValkeyModuleScriptingEngineCtx engine_ctx, ValkeyModuleScriptingEngineFunctionCtx func_ctx, void compiled_function, ValkeyModuleString keys, size_t nkeys, ValkeyModuleString args, size_t nargs); typedef size_t (ValkeyModuleScriptingEngineGetUsedMemoryFunc)( ValkeyModuleScriptingEngineCtx engine_ctx); typedef size_t (ValkeyModuleScriptingEngineGetFunctionMemoryOverheadFunc)( void compiled_function); typedef size_t (ValkeyModuleScriptingEngineGetEngineMemoryOverheadFunc)( ValkeyModuleScriptingEngineCtx engine_ctx); typedef void (ValkeyModuleScriptingEngineFreeFunctionFunc)( ValkeyModuleScriptingEngineCtx engine_ctx, void compiled_function); / This struct stores the callback functions implemented by the scripting * engine to provide the functionality for the `FUNCTION ` commands. / typedef struct ValkeyModuleScriptingEngineMethodsV1 { uint64_t version; /* Version of this structure for ABI compat. / / Library create function callback. When a new script is loaded, this * callback will be called with the script code, and returns a list of * ValkeyModuleScriptingEngineCompiledFunc objects. / ValkeyModuleScriptingEngineCreateFunctionsLibraryFunc create_functions_library; / The callback function called when `FCALL` command is called on a function * registered in this engine. / ValkeyModuleScriptingEngineCallFunctionFunc call_function; / Function callback to get current used memory by the engine. / ValkeyModuleScriptingEngineGetUsedMemoryFunc get_used_memory; / Function callback to return memory overhead for a given function. / ValkeyModuleScriptingEngineGetFunctionMemoryOverheadFunc get_function_memory_overhead; / Function callback to return memory overhead of the engine. / ValkeyModuleScriptingEngineGetEngineMemoryOverheadFunc get_engine_memory_overhead; / Function callback to free the memory of a registered engine function. / ValkeyModuleScriptingEngineFreeFunctionFunc free_function; } ValkeyModuleScriptingEngineMethodsV1; / Registers a new scripting engine in the server. * * - `engine_name`: the name of the scripting engine. This name will match * against the engine name specified in the script header using a shebang. * * - `engine_ctx`: engine specific context pointer. * * - `engine_methods`: the struct with the scripting engine callback functions * pointers. / int ValkeyModule_RegisterScriptingEngine(ValkeyModuleCtx ctx, const char engine_name, void engine_ctx, ValkeyModuleScriptingEngineMethods engine_methods); /* Removes the scripting engine from the server. * * `engine_name` is the name of the scripting engine. * / int ValkeyModule_UnregisterScriptingEngine(ValkeyModuleCtx ctx, const char *engine_name); ``` --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2024-12-21 23:09:35 +01:00
Madelyn Olson	e203ca35b7	Fix undefined behavior defined by ASAN (#1451 ) Asan now supports making sure you are passing in the correct pointer type, which seems useful but we can't support it since we pass in an incorrect pointer in several places. This is most commonly done with generic free functions, where we simply cast it to the correct type. It's not a lot of code to clean up, so it seems appropriate to cleanup instead of disabling the check. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-12-17 17:48:53 -08:00
Rain Valentine	88942c8e61	Replace dict with new hashtable for sets datatype (#1176 ) The new `hashtable` provides faster lookups and uses less memory than `dict`. A TCL test case "SRANDMEMBER with a dict containing long chain" is deleted because it's covered by a hashtable unit test "test_random_entry_with_long_chain", which is already present. This change also moves some logic from dismissMemory (object.c) to zmadvise_dontneed (zmalloc.c), so the hashtable implementation which needs the dismiss functionality doesn't need to depend on object.c and server.h. This PR follows #1186. --------- Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-12-14 20:53:48 +01:00
Viktor Söderqvist	3eb8314be6	Replace dict with hashtable for keys, expires and pubsub channels Instead of a dictEntry with pointers to key and value, the hashtable has a pointer directly to the value (robj) which can hold an embedded key and acts as a key-value in the hashtable. This minimizes the number of pointers to follow and thus the number of memory accesses to lookup a key-value pair. Keys robj hashtable +-------+ +-----------------------+ \| 0 \| \| type, encoding, LRU \| \| 1 ------->\| refcount, expire \| \| 2 \| \| ptr \| \| ... \| \| optional embedded key \| +-------+ \| optional embedded val \| +-----------------------+ The expire timestamp (TTL) is also stored in the robj, if any. The expire hash table points to the same robj. Overview of changes: * Replace dict with hashtable in kvstore (kvstore.c) * Add functions for embedding key and expire in robj (object.c) * When there's unused space, reserve an expire field to avoid realloting it later if expire is added. * Always reserve space for expire for large key names to avoid realloc if it's set later. * Update db functions (db.c) * dbAdd, setKey and setExpire reallocate the object when embedding a key * setKey does not increment the reference counter, since it would require duplicating the object. This responsibility is moved to the caller. * Remove logic for shared integer objects as values in the database. The keys are now embedded in the objects, so all objects in the database need to be unique. Thus, we can't use shared objects as values. Also delete test cases for shared integers. * Adjust various commands to the changes mentioned above. * Adjust defrag code * Improvement: Don't access the expires table before defrag has actually reallocated the object. * Adjust test cases that were using hard-coded sizes for dict when realloc would happen, and some other adjustments in test cases. * Adjust memory prefetch for new hash table implementation in IO-threading, using new `hashtableIncrementalFind` API * Adjust offloading of free() to IO threads: Object free to be done in main thread while keeping obj->ptr offloading in IO-thread since the DB object is now allocated by the main-thread and not by the IO-thread as it used to be. * Let expireIfNeeded take an optional value, to avoid looking up the expires table when possible. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com> Signed-off-by: uriyage <78144248+uriyage@users.noreply.github.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Uri Yagelnik <uriy@amazon.com>	2024-12-10 21:30:56 +01:00
Rain Valentine	4efff42f04	Replace dict with hashtable in command tables (#1065 ) This changes the type of command tables from dict to hashtable. Command table lookup takes ~3% of overall CPU time in benchmarks, so it is a good candidate for optimization. My initial SET benchmark comparison suggests that hashtable is about 4.5 times faster than dict and this replacement reduced overall CPU time by 2.79% 🥳 --------- Signed-off-by: Rain Valentine <rainval@amazon.com> Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Rain Valentine <rainval@amazon.com>	2024-12-10 21:30:56 +01:00
Wen Hui	71560a2a4a	Add API UpdateRuntimeArgs for updating the module arguments during runtime (#1041 ) Before Redis OSS 7, if we load a module with some arguments during runtime, and run the command "config rewrite", the module information will not be saved into the config file. Since Redis OSS 7 and Valkey 7.2, if we load a module with some arguments during runtime, the module information (path, arguments number, and arguments value) can be saved into the config file after config rewrite command is called. Thus, the module will be loaded automatically when the server startup next time. Following is one example: bind 172.25.0.58 port 7000 protected-mode no enable-module-command yes Generated by CONFIG REWRITE latency-tracking-info-percentiles 50 99 99.9 dir "/home/ubuntu/valkey" save 3600 1 300 100 60 10000 user default on nopass sanitize-payload ~* &* +https://github.com/ALL loadmodule tests/modules/datatype.so 10 20 However, there is one problem. If developers write a module, and update the running arguments by someway, the updated arguments can not be saved into the config file even "config rewrite" is called. The reason comes from the following function rewriteConfigLoadmoduleOption (src/config.c) void rewriteConfigLoadmoduleOption(struct rewriteConfigState state) { .......... struct ValkeyModule module = dictGetVal(de); line = sdsnew("loadmodule "); line = sdscatsds(line, module->loadmod->path); for (int i = 0; i < module->loadmod->argc; i++) { line = sdscatlen(line, " ", 1); line = sdscatsds(line, module->loadmod->argv[i]->ptr); } rewriteConfigRewriteLine(state, "loadmodule", line, 1); ....... } The function only save the initial arguments information (module->loadmod) into the configfile. After core members discuss, ref https://github.com/valkey-io/valkey/issues/1177 We decide add the following API to implement this feature: Original proposal: int VM_UpdateRunTimeArgs(ValkeyModuleCtx ctx, int index, char value); Updated proposal: ValkeyModuleString *values VM_GetRuntimeArgs(ValkeyModuleCtx ctx); *int VM_UpdateRuntimeArgs(ValkeyModuleCtx ctx, int argc, ValkeyModuleString **values); Why we do not recommend the following way: MODULE UNLOAD Update module args in the conf file MODULE LOAD I think there are the following disadvantages: 1. Some modules can not be unloaded. Such as the example module datatype.so, which is tests/modules/datatype.so 2. it is not atomic operation for MODULE UNLOAD + MODULE LOAD 3. sometimes, if we just run the module unload, the client business could be interrupted --------- Signed-off-by: hwware <wen.hui.ware@gmail.com>	2024-12-05 11:58:24 -05:00
Jim Brunner	349bc7547b	defrag: use monotime in module interface (#1388 ) The recent PR (https://github.com/valkey-io/valkey/pull/1242) converted Active Defrag to use `monotime`. In that change, a conversion was performed to continue to use `ustime()` as part of the module interface. Since this time is only used internally, and never actually exposed to the module, we can convert this to use `monotime` directly. Signed-off-by: Jim Brunner <brunnerj@amazon.com>	2024-12-03 11:19:53 -08:00
Binbin	5d08149e72	Use fake client flag to replace not conn check (#1198 ) The fake client flag was introduced in #1063, we want this to replace all !conn fake client checks. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-11-27 18:02:07 +08:00
Qu Chen	32f7541fe3	Simplify dictType callbacks and move some macros from dict.h to dict.c (#1281 ) Remove the dict pointer argument to the `dictType` callbacks `keyDup`, `keyCompare`, `keyDestructor` and `valDestructor`. This argument was unused in all of the callback implementations. The macros `dictFreeKey()` and `dictFreeVal()` are made internal to dict and moved from dict.h to dict.c. They're also changed from macros to static inline functions. Signed-off-by: Qu Chen <quchen@amazon.com>	2024-11-14 09:45:47 +01:00
Binbin	1892f8a731	Add server log when module load fails with busy name (#1084 ) Currently when module loading fails due to busy name, we don't have a clean way to assist to troubleshooting. Case 1: when loading the same module multiple times, we can not detemine the cause of its failure without referring to the module list or the earliest module load log. The log may not exist and sometimes it is difficult for people to associate module list. Case 2: when multiple modules use the same module name, we can not quickly associate the busy name without referring to the module list and the earliest module load log. Different people wrote modules with the same module name, they don't easily associate module name. So in this PR, when doing module onload, we will try to print a busy name log if this happen. Currently we check ctx.module since if it is NULL it means the Init call failed, and Init currently only fails with busy name. It's kind of ugly. It would have been nice if we could have had a better way for onload to signal why the load failed. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-10-09 16:10:29 +08:00
Guillaume Koenig	f85d8bfde9	Rax size tracking (#688 ) Introduce a `size_t` field into the rax struct to track allocation size. Update the allocation size on rax insert and deletes. Return the allocation size when `raxAllocSize` is called. This size tracking is now used in MEMORY USAGE and MEMORY STATS in place of the previous method based on sampling. The module API allows to create sorted dictionaries, which are backed by rax. Users now also get precise memory allocation for them (through `ValkeyModule_MallocSizeDict`). Fixes #677. For the release notes: * MEMORY USAGE and MEMORY STATS are now exact for streams, rather than based on sampling. --------- Signed-off-by: Guillaume Koenig <knggk@amazon.com> Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com> Co-authored-by: Joey <yzhaon@amazon.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-10-02 19:28:55 +02:00
Shivshankar	a37dee4b3a	Change return value of aeTimeProc callback function to long long. (#1057 ) moduleTimerHandler is aeTimeProc handler and event loop gets created with this. However, found that the function return type is int but actually returns "long long" value(i.e., next_period). and return value being assigned to int variable in processTimeEvents(where time events are processed), this might cause an overflow of the timer values. So changed the return type of the function to long long. And also updated other callback function return type to be consistent. I found this when I was checking functions reported in https://github.com/valkey-io/valkey/issues/1054 issue stacktrace. (FYI, this is just to update the return type to be consistent and it will not the fix for the issue reported) Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-09-27 12:20:47 -07:00
Binbin	80fcbd3fec	Fix module / script call CLUSTER SLOTS / SHARDS fake client check crash (#1063 ) The reason is VM_Call will use a fake client without connection, so we also need to check if c->conn is NULL. This also affects scripts. If they are called in the script, the server will crash. Injecting commands into AOF will also cause startup failure. Fixes #1054. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-25 14:50:48 +08:00
Mikhail Koviazin	af811748e7	clang-format: set ColumnLimit to 0 and reformat (#1045 ) This commit hopefully improves the formatting of the codebase by setting ColumnLimit to 0 and hence stopping clang-format from trying to put as much stuff in one line as possible. This change enabled us to remove most of `clang-format off` directives and fixed a bunch of lines that looked like this: ```c #define KEY \ VALUE /* comment */ ``` Additionally, one pair of `clang-format off` / `clang-format on` had `clang-format off` as the second comment and hence didn't enable the formatting for the rest of the file. This commit addresses this issue as well. Please tell me if anything in the changes seem off. If everything is fine, I will add this commit to `.git-blame-ignore-revs` later. --------- Signed-off-by: Mikhail Koviazin <mikhail.koviazin@aiven.io>	2024-09-25 01:22:54 +02:00
Binbin	4033c99ef5	Fix module RdbLoad wrongly disable the AOF (#1001 ) In RdbLoad, we disable AOF before emptyData and rdbLoad to prevent copy-on-write issues. After rdbLoad completes, AOF should be re-enabled, but the code incorrectly checks server.aof_state, which has been reset to AOF_OFF in stopAppendOnly. This leads to AOF not being re-enabled after being disabled. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-10 21:00:08 -07:00
Pieter Cailliau	4d284daefd	Copyright update to reflect IP transfer from salvatore to Redis (#740 ) Update references of copyright being assigned to Salvatore when it was transferred to Redis Ltd. as per https://github.com/valkey-io/valkey/issues/544. --------- Signed-off-by: Pieter Cailliau <pieter@redis.com>	2024-08-14 09:20:36 -07:00
uriyage	94bc15cb71	Io thread work offload (#763 ) ### IO-Threads Work Offloading This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests per second. (1st PR: https://github.com/valkey-io/valkey/pull/758) This PR offloads additional work to the I/O threads, beyond the current read-parse/write operations, to better utilize the I/O threads and reduce the load on the main thread. It contains the following 3 commits: ### Poll Offload Currently, the main thread is responsible for executing the poll-wait system call, while the IO threads wait for tasks from the main thread. The poll-wait operation is expensive and can consume up to 30% of the main thread's time. We could have let the IO threads do the poll-wait by themselves, with each thread listening to some of the clients and notifying the main thread when a client's command is ready to execute. However, the current approach, where the main thread listens for events from the network, has several benefits. The main thread remains in charge, allowing it to know the state of each client (idle/read/write/close) at any given time. Additionally, it makes the threads flexible, enabling us to drain an IO thread's job queue and stop a thread when the load is light without modifying the event loop and moving its clients to a different IO thread. Furthermore, with this approach, the IO threads don't need to wait for both messages from the network and from the main thread instead, the threads wait only for tasks from the main thread. To enjoy the benefits of both the main thread remaining in charge and the poll being offloaded, we propose offloading the poll-wait as a single-time, non-blocking job to one of the IO threads. The IO thread will perform a poll-wait non-blocking call while the main thread processes the client commands. Later, in `aeProcessEvents`, instead of sleeping on the poll, we check for the IO thread's poll-wait results. The poll-wait will be offloaded in `beforeSleep` only when there are ready events for the main thread to process. If no events are pending, the main thread will revert to the current behavior and sleep on the poll by itself. Implementation Details A new call back `custompoll` was added to the `aeEventLoop` when not set to `NULL` the ae will call the `custompoll` callback instead of the `aeApiPoll`. When the poll is offloaded we will set the `custompoll` to `getIOThreadPollResults` and send a poll-job to the thread. the thread will take a mutex, call a non-blocking (with timeout 0) to `aePoll` which will populate the fired events array. the IO thread will set the `server.io_fired_events` to the number of the returning `numevents`, later the main-thread in `custompoll` will return the `server.io_fired_events` and will set the `customPoll` back to `NULL`. To ensure thread safety when accessing server.el, all functions that modify the eventloop events were wrapped with a mutex to ensure mutual exclusion when modifying the events. ### Command Lookup Offload As the IO thread parses the command from the client's Querybuf, it can perform a command lookup in the commands dictionary, which can consume up to ~5% of the main-thread runtime. Implementation details The IO thread will store the looked-up command in the client's new field `io_parsed_cmd` field. We can't use `c->cmd` for that since we use `c->cmd `to check if a command was reprocessed or not. To ensure thread safety when accessing the command dictionary, we make sure the main thread isn't changing the dictionary while IO threads are accessing it. This is accomplished by introducing a new flag called `no_incremental_rehash` for the `dictType` commands. When performing `dictResize`, we will rehash the entire dictionary in place rather than deferring the process. ### Free Offload Since the command arguments are allocated by the I/O thread, it would be beneficial if they were also freed by the same thread. If the main thread frees objects allocated by the I/O thread, two issues arise: 1. During the freeing process, the main thread needs to access the SDS pointed to by the object to get its length. 2. With Jemalloc, each thread manages thread local pool (`tcache`) of buffers for quick reallocation without accessing the arena. If the main thread constantly frees objects allocated by other threads, those threads will have to frequently access the shared arena to obtain new memory allocations Implementation Details When freeing the client's argv, we will send the argv array to the thread that allocated it. The thread will be identified by the client ID. When freeing an object during `dbOverwrite`, we will offload the object free as well. We will extend this to offload the free during `dbDelete` in a future PR, as its effects on defrag/memory evictions need to be studied. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>	2024-07-18 19:21:45 -07:00
Viktor Söderqvist	a323dce890	Dual stack and client-specific IPs in cluster (#736 ) New configs: * `cluster-announce-client-ipv4` * `cluster-announce-client-ipv6` New module API function: * `ValkeyModule_GetClusterNodeInfoForClient`, takes a client id and is otherwise just like its non-ForClient cousin. If configured, one of these IP addresses are reported to each client in CLUSTER SLOTS, CLUSTER SHARDS, CLUSTER NODES and redirects, replacing the IP (`custer-announce-ip` or the auto-detected IP) of each node. Which one is reported to the client depends on whether the client is connected over IPv4 or IPv6. Benefits: * This allows clients using IPv4 to get the IPv4 addresses of all cluster nodes and IPv6 clients to get the IPv6 clients. * This allows the IPs visible to clients to be different to the IPs used between the cluster nodes due to NAT'ing. The information is propagated in the cluster bus using new Ping extensions. (Old nodes without this feature ignore unknown Ping extensions.) This adds another dimension to CLUSTER SLOTS reply. It now depends on the client's use of TLS, the IP address family and RESP version. Refactoring: The cached connection type definition is moved from connection.h (it actually has nothing to do with the connection abstraction) to server.h and is changed to a bitmap, with one bit for each of TLS, IPv6 and RESP3. Fixes #337 --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-07-10 13:53:52 +02:00
skyfirelee	e4c1f6d45a	Replace client flags to bitfield (#614 )	2024-06-30 11:33:10 -07:00
Madelyn Olson	6faa48a358	Don't initialize the key buffer in getKeysResult (#631 ) getKeysResults is typically initialized with 2kb of zeros (16 * 256), which isn't strictly necessary since the only thing we have to initialize is some of the metadata fields. The rest of the data can remain junk as long as we don't access it. This was a bit of a regression in 7.0 with the keyspecs, since we doubled the size of the zeros, but hopefully this recovers a lot of the performance drop. I saw a modest performance bump for deep pipeline of cluster mode (~8%). I think we would see some comparable improvements in the other places where we are using it such as tracking and ACLs. --------- Signed-off-by: Madelyn Olson <matolson@amazon.com>	2024-06-14 08:42:00 -07:00
skyfirelee	09b5825b26	Moving client->authenticated to a flag instead of an int (#592 ) Moving client->authenticated to a flag Fix #589 Signed-off-by: artikell <739609084@qq.com>	2024-06-09 11:49:05 -07:00
Ping Xie	54c9747935	Remove `master` and `slave` from source code (#591 ) External facing interfaces are not affected. --------- Signed-off-by: Ping Xie <pingxie@google.com>	2024-06-07 14:21:33 -07:00
Eran Liberty	0700c441c6	Remove unused valDup (#443 ) Remove the unused value duplicate API from dict. It's unused in the codebase and introduces unnecessary overhead. --------- Signed-off-by: Eran Liberty <eran.liberty@gmail.com>	2024-06-03 12:22:06 -07:00
Samuel Adetunji	5d0f4bc9f0	Require C11 atomics (#490 ) - Replaces custom atomics logic with C11 default atomics logic. - Drops "atomicvar_api" field from server info Closes #485 --------- Signed-off-by: adetunjii <adetunjithomas1@outlook.com> Signed-off-by: Samuel Adetunji <adetunjithomas1@outlook.com> Co-authored-by: teej4y <samuel.adetunji@prunny.com>	2024-05-26 18:41:11 +02:00
Ping Xie	c41dd77a3e	Add clang-format configs (#323 ) I have validated that these settings closely match the existing coding style with one major exception on `BreakBeforeBraces`, which will be `Attach` going forward. The mixed `BreakBeforeBraces` styles in the current codebase are hard to imitate and also very odd IMHO - see below ``` if (a == 1) { /Attach / } ``` ``` if (a == 1 \|\| b == 2) { /* Why? */ } ``` Please do NOT merge just yet. Will add the github action next once the style is reviewed/approved. --------- Signed-off-by: Ping Xie <pingxie@google.com>	2024-05-22 23:24:12 -07:00
Madelyn Olson	546cef6684	Initial cleanup for cluster refactoring (#460 ) Cleaned up the minor cluster refactoring notes that were intended to be follow ups that never happened. Basically: 1. Minor style nitpicks 2. Generalized clusterNodeIsMyself so that it wasn't implementation dependent. 3. Removed getMyClusterId, and just make it an explicit call to myself's name, which seems more straightforward and removes unnecessary abstraction. 4. Remove clusterNodeGetSlaveof infavor of clusterNodeGetMaster. We already do a check if it's a replica, and if it wasn't working it would have been crashing. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-05-14 17:09:49 -07:00
Shivshankar	07367df981	Update rdb and module's child proc name to valkey accordingly (compatible with redis symlink) (#454 ) If `valkey-server` was started with the `redis-server` symlink, the old proc names are used, for backwards compatibility. --------- Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-05-10 21:51:01 +02:00
Viktor Söderqvist	6af51f5092	Prevent clang-format in certain places (#468 ) This is a preparation for adding clang-format. These comments prevent automatic formatting in some places. With these exceptions, we will be able to run clang-format on the rest of the code. This is a preparation for #323. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-05-08 20:58:53 +02:00
Shivshankar	315b7573c4	Update server function's name to valkey (#456 ) Updated valkey in follwing functions. genRedisInfoString -> genValkeyInfoString genRedisInfoStringCommandStats -> genValkeyInfoStringCommandStats genRedisInfoStringACLStats -> genValkeyInfoStringACLStats genRedisInfoStringLatencyStats -> genValkeyInfoStringLatencyStats Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-05-08 09:44:05 -04:00
Shivshankar	1aca85e3de	Update module api and variable to valkey accordingly. (#455 ) Updated redis instances accordingly as follows. rediscmd -> serverCmd freeRedisModuleAsyncRMCallPromise -> freeValkeyModuleAsyncRMCallPromise MyCommand_RedisCommand -> MyCommand_ValkeyCommand RedisModuleString -> ValkeyModuleString flushRedisModuleIOBuffer -> flushValkeyModuleIOBuffer Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-05-07 16:29:46 -07:00
Viktor Söderqvist	472c1ca26b	Update links in module API docs (generated from module.c) (#433 ) These are used in the docs and on the website. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-05-04 00:14:56 +02:00
Shivshankar	2e46046625	Rename macros in valkey-cli.c and redis_strlcpy to valkey (#284 ) Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>	2024-04-10 22:50:52 +02:00
Jacob Murphy	df5db0627f	Remove trademarked language in code comments (#223 ) This includes comments used for module API documentation. * Strategy for replacement: Regex search: `(//\|/\\| \\|#).* ("\|\()?(r\|R)edis( \|\. \|'\|\n\|,\|-\|\)\|")(?!nor the names of its contributors)(?!Ltd.)(?!Labs)(?!Contributors.)` * Don't edit copyright comments * Replace "Redis version X.X" -> "Redis OSS version X.X" to distinguish from newly licensed repository * Replace "Redis Object" -> "Object" * Exclude markdown for now * Don't edit Lua scripting comments referring to redis.X API * Replace "Redis Protocol" -> "RESP" * Replace redis-benchmark, -cli, -server, -check-aof/rdb with "valkey-" prefix * Most other places, I use best judgement to either remove "Redis", or replace with "the server" or "server" Fixes #148 --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-04-09 10:24:03 +02:00
Bany	d92dc78cb8	Update ValkeyModuleEvent_MasterLinkChange to ValkeyModuleEvent_PrimaryLinkChange (#262 ) Update ValkeyModuleEvent_MasterLinkChange to ValkeyModuleEvent_PrimaryLinkChange Signed-off-by: 0del <bany.y0599@gmail.com>	2024-04-08 08:56:39 -07:00
0del	717ec1e144	Rename ValkeyModule_DefragModuleString to ValkeyModule_DefragValkeyModuleString (#243 ) fixes: #242 --------- Signed-off-by: 0del <bany.y0599@gmail.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-04-06 22:50:56 +02:00
Madelyn Olson	bc28fb4ac0	Update Server version to valkey version (#232 ) This commit updates the following fields: 1. server_version -> valkey_version in server info. Since we would like to advertise specific compatibility, we are making the version specific to valkey. servername will remain as an optional indicator, and other valkey compatible stores might choose to advertise something else. 1. We dropped redis-ver from the API. This isn't related to API compatibility, but we didn't want to "fake" that valkey was creating an rdb from a Redis version. 1. Renamed server-ver -> valkey_version in rdb info. Same as point one, we want to explicitly indicate this was created by a valkey server. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-04-05 21:15:57 -07:00
Ping Xie	aaec321213	Remove REDISMODULE_ prefixes and introduce compatibility header (#194 ) Fix #146 Removed REDISMODULE_ prefixes from the core source code to align with the new SERVERMODULE_ naming convention. Added a new 'redismodule.h' header file to ensure full backward compatibility with existing modules. This compatibility layer maps all legacy REDISMODULE_ prefixed identifiers to their new SERVERMODULE_ equivalents, allowing existing Redis modules to function without modification. --------- Signed-off-by: Ping Xie <pingxie@google.com>	2024-04-05 16:59:55 -07:00
Madelyn Olson	39d0f457a2	Update versioning fields for compatibility (#47 ) New info information to be used to determine the valkey versioning info. Internally, introduce new define values for "SERVER_VERSION" which is different from the Redis compatibility version, "REDIS_VERSION". Add two new info fields: `server_version`: The Valkey server version `server_name`: Indicates that the server is valkey. Add one new RDB field: `server_ver`, which indicates the valkey version that produced the server. Add 3 new LUA globals: `SERVER_VERSION_NUM`, `SERVER_VERSION`, and `SERVER_NAME`. Which reflect the valkey version instead of the Redis compatibility version. Also clean up various places where Redis and configuration was being used that is no longer necessary. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-04-03 14:52:36 -07:00
0del	3a0ba0ad93	rename redisCommandArgType serverCommandArgType (#182 ) redisCommandArgType -> serverCommandArgType redisCommandArg -> serverCommandArg https://github.com/valkey-io/valkey/issues/144 Signed-off-by: 0del <bany.y0599@gmail.com>	2024-04-03 20:33:38 +02:00
0del	f753db5141	rename redis functions in server.h (#179 ) redisPopcount -> serverPopcount redisSetProcTitle -> serverSetProcTitle redisCommunicateSystemd -> serverCommunicateSystemd redisSetCpuAffinity -> serverSetCpuAffinity redisFork -> serverFork #144 Signed-off-by: 0del <bany.y0599@gmail.com>	2024-04-03 20:26:33 +02:00
Harkrishn Patro	1736018aa9	Remove trademarked wording on configuration file and individual configs (#29 ) Remove trademarked wording on configuration layer. Following changes for release notes: 1. Rename redis.conf to valkey.conf 2. Pre-filled config in the template config file: Changing pidfile to `/var/run/valkey_6379.pid` Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>	2024-04-03 19:47:26 +02:00

1 2 3 4 5 ...

786 Commits