futriix

Author	SHA1	Message	Date
Madelyn Olson	2b207ee1b3	Improve stability of hostnames test (#1016 ) Maybe partially resolves https://github.com/valkey-io/valkey/issues/952. The hostnames test relies on an assumption that node zero and node six don't communicate with each other to test a bunch of behavior in the handshake stake. This was done by previously dropping all meet packets, however it seems like there was some case where node zero was sending a single pong message to node 6, which was partially initializing the state. I couldn't track down why this happened, but I adjusted the test to simply pause node zero which also correctly emulates the state we want to be in since we're just testing state on node 6, and removes the chance of errant messages. The test was failing about 5% of the time locally, and I wasn't able to reproduce a failure with this new configuration. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-11 09:52:34 -07:00
Binbin	4033c99ef5	Fix module RdbLoad wrongly disable the AOF (#1001 ) In RdbLoad, we disable AOF before emptyData and rdbLoad to prevent copy-on-write issues. After rdbLoad completes, AOF should be re-enabled, but the code incorrectly checks server.aof_state, which has been reset to AOF_OFF in stopAppendOnly. This leads to AOF not being re-enabled after being disabled. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-10 21:00:08 -07:00
Amit Nagler	1b24168450	Dual Channel Replication - Verify Replica Local Buffer Limit Configuration (#989 ) Prior to comparing the replica buffer against the configured limit, we need to ensure that the limit configuration is enabled. If the limit is set to zero, it indicates that there is no limit, and we should skip the buffer limit check. --------- Signed-off-by: naglera <anagler123@gmail.com>	2024-09-10 17:26:28 -07:00
Binbin	50c1fe59f7	Add missing moduleapi getchannels test and fix tests (#1002 ) Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-10 10:13:54 +08:00
Binbin	6478526597	Fix aof base suffix when modifying aof-use-rdb-preamble during rewrite (#886 ) If we modify aof-use-rdb-preamble in the middle of rewrite, we may get a wrong aof base suffix. This is because the suffix is concatenated by the main process afterwards, and it may be different from the beginning. We cache this value when we start the rewrite. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-07 23:27:59 +08:00
Amit Nagler	5fdb47c2e2	Add configuration hide-user-data-from-log to hide user data from server logs (#877 ) Implement data masking for user data in server logs and diagnostic output. This change prevents potential exposure of confidential information, such as PII, and enhances privacy protection. It masks all command arguments, client names, and client usernames. Added a new hide-user-data-from-log configuration item, default yes. --------- Signed-off-by: Amit Nagler <anagler123@gmail.com>	2024-09-02 09:50:36 -07:00
Binbin	5693fe4664	Fix set expire test due to the new lazyfree configs changes (#980 ) Test failed because these two PRs #865 and #913. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-02 22:43:09 +08:00
Binbin	70624ea63d	Change all the lazyfree configurations to yes by default (#913 ) ## Set replica-lazy-flush and lazyfree-lazy-user-flush to yes by default. There are many problems with running flush synchronously. Even in single CPU environments, the thread managers should balance between the freeing and serving incoming requests. ## Set lazy eviction, expire, server-del, user-del to yes by default We now have a del and a lazyfree del, we also have these configuration items to control: lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del, lazyfree-lazy-user-del. In most cases lazyfree is better since it reduces the risk of blocking the main thread, and because we have lazyfreeGetFreeEffort, on those with high effor (currently 64) will use lazyfree. Part of #653. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-02 07:07:17 -07:00
Binbin	e3af1a30e4	Fast path in SET if the expiration time is expired (#865 ) If the expiration time passed in SET is expired, for example, it has expired due to the machine time (DTS) or the expiration time passed in (wrong arg). In this case, we don't need to set the key and wait for the active expire scan before deleting the key. Compared with previous changes: 1. If the key does not exist, previously we would set the key and wait for the active expire to delete it, so it is a set + del from the perspective of propaganda. Now we will no set the key and return, so it a NOP. 2. If the key exists, previously we woule set the key and wait for the active expire to delete it, so it is a set + del From the perspective of propaganda. Now we will delete it and return, so it is a del. Adding a new deleteExpiredKeyFromOverwriteAndPropagate function to reduce the duplicate code. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-31 22:39:07 +08:00
Binbin	fea49bce2c	Fix timing issue in replica migration test (#968 ) The reason is the server 3 still have the server 7 as its replica due to a short wait, the wait is not enough, we should wait for server loss its replica. ``` *** [err]: valkey-cli make source node ignores NOREPLICAS error when doing the last CLUSTER SETSLOT Expected '{127.0.0.1 21497 267}' to be equal to '' (context: type eval line 34 cmd {assert_equal [lindex [R 3 role] 2] {}} proc ::test) ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-30 19:58:46 +08:00
zhaozhao.zz	743f5ac2ae	standalone -REDIRECT handles special case of MULTI context (#895 ) In standalone mode, when a `-REDIRECT` error occurs, special handling is required if the client is in the `MULTI` context. We have adopted the same handling method as the cluster mode: 1. If a command in the transaction encounters a `REDIRECT` at the time of queuing, the execution of `EXEC` will return an `EXECABORT` error (we expect the client to redirect and discard the transaction upon receiving a `REDIRECT`). That is: ``` MULTI ==> +OK SET x y ==> -REDIRECT EXEC ==> -EXECABORT ``` 2. If all commands are successfully queued (i.e., `QUEUED` results are received) but a redirect is detected during `EXEC` execution (such as a primary-replica switch), a `REDIRECT` is returned to instruct the client to perform a redirect. That is: ``` MULTI ==> +OK SET x y ==> +QUEUED failover EXEC ==> -REDIRECT ``` --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-08-30 10:17:53 +08:00
Binbin	ecbfb6a7ec	Fix reconfiguring sub-replica causing data loss when myself change shard_id (#944 ) When reconfiguring sub-replica, there may a case that the sub-replica will use the old offset and win the election and cause the data loss if the old primary went down. In this case, sender is myself's primary, when executing updateShardId, not only the sender's shard_id is updated, but also the shard_id of myself is updated, casuing the subsequent areInSameShard check, that is, the full_sync_required check to fail. As part of the recent fix of #885, the sub-replica needs to decide whether a full sync is required or not when switching shards. This shard membership check is supposed to be done against sub-replica's current shard_id, which however was lost in this code path. This then leads to sub-replica joining the other shard with a completely different and incorrect replication history. This is the only place where replicaof state can be updated on this path so the most natural fix would be to pull the chain replication reduction logic into this code block and before the updateShardId call. This one follow #885 and closes #942. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@outlook.com>	2024-08-29 22:39:53 +08:00
Ping Xie	ad0ede302c	Exclude '.' and ':' from `isValidAuxChar`'s banned charset (#963 ) Fix a bug in isValidAuxChar where valid characters '.' and ':' were incorrectly included in the banned charset. This issue affected the validation of auxiliary fields in the nodes.conf file used by Valkey in cluster mode, particularly when handling IPv4 and IPv6 addresses. The code now correctly allows '.' and ':' as valid characters, ensuring proper handling of these fields. Comments were added to clarify the use of the banned charset. Related to #736 --------- Signed-off-by: Ping Xie <pingxie@google.com>	2024-08-28 23:35:31 -07:00
Binbin	75b824052d	Revert make KEYS to be an exact match if there is no pattern (#964 ) In #792, the time complexity became ambiguous, fluctuating between O(1) and O(n), which is a significant difference. And we agree uncertainty can potentially bring disaster to the business, the right thing to do is to persuade users to use EXISTS instead of KEYS in this case, to do the right thing the right way, rather than accommodating this incorrect usage. This reverts commit d66a06e8183818c035bb78706f46fd62645db07e. This reverts #792. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-29 10:58:19 +08:00
Viktor Söderqvist	25dd943087	Delete TLS.md and update README.md about tests (#960 ) Most of the content of TLS.md has already been copied to README.md in #927. The description of how to run tests with TLS is moved to tests/README.md. Descriptions of the additional scripts runtest-cluster, runtest-sentinel and runtest-module are added in tests/README.md. Links to tests/README.md and src/unit/README.md are added in the top-level README.md along with a brief overview of the `make test-*` commands. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-08-28 21:17:04 +02:00
Binbin	4fe8320711	Add pause path coverage to replica migration tests (#937 ) In #885, we only add a shutdown path, there is another path is that the server might got hang by slowlog. This PR added the pause path coverage to cover it. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-28 11:08:27 +08:00
Binbin	6a84e06b05	Wait for the role change and fix the timing issue in the new test (#947 ) The test might be fast enough and then there is no change in the role causing the test to fail. Adding a wait to avoid the timing issue: ``` *** [err]: valkey-cli make source node ignores NOREPLICAS error when doing the last CLUSTER SETSLOT Expected '{127.0.0.1 23154 267}' to be equal to '' (context: type eval line 24 cmd {assert_equal [lindex [R 3 role] 2] {}} proc ::test) ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-28 09:51:10 +08:00
Amit Nagler	1ff2a3b6ae	Remove `dual-channel-replication` Feature Flag's Protection (#908 ) Currently, the `dual-channel-replication` feature flag is immutable if `enable-protected-configs` is enabled, which is the default behavior. This PR proposes to make the `dual-channel-replication` flag mutable, allowing it to be changed dynamically without restarting the cluster. Motivation: The ability to change the `dual-channel-replication` flag dynamically is essential for testing and validating the feature on real clusters running in production environments. By making the flag mutable, we can enable or disable the feature without disrupting the cluster's operations, facilitating easier testing and experimentation. Additionally, this change would provide more flexibility for users to enable or disable the feature based on their specific requirements or operational needs without requiring a cluster restart. --------- Signed-off-by: naglera <anagler123@gmail.com>	2024-08-27 10:18:48 -07:00
uriyage	04d76d8b02	Improve multithreaded performance with memory prefetching (#861 ) This PR utilizes the IO threads to execute commands in batches, allowing us to prefetch the dictionary data in advance. After making the IO threads asynchronous and offloading more work to them in the first 2 PRs, the `lookupKey` function becomes a main bottle-neck and it takes about 50% of the main-thread time (Tested with SET command). This is because the Valkey dictionary is a straightforward but inefficient chained hash implementation. While traversing the hash linked lists, every access to either a dictEntry structure, pointer to key, or a value object requires, with high probability, an expensive external memory access. ### Memory Access Amortization Memory Access Amortization (MAA) is a technique designed to optimize the performance of dynamic data structures by reducing the impact of memory access latency. It is applicable when multiple operations need to be executed concurrently. The principle behind it is that for certain dynamic data structures, executing operations in a batch is more efficient than executing each one separately. Rather than executing operations sequentially, this approach interleaves the execution of all operations. This is done in such a way that whenever a memory access is required during an operation, the program prefetches the necessary memory and transitions to another operation. This ensures that when one operation is blocked awaiting memory access, other memory accesses are executed in parallel, thereby reducing the average access latency. We applied this method in the development of `dictPrefetch`, which takes as parameters a vector of keys and dictionaries. It ensures that all memory addresses required to execute dictionary operations for these keys are loaded into the L1-L3 caches when executing commands. Essentially, `dictPrefetch` is an interleaved execution of dictFind for all the keys. Implementation details When the main thread iterates over the `clients-pending-io-read`, for clients with ready-to-execute commands (i.e., clients for which the IO thread has parsed the commands), a batch of up to 16 commands is created. Initially, the command's argv, which were allocated by the IO thread, is prefetched to the main thread's L1 cache. Subsequently, all the dict entries and values required for the commands are prefetched from the dictionary before the command execution. Only then will the commands be executed. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>	2024-08-26 21:10:44 -07:00
Binbin	d66a06e818	Make KEYS to be an exact match if there is no pattern (#792 ) Although KEYS is a dangerous command and we recommend people to avoid using it, some people who are not familiar with it still using it, and even use KEYS with no pattern at all. Once KEYS is using with no pattern, we can convert it to an exact match to avoid iterating over all data. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-27 12:04:27 +08:00
Ayush Sharma	b48596a914	Add support for setting the group on a unix domain socket (#901 ) Add new optional, immutable string config called `unixsocketgroup`. Change the group of the unix socket to `unixsocketgroup` after `bind()` if specified. Adds tests to validate the behavior. Fixes #873. Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>	2024-08-23 11:52:08 -07:00
Binbin	8045994972	valkey-cli make source node ignores NOREPLICAS when doing the last CLUSTER SETSLOT (#928 ) This fixes #899. In that issue, the primary is cluster-allow-replica-migration no and its replica is cluster-allow-replica-migration yes. And during the slot migration: 1. Primary calling blockClientForReplicaAck, waiting its replica. 2. Its replica reconfiguring itself as a replica of other shards due to replica migration and disconnect from the old primary. 3. The old primary never got the chance to receive the ack, so it got a timeout and got a NOREPLICAS error. In this case, the replicas might automatically migrate to another primary, resulting in the client being unblocked with the NOREPLICAS error. In this case, since the configuration will eventually propagate itself, we can safely ignore this error on the source node. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-23 16:22:30 +08:00
Binbin	5d97f5133c	Fix CLUSTER SETSLOT block and unblock error when all replicas are down (#879 ) In CLUSTER SETSLOT propagation logic, if the replicas are down, the client will get block during command processing and then unblock with `NOREPLICAS Not enough good replicas to write`. The reason is that all replicas are down (or some are down), but myself->num_replicas is including all replicas, so the client will get block and always get timeout. We should only wait for those online replicas, otherwise the waiting propagation will always timeout since there are not enough replicas. The admin can easily check if there are replicas that are down for an extended period of time. If they decide to move forward anyways, we should not block it. If a replica failed right before the replication and was not included in the replication, it would also unlikely win the election. Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@google.com>	2024-08-23 16:21:53 +08:00
Madelyn Olson	b12668af7a	Revert repl backlog size back to 1mb for dual channel tests (#934 ) There is a test that assumes that the backlog will get overrun, but because of the recent changes to the default it no longer fails. It seems like it is a bit flakey now though, so resetting the value in the test back to 1mb. (This relates to the CoB of 1100k. So it should consistently work with a 1mb limit). Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-22 15:35:28 -07:00
Wen Hui	959dd3485b	Decline unsubscribe related command in non-subscribed mode (#759 ) Now, when clients run the unsubscribe, sunsubscribe and punsubscribe commands in the non-subscribed mode, it returns 0. Indeed this is a bug, we should not allow client run these kind of commands here. Thus, this PR fixes this bug, but it is a break change for existing clients --------- Signed-off-by: hwware <wen.hui.ware@gmail.com>	2024-08-22 11:21:33 -04:00
Binbin	8d9b8c9d3d	Make runtest-cluster support --io-threads option (#933 ) In #764, we add a --io-threads mode in test, but forgot to handle runtest-cluster, they are different framework. Currently runtest-cluster does not support tags, and we don't have plan to support it. And currently cluster tests does not have any io-threads tests, so this PR just align --io-threads option with #764. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-22 11:21:06 -04:00
uriyage	39f8bcb91b	Skip tracking clients OOM test when I/O threads are enabled (#764 ) Fix feedback loop in key eviction with tracking clients when using I/O threads. Current issue: Evicting keys while tracking clients or key space-notification exist creates a feedback loop when using I/O threads: While evicting keys we send tracking async writes to I/O threads, preventing immediate release of tracking clients' COB memory consumption. Before the I/O thread finishes its write, we recheck used_memory, which now includes the tracking clients' COB and thus continue to evict more keys. Fix: We will skip the test for now while IO threads are active. We may consider avoiding sending writes in `processPendingWrites` to I/O threads for tracking clients when we are out of memory. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-21 17:02:57 -07:00
Binbin	a1ac459ef1	Set repl-backlog-size from 1mb to 10mb by default (#911 ) The repl-backlog-size 1mb is too small in most cases, now network transmission and bandwidth performance have improved rapidly in more than ten years. The bigger the replication backlog, the longer the replica can endure the disconnect and later be able to perform a partial resynchronization. Part of #653. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-21 11:59:02 -04:00
Wen Hui	b8dd4fbbf7	Fix Error in Daily CI -- reply-schemas-validator (#922 ) Just add one more test for command "sentinel IS-PRIMARY-DOWN-BY-ADDR" to make the reply-schemas-validator run successfully. Note: test result here https://github.com/hwware/valkey/actions/runs/10457516111 Signed-off-by: hwware <wen.hui.ware@gmail.com>	2024-08-21 09:36:02 -04:00
Binbin	e1b3629186	Fix data loss when replica do a failover with a old history repl offset (#885 ) Our current replica can initiate a failover without restriction when it detects that the primary node is offline. This is generally not a problem. However, consider the following scenarios: 1. In slot migration, a primary loses its last slot and then becomes a replica. When it is fully synchronized with the new primary, the new primary downs. 2. In CLUSTER REPLICATE command, a replica becomes a replica of another primary. When it is fully synchronized with the new primary, the new primary downs. In the above scenario, case 1 may cause the empty primary to be elected as the new primary, resulting in primary data loss. Case 2 may cause the non-empty replica to be elected as the new primary, resulting in data loss and confusion. The reason is that we have cached primary logic, which is used for psync. In the above scenario, when clusterSetPrimary is called, myself will cache server.primary in server.cached_primary for psync. In replicationGetReplicaOffset, we get server.cached_primary->reploff for offset, gossip it and rank it, which causes the replica to use the old historical offset to initiate failover, and it get a good rank, initiates election first, and then is elected as the new primary. The main problem here is that when the replica has not completed full sync, it may get the historical offset in replicationGetReplicaOffset. The fix is to clear cached_primary in these places where full sync is obviously needed, and let the replica use offset == 0 to participate in the election. In this way, this unhealthy replica has a worse rank and is not easy to be elected. Of course, it is possible that it will be elected with offset == 0. In the future, we may need to prohibit the replica with offset == 0 from having the right to initiate elections. Another point worth mentioning, in above cases: 1. In the ROLE command, the replica status will be handshake, and the offset will be -1. 2. Before this PR, in the CLUSTER SHARD command, the replica status will be online, and the offset will be the old cached value (which is wrong). 3. After this PR, in the CLUSTER SHARD, the replica status will be loading, and the offset will be 0. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-21 13:11:21 +08:00
gmbnomis	7795152fff	Fix valgrind timing issue failure in replica-redirect test (#917 ) Wait for the replica to become online before starting the actual test. Signed-off-by: Simon Baatz <gmbnomis@gmail.com>	2024-08-18 21:20:53 +08:00
Wen Hui	33c7ca41be	Add 4 commands for sentinel and update most test cases and json files (#789 ) Add 4 new commands for Sentinel (reference https://github.com/valkey-io/valkey/issues/36) Sentinel GET-PRIMARY-ADDR-BY-NAME Sentinel PRIMARY Sentinel PRIMARIES Sentinel IS-PRIMARY-DOWN-BY-ADDR and deprecate 4 old commands: Sentinel GET-MASTER-ADDR-BY-NAME Sentinel MASTER Sentinel MASTERS Sentinel IS-MASTER-DOWN-BY-ADDR and all sentinel tests pass here https://github.com/hwware/valkey/actions/runs/9962102363/job/27525124583 Note: 1. runtest-sentinel pass all test cases 2. I finished a sentinel rolling upgrade test: 1 primary 2 replicas 3 sentinel there are 4 steps in this test scenario: step 1: all 3 sentinel nodes run old sentinel, shutdown primary, and then new primary can be voted successfully. step 2: replace sentinel 1 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 3: replace sentinel 2 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully step 4: replace sentinel 3 with new sentinel bin file, and then shutdown primary, and then another new primary can be voted successfully We can see, even mixed version sentinel running, whole system still works. --------- Signed-off-by: hwware <wen.hui.ware@gmail.com>	2024-08-16 09:46:36 -04:00
Binbin	76ad8f7a76	Skip IPv6 tests when TCLSH version is < 8.6 (#910 ) In #786, we did skip it in the daily, but not for the others. When running ./runtest on MacOS, we will get the failure. ``` couldn't open socket: host is unreachable (nodename nor servname provided, or not known) ``` The reason is that TCL 8.5 doesn't support ipv6, so we skip tests tagged with ipv6. This also revert #786. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-15 15:11:38 +08:00
Pieter Cailliau	4d284daefd	Copyright update to reflect IP transfer from salvatore to Redis (#740 ) Update references of copyright being assigned to Salvatore when it was transferred to Redis Ltd. as per https://github.com/valkey-io/valkey/issues/544. --------- Signed-off-by: Pieter Cailliau <pieter@redis.com>	2024-08-14 09:20:36 -07:00
zhaozhao.zz	131857e80a	To avoid bouncing -REDIRECT during FAILOVER (#871 ) Fix #821 During the `FAILOVER` process, when conditions are met (such as when the force time is reached or the primary and replica offsets are consistent), the primary actively becomes the replica and transitions to the `FAILOVER_IN_PROGRESS` state. After the primary becomes the replica, and after handshaking and other operations, it will eventually send the `PSYNC FAILOVER` command to the replica, after which the replica will become the primary. This means that the upgrade of the replica to the primary is an asynchronous operation, which implies that during the `FAILOVER_IN_PROGRESS` state, there may be a period of time where both nodes are replicas. In this scenario, if a `-REDIRECT` is returned, the request will be redirected to the replica and then redirected back, causing back and forth redirection. To avoid this situation, during the `FAILOVER_IN_PROGRESS state`, we temporarily suspend the clients that need to be redirected until the replica truly becomes the primary, and then resume the execution. --------- Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>	2024-08-14 14:04:29 +08:00
Amit Nagler	6cb86fff51	Fix dual-channel replication test under valgrind (#904 ) Test dual-channel-replication primary gets cob overrun during replica rdb load` fails during the Valgrind run. This is due to the load handlers disconnecting before the tests complete, resulting in a low primary COB. Increasing the handlers' timeout should resolve this issue. Failure: https://github.com/valkey-io/valkey/actions/runs/10361286333/job/28681321393 Server logs reveals that the load handler clients were disconnected before the test started Also the two previus test took about 20 seconds which is the handler timeout. --------- Signed-off-by: naglera <anagler123@gmail.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-08-13 10:40:19 -07:00
naglera	27fce29500	Fix dual-channel-replication related issues (#837 ) - Fix TLS bug where connection were shutdown by primary's main process while the child process was still writing- causing main process to be blocked. - TLS connection fix -file descriptors are set to blocking mode in the main thread, followed by a blocking write. This sets the file descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`) (@xbasel). - Improve the reliability of dual-channel tests. Modify the pause mechanism to verify process status directly, rather than relying on log. - Ensure that `server.repl_offset` and `server.replid` are updated correctly when dual channel synchronization completes successfully. Thist led to failures in replication tests that validate replication IDs or compare replication offsets. --------- Signed-off-by: naglera <anagler123@gmail.com> Signed-off-by: naglera <58042354+naglera@users.noreply.github.com> Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: ranshid <88133677+ranshid@users.noreply.github.com> Co-authored-by: xbasel <103044017+xbasel@users.noreply.github.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2024-08-12 13:03:12 -07:00
Binbin	929283fc6f	Dual channel replication should not update lastbgsave_status when transfer error (#811 ) Currently lastbgsave_status is used in bgsave or disk-replication, and the target is the disk. In #60, we update it when transfer error, i think it is mainly used in tests, so we can use log to replace it. It changes lastbgsave_status to err in this case, but it is strange that it does not set ok or err in the above if and the following else. Also noted this will affect stop-writes-on-bgsave-error. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-12 11:25:55 -07:00
Binbin	5166d489da	Correctly recode client infomation to the slowlog when runing script (#805 ) Currently when we are runing a script, we will passing a fake client. So if the command executed in the script is recoded in the slowlog, the client's ip:port and name information will be empty. before: ``` 127.0.0.1:6379> client setname myclient OK 127.0.0.1:6379> config set slowlog-log-slower-than 0 OK 127.0.0.1:6379> eval "redis.call('ping')" 0 (nil) 127.0.0.1:6379> slowlog get 2 1) 1) (integer) 2 2) (integer) 1721314289 3) (integer) 96 4) 1) "eval" 2) "redis.call('ping')" 3) "0" 5) "127.0.0.1:61106" 6) "myclient" 2) 1) (integer) 1 2) (integer) 1721314289 3) (integer) 4 4) 1) "ping" 5) "" 6) "" ``` after: ``` 127.0.0.1:6379> client setname myclient OK 127.0.0.1:6379> config set slowlog-log-slower-than 0 OK 127.0.0.1:6379> eval "redis.call('ping')" 0 (nil) 127.0.0.1:6379> slowlog get 2 1) 1) (integer) 2 2) (integer) 1721314371 3) (integer) 53 4) 1) "eval" 2) "redis.call('ping')" 3) "0" 5) "127.0.0.1:51983" 6) "myclient" 2) 1) (integer) 1 2) (integer) 1721314371 3) (integer) 1 4) 1) "ping" 5) "127.0.0.1:51983" 6) "myclient" ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-08-10 23:46:56 +08:00
Madelyn Olson	b728e4170f	Disable empty shard slot migration test until test is de-flaked (#859 ) We have a number of test failures in the empty shard migration which seem to be related to race conditions in the failover, but could be more pervasive. For now disable the tests to prevent so many false negative test failures. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-07-31 16:52:20 -07:00
naglera	9211aed72e	Improve reliability of dual-channel-replication pause resume tests (#835 ) Update the dual channel-replication tests to wait for the pause to begin before attempting to unpause. --------- Signed-off-by: naglera <anagler123@gmail.com>	2024-07-28 11:14:56 -07:00
Binbin	e32518d655	Fix unexpected behavior when turning appendonly on and off within a transaction (#826 ) If we do `config set appendonly yes` and `config set appendonly no` in a multi, there are some unexpected behavior. When doing appendonly yes, we will schedule a AOFRW, and when we are doding appendonly no, we will call stopAppendOnly to stop it. In stopAppendOnly, the aof_fd is -1 since the aof is not start yet and the fsync and close will take the -1 and call it, so they will all fail with EBADF. And stopAppendOnly will emit a server log, the close(-1) should be no problem but it is still an undefined behavior. This PR also adds a log `Background append only file rewriting scheduled.` to bgrewriteaofCommand when it was scheduled. And adds a log in stopAppendOnly when a scheduled AOF is canceled, it will print `AOF was disabled but there is a scheduled AOF background, cancel it.` Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2024-07-27 18:38:23 +08:00
Kyle Kim (kimkyle@)	e1d936b339	Add network-bytes-in and network-bytes-out metric support under CLUSTER SLOT-STATS command (#20 ) (#720 ) Adds two new metrics for per-slot statistics, network-bytes-in and network-bytes-out. The network bytes are inclusive of replication bytes but exclude other types of network traffic such as clusterbus traffic. #### network-bytes-in The metric tracks network ingress bytes under per-slot context, by reverse calculation of `c->argv_len_sum` and `c->argc`, stored under a newly introduced field `c->net_input_bytes_curr_cmd`. #### network-bytes-out The metric tracks network egress bytes under per-slot context, by hooking onto COB buffer mutations. #### sample response Both metrics are reported under the `CLUSTER SLOT-STATS` command. ``` 127.0.0.1:6379> cluster slot-stats slotsrange 0 0 1) 1) (integer) 0 2) 1) "key-count" 2) (integer) 0 3) "cpu-usec" 4) (integer) 0 5) "network-bytes-in" 6) (integer) 0 7) "network-bytes-out" 8) (integer) 0 ``` --------- Signed-off-by: Kyle Kim <kimkyle@amazon.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-07-26 16:06:16 -07:00
naglera	48ca2c9176	Improve dual channel replication stability and fix compatibility issues (#804 ) Introduce several improvements to improve the stability of dual-channel replication and fix compatibility issues. 1. Make dual-channel-replication tests more reliable: use pause instead of forced sleep. 2. Fix race conditions when freeing RDB client. 3. Check if sync was stopped during local buffer streaming. 4. Fix $ENDOFFSET reply format to work on 32-bit machines too. --------- Signed-off-by: naglera <anagler123@gmail.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-07-25 09:34:39 -07:00
Binbin	59aa00823c	Replicas with the same offset queue up for election (#762 ) In some cases, like read more than write scenario, the replication offset of the replicas are the same. When the primary fails, the replicas have the same rankings (rank == 0). They issue the election at the same time (although we have a random 500), the simultaneous elections may lead to the failure of the election due to quorum. In clusterGetReplicaRank, when we calculates the rank, if the offsets are the same, the one with the smaller node name will have a better rank to avoid this situation. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-07-22 23:43:16 -07:00
Kyle Kim (kimkyle@)	5000c050b5	Add cpu-usec metric support under CLUSTER SLOT-STATS command (#20 ). (#712 ) The metric tracks cpu time in micro-seconds, sharing the same value as `INFO COMMANDSTATS`, aggregated under per-slot context. --------- Signed-off-by: Kyle Kim <kimkyle@amazon.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-07-22 18:03:28 -07:00
Binbin	14e09e981e	Fix the wrong woff when execute WAIT / WAITAOF in script (#776 ) When executing the script, the client passed in is a fake client, and its woff is always 0. This results in woff always being 0 when executing wait/waitaof in the script, and the command returns a wrong number. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-07-22 10:33:10 +02:00
Binbin	15a8290231	Optimize failover time when the new primary node is down again (#782 ) We will not reset failover_auth_time after setting it, this is used to check auth_timeout and auth_retry_time, but we should at least reset it after a successful failover. Let's assume the following scenario: 1. Two replicas initiate an election. 2. Replica 1 is elected as the primary node, and replica 2 does not have enough votes. 3. Replica 1 is down, ie the new primary node down again in a short time. 4. Replica 2 know that the new primary node is down and wants to initiate a failover, but because the failover_auth_time of the previous round has not been reset, it needs to wait for it to time out and then wait for the next retry time, which will take cluster-node-timeout * 4 times, this adds a lot of delay. There is another problem. Like we will set additional random time for failover_auth_time, such as random 500ms and replicas ranking 1s. If replica 2 receives PONG from the new primary node before sending the FAILOVER_AUTH_REQUEST, that is, before the failover_auth_time, it will change itself to a replica. If the new primary node goes down again at this time, replica 2 will use the previous failover_auth_time to initiate an election instead of going through the logic of random 500ms and replicas ranking 1s again, which may lead to unexpected consequences (for example, a low-ranking replica initiates an election and becomes the new primary node). That is, we need to reset failover_auth_time at the appropriate time. When the replica switches to a new primary, we reset it, because the existing failover_auth_time is already out of date in this case. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-07-19 15:27:49 -04:00
Harkrishn Patro	816accea76	Generate correct slot information in cluster shards command on primary failure (#790 ) Fix #784 Prior to the change, `CLUSTER SHARDS` command processing might pick a failed primary node which won't have the slot coverage information and the slots `output` in turn would be empty. This change finds an appropriate node which has the slot coverage information served by a given shard and correctly displays it as part of `CLUSTER SHARDS` output. Before: ``` 1) 1) "slots" 2) (empty array) 3) "nodes" 4) 1) 1) "id" 2) "2936f22a490095a0a851b7956b0a88f2b67a5d44" ... 9) "role" 10) "master" ... 13) "health" 14) "fail" ``` After: ``` 1) 1) "slots" 2) 1) 0 2) 5461 3) "nodes" 4) 1) 1) "id" 2) "2936f22a490095a0a851b7956b0a88f2b67a5d44" ... 9) "role" 10) "master" ... 13) "health" 14) "fail" ``` --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>	2024-07-19 09:32:39 -07:00
Binbin	35a1888333	Fix incorrect usage of process_is_paused in tests (#783 ) It was introduced wrong in #442. Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-07-19 11:25:58 +08:00

1 2 3 4 5 ...

2369 Commits