76 Commits

Author SHA1 Message Date
Binbin
380f700816
Improve cluster cant failover log conditions (#780)
This PR adjusts the logging conditions of clusterLogCantFailover
in this two ways.

1. For the same cant_failover_reason, we will print the log once
in CLUSTER_CANT_FAILOVER_RELOG_PERIOD, but its value is 10s, which
is a bit long, shorten it to 1s, so we can better track its state.
We get to see the system making progress by watching the message.
Using 1s also covers pretty much all cases as i don't see a reason
for using a <1s node timeout, test or prod.

2. We will not print logs before the nolog_fail_time, its value
is cluster-node-timeout+5000. This may casue us to lose some logs,
for example, if cluster-node-timeout is small, auth_timeout will
be 2000, and auth_retry_time will be 4000. In this case, we will
lose all the reasons during the election if the failover is timedout.
So remove the nolog_fail_time logic, since we still do have the
CLUSTER_CANT_FAILOVER_RELOG_PERIOD logic, we won't print too many
logs.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-08-06 21:14:18 +08:00
Madelyn Olson
4b8de6b1be
Update replica version comparison to handle version 8 RC candidates (#851)
Release candidates have a version that is lower than 8.0.0 to allow for
8.0.0 to have 0x080000 as a release number. However, we did an explicit
check to make sure a version was 8.0 or greater to validate a replica
supports a feature. Now we are using the highest patch version of latest
minor to do the comparison to accommodate future versions.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-07-31 10:01:48 -07:00
Madelyn Olson
b4d96caa78
Remove static to avoid compiler warning (#836) 2024-07-28 13:08:09 -07:00
Roshan Khatri
e745e9c240
Adds Light-weight cluster bus header for pubsub message. (#654)
Adds light-weight cluster bus header for pubsub message. Closes #557.

This also supports sending to and receiving non-light messages from
older versions of the engine.

The light-weight cluster bus message supports multiple pubsub messages
(payloads) for one pubsub channel. Receiving messages with multiple
payloads is supported but we're not yet sending such multi-payload
messages to other nodes.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2024-07-26 10:49:18 -07:00
Binbin
f00c8f6214
Modify clusterSaveConfig function call to use C_OK / C_ERR return value (#818)
Minor cleanups.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-07-24 09:58:44 +08:00
Binbin
59aa00823c
Replicas with the same offset queue up for election (#762)
In some cases, like read more than write scenario, the replication
offset of the replicas are the same. When the primary fails, the
replicas have the same rankings (rank == 0). They issue the election
at the same time (although we have a random 500), the simultaneous
elections may lead to the failure of the election due to quorum.

In clusterGetReplicaRank, when we calculates the rank, if the offsets
are the same, the one with the smaller node name will have a better
rank to avoid this situation.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-07-22 23:43:16 -07:00
Kyle Kim (kimkyle@)
5000c050b5
Add cpu-usec metric support under CLUSTER SLOT-STATS command (#20). (#712)
The metric tracks cpu time in micro-seconds, sharing the same value as
`INFO COMMANDSTATS`, aggregated under per-slot context.

---------

Signed-off-by: Kyle Kim <kimkyle@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-07-22 18:03:28 -07:00
Binbin
15a8290231
Optimize failover time when the new primary node is down again (#782)
We will not reset failover_auth_time after setting it, this is used
to check auth_timeout and auth_retry_time, but we should at least
reset it after a successful failover.

Let's assume the following scenario:
1. Two replicas initiate an election.
2. Replica 1 is elected as the primary node, and replica 2 does not have
   enough votes.
3. Replica 1 is down, ie the new primary node down again in a short
time.
4. Replica 2 know that the new primary node is down and wants to
initiate
   a failover, but because the failover_auth_time of the previous round
   has not been reset, it needs to wait for it to time out and then wait
for the next retry time, which will take cluster-node-timeout * 4 times,
   this adds a lot of delay.

There is another problem. Like we will set additional random time for
failover_auth_time, such as random 500ms and replicas ranking 1s. If
replica 2 receives PONG from the new primary node before sending the
FAILOVER_AUTH_REQUEST, that is, before the failover_auth_time, it will
change itself to a replica. If the new primary node goes down again at
this time, replica 2 will use the previous failover_auth_time to
initiate
an election instead of going through the logic of random 500ms and
replicas ranking 1s again, which may lead to unexpected consequences
(for example, a low-ranking replica initiates an election and becomes
the new primary node).

That is, we need to reset failover_auth_time at the appropriate time.
When the replica switches to a new primary, we reset it, because the
existing failover_auth_time is already out of date in this case.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-07-19 15:27:49 -04:00
Harkrishn Patro
816accea76
Generate correct slot information in cluster shards command on primary failure (#790)
Fix #784 

Prior to the change, `CLUSTER SHARDS` command processing might pick a
failed primary node which won't have the slot coverage information and
the slots `output` in turn would be empty. This change finds an
appropriate node which has the slot coverage information served by a
given shard and correctly displays it as part of `CLUSTER SHARDS`
output.

 Before:
 
 ```
 1) 1) "slots"
   2)  (empty array)
   3) "nodes"
   4) 1)  1) "id"
          2) "2936f22a490095a0a851b7956b0a88f2b67a5d44"
          ...
          9) "role"
         10) "master"
         ...
         13) "health"
         14) "fail"
 ```
 
 After:
 

 ```
 1) 1) "slots"
   2) 1) 0
       2) 5461
   3) "nodes"
   4) 1)  1) "id"
          2) "2936f22a490095a0a851b7956b0a88f2b67a5d44"
          ...
          9) "role"
         10) "master"
         ...
         13) "health"
         14) "fail"
 ```

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-07-19 09:32:39 -07:00
Ping Xie
66d0f7d9a1
Ensure only primary sender drives slot ownership updates (#754)
Fixes a regression introduced in PR #445, which allowed a message from a
replica
to update the slot ownership of its primary. The regression results in a
`replicaof` cycle, causing server crashes due to the cycle detection
assert. The
fix restores the previous behavior where only primary senders can
trigger
`clusterUpdateSlotsConfigWith`.

Additional changes:

* Handling of primaries without slots is obsoleted by new handling of
when a
  sender that was a replica announces that it is now a primary.
* Replication loop detection code is unchanged but shifted downwards.
* Some variables are renamed for better readability and some are
introduced to
  avoid repeated memcmp() calls.

Fixes #753.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-07-16 13:05:49 -07:00
Brennan
34649bd034
Configurable cluster blacklist TTL (#738)
Allows cluster admins to configure the blacklist TTL as needed to allow
sufficient time for `CLUSTER FORGET` to be executed on every node in the
cluster.

Config name `cluster-blacklist-ttl`; unit seconds; deault 60.

---------

Signed-off-by: Brennan Cathcart <brennancathcart@gmail.com>
2024-07-13 20:38:25 +02:00
Viktor Söderqvist
a323dce890
Dual stack and client-specific IPs in cluster (#736)
New configs:

* `cluster-announce-client-ipv4`
* `cluster-announce-client-ipv6`

New module API function:

* `ValkeyModule_GetClusterNodeInfoForClient`, takes a client id and is
otherwise just like its non-ForClient cousin.

If configured, one of these IP addresses are reported to each client in
CLUSTER SLOTS, CLUSTER SHARDS, CLUSTER NODES and redirects, replacing
the IP (`custer-announce-ip` or the auto-detected IP) of each node.
Which one is reported to the client depends on whether the client is
connected over IPv4 or IPv6.

Benefits:

* This allows clients using IPv4 to get the IPv4 addresses of all
cluster nodes and IPv6 clients to get the IPv6 clients.
* This allows the IPs visible to clients to be different to the IPs used
between the cluster nodes due to NAT'ing.

The information is propagated in the cluster bus using new Ping
extensions. (Old nodes without this feature ignore unknown Ping
extensions.)

This adds another dimension to CLUSTER SLOTS reply. It now depends on
the client's use of TLS, the IP address family and RESP version.
Refactoring: The cached connection type definition is moved from
connection.h (it actually has nothing to do with the connection
abstraction) to server.h and is changed to a bitmap, with one bit for
each of TLS, IPv6 and RESP3.

Fixes #337

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-07-10 13:53:52 +02:00
bentotten
f2bbd1ff0f
Fix minor memory leak in clusterLoadConfig (#741)
We forgot to call sdsfreesplitres in the error path during a
nodes.conf corruption check, this function exits on the error
paths so this is just a cleanup.

Signed-off-by: bentotten <59932872+bentotten@users.noreply.github.com>
2024-07-04 16:55:55 -07:00
Binbin
2d6791bb11
Use clusterNodeIsVotingPrimary function to check the right (#735)
Minor cleanups.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-07-03 20:42:25 +02:00
skyfirelee
e4c1f6d45a
Replace client flags to bitfield (#614) 2024-06-30 11:33:10 -07:00
zhaozhao.zz
4fbe31ab87
Fix the TLS and REPS issues about CLUSTER SLOTS cache (#581)
PR #53 introduced the cache of CLUSTER SLOTS response, but the cache has
some problems for different types of clients:

1. the RESP version is wrongly ignored:

    ```
    $./valkey-cli
    127.0.0.1:6379> cluster slots
    1) 1) (integer) 0
       2) (integer) 16383
       3) 1) ""
          2) (integer) 6379
          3) "f1aeceb352401ce57acd432c68c60b359c00ef85"
          4) (empty array)
    127.0.0.1:6379> hello 3
    1# "server" => "valkey"
    2# "version" => "255.255.255"
    3# "proto" => (integer) 3
    4# "id" => (integer) 3
    5# "mode" => "cluster"
    6# "role" => "master"
    7# "modules" => (empty array)
    127.0.0.1:6379> cluster slots
    1) 1) (integer) 0
       2) (integer) 16383
       3) 1) ""
          2) (integer) 6379
          3) "f1aeceb352401ce57acd432c68c60b359c00ef85"
          4) (empty array)
    ```

    RESP3 should get "empty hash" but get RESP2's "empty array"

3. we should use the original client's connect type, or lua/function and
module would get wrong port:

    ```
    $./valkey-cli --tls --insecure -p 6789
    127.0.0.1:6789> config get port tls-port
    1) "tls-port"
    2) "6789"
    3) "port"
    4) "6379"
    127.0.0.1:6789> cluster slots
    1) 1) (integer) 0
       2) (integer) 16383
       3) 1) ""
          2) (integer) 6789
          3) "f1aeceb352401ce57acd432c68c60b359c00ef85"
          4) (empty array)
    127.0.0.1:6789> eval "return redis.call('cluster','slots')" 0
    1) 1) (integer) 0
       2) (integer) 16383
       3) 1) ""
          2) (integer) 6379
          3) "f1aeceb352401ce57acd432c68c60b359c00ef85"
          4) (empty array)
        ```

---------

Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2024-06-28 14:56:13 +08:00
Pierre
495c35d918
Add check in CLUSTERLINK KILL cmd to avoid freeing links to myself (#689)
Add check in CLUSTERLINK KILL cmd to avoid freeing cluster bus links to
myself. Also add an assert in `freeClusterLink()`.

Testing:
```
127.0.0.1:6379> debug clusterlink kill all c0404ee68574c6aa1048aaebfe90283afe51d2fc
(error) ERR Cannot free cluster link(s) to myself
```

Signed-off-by: Pierre Turin <pieturin@amazon.com>
2024-06-25 15:18:30 -07:00
Ping Xie
32ca6e5b38
Improve CLUSTER SETSLOT replication handling to support older replica versions. (#686) 2024-06-23 22:08:52 -07:00
Ping Xie
4135894a5d
Update remaining master references to primary (#660)
Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-17 20:31:15 -07:00
Binbin
495a121f19
Adjust the log level of some logs in the cluster (#633)
I think the log of pfail status changes will be very useful.
The other parts were scanned and found that it can be modified.

Changes:
1. Changing pfail status releated logs from VERBOSE to NOTICE.
2. Changing configEpoch collision log from VERBOSE(warning) to NOTICE.
3. Changing some logs from DEBUG to NOTICE.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-18 10:46:56 +08:00
Binbin
db6d3c1138
Only primary with slots has the right to mark a node as failed (#634)
In markNodeAsFailingIfNeeded we will count needed_quorum and failures,
needed_quorum is the half the cluster->size and plus one, and
cluster-size
is the size of primary node which contain slots, but when counting
failures, we dit not check if primary has slots.

Only the primary has slots that has the rights to vote, adding a new
clusterNodeIsVotingPrimary to formalize this concept.

Release notes:

bugfix where nodes not in the quorum group might spuriously mark nodes
as failed

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@outlook.com>
2024-06-16 20:46:08 -07:00
Sankar
a81c32079c
Make cluster meet reliable under link failures (#461)
When there is a link failure while an ongoing MEET request is sent the
sending node stops sending anymore MEET and starts sending PINGs. Since
every node responds to PINGs from unknown nodes with a PONG, the
receiving node never adds the sending node. But the sending node adds
the receiving node when it sees a PONG. This can lead to asymmetry in
cluster membership. This changes makes the sender keep sending MEET
until it sees a PONG, avoiding the asymmetry.

---------

Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com>
2024-06-16 20:37:09 -07:00
Ping Xie
8a776c3509
Fix potential infinite loop in clusterNodeGetPrimary (#651) 2024-06-13 23:43:36 -07:00
Madelyn Olson
a3f1535b57
Fix misuse of safe iterators (#612)
Safe iterators must call resetIterators when they are done being used.
Fix one issue where a safe iterator was not correctly calling reset
during cluster slot caching and fixed a second issue where reset
iterator was being called twice. For the double release case,
kvstoreIteratorNextDict is responsible for patching up the iterator, but
we were calling it a second time in kvstoreIteratorNext.

In addition, I added some documentation around initializing iterators,
added an assert to prevent double initialization, and remove a function
from the public interface which isn't needed and might lead to incorrect
usage of the safe iterators.

Bumping srgsanky for finding it here:
c4782066e7 (r142867004).

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-10 12:30:57 -07:00
flowerysong
d28ae52004
Remove redundant function nextPingExt() (#613)
Functionally identical to the older, documented `getNextPingExt()`.

Fixes #610.

Signed-off-by: Paul Arthur <paul.arthur@flowerysong.com>
2024-06-08 21:55:58 -07:00
Ping Xie
aad6769a80
Replicate slot migration states via RDB aux fields (#586) 2024-06-07 20:32:27 -07:00
Ping Xie
54c9747935
Remove master and slave from source code (#591)
External facing interfaces are not affected.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-07 14:21:33 -07:00
Viktor Söderqvist
ad5fd5b95c
More rebranding (#606)
More rebranding of

* Log messages (#252)
* The DENIED error reply
* Internal function names and comments, mainly Lua API

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-06-07 01:40:55 +02:00
Shivshankar
9319f7aeca
Replace valkey in log and panic messages (#550)
Part of #207

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-06-04 20:46:59 +02:00
Eran Liberty
0700c441c6
Remove unused valDup (#443)
Remove the unused value duplicate API from dict. It's unused in the codebase and introduces unnecessary overhead. 

---------

Signed-off-by: Eran Liberty <eran.liberty@gmail.com>
2024-06-03 12:22:06 -07:00
Ping Xie
f927565d28
Consolidate various BLOCKED_WAIT* states (#562)
There are currently three block types: BLOCKED_WAIT, BLOCKED_WAITAOF,
and BLOCKED_WAIT_PREREPL, used to block clients executing `WAIT`,
`WAITAOF`, and `CLUSTER SETSLOT`, respectively. They share the same
workflow: the client is blocked until replication to the expected number
of replicas completes. However, they provide different responses
depending on the commands involved. Using distinct block types leads to
code duplication and reduced readability. This PR consolidates the three
types into a single WAIT type, differentiating them using the pending
command to ensure the appropriate response is returned.


Fix #427

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-30 23:45:47 -07:00
Binbin
6bab2d7968
Make sure clear the CLUSTER SLOTS cache on time when updating hostname (#564)
In #53, we will cache the CLUSTER SLOTS response to improve the
throughput and reduct the latency.

In the code snippet below, the second cluster slots will use the
old hostname:
```
config set cluster-preferred-endpoint-type hostname
config set cluster-announce-hostname old-hostname.com
multi
cluster slots
config set cluster-announce-hostname new-hostname.com
cluster slots
exec
```

When updating the hostname, in updateAnnouncedHostname, we will set
CLUSTER_TODO_SAVE_CONFIG and we will do a clearCachedClusterSlotsResponse
in clusterSaveConfigOrDie, so harmless in most cases.

Move the clearCachedClusterSlotsResponse call to clusterDoBeforeSleep
instead of scheduling it to be called in clusterSaveConfigOrDie.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-05-30 10:44:12 +08:00
Ping Xie
e4ead9442b
Make CLUSTER SETSLOT with TIMEOUT 0 block indefinitely (#556)
This aligns the behaviour with established Valkey commands with a
TIMEOUT argument, such as BLPOP.

Fix #422

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-27 07:11:24 -07:00
Ping Xie
c41dd77a3e
Add clang-format configs (#323)
I have validated that these settings closely match the existing coding
style with one major exception on `BreakBeforeBraces`, which will be
`Attach` going forward. The mixed `BreakBeforeBraces` styles in the
current codebase are hard to imitate and also very odd IMHO - see below

```
if (a == 1) { /*Attach */
}
```

```
if (a == 1 ||
    b == 2)
{ /* Why? */
}
```

Please do NOT merge just yet. Will add the github action next once the
style is reviewed/approved.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-22 23:24:12 -07:00
Roshan Khatri
c4782066e7
Cache CLUSTER SLOTS response for improving throughput and reduced latency. (#53)
This commit adds a logic to cache `CLUSTER SLOTS` response for reduced
latency and also updates the cache when a change in the cluster is
detected.

Historically, `CLUSTER SLOTS` command was deprecated, however all the
server clients have been using `CLUSTER SLOTS` and have not migrated to
`CLUSTER SHARDS`. In future this logic can be added to any other
commands to improve the performance of the engine.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2024-05-22 14:21:41 -07:00
Madelyn Olson
546cef6684
Initial cleanup for cluster refactoring (#460)
Cleaned up the minor cluster refactoring notes that were intended to be
follow ups that never happened. Basically:
1. Minor style nitpicks
2. Generalized clusterNodeIsMyself so that it wasn't implementation
dependent.
3. Removed getMyClusterId, and just make it an explicit call to myself's
name, which seems more straightforward and removes unnecessary
abstraction.
4. Remove clusterNodeGetSlaveof infavor of clusterNodeGetMaster. We
already do a check if it's a replica, and if it wasn't working it would
have been crashing.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-05-14 17:09:49 -07:00
Ping Xie
138a7d9846
Handle role change error in cluster setslot when migrating the last slot away with allow-replica-migration enabled (#466) 2024-05-09 18:12:55 -07:00
Ping Xie
6e7af9471c
Slot migration improvement (#445) 2024-05-06 21:40:28 -07:00
Shivshankar
8abeb79f52
Rename redis in aof logs and proc title redis-aof-rewrite to valkey-aof-rewrite (#393)
Renamed redis to valkey/server in aof.c serverlogs.

The AOF rewrite child process title is set to "redis-aof-rewrite" if
Valkey was started from a redis-server symlink, otherwise to
"valkey-aof-rewrite".

This is a breaking changes since logs are changed.

Part of #207.

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-05-01 18:15:19 +02:00
Lipeng Zhu
393c8fde29
Rename macros in config.h (#257)
This patch try to do following things:

1. Rename `redis_*` and `REDIS_*` macros defined in config.h to
`valkey_*`, `VALKEY_*` and update associated used files. (`redis_fstat`,
`redis_fsync`, `REDIS_THREAD_STACK_SIZE`, etc.)
2. Remove the leading double underscore for guard macro in config.h.

---------

Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
2024-04-23 14:20:35 +02:00
Bany
96d14fe263
Change Redis to Valkey in log messages (#226)
Log messages containing "Redis" in some files are changed.

Add macro SERVER_TITLE defined to "Valkey" (uppercase V) is introduced
and used in log messages, so at least it will be easy to patch this
definition to get Redis or any other name in the logs instead of Valkey.

Change "Redis" in some log messages to use %s and SERVER_TITLE.

This is a partial implementation of
https://github.com/valkey-io/valkey/issues/207

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-17 14:38:21 +02:00
Shivshankar
2e46046625
Rename macros in valkey-cli.c and redis_strlcpy to valkey (#284)
Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-10 22:50:52 +02:00
Jacob Murphy
df5db0627f
Remove trademarked language in code comments (#223)
This includes comments used for module API documentation.

* Strategy for replacement: Regex search: `(//|/\*| \*|#).* ("|\()?(r|R)edis( |\.
  |'|\n|,|-|\)|")(?!nor the names of its contributors)(?!Ltd.)(?!Labs)(?!Contributors.)`
* Don't edit copyright comments
* Replace "Redis version X.X" -> "Redis OSS version X.X" to distinguish
from newly licensed repository
* Replace "Redis Object" -> "Object"
* Exclude markdown for now
* Don't edit Lua scripting comments referring to redis.X API
* Replace "Redis Protocol" -> "RESP"
* Replace redis-benchmark, -cli, -server, -check-aof/rdb with "valkey-"
prefix
* Most other places, I use best judgement to either remove "Redis", or
replace with "the server" or "server"

Fixes #148

---------

Signed-off-by: Jacob Murphy <jkmurphy@google.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-09 10:24:03 +02:00
Harkrishn Patro
e59dd41e42
Maintain determinstic ordering of replica(s) in CLUSTER SLOTS response (#265)
Sort `clusterNode.slaves` while adding a new replica to the cluster on
basis of `name`. This will enable deterministic ordering of replica(s)
information in `CLUSTER SLOTS` response.

Before this change:

```
127.0.0.1:6380> CLUSTER SLOTS
1) 1) (integer) 0
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 6379
      3) "fc72609a620c62d073a31eed9ddde5104c1fa302"
      4) (empty array)
   4) 1) "127.0.0.1"
      2) (integer) 6381
      3) "fac84bbf2ffc5cfcdebc92c06b8ead9c3cba4051"
      4) (empty array)
   5) 1) "127.0.0.1"
      2) (integer) 6380
      3) "b1249d394326f1485df0b895f2fd38e141aa5056"
      4) (empty array)
```

After this change:

```
127.0.0.1:6380> CLUSTER SLOTS
1) 1) (integer) 0
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 6379
      3) "fc72609a620c62d073a31eed9ddde5104c1fa302"
      4) (empty array)
   4) 1) "127.0.0.1"
      2) (integer) 6380
      3) "b1249d394326f1485df0b895f2fd38e141aa5056"
      4) (empty array)
   5) 1) "127.0.0.1"
      2) (integer) 6381
      3) "fac84bbf2ffc5cfcdebc92c06b8ead9c3cba4051"
      4) (empty array)
```

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-04-08 19:58:23 -07:00
Harkrishn Patro
ebfb440629
Pass extensions to node if extension processing is handled by it (#52)
Ref: https://github.com/redis/redis/pull/12760

### Description

#### Fixes compatibilty of PlaceholderKV cluster (7.2 - extensions
enabled by default) with older Redis cluster (< 7.0 - extensions not
handled) .

With some of the extensions enabled by default in 7.2 version, new nodes
running 7.2 and above start sending out larger clusterbus message
payload including the ping extensions. This caused an incompatibility
with node running engine versions < 7.0. Old nodes (< 7.0) would receive
the payload from new nodes (> 7.2) would observe a payload length
(totlen) > (estlen) and would perform an early exit and won't process
the message.

This fix introduces a flag `extensions_supported` on the clusterMsg
indicating the sender node can handle extensions parsing. Once, a
receiver nodes receives a message with this flag set to 1, it would
update clusterNode new field extensions_supported and start sending out
extensions if it has any.


This PR also introduces a DEBUG sub command to enable/disable cluster
message extensions `process-clustermsg-extensions` feature.

Note: A successful `PING`/`PONG` is required as a sender for a given
node to be marked as `extensions_supported` and then extensions message
will be sent to it. This could cause a slight delay in receiving the
extensions message(s).

### Testing

TCL test verifying the cluster state is healthy irrespective of
enabling/disabling cluster message extensions feature.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-04-08 09:01:30 -07:00
0del
e3e1f9a372
Rename 'redis' to 'server' and redisNodeFlags to clusterNodeFlags (#191)
Rename additional instances of redis to server, as well as redisNodeFlags to clusterNodeFlags.

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:45:23 -07:00
0del
f753db5141
rename redis functions in server.h (#179)
redisPopcount -> serverPopcount
redisSetProcTitle -> serverSetProcTitle
redisCommunicateSystemd -> serverCommunicateSystemd
redisSetCpuAffinity -> serverSetCpuAffinity
redisFork -> serverFork

#144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:26:33 +02:00
Ping Xie
28976a9003
Fix PONG message processing for primary-ship tracking during failovers (#13055)
This commit updates the processing of PONG gossip messages in the
cluster. When a node (B) becomes a replica due to a failover, its PONG
messages include its new primary node's (A) information and B's
configuration epoch is aligned with A's. This allows observer nodes to
identify changes in primary-ship, addressing issues of intermediate
states and enhancing cluster state consistency during topology changes.

Fix #13018
2024-03-04 17:32:25 -08:00
Sankar
c1d2ac2a73
Do not include gossip about receiver in cluster messages (#13046)
The receiver does not update any of its cluster state based on gossip
about itself. This commit explicitly avoids sending or processing gossip
about the receiver.

Currently cluster bus gossips include 10% of nodes in the cluster with a
minimum of 3 nodes. For up to 30 node clusters, this commit makes sure
that 1/3 of the gossip (1 out of 3 gossips) is never discarded. This
should help with relatively faster convergence of cluster state in
general.
2024-02-13 16:38:37 -08:00
Binbin
81666a6510
Fix heap-use-after-free when pubsubshard_channels became NULL (#13038)
After fix for #13033, address sanitizer reports this heap-use-after-free
error. When the pubsubshard_channels dict becomes empty, we will delete
the dict, and the dictReleaseIterator will call dictResetIterator, it
will use the dict so we will trigger the error.

This PR introduced a new struct kvstoreDictIterator to wrap
dictIterator.
Replace the original dict iterator with the new kvstore dict iterator.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: guybe7 <guy.benoish@redislabs.com>
2024-02-07 14:53:50 +02:00