57 Commits

Author SHA1 Message Date
Binbin
495a121f19
Adjust the log level of some logs in the cluster (#633)
I think the log of pfail status changes will be very useful.
The other parts were scanned and found that it can be modified.

Changes:
1. Changing pfail status releated logs from VERBOSE to NOTICE.
2. Changing configEpoch collision log from VERBOSE(warning) to NOTICE.
3. Changing some logs from DEBUG to NOTICE.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-18 10:46:56 +08:00
Binbin
db6d3c1138
Only primary with slots has the right to mark a node as failed (#634)
In markNodeAsFailingIfNeeded we will count needed_quorum and failures,
needed_quorum is the half the cluster->size and plus one, and
cluster-size
is the size of primary node which contain slots, but when counting
failures, we dit not check if primary has slots.

Only the primary has slots that has the rights to vote, adding a new
clusterNodeIsVotingPrimary to formalize this concept.

Release notes:

bugfix where nodes not in the quorum group might spuriously mark nodes
as failed

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@outlook.com>
2024-06-16 20:46:08 -07:00
Sankar
a81c32079c
Make cluster meet reliable under link failures (#461)
When there is a link failure while an ongoing MEET request is sent the
sending node stops sending anymore MEET and starts sending PINGs. Since
every node responds to PINGs from unknown nodes with a PONG, the
receiving node never adds the sending node. But the sending node adds
the receiving node when it sees a PONG. This can lead to asymmetry in
cluster membership. This changes makes the sender keep sending MEET
until it sees a PONG, avoiding the asymmetry.

---------

Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com>
2024-06-16 20:37:09 -07:00
Ping Xie
8a776c3509
Fix potential infinite loop in clusterNodeGetPrimary (#651) 2024-06-13 23:43:36 -07:00
Madelyn Olson
a3f1535b57
Fix misuse of safe iterators (#612)
Safe iterators must call resetIterators when they are done being used.
Fix one issue where a safe iterator was not correctly calling reset
during cluster slot caching and fixed a second issue where reset
iterator was being called twice. For the double release case,
kvstoreIteratorNextDict is responsible for patching up the iterator, but
we were calling it a second time in kvstoreIteratorNext.

In addition, I added some documentation around initializing iterators,
added an assert to prevent double initialization, and remove a function
from the public interface which isn't needed and might lead to incorrect
usage of the safe iterators.

Bumping srgsanky for finding it here:
c4782066e7 (r142867004).

---------

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-10 12:30:57 -07:00
flowerysong
d28ae52004
Remove redundant function nextPingExt() (#613)
Functionally identical to the older, documented `getNextPingExt()`.

Fixes #610.

Signed-off-by: Paul Arthur <paul.arthur@flowerysong.com>
2024-06-08 21:55:58 -07:00
Ping Xie
aad6769a80
Replicate slot migration states via RDB aux fields (#586) 2024-06-07 20:32:27 -07:00
Ping Xie
54c9747935
Remove master and slave from source code (#591)
External facing interfaces are not affected.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-07 14:21:33 -07:00
Viktor Söderqvist
ad5fd5b95c
More rebranding (#606)
More rebranding of

* Log messages (#252)
* The DENIED error reply
* Internal function names and comments, mainly Lua API

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-06-07 01:40:55 +02:00
Shivshankar
9319f7aeca
Replace valkey in log and panic messages (#550)
Part of #207

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-06-04 20:46:59 +02:00
Eran Liberty
0700c441c6
Remove unused valDup (#443)
Remove the unused value duplicate API from dict. It's unused in the codebase and introduces unnecessary overhead. 

---------

Signed-off-by: Eran Liberty <eran.liberty@gmail.com>
2024-06-03 12:22:06 -07:00
Ping Xie
f927565d28
Consolidate various BLOCKED_WAIT* states (#562)
There are currently three block types: BLOCKED_WAIT, BLOCKED_WAITAOF,
and BLOCKED_WAIT_PREREPL, used to block clients executing `WAIT`,
`WAITAOF`, and `CLUSTER SETSLOT`, respectively. They share the same
workflow: the client is blocked until replication to the expected number
of replicas completes. However, they provide different responses
depending on the commands involved. Using distinct block types leads to
code duplication and reduced readability. This PR consolidates the three
types into a single WAIT type, differentiating them using the pending
command to ensure the appropriate response is returned.


Fix #427

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-30 23:45:47 -07:00
Binbin
6bab2d7968
Make sure clear the CLUSTER SLOTS cache on time when updating hostname (#564)
In #53, we will cache the CLUSTER SLOTS response to improve the
throughput and reduct the latency.

In the code snippet below, the second cluster slots will use the
old hostname:
```
config set cluster-preferred-endpoint-type hostname
config set cluster-announce-hostname old-hostname.com
multi
cluster slots
config set cluster-announce-hostname new-hostname.com
cluster slots
exec
```

When updating the hostname, in updateAnnouncedHostname, we will set
CLUSTER_TODO_SAVE_CONFIG and we will do a clearCachedClusterSlotsResponse
in clusterSaveConfigOrDie, so harmless in most cases.

Move the clearCachedClusterSlotsResponse call to clusterDoBeforeSleep
instead of scheduling it to be called in clusterSaveConfigOrDie.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-05-30 10:44:12 +08:00
Ping Xie
e4ead9442b
Make CLUSTER SETSLOT with TIMEOUT 0 block indefinitely (#556)
This aligns the behaviour with established Valkey commands with a
TIMEOUT argument, such as BLPOP.

Fix #422

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-27 07:11:24 -07:00
Ping Xie
c41dd77a3e
Add clang-format configs (#323)
I have validated that these settings closely match the existing coding
style with one major exception on `BreakBeforeBraces`, which will be
`Attach` going forward. The mixed `BreakBeforeBraces` styles in the
current codebase are hard to imitate and also very odd IMHO - see below

```
if (a == 1) { /*Attach */
}
```

```
if (a == 1 ||
    b == 2)
{ /* Why? */
}
```

Please do NOT merge just yet. Will add the github action next once the
style is reviewed/approved.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-22 23:24:12 -07:00
Roshan Khatri
c4782066e7
Cache CLUSTER SLOTS response for improving throughput and reduced latency. (#53)
This commit adds a logic to cache `CLUSTER SLOTS` response for reduced
latency and also updates the cache when a change in the cluster is
detected.

Historically, `CLUSTER SLOTS` command was deprecated, however all the
server clients have been using `CLUSTER SLOTS` and have not migrated to
`CLUSTER SHARDS`. In future this logic can be added to any other
commands to improve the performance of the engine.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2024-05-22 14:21:41 -07:00
Madelyn Olson
546cef6684
Initial cleanup for cluster refactoring (#460)
Cleaned up the minor cluster refactoring notes that were intended to be
follow ups that never happened. Basically:
1. Minor style nitpicks
2. Generalized clusterNodeIsMyself so that it wasn't implementation
dependent.
3. Removed getMyClusterId, and just make it an explicit call to myself's
name, which seems more straightforward and removes unnecessary
abstraction.
4. Remove clusterNodeGetSlaveof infavor of clusterNodeGetMaster. We
already do a check if it's a replica, and if it wasn't working it would
have been crashing.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-05-14 17:09:49 -07:00
Ping Xie
138a7d9846
Handle role change error in cluster setslot when migrating the last slot away with allow-replica-migration enabled (#466) 2024-05-09 18:12:55 -07:00
Ping Xie
6e7af9471c
Slot migration improvement (#445) 2024-05-06 21:40:28 -07:00
Shivshankar
8abeb79f52
Rename redis in aof logs and proc title redis-aof-rewrite to valkey-aof-rewrite (#393)
Renamed redis to valkey/server in aof.c serverlogs.

The AOF rewrite child process title is set to "redis-aof-rewrite" if
Valkey was started from a redis-server symlink, otherwise to
"valkey-aof-rewrite".

This is a breaking changes since logs are changed.

Part of #207.

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-05-01 18:15:19 +02:00
Lipeng Zhu
393c8fde29
Rename macros in config.h (#257)
This patch try to do following things:

1. Rename `redis_*` and `REDIS_*` macros defined in config.h to
`valkey_*`, `VALKEY_*` and update associated used files. (`redis_fstat`,
`redis_fsync`, `REDIS_THREAD_STACK_SIZE`, etc.)
2. Remove the leading double underscore for guard macro in config.h.

---------

Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
2024-04-23 14:20:35 +02:00
Bany
96d14fe263
Change Redis to Valkey in log messages (#226)
Log messages containing "Redis" in some files are changed.

Add macro SERVER_TITLE defined to "Valkey" (uppercase V) is introduced
and used in log messages, so at least it will be easy to patch this
definition to get Redis or any other name in the logs instead of Valkey.

Change "Redis" in some log messages to use %s and SERVER_TITLE.

This is a partial implementation of
https://github.com/valkey-io/valkey/issues/207

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-17 14:38:21 +02:00
Shivshankar
2e46046625
Rename macros in valkey-cli.c and redis_strlcpy to valkey (#284)
Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-10 22:50:52 +02:00
Jacob Murphy
df5db0627f
Remove trademarked language in code comments (#223)
This includes comments used for module API documentation.

* Strategy for replacement: Regex search: `(//|/\*| \*|#).* ("|\()?(r|R)edis( |\.
  |'|\n|,|-|\)|")(?!nor the names of its contributors)(?!Ltd.)(?!Labs)(?!Contributors.)`
* Don't edit copyright comments
* Replace "Redis version X.X" -> "Redis OSS version X.X" to distinguish
from newly licensed repository
* Replace "Redis Object" -> "Object"
* Exclude markdown for now
* Don't edit Lua scripting comments referring to redis.X API
* Replace "Redis Protocol" -> "RESP"
* Replace redis-benchmark, -cli, -server, -check-aof/rdb with "valkey-"
prefix
* Most other places, I use best judgement to either remove "Redis", or
replace with "the server" or "server"

Fixes #148

---------

Signed-off-by: Jacob Murphy <jkmurphy@google.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-04-09 10:24:03 +02:00
Harkrishn Patro
e59dd41e42
Maintain determinstic ordering of replica(s) in CLUSTER SLOTS response (#265)
Sort `clusterNode.slaves` while adding a new replica to the cluster on
basis of `name`. This will enable deterministic ordering of replica(s)
information in `CLUSTER SLOTS` response.

Before this change:

```
127.0.0.1:6380> CLUSTER SLOTS
1) 1) (integer) 0
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 6379
      3) "fc72609a620c62d073a31eed9ddde5104c1fa302"
      4) (empty array)
   4) 1) "127.0.0.1"
      2) (integer) 6381
      3) "fac84bbf2ffc5cfcdebc92c06b8ead9c3cba4051"
      4) (empty array)
   5) 1) "127.0.0.1"
      2) (integer) 6380
      3) "b1249d394326f1485df0b895f2fd38e141aa5056"
      4) (empty array)
```

After this change:

```
127.0.0.1:6380> CLUSTER SLOTS
1) 1) (integer) 0
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 6379
      3) "fc72609a620c62d073a31eed9ddde5104c1fa302"
      4) (empty array)
   4) 1) "127.0.0.1"
      2) (integer) 6380
      3) "b1249d394326f1485df0b895f2fd38e141aa5056"
      4) (empty array)
   5) 1) "127.0.0.1"
      2) (integer) 6381
      3) "fac84bbf2ffc5cfcdebc92c06b8ead9c3cba4051"
      4) (empty array)
```

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-04-08 19:58:23 -07:00
Harkrishn Patro
ebfb440629
Pass extensions to node if extension processing is handled by it (#52)
Ref: https://github.com/redis/redis/pull/12760

### Description

#### Fixes compatibilty of PlaceholderKV cluster (7.2 - extensions
enabled by default) with older Redis cluster (< 7.0 - extensions not
handled) .

With some of the extensions enabled by default in 7.2 version, new nodes
running 7.2 and above start sending out larger clusterbus message
payload including the ping extensions. This caused an incompatibility
with node running engine versions < 7.0. Old nodes (< 7.0) would receive
the payload from new nodes (> 7.2) would observe a payload length
(totlen) > (estlen) and would perform an early exit and won't process
the message.

This fix introduces a flag `extensions_supported` on the clusterMsg
indicating the sender node can handle extensions parsing. Once, a
receiver nodes receives a message with this flag set to 1, it would
update clusterNode new field extensions_supported and start sending out
extensions if it has any.


This PR also introduces a DEBUG sub command to enable/disable cluster
message extensions `process-clustermsg-extensions` feature.

Note: A successful `PING`/`PONG` is required as a sender for a given
node to be marked as `extensions_supported` and then extensions message
will be sent to it. This could cause a slight delay in receiving the
extensions message(s).

### Testing

TCL test verifying the cluster state is healthy irrespective of
enabling/disabling cluster message extensions feature.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
2024-04-08 09:01:30 -07:00
0del
e3e1f9a372
Rename 'redis' to 'server' and redisNodeFlags to clusterNodeFlags (#191)
Rename additional instances of redis to server, as well as redisNodeFlags to clusterNodeFlags.

---------

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 18:45:23 -07:00
0del
f753db5141
rename redis functions in server.h (#179)
redisPopcount -> serverPopcount
redisSetProcTitle -> serverSetProcTitle
redisCommunicateSystemd -> serverCommunicateSystemd
redisSetCpuAffinity -> serverSetCpuAffinity
redisFork -> serverFork

#144

Signed-off-by: 0del <bany.y0599@gmail.com>
2024-04-03 20:26:33 +02:00
Ping Xie
28976a9003
Fix PONG message processing for primary-ship tracking during failovers (#13055)
This commit updates the processing of PONG gossip messages in the
cluster. When a node (B) becomes a replica due to a failover, its PONG
messages include its new primary node's (A) information and B's
configuration epoch is aligned with A's. This allows observer nodes to
identify changes in primary-ship, addressing issues of intermediate
states and enhancing cluster state consistency during topology changes.

Fix #13018
2024-03-04 17:32:25 -08:00
Sankar
c1d2ac2a73
Do not include gossip about receiver in cluster messages (#13046)
The receiver does not update any of its cluster state based on gossip
about itself. This commit explicitly avoids sending or processing gossip
about the receiver.

Currently cluster bus gossips include 10% of nodes in the cluster with a
minimum of 3 nodes. For up to 30 node clusters, this commit makes sure
that 1/3 of the gossip (1 out of 3 gossips) is never discarded. This
should help with relatively faster convergence of cluster state in
general.
2024-02-13 16:38:37 -08:00
Binbin
81666a6510
Fix heap-use-after-free when pubsubshard_channels became NULL (#13038)
After fix for #13033, address sanitizer reports this heap-use-after-free
error. When the pubsubshard_channels dict becomes empty, we will delete
the dict, and the dictReleaseIterator will call dictResetIterator, it
will use the dict so we will trigger the error.

This PR introduced a new struct kvstoreDictIterator to wrap
dictIterator.
Replace the original dict iterator with the new kvstore dict iterator.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: guybe7 <guy.benoish@redislabs.com>
2024-02-07 14:53:50 +02:00
guybe7
8cd62f82ca
Refactor the per-slot dict-array db.c into a new kvstore data structure (#12822)
# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of https://github.com/redis/redis/pull/12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
https://github.com/redis/redis/pull/12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
2024-02-05 17:21:35 +02:00
Binbin
07b292af5e
Add sender NULL check in clusterProcessGossipSection invalid_ids case (#12980)
In the following case sender may be unknown, so we need to set up a
NULL check for sender:
```
/* If this is a MEET packet from an unknown node, we still process
 * the gossip section here since we have to trust the sender because
 * of the message type. */
if (!sender && type == CLUSTERMSG_TYPE_MEET)
    clusterProcessGossipSection(hdr,link);
```
2024-01-23 09:45:02 -08:00
Brennan
e12f2decc1
Prevent nodes with invalid IDs from being propagated through gossip (#12921)
There have been occasional instances of memory corruption (though code bugs or bit flips) leading to invalid node information being gossiped around. To prevent this invalid information spreading, we verify the node IDs in received gossip are in an acceptable format, and disregard any gossiped nodes with invalid IDs. This PR uses the existing verifyClusterNodeId function to check the validity of the gossiped node IDs and if an invalid one is encountered, logs raw byte information to help debug the corruption.

---------

Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-01-22 11:25:43 -08:00
Harkrishn Patro
964f4a4576
Avoid double free of cluster link (#12930)
Avoid crash while performing `DEBUG CLUSTERLINK KILL` mutliple times
(cluster link might not be created/valid).
2024-01-11 15:59:22 -08:00
bentotten
b3aaa0a136
When one shard, sole primary node marks potentially failed replica as FAIL instead of PFAIL (#12824)
Fixes issue where a single primary cannot mark a replica as failed in a
single-shard cluster.
2024-01-11 15:48:19 -08:00
Binbin
5b0c6a8255
Fix CLUSTER SHARDS crash in 7.0/7.2 mixed clusters where shard ids are not sync (#12832)
Crash reported in #12695. In the process of upgrading the cluster from
7.0 to 7.2, because the 7.0 nodes will not gossip shard id, in 7.2 we
will rely on shard id to build the server.cluster->shards dict.

In some cases, for example, the 7.0 master node and the 7.2 replica node.
From the view of 7.2 replica node, the cluster->shards dictionary does not
have its master node. In this case calling CLUSTER SHARDS on the 7.2 replica
node may crash.

We should fix the underlying assumption of updateShardId, which is that the
shard dict should be always in sync with the node's shard_id. The fix was
suggested by PingXie, see more details in #12695.
2024-01-07 20:54:41 -08:00
Binbin
4cae66f5e8
Use shard-id of the master if the replica does not support shard-id (#12805)
If there are nodes in the cluster that do not support shard-id, they
will gossip shard-id. From the perspective of nodes that support shard-id,
their shard-id is meaningless (since shard-id is randomly generated when
we create a node.)

Nodes that support shard-id will save the shard-id information in nodes.conf.
If the node is restarted according to nodes.conf, the server will report a
corrupted cluster config file error. Because auxShardIdSetter will reject
configurations with inconsistent master-replica shard-ids.

A cluster-wide consensus for the node's shard_id is not necessary. The key
is maintaining consistency of the shard_id on each individual 7.2 node.
As the cluster progressively upgrades to version 7.2, we can expect the
shard_ids across all nodes to naturally converge and align.

In this PR, when processing the gossip, if sender is a replica and does not
support shard-id, set the shard_id to the shard_id of its master.
2024-01-06 20:24:41 -08:00
Chen Tianjie
8527959598
Replace slots_to_channels radix tree with slot specific dictionaries for shard channels. (#12804)
We have achieved replacing `slots_to_keys` radix tree with key->slot
linked list (#9356), and then replacing the list with slot specific
dictionaries for keys (#11695).

Shard channels behave just like keys in many ways, and we also need a
slots->channels mapping. Currently this is still done by using a radix
tree. So we should split `server.pubsubshard_channels` into 16384 dicts
and drop the radix tree, just like what we did to DBs.

Some benefits (basically the benefits of what we've done to DBs):
1. Optimize counting channels in a slot. This is currently used only in
removing channels in a slot. But this is potentially more useful:
sometimes we need to know how many channels there are in a specific slot
when doing slot migration. Counting is now implemented by traversing the
radix tree, and with this PR it will be as simple as calling `dictSize`,
from O(n) to O(1).
2. The radix tree in the cluster has been removed. The shard channel
names no longer require additional storage, which can save memory.
3. Potentially useful in slot migration, as shard channels are logically
split by slots, thus making it easier to migrate, remove or add as a
whole.
4. Avoid rehashing a big dict when there is a large number of channels.

Drawbacks:
1. Takes more memory than using radix tree when there are relatively few
shard channels.

What this PR does:
1. in cluster mode, split `server.pubsubshard_channels` into 16384
dicts, in standalone mode, still use only one dict.
2. drop the `slots_to_channels` radix tree.
3. to save memory (to solve the drawback above), all 16384 dicts are
created lazily, which means only when a channel is about to be inserted
to the dict will the dict be initialized, and when all channels are
deleted, the dict would delete itself.
5. use `server.shard_channel_count` to keep track of the number of all
shard channels.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2023-12-27 17:40:45 +08:00
Binbin
c85a9b7896
Fix delKeysInSlot server events are not executed inside an execution unit (#12745)
This is a follow-up fix to #12733. We need to apply the same changes to
delKeysInSlot. Refer to #12733 for more details.

This PR contains some other minor cleanups / improvements to the test
suite and docs.
It uses the postnotifications test module in a cluster mode test which
revealed a leak in the test module (fixed).
2023-12-11 20:15:19 +02:00
zhaozhao.zz
8e11f84ded
Fix replica node cannot expand dicts when loading legacy RDB (#12839)
When loading RDB on cluster nodes, it is necessary to consider the
scenario where a node is a replica.

For example, during a rolling upgrade, new version instances are often
mounted as replicas on old version instances. In this case, the full
synchronization legacy RDB does not contain slot information, and the
new version instance, acting as a replica, should be able to handle the
legacy RDB correctly for `dbExpand`.

Additionally, renaming `getMyClusterSlotCount` to `getMyShardSlotCount`
would be appropriate.

Introduced in #11695
2023-12-07 14:30:48 +08:00
Binbin
8a4ccb01b3
Fix clusterLoadConfig aux_argv minor memory leak (#12726)
We forgot to call sdsfreesplitres. This is just a cleanup since it will
only be leaked in the error paths, and we will exit on the error paths.
2023-12-03 11:00:53 +02:00
Yehoshua (Josh) Hershberg
ef1ca6c882
add some file level comments and copyright (#12793)
A followup PR for #12742
Add some brief comments explaining the purpose of the file to the head
of cluster_legacy.c and cluster.c.
Add copyright notice to cluster.c

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
Co-authored-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 11:32:23 +02:00
Binbin
dedbf99a80
Fix dbExpand not dividing by slots, resulting in consuming slots times the dictExpand (#12773)
We meant to divide it by the number of slots, otherwise it will do slots
times
dictExpand, bug was introduced in #11695.
2023-11-22 11:16:06 +02:00
Josh Hershberg
eebb025826 Cluster refactor: Some code convention fixes
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:48 +02:00
Josh Hershberg
290f376429 Cluster refactor: fn renames + small compilation issue on ubuntu
Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
2e5181ef28 Cluster refactor: Add failover cmd support to cluster api
The failover command is up until now not supported
in cluster mode. This commit allows a cluster
implementation to support the command. The legacy
clustering implementation still does not support
this command.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
c6157b3510 Cluster refactor: Make clustering functions common
Move primary functions used to implement datapath
clustering into cluster.c, making them shared. This
required adding "accessor" and other functions to
abstract access to node details and cluster state.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
4afc54ad9b Cluster refactor: break up clusterCommand
Divide up clusterCommand into clusterCommand for shared
sub-commands and clusterCommandSpecial for implementation
specific sub-commands. So to, the cluster command help
sub-command has been divided into two implementations,
clusterCommandHelp and clusterCommandHelpSpecial. Some
common sub-subcommand implementations have been extracted
and their implemenations either made shared or else
implementation specific.

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00
Josh Hershberg
33ef6a3003 Cluster refactor: s/clusterNodeGetSlotBit/clusterNodeCoversSlot/
Simple rename, "GetSlotBit" is implementation specific

Signed-off-by: Josh Hershberg <yehoshua@redis.com>
2023-11-22 05:54:06 +02:00