3 Commits

Author SHA1 Message Date
Pierre
5f7fe9ef21
Send MEET packet to node if there is no inbound link to fix inconsistency when handshake timedout (#1307)
In some cases, when meeting a new node, if the handshake times out, we
can end up with an inconsistent view of the cluster where the new node
knows about all the nodes in the cluster, but the cluster does not know
about this new node (or vice versa).
To detect this inconsistency, we now check if a node has an outbound
link but no inbound link, in this case it probably means this node does
not know us. In this case we (re-)send a MEET packet to this node to do
a new handshake with it.
If we receive a MEET packet from a known node, we disconnect the
outbound link to force a reconnect and sending of a PING packet so that
the other node recognizes the link as belonging to us. This prevents
cases where a node could send MEET packets in a loop because it thinks
the other node does not have an inbound link.

This fixes the bug described in #1251.

---------

Signed-off-by: Pierre Turin <pieturin@amazon.com>
2024-12-11 17:26:06 -08:00
Sankar
eff45f5467
Fix flakiness of cluster-multiple-meets and cluster-reliable-meet (#728)
Tests in cluster-multiple-meets were flaky as reported by @madolson 

*
https://github.com/valkey-io/valkey/actions/runs/9688455588/job/26776953320
*
https://github.com/valkey-io/valkey/actions/runs/9688455588/job/26776953585

I wasn't able to reproduce this locally, but I suspect that the
flakiness is coming from the fact that nodes are reported as "connected"
as long as there is an outgoing link. An outgoing link is created before
MEET is sent out.

Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com>
2024-07-01 22:27:38 -07:00
Sankar
a81c32079c
Make cluster meet reliable under link failures (#461)
When there is a link failure while an ongoing MEET request is sent the
sending node stops sending anymore MEET and starts sending PINGs. Since
every node responds to PINGs from unknown nodes with a PONG, the
receiving node never adds the sending node. But the sending node adds
the receiving node when it sees a PONG. This can lead to asymmetry in
cluster membership. This changes makes the sender keep sending MEET
until it sees a PONG, avoiding the asymmetry.

---------

Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com>
2024-06-16 20:37:09 -07:00