From ffb691f6f1b0f176b1f03c474f1dfd854ee6bf78 Mon Sep 17 00:00:00 2001
From: Binbin <binloveplay1314@qq.com>
Date: Wed, 1 Feb 2023 20:48:16 +0800
Subject: [PATCH] Fix handshake timeout replication test race (#11773)

Test on x86 + TLS fail with this error:
```
*** [err]: Slave is able to detect timeout during handshake in tests/integration/replication.tcl
Replica is not able to detect timeout
```

The replica logs is:
```
 ### Starting test Slave is able to detect timeout during handshake in tests/integration/replication.tcl
7681:S 05 Jan 2023 00:21:56.635 * Non blocking connect for SYNC fired the event.
7681:S 05 Jan 2023 00:21:56.638 * Master replied to PING, replication can continue...
7681:S 05 Jan 2023 00:21:56.638 * Trying a partial resynchronization (request ef70638885500aad12dd673c68ca1541116a59fe:1).
7681:S 05 Jan 2023 00:22:56.894 # Failed to read response from the server: error:0A000126:SSL routines::unexpected eof while reading
7681:S 05 Jan 2023 00:22:56.894 # Master did not reply to PSYNC, will try later
```

This is another issue that appeared after #11640 was merged. This PR try to fix it.
The idea is to make it stable in `wait_bgsave`, for example, it may wait until the
next psync retry in the following situation: `Master did not reply to PSYNC, will try later`

Other than that, the change will make the test more consistent / predictable since
it'll mean the master is always frozen in the desired state (waiting for repl-diskless-sync-delay
to happen, rather than earlier stages of the handshake).
---
 tests/integration/replication.tcl | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tests/integration/replication.tcl b/tests/integration/replication.tcl
index 319462695..e23edad9d 100644
--- a/tests/integration/replication.tcl
+++ b/tests/integration/replication.tcl
@@ -31,6 +31,14 @@ start_server {tags {"repl network external:skip"}} {
             }
         }
 
+        test {Slave enters wait_bgsave} {
+            wait_for_condition 50 1000 {
+                [string match *state=wait_bgsave* [$master info replication]]
+            } else {
+                fail "Replica does not enter wait_bgsave state"
+            }
+        }
+
         # Use a short replication timeout on the slave, so that if there
         # are no bugs the timeout is triggered in a reasonable amount
         # of time.