100 Commits

Author SHA1 Message Date
antirez
a7ebb0c7bf WAIT command: synchronous replication for Redis. 2013-12-04 16:20:03 +01:00
antirez
c46f655c90 Log to what master a slave is going to connect to. 2013-11-11 09:25:36 +01:00
antirez
8432ddcedb Replication: install the write handler when reusing a cached master.
Sometimes when we resurrect a cached master after a successful partial
resynchronization attempt, there is pending data in the output buffers
of the client structure representing the master (likely REPLCONF ACK
commands).

If we don't reinstall the write handler, it will never be installed
again by addReply*() family functions as they'll assume that if there is
already data pending, the write handler is already installed.

This bug caused some slaves after a successful partial sync to never
send REPLCONF ACK, and continuously being detected as timing out by the
master, with a disconnection / reconnection loop.
2013-10-04 16:12:25 +02:00
antirez
cd73a69c18 PSYNC: safer handling of PSYNC requests.
There was a bug that over-esteemed the amount of backlog available,
however this could only happen when a slave was asking for an offset
that was in the "future" compared to the master replication backlog.

Now this case is handled well and logged as an incident in the master
log file.
2013-10-04 12:25:09 +02:00
antirez
ec3bd0695b Make clear that runids are not cluster node IDs. 2013-09-30 11:48:09 +02:00
Maxim Zakharov
1885c6bada A mistype fixed 2013-09-03 15:15:48 +02:00
antirez
a33c9fb250 replicationFeedSlaves() func name typo: feedReplicationBacklogWithObject -> feedReplicationBacklog. 2013-08-12 12:50:45 +02:00
antirez
6268dbdd94 replicationFeedSlave() reworked for correctness and speed.
The previous code using a static buffer as an optimization was lame:

1) Premature optimization, actually it was *slower* than naive code
   because resulted into the creation / destruction of the object
   encapsulating the output buffer.
2) The code was very hard to test, since it was needed to have specific
   tests for command lines exceeding the size of the static buffer.
3) As a result of "2" the code was bugged as the current tests were not
   able to stress specific corner cases.

It was replaced with easy to understand code that is safer and faster.
2013-08-12 12:50:29 +02:00
antirez
21cde6ecb7 Fix a PSYNC bug caused by a variable name typo. 2013-08-12 11:51:35 +02:00
antirez
4b8b7cb964 Replication: better way to send a preamble before RDB payload.
During the replication full resynchronization process, the RDB file is
transfered from the master to the slave. However there is a short
preamble to send, that is currently just the bulk payload length of the
file in the usual Redis form $..length..<CR><LF>.

This preamble used to be sent with a direct write call, assuming that
there was alway room in the socket output buffer to hold the few bytes
needed, however this does not scale in case we'll need to send more
stuff, and is not very robust code in general.

This commit introduces a more general mechanism to send a preamble up to
2GB in size (the max length of an sds string) in a non blocking way.
2013-08-12 10:29:14 +02:00
antirez
9efbe0dca0 Fix replicationFeedSlaves() off-by-one bug.
This fixes issue #1221.
2013-07-28 12:49:34 +02:00
antirez
06a0f621d6 Fix replicationFeedSlaves() to use sdsEncodedObject() macro. 2013-07-22 10:36:27 +02:00
Ted Nyman
efaa9d0bc4 Make sure the log standardizes on 'timeout' 2013-07-12 14:06:27 -07:00
antirez
80993d9892 Use getClientPeerId() for MONITOR implementation. 2013-07-09 16:21:21 +02:00
antirez
2fa66d5e76 Fix old anetPeerToString() API call in replication.c 2013-07-08 16:11:52 +02:00
Geoff Garside
dc7e8ec27f Update calls to anetPeerToString to include ip_len. 2013-07-08 15:57:22 +02:00
antirez
cdaacf03aa Don't disconnect pre PSYNC replication clients for timeout.
Clients using SYNC to replicate are older implementations, such as
redis-cli --slave, and are not designed to acknowledge the master with
REPLCONF ACK commands, so we don't have any feedback and should not
disconnect them on timeout.
2013-06-26 10:11:20 +02:00
antirez
eaebabe564 Use the RSC to replicate EVALSHA unmodified.
This commit uses the Replication Script Cache in order to avoid
translating EVALSHA into EVAL whenever possible for both the AOF and
slaves.
2013-06-24 18:57:31 +02:00
antirez
a9e1c46f40 Replication of scripts as EVALSHA: sha1 caching implemented.
This code is only responsible to take an LRU-evicted fixed length cache
of SHA1 that we are sure all the slaves received.

In this commit only the implementation is provided, but the Redis core
does not use it to actually send EVALSHA to slaves when possible.
2013-06-24 10:26:04 +02:00
antirez
e802b22dfb Refresh good slaves count when setting slave state as online. 2013-05-30 12:13:25 +02:00
antirez
cb76f29230 min-slaves-to-write: don't accept writes with less than N replicas.
This feature allows the user to specify the minimum number of
connected replicas having a lag less or equal than the specified
amount of seconds for writes to be accepted.
2013-05-30 11:30:04 +02:00
antirez
8752adc059 Close connection with timedout slaves.
Now masters, using the time at which the last REPLCONF ACK was received,
are able to explicitly disconnect slaves that are no longer responding.

Previously the only chance was to see a very long output buffer, that
was highly suboptimal.
2013-05-27 11:42:42 +02:00
antirez
8304ed9a06 Send ACK to master once every second.
ACKs can be also used as a base for synchronous replication. However in
that case they'll be explicitly requested by the master when the client
sends a request that needs to be replicated synchronously.
2013-05-27 11:42:38 +02:00
antirez
c9af89d8cd Don't ACK the master after every command.
Sending an ACK is now moved into the replicationSendAck() function.
2013-05-27 11:42:35 +02:00
antirez
dfc2575703 Make sure that REPLCONF ACK really has no return value. 2013-05-27 11:42:30 +02:00
antirez
6d2b8f5845 REPLCONF ACK command.
This special command is used by the slave to inform the master the
amount of replication stream it currently consumed.

it does not return anything so that we not need to consume additional
bandwidth needed by the master to reply something.

The master can do a number of things knowing the amount of stream
processed, such as understanding the "lag" in bytes of the slave, verify
if a given command was already processed by the slave, and so forth.
2013-05-27 11:42:17 +02:00
antirez
39c5f8a615 Cluster: SLAVEOF command not allowed in cluster mode. 2013-03-05 12:39:41 +01:00
antirez
7cf96d66ef Make sure replicationSetMaster() works when ip argument is not an sds. 2013-03-04 15:39:55 +01:00
antirez
06fa5f82d7 SLAVEOF command refactored into a proper API.
We now have replicationSetMaster() and replicationUnsetMaster() that can
be called in other contexts (for instance Redis Cluster).
2013-03-04 13:22:21 +01:00
antirez
646785ae48 Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:27:15 +01:00
antirez
c72be04d12 PSYNC: another change to unexpected reply from PSYNC. 2013-02-13 18:43:40 +01:00
antirez
67ef554e2e PSYNC: More robust handling of unexpected reply to PSYNC. 2013-02-13 18:33:33 +01:00
antirez
f8e3cd19ad Replication: more strict error checking for master PING reply. 2013-02-12 16:53:27 +01:00
antirez
12a3bf6245 Replication: added new stats counting full and partial resynchronizations. 2013-02-12 15:33:54 +01:00
antirez
d5bed58b08 PSYNC: debugging printf() calls are now logs at DEBUG level. 2013-02-12 12:52:22 +01:00
antirez
33a7ca234d Remove harmless warning in slaveTryPartialResynchronization(). 2013-02-12 12:52:21 +01:00
antirez
10756f5c4a PSYNC: don't use the client buffer to send +CONTINUE and +FULLRESYNC.
When we are preparing an handshake with the slave we can't touch the
connection buffer as it'll be used to accumulate differences between
the sent RDB file and what arrives next from clients.

So in short we can't use addReply() family functions.

However we just use write(2) because we know that the socket buffer is
empty, since a prerequisite for SYNC to work is that the static buffer
and the output list are empty, and in general it is not expected that a
client SYNCs after doing some heavy I/O with the master.

However a short write connection is explicitly handled to avoid
fragility (we simply close the connection and the slave will retry).
2013-02-12 12:52:21 +01:00
antirez
b5ddb829b5 SYNC not allowed with pending data on the static output buffer. 2013-02-12 12:52:21 +01:00
antirez
65c0a0eb2b Log the unexpected string received in place of the SYNC payload length. 2013-02-12 12:52:21 +01:00
antirez
16b114d7db After SLAVEOF <newslave> don't allow chained slaves to PSYNC. 2013-02-12 12:52:21 +01:00
antirez
75512d94d9 PSYNC: work in progress, preview #2, rebased to unstable. 2013-02-12 12:52:21 +01:00
antirez
4860dda15d Use the new unified protocol to send SELECT to slaves.
SELECT was still transmitted to slaves using the inline protocol, that
is conceived mostly for humans to type into telnet sessions, and is
notably not understood by redis-cli --slave.

Now the new protocol is used instead.
2013-02-12 12:50:28 +01:00
antirez
96c5190ed9 Use replicationFeedSlaves() to send PING to slaves.
A Redis master sends PING commands to slaves from time to time: doing
this ensures that even if absence of writes, the master->slave channel
remains active and the slave can feel the master presence, instead of
closing the connection for timeout.

This commit changes the way PINGs are sent to slaves in order to use the
standard interface used to replicate all the other commands, that is,
the function replicationFeedSlaves().

With this change the stream of commands sent to every slave is exactly
the same regardless of their exact state (Transferring RDB for first
synchronization or slave already online). With the previous
implementation the PING was only sent to online slaves, with the result
that the output stream from master to slaves was not identical for all
the slaves: this is a problem if we want to implement partial resyncs in
the future using a global replication stream offset.

TL;DR: this commit should not change the behaviour in practical terms,
but is just something in preparation for partial resynchronization
support.
2013-02-12 12:50:28 +01:00
antirez
08d5aa6a20 Emit SELECT to slaves in a centralized way.
Before this commit every Redis slave had its own selected database ID
state. This was not actually useful as the emitted stream of commands
is identical for all the slaves.

Now the the currently selected database is a global state that is set to
-1 when a new slave is attached, in order to force the SELECT command to
be re-emitted for all the slaves.

This change is useful in order to implement replication partial
resynchronization in the future, as makes sure that the stream of
commands received by slaves, including SELECT commands, are exactly the
same for every slave connected, at any time.

In this way we could have a global offset that can identify a specific
piece of the master -> slaves stream of commands.
2013-02-12 12:50:28 +01:00
antirez
c3ce83fac0 Make all WATCHers dirty when the slave reloads the DB. 2013-02-08 10:26:19 +01:00
antirez
d698a264d2 TCP_NODELAY after SYNC: changes to the implementation. 2013-02-05 12:04:30 +01:00
charsyam
6c7473623e Turn off TCP_NODELAY on the slave socket after SYNC.
Further details from @antirez:

It was reported by @StopForumSpam on Twitter that the Redis replication
link was strangely using multiple TCP packets for multiple commands.
This wastes a lot of bandwidth and is due to the TCP_NODELAY option we
enable on the socket after accepting a new connection.

However the master -> slave channel is a one-way channel since Redis
replication is asynchronous, so there is no point in trying to reduce
the latency, we should aim to reduce the bandwidth. For this reason this
commit introduces the ability to disable the nagle algorithm on the
socket after a successful SYNC.

This feature is off by default because the delay can be up to 40
milliseconds with normally configured Linux kernels.
2013-02-05 12:04:25 +01:00
guiquanz
df7a5b7157 Fixed many typos. 2013-01-19 10:59:44 +01:00
antirez
e84a6cd10c Undo slave-master handshake when SLAVEOF sets a new slave.
Issue #828 shows how Redis was not correctly undoing a non-blocking
connection attempt with the previous master when the master was set to a
new address using the SLAVEOF command.

This was also a result of lack of refactoring, so now there is a
function to cancel the non blocking handshake with the master.
The new function is now used when SLAVEOF NO ONE is called or when
SLAVEOF is used to set the master to a different address.
2013-01-15 13:33:24 +01:00
antirez
0fc5457a7f Better error reporting when fd event creation fails. 2013-01-03 14:29:34 +01:00