3921 Commits

Author SHA1 Message Date
antirez
21b4d6c23e SENTINEL SET master quorum implemented. 2014-01-14 09:23:26 +01:00
antirez
b279e578fa SENTINEL SET: error on bad option name + flush config on error. 2014-01-13 11:55:57 +01:00
antirez
74f84e3a3d SENTINEL SET implemented.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
2014-01-13 11:53:29 +01:00
antirez
1642da19bb Sentinel: fix wrong arity error message. 2014-01-13 11:05:13 +01:00
antirez
2c6c1b1271 Sentinel: SENTINEL REMOVE command added.
The command totally removes a monitored master.
2014-01-10 15:39:36 +01:00
antirez
2306608167 Sentinel: releaseSentinelRedisInstance() top comment fixed.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.
2014-01-10 15:33:42 +01:00
antirez
282b2b4660 Sentinel: flush config on disk when new master is added. 2014-01-10 15:22:06 +01:00
antirez
7dae2c3681 anetResolveIP() prototype added to anet.h. 2014-01-10 15:18:41 +01:00
antirez
61302ba560 Sentinel: SENTINEL MONITOR command implemented.
It allows to add new masters to monitor at runtime.
2014-01-10 15:18:24 +01:00
antirez
057392f876 anetResolveIP() added to anet.c.
The new function is used when we want to normalize an IP address without
performing a DNS lookup if the string to resolve is not a valid IP.

This is useful every time only IPs are valid inputs or when we want to
skip DNS resolution that is slow during runtime operations if we are
required to block.
2014-01-10 15:02:39 +01:00
antirez
7d7e3f00e0 Sentinel: added SENTINEL MASTER <name> command.
With SENTINEL MASTERS it was already possible to list all the configured
masters, but not a specific one.
2014-01-10 14:41:52 +01:00
antirez
46429f36a7 Add all the configurable fields to addReplySentinelRedisInstance().
Note: the auth password with the master is voluntarily not exposed.
2014-01-10 14:31:41 +01:00
antirez
1f73921d24 Trip comment to 80 cols in SentinelCommand(). 2014-01-10 14:13:04 +01:00
antirez
e9786a3255 Test: regression for issues #1483. 2014-01-09 11:19:03 +01:00
antirez
ed3c6c0124 Fix RESTORE ttl handling in 32 bit archs.
long was used instead of long long in order to handle a 64 bit
resolution millisecond timestamp.

This fixes issue #1483.
2014-01-09 11:09:23 +01:00
antirez
937732d50a Fix keyspace events flags-to-string conversion.
Fixes issue #1491 on Github.
2014-01-08 17:18:34 +01:00
antirez
088a617c61 Test: stress events flags to/from string conversion. 2014-01-08 17:18:30 +01:00
antirez
c0cdcaf373 Don't send REPLCONF ACK to old masters.
Masters not understanding REPLCONF ACK will reply with errors to our
requests causing a number of possible issues.

This commit detects a global replication offest set to -1 at the end of
the replication, and marks the client representing the master with the
REDIS_PRE_PSYNC flag.

Note that this flag was called REDIS_PRE_PSYNC_SLAVE but now it is just
REDIS_PRE_PSYNC as it is used for both slaves and masters starting with
this commit.

This commit fixes issue #1488.
2014-01-08 14:28:16 +01:00
antirez
c1a042fda9 Clarify a comment in slaveTryPartialResynchronization(). 2014-01-08 14:28:13 +01:00
antirez
c0b9515805 Log disconnection with slave only when ip:port is available. 2013-12-25 18:41:53 +01:00
antirez
87b56174b9 anetPeerToString / SockName: port can be NULL on errors too. 2013-12-25 18:41:49 +01:00
antirez
9a1cfab59b anetTcpGenericConnect() bug introduced in 9d19977 fixed.
Durign a refactoring I mispelled _port for port.
This is one of the reasons I never used _varname myself.
2013-12-25 18:41:45 +01:00
antirez
cf71d130a1 Remove useless goto from anetTcpGenericConnect(). 2013-12-25 18:41:41 +01:00
antirez
b4bee62561 anetTcpGenericConnect() code improved + 1 bug fix.
Now the socket is closed if anetNonBlock() fails, and in general the
code structure makes it harder to introduce this kind of bugs in the
future.

Reference: pull request #1059.
2013-12-25 18:15:28 +01:00
antirez
98901950f9 Cluster: clusterProcessPacket() was not 80 cols friendly.
The function actually needs to be split into sub-functions at some
point in the future.
2013-12-25 17:57:36 +01:00
antirez
c571290943 Fix CONFIG REWRITE handling of unknown options.
There were two problems with the implementation.

1) "save" was not correctly processed when no save point was configured,
   as reported in issue #1416.
2) The way the code checked if an option existed in the "processed"
   dictionary was wrong, as we add the element with as a key associated
   with a NULL value, so dictFetchValue() can't be used to check for
   existance, but dictFind() must be used, that returns NULL only if the
   entry does not exist at all.
2013-12-23 12:50:27 +01:00
antirez
ab00366504 Configuring port to 0 disables IP socket as specified.
This was no longer the case with 2.8 becuase of a bug introduced with
the IPv6 support. Now it is fixed.

This fixes issue #1287 and #1477.
2013-12-23 11:31:35 +01:00
antirez
c123005f8c Make new masters inherit replication offsets.
Currently replication offsets could be used into a limited way in order
to understand, out of a set of slaves, what is the one with the most
updated data. For example this comparison is possible of N slaves
were replicating all with the same master.

However the replication offset was not transferred from master to slaves
(that are later promoted as masters) in any way, so for instance if
there were three instances A, B, C, with A master and B and C
replication from A, the following could happen:

C disconnects from A.
B is turned into master.
A is switched to master of B.
B receives some write.

In this context there was no way to compare the offset of A and C,
because B would use its own local master replication offset as
replication offset to initialize the replication with A.

With this commit what happens is that when B is turned into master it
inherits the replication offset from A, making A and C comparable.
In the above case assuming no inconsistencies are created during the
disconnection and failover process, A will show to have a replication
offset greater than C.

Note that this does not mean offsets are always comparable to understand
what is, in a set of instances, since in more complex examples the
replica with the higher replication offset could be partitioned away
when picking the instance to elect as new master. However this in
general improves the ability of a system to try to pick a good replica
to promote to master.
2013-12-22 11:43:25 +01:00
antirez
0bfe6badf5 Slave disconnection is an event worth logging. 2013-12-22 10:15:35 +01:00
antirez
a75b334bdf Redis Cluster: add repl_ping_slave_period to slave data validity time.
When the configured node timeout is very small, the data validity time
(maximum data age for a slave to try a failover) is too little (ten
times the configured node timeout) when the replication link with the
master is mostly idle. In this case we'll receive some data from the
master only every server.repl_ping_slave_period to refresh the last
interaction with the master.

This commit adds to the max data validity time the slave ping period to
avoid this problem of slaves sensing too old data without a good reason.
However this max data validity time is likely a setting that should be
configurable by the Redis Cluster user in a way completely independent
from the node timeout.
2013-12-22 10:05:16 +01:00
antirez
74da5ee594 Log when a slave lose the connection with its master. 2013-12-21 00:23:37 +01:00
antirez
db016acb7f Redis Cluster: move node failure reports logging from VERBOSE to NOTICE level. 2013-12-21 00:04:53 +01:00
antirez
8527ba1eea Redis Cluster: remove no longer relevant comment. 2013-12-20 14:40:11 +01:00
antirez
dd10efb31a Redis Cluster: reconfigure replication when master changes address. 2013-12-20 12:47:22 +01:00
antirez
4d11d4c86c Redis Cluster: handshake code refactoring + Gossip IP switch detection.
This commit makes it simple to start an handshake with a specific node
address, and uses this in order to detect a node IP change and start a
new handshake in order to fix the IP if possible.
2013-12-20 12:38:03 +01:00
antirez
f42e0277ab Redis Cluster: delay state change when in the majority again.
As specified in the Redis Cluster specification, when a node can reach
the majority again after a period in which it was partitioend away with
the minorty of masters, wait some time before accepting queries, to
provide a reasonable amount of time for other nodes to upgrade its
configuration.

This lowers the probabilities of both a client and a master with not
updated configuration to rejoin the cluster at the same time, with a
stale master accepting writes.
2013-12-20 09:56:18 +01:00
antirez
e76443455f Clarify include directive behavior in example redis.conf. 2013-12-19 16:02:31 +01:00
antirez
9dc5817de7 CONFIG REWRITE: no special handling or include and rename-command.
CONFIG REWRITE is now wiser and does not touch what it does not
understand inside redis.conf.
2013-12-19 15:57:11 +01:00
Yubao Liu
9846af124d CONFIG REWRITE: don't throw some options on config rewrite
Those options will be thrown without this patch:
  include, rename-command, min-slaves-to-write, min-slaves-max-lag,
appendfilename.
2013-12-19 15:56:48 +01:00
antirez
5131d7da74 CONFIG REWRITE: old development comments removed. 2013-12-19 15:30:06 +01:00
antirez
f075607239 CONFIG REWRITE: don't wipe unknown options.
With this commit options not explicitly rewritten by CONFIG REWRITE are
not touched at all. These include new options that may not have support
for REWRITE, and other special cases like rename-command and include.
2013-12-19 15:25:45 +01:00
antirez
4b44b03cb9 Example redis.conf formatted to better show appendfilename option. 2013-12-19 10:18:45 +01:00
antirez
e48365e2c2 Cluster: set n->slaves to NULL in clusterNodeResetSlaves().
The value was otherwise undefined, so next time the node was promoted
again from slave to master, adding a slave to the list of slaves
would likely crash the server or result into undefined behavior.
2013-12-17 14:50:24 +01:00
antirez
7f51cf8b56 Cluster: check link is valid before sending UPDATE. 2013-12-17 12:28:37 +01:00
antirez
c17be18035 Cluster: initialize todo_before_sleep flags to 0. 2013-12-17 12:22:02 +01:00
antirez
195aab3345 Cluster: use proper type mstime_t for ping delay var. 2013-12-17 10:27:36 +01:00
antirez
b1d3dd657d Cluster: use an hardcoded 60 sec timeout in redis-trib connections.
Later this should be configurable from the command line but at least now
we use something more appropriate for our use case compared to the
redis-rb default timeout.
2013-12-17 10:00:33 +01:00
antirez
118d0fb533 Fixed clearNodeFailureIfNeeded() time type to mstime_t.
This prevented 32bit cluster instances from clearing the FAIL flag when
needed.
2013-12-17 09:45:52 +01:00
antirez
9180fb7931 Cluster: use long long for timestamps in clusterGenNodesDescription().
Ping sent and pong received fields need to be casted to long long to be
printed correctly into 32 bit systems.
2013-12-17 09:38:11 +01:00
antirez
229267abd1 Makefile.dep updated. 2013-12-13 13:10:05 +01:00