226 Commits

Author SHA1 Message Date
antirez
1a88341fb6 Sentinel: allow SHUTDOWN command in Sentinel mode. 2014-02-07 11:22:24 +01:00
antirez
b9c51f518b Sentinel: check arity for SENTINEL MASTER command.
This fixes issue #1530.
2014-01-31 10:13:38 +01:00
antirez
21b4d6c23e SENTINEL SET master quorum implemented. 2014-01-14 09:23:26 +01:00
antirez
b279e578fa SENTINEL SET: error on bad option name + flush config on error. 2014-01-13 11:55:57 +01:00
antirez
74f84e3a3d SENTINEL SET implemented.
The new command allows to change master-specific configurations
at runtime. All the settable parameters can be retrivied via the
SENTINEL MASTER command, so there is no equivalent "GET" command.
2014-01-13 11:53:29 +01:00
antirez
1642da19bb Sentinel: fix wrong arity error message. 2014-01-13 11:05:13 +01:00
antirez
2c6c1b1271 Sentinel: SENTINEL REMOVE command added.
The command totally removes a monitored master.
2014-01-10 15:39:36 +01:00
antirez
2306608167 Sentinel: releaseSentinelRedisInstance() top comment fixed.
The claim about unlinking the instance from the connected hash tables
was the opposite of the reality. Also the current actual behavior is
safer in most cases, so it is better to manually unlink when needed.
2014-01-10 15:33:42 +01:00
antirez
282b2b4660 Sentinel: flush config on disk when new master is added. 2014-01-10 15:22:06 +01:00
antirez
61302ba560 Sentinel: SENTINEL MONITOR command implemented.
It allows to add new masters to monitor at runtime.
2014-01-10 15:18:24 +01:00
antirez
7d7e3f00e0 Sentinel: added SENTINEL MASTER <name> command.
With SENTINEL MASTERS it was already possible to list all the configured
masters, but not a specific one.
2014-01-10 14:41:52 +01:00
antirez
46429f36a7 Add all the configurable fields to addReplySentinelRedisInstance().
Note: the auth password with the master is voluntarily not exposed.
2014-01-10 14:31:41 +01:00
antirez
1f73921d24 Trip comment to 80 cols in SentinelCommand(). 2014-01-10 14:13:04 +01:00
antirez
134b4e97e7 Sentinel: dead code removed. 2013-12-13 11:01:13 +01:00
antirez
247a311317 dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
b6d79f34e8 Sentinel: fix reported role info sampling.
The way the role change was recoded was not sane and too much
convoluted, causing the role information to be not always updated.

This commit fixes issue #1445.
2013-12-06 12:46:56 +01:00
antirez
33ea913329 Sentinel: fix reported role fields when master is reset.
When there is a master address switch, the reported role must be set to
master so that we have a chance to re-sample the INFO output to check if
the new address is reporting the right role.

Otherwise if the role was wrong, it will be sensed as wrong even after
the address switch, and for enough time according to the role change
time, for Sentinel consider the master SDOWN.

This fixes isue #1446, that describes the effects of this bug in
practice.
2013-12-06 11:37:46 +01:00
antirez
7a5a646df9 Fixed grammar: before H the article is a, not an. 2013-12-05 16:35:32 +01:00
antirez
6fc6c6bda9 Sentinel: don't write HZ when flushing config.
See issue #1419.
2013-12-02 15:56:10 +01:00
antirez
4df452caf6 Sentinel: better time desynchronization.
Sentinels are now desynchronized in a better way changing the time
handler frequency between 10 and 20 HZ. This way on average a
desynchronization of 25 milliesconds is produced that should be larger
enough compared to network latency, avoiding most split-brain condition
during the vote.

Now that the clocks are desynchronized, to have larger random delays when
performing operations can be easily achieved in the following way.
Take as example the function that starts the failover, that is
called with a frequency between 10 and 20 HZ and will start the
failover every time there are the conditions. By just adding as an
additional condition something like rand()%4 == 0, we can amplify the
desynchronization between Sentinel instances easily.

See issue #1419.
2013-12-02 12:29:42 +01:00
antirez
394bccd137 Sentinel: log vote received from other Sentinels. 2013-11-28 15:23:46 +01:00
huangz1990
a1979d9d55 fix a bug in sentinel.c about pub/sub link 2013-11-26 19:55:51 +08:00
antirez
2995302165 Sentinel: fixes inverted strcmp() test preventing config updates.
The result of this one-char bug was pretty serious, if the new master
had the same port of the previous master, but just a different IP
address, non-leader Sentinels would not be able to recognize the
configuration change.

This commit fixes issue #1394.

Many thanks to @shanemadden that reported the bug and helped
investigating it.
2013-11-25 10:59:53 +01:00
antirez
90bacd032e Sentinel: fix type specifier for Hello msg generation.
This fixes issue #1395.
2013-11-25 10:24:34 +01:00
antirez
e8b13dc679 Sentinel: different comments updated to new implementation. 2013-11-21 16:22:59 +01:00
antirez
6feb6cfdf8 Sentinel: cleanup around SENTINEL_INFO_VALIDITY_TIME. 2013-11-21 16:05:41 +01:00
antirez
0fa5d0e537 Sentinel: removed mem leak and useless code. 2013-11-21 15:43:55 +01:00
antirez
166b380011 Sentinel: manual failover works again. 2013-11-21 12:39:47 +01:00
antirez
adbba45d5d Sentinel: test for writable config file.
This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.
2013-11-21 12:28:15 +01:00
antirez
98f08fa3ab Sentinel: check for disconnected links in sentinelSendHello().
Does not fix any bug as the test is performed by the caller, but better
to have the check.
2013-11-21 11:35:50 +01:00
antirez
221d4d48f4 Sentinel: Hello message sending code refactored. 2013-11-21 11:31:06 +01:00
antirez
3f92ee09ae Sentinel: select slave with best (greater) replication offset. 2013-11-20 16:05:36 +01:00
antirez
3ea52291d9 Sentinel: take the replication offset in slaves state. 2013-11-20 15:53:21 +01:00
antirez
5d77fe69c7 Sentinel: distinguish between is-master-down-by-addr requests.
Some are just to know if the master is down, and in this case the runid
in the request is set to "*", others are actually in order to seek for a
vote and get elected. In the latter case the runid is set to the runid
of the instance seeking for the vote.
2013-11-19 16:50:04 +01:00
antirez
9bae762af3 Sentinel: various fixes to leader election implementation. 2013-11-19 16:20:42 +01:00
antirez
101f583689 Sentinel: failover script execution fixed. 2013-11-19 12:34:46 +01:00
antirez
02b42dc7c7 Sentinel: no longer used defines removed. 2013-11-19 11:24:36 +01:00
antirez
934e4d103f Sentinel: when writing config on disk, remember sentinels runid. 2013-11-19 11:11:43 +01:00
antirez
8ca008692f Sentinel: arity of known-sentinel/slave is 4 not 3. 2013-11-19 11:03:47 +01:00
antirez
8b7b010580 Sentinel: rewriteConfigSentinelOption() sub-iterators var typo fixed. 2013-11-19 10:59:50 +01:00
antirez
88b2f6525e Sentinel: call sentinelFlushConfig() to persist state when needed.
Also the sentinel configuration rewriting was modified in order to
account for failover in progress, where we need to provide the promoted
slave address as master address, and the old master address as one of
the slaves address.
2013-11-19 10:55:43 +01:00
antirez
d345a59943 Sentinel: sentinelFlushConfig() to CONFIG REWRITE + fsync. 2013-11-19 10:13:04 +01:00
antirez
45666c4c22 Sentinel: CONFIG REWRITE support for Sentinel config. 2013-11-19 09:48:12 +01:00
antirez
2c5afa88c7 Sentinel: can-failover option removed, many comments fixed. 2013-11-19 09:28:47 +01:00
antirez
a6c9d2d796 Sentinel: added config options useful to take state on config rewrite.
We'll use CONFIG REWRITE (internally) in order to store the new
configuration of a Sentinel after the internal state changes. In order
to do so, we need configuration options (that usually the user will not
touch at all) about config epoch of the master, Sentinels and Slaves
known for this master, and so forth.
2013-11-18 16:03:03 +01:00
antirez
9cc4330b06 Sentinel: failover abort function simplified. 2013-11-18 11:43:35 +01:00
antirez
5196131fb9 Sentinel: slaves reconfig delay modified.
The time Sentinel waits since the slave is detected to be configured to
the wrong master, before reconfiguring it, is now the failover_timeout
time as this makes more sense in order to give the Sentinel performing
the failover enoung time to reconfigure the slaves slowly (if required
by the configuration).

Also we now PUBLISH more frequently the new configuraiton as this allows
to switch the reapprearing master back to slave faster.
2013-11-18 11:37:24 +01:00
antirez
16bc1ae5f4 Sentinel: failover restart time is now multiple of failover timeout.
Also defaulf failover timeout changed to 3 minutes as the failover is a
fairly fast procedure most of the times, unless there are a very big
number of slaves and the user picked to configure them sequentially (in
that case the user should change the failover timeout accordingly).
2013-11-18 11:30:08 +01:00
antirez
7b7763ff3e Sentinel: state machine and timeouts simplified. 2013-11-18 11:12:58 +01:00
antirez
00cad98228 Sentinel: election timeout define. 2013-11-18 10:08:06 +01:00