27431 Commits

Author SHA1 Message Date
antirez
adbba45d5d Sentinel: test for writable config file.
This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.
2013-11-21 12:28:15 +01:00
antirez
297de1ab26 Sentinel: test for writable config file.
This commit introduces a funciton called when Sentinel is ready for
normal operations to avoid putting Sentinel specific stuff in redis.c.
2013-11-21 12:28:15 +01:00
antirez
98f08fa3ab Sentinel: check for disconnected links in sentinelSendHello().
Does not fix any bug as the test is performed by the caller, but better
to have the check.
2013-11-21 11:35:50 +01:00
antirez
d920177f8d Sentinel: check for disconnected links in sentinelSendHello().
Does not fix any bug as the test is performed by the caller, but better
to have the check.
2013-11-21 11:35:50 +01:00
antirez
221d4d48f4 Sentinel: Hello message sending code refactored. 2013-11-21 11:31:06 +01:00
antirez
8810167d13 Sentinel: Hello message sending code refactored. 2013-11-21 11:31:06 +01:00
antirez
3f92ee09ae Sentinel: select slave with best (greater) replication offset. 2013-11-20 16:05:36 +01:00
antirez
0101c2bcfe Sentinel: select slave with best (greater) replication offset. 2013-11-20 16:05:36 +01:00
antirez
3ea52291d9 Sentinel: take the replication offset in slaves state. 2013-11-20 15:53:21 +01:00
antirez
a6ebd910d8 Sentinel: take the replication offset in slaves state. 2013-11-20 15:53:21 +01:00
antirez
19f625ed5c CONFIG REWRITE: don't add the signature if it already exists.
At the end of the file, CONFIG REWRITE adds a comment line that:

    # Generated by CONFIG REWRITE

Followed by the additional config options required. However this was
added again and again at every rewrite in praticular conditions (when a
given set of options change in a given time during the time).

Now if it was alrady encountered, it is not added a second time.

This is especially important for Sentinel that rewrites the config at
every state change.
2013-11-19 17:58:11 +01:00
antirez
b1f5a0b3ec CONFIG REWRITE: don't add the signature if it already exists.
At the end of the file, CONFIG REWRITE adds a comment line that:

    # Generated by CONFIG REWRITE

Followed by the additional config options required. However this was
added again and again at every rewrite in praticular conditions (when a
given set of options change in a given time during the time).

Now if it was alrady encountered, it is not added a second time.

This is especially important for Sentinel that rewrites the config at
every state change.
2013-11-19 17:58:11 +01:00
antirez
5d77fe69c7 Sentinel: distinguish between is-master-down-by-addr requests.
Some are just to know if the master is down, and in this case the runid
in the request is set to "*", others are actually in order to seek for a
vote and get elected. In the latter case the runid is set to the runid
of the instance seeking for the vote.
2013-11-19 16:50:04 +01:00
antirez
37a51a2568 Sentinel: distinguish between is-master-down-by-addr requests.
Some are just to know if the master is down, and in this case the runid
in the request is set to "*", others are actually in order to seek for a
vote and get elected. In the latter case the runid is set to the runid
of the instance seeking for the vote.
2013-11-19 16:50:04 +01:00
antirez
9bae762af3 Sentinel: various fixes to leader election implementation. 2013-11-19 16:20:42 +01:00
antirez
b22d1beea0 Sentinel: various fixes to leader election implementation. 2013-11-19 16:20:42 +01:00
antirez
101f583689 Sentinel: failover script execution fixed. 2013-11-19 12:34:46 +01:00
antirez
1f9728cb20 Sentinel: failover script execution fixed. 2013-11-19 12:34:46 +01:00
antirez
02b42dc7c7 Sentinel: no longer used defines removed. 2013-11-19 11:24:36 +01:00
antirez
90635488ce Sentinel: no longer used defines removed. 2013-11-19 11:24:36 +01:00
antirez
934e4d103f Sentinel: when writing config on disk, remember sentinels runid. 2013-11-19 11:11:43 +01:00
antirez
0a35f65301 Sentinel: when writing config on disk, remember sentinels runid. 2013-11-19 11:11:43 +01:00
antirez
8ca008692f Sentinel: arity of known-sentinel/slave is 4 not 3. 2013-11-19 11:03:47 +01:00
antirez
5450833d02 Sentinel: arity of known-sentinel/slave is 4 not 3. 2013-11-19 11:03:47 +01:00
antirez
8b7b010580 Sentinel: rewriteConfigSentinelOption() sub-iterators var typo fixed. 2013-11-19 10:59:50 +01:00
antirez
b8a94463b7 Sentinel: rewriteConfigSentinelOption() sub-iterators var typo fixed. 2013-11-19 10:59:50 +01:00
antirez
88b2f6525e Sentinel: call sentinelFlushConfig() to persist state when needed.
Also the sentinel configuration rewriting was modified in order to
account for failover in progress, where we need to provide the promoted
slave address as master address, and the old master address as one of
the slaves address.
2013-11-19 10:55:43 +01:00
antirez
16237d78c8 Sentinel: call sentinelFlushConfig() to persist state when needed.
Also the sentinel configuration rewriting was modified in order to
account for failover in progress, where we need to provide the promoted
slave address as master address, and the old master address as one of
the slaves address.
2013-11-19 10:55:43 +01:00
antirez
d345a59943 Sentinel: sentinelFlushConfig() to CONFIG REWRITE + fsync. 2013-11-19 10:13:04 +01:00
antirez
e257ab2bfe Sentinel: sentinelFlushConfig() to CONFIG REWRITE + fsync. 2013-11-19 10:13:04 +01:00
antirez
45666c4c22 Sentinel: CONFIG REWRITE support for Sentinel config. 2013-11-19 09:48:12 +01:00
antirez
5998769c28 Sentinel: CONFIG REWRITE support for Sentinel config. 2013-11-19 09:48:12 +01:00
antirez
2c5afa88c7 Sentinel: can-failover option removed, many comments fixed. 2013-11-19 09:28:47 +01:00
antirez
47df12d5d9 Sentinel: can-failover option removed, many comments fixed. 2013-11-19 09:28:47 +01:00
antirez
33ccfbb624 Fix typo 'configuraiton' in rewriteConfigRewriteLine() comment. 2013-11-18 18:18:10 +01:00
antirez
cd4ff9992b Fix typo 'configuraiton' in rewriteConfigRewriteLine() comment. 2013-11-18 18:18:10 +01:00
antirez
a6c9d2d796 Sentinel: added config options useful to take state on config rewrite.
We'll use CONFIG REWRITE (internally) in order to store the new
configuration of a Sentinel after the internal state changes. In order
to do so, we need configuration options (that usually the user will not
touch at all) about config epoch of the master, Sentinels and Slaves
known for this master, and so forth.
2013-11-18 16:03:03 +01:00
antirez
232cdb95ab Sentinel: added config options useful to take state on config rewrite.
We'll use CONFIG REWRITE (internally) in order to store the new
configuration of a Sentinel after the internal state changes. In order
to do so, we need configuration options (that usually the user will not
touch at all) about config epoch of the master, Sentinels and Slaves
known for this master, and so forth.
2013-11-18 16:03:03 +01:00
antirez
9cc4330b06 Sentinel: failover abort function simplified. 2013-11-18 11:43:35 +01:00
antirez
3a374b0511 Sentinel: failover abort function simplified. 2013-11-18 11:43:35 +01:00
antirez
5196131fb9 Sentinel: slaves reconfig delay modified.
The time Sentinel waits since the slave is detected to be configured to
the wrong master, before reconfiguring it, is now the failover_timeout
time as this makes more sense in order to give the Sentinel performing
the failover enoung time to reconfigure the slaves slowly (if required
by the configuration).

Also we now PUBLISH more frequently the new configuraiton as this allows
to switch the reapprearing master back to slave faster.
2013-11-18 11:37:24 +01:00
antirez
e0750acf11 Sentinel: slaves reconfig delay modified.
The time Sentinel waits since the slave is detected to be configured to
the wrong master, before reconfiguring it, is now the failover_timeout
time as this makes more sense in order to give the Sentinel performing
the failover enoung time to reconfigure the slaves slowly (if required
by the configuration).

Also we now PUBLISH more frequently the new configuraiton as this allows
to switch the reapprearing master back to slave faster.
2013-11-18 11:37:24 +01:00
antirez
16bc1ae5f4 Sentinel: failover restart time is now multiple of failover timeout.
Also defaulf failover timeout changed to 3 minutes as the failover is a
fairly fast procedure most of the times, unless there are a very big
number of slaves and the user picked to configure them sequentially (in
that case the user should change the failover timeout accordingly).
2013-11-18 11:30:08 +01:00
antirez
83316f515c Sentinel: failover restart time is now multiple of failover timeout.
Also defaulf failover timeout changed to 3 minutes as the failover is a
fairly fast procedure most of the times, unless there are a very big
number of slaves and the user picked to configure them sequentially (in
that case the user should change the failover timeout accordingly).
2013-11-18 11:30:08 +01:00
antirez
7b7763ff3e Sentinel: state machine and timeouts simplified. 2013-11-18 11:12:58 +01:00
antirez
3a56013acb Sentinel: state machine and timeouts simplified. 2013-11-18 11:12:58 +01:00
antirez
00cad98228 Sentinel: election timeout define. 2013-11-18 10:08:06 +01:00
antirez
4be53b1c5d Sentinel: election timeout define. 2013-11-18 10:08:06 +01:00
antirez
297df0e910 Sentinel: fix address of master in Hello messages.
Once we switched configuration during a failover, we should advertise
the new address.

This was a serious race condition as the Sentinel performing the
failover for a moment advertised the old address with the new
configuration epoch: once trasmitted to the other Sentinels the broken
configuration would remain there forever, until the next failover
(because a greater configuration epoch is required to overwrite an older
one).
2013-11-14 10:25:55 +01:00
antirez
69d826a354 Sentinel: fix address of master in Hello messages.
Once we switched configuration during a failover, we should advertise
the new address.

This was a serious race condition as the Sentinel performing the
failover for a moment advertised the old address with the new
configuration epoch: once trasmitted to the other Sentinels the broken
configuration would remain there forever, until the next failover
(because a greater configuration epoch is required to overwrite an older
one).
2013-11-14 10:25:55 +01:00