200 Commits

Author SHA1 Message Date
antirez
1a7c6f5e04 Sentinel: PING trigger improved
It's ok to ping as soon as the ping period has elapsed since we received
the last PONG, but it's not good that we ping again if there is a
pending ping... With this change we'll send a new ping if there is one
pending only if two times the ping period elapsed since the ping which
is still pending was sent.
2015-05-12 17:03:53 +02:00
antirez
f54299a9e3 Sentinel: same-Sentinel link sharing across masters 2015-05-12 17:03:00 +02:00
antirez
67d19e865e Sentinel: add sentinelGetInstanceTypeString() fuction
This is useful for debugging and logging activities: given a
sentinelRedisInstance object returns a C string representing the
instance type: master, slave, sentinel.
2015-05-12 12:12:25 +02:00
antirez
7346c06fb0 Sentinel: add link refcount to instance description 2015-05-11 23:49:19 +02:00
antirez
c257c0b1c1 Sentinel: connection sharing WIP #1 2015-05-11 13:15:26 +02:00
antirez
fa9695e2d9 Sentinel: suppress warnings for not used args. 2015-05-08 17:17:59 +02:00
antirez
3eb45318d6 Sentinel: generate +sentinel again, removed in prev commit. 2015-05-08 17:16:48 +02:00
antirez
3cc4c341e7 Sentinel: Use privdata instead of c->data in sentinelReceiveHelloMessages()
This way we may later share the hiredis link "c" among the same Sentinel
instance referenced multiple times for multiple masters.
2015-05-08 17:16:39 +02:00
antirez
8b0615a04f Sentinel: clarify arguments of SENTINEL IS-MASTER-DOWN-BY-ADDR 2015-05-08 17:16:00 +02:00
antirez
038a39c0f8 Sentinel: don't detect duplicated Sentinels, just address switch
Since with a previous commit Sentinels now persist their unique ID, we
no longer need to detect duplicated Sentinels and re-add them. We remove
and re-add back using different events only in the case of address
switch of the same Sentinel, without generating a new +sentinel event.
2015-05-07 10:07:47 +02:00
antirez
686b84abde Sentinel: persist its unique ID across restarts.
Previously Sentinels always changed unique ID across restarts, relying
on the server.runid field. This is not a good idea, and forced Sentinel
to rely on detection of duplicated Sentinels and a potentially dangerous
clean-up and re-add operation of the Sentinel instance that was
rebooted.

Now the ID is generated at the first start and persisted in the
configuration file, so that a given Sentinel will have its unique
ID forever (unless the configuration is manually deleted or there is a
filesystem corruption).
2015-05-06 16:19:14 +02:00
therealbill
b04184f34c Making sentinel flush config on +slave
Originally, only the +slave event which occurs when a slave is
reconfigured during sentinelResetMasterAndChangeAddress triggers a flush
of the config to disk.  However, newly discovered slaves don't
apparently trigger this flush but do trigger the +slave event issuance.

So if you start up a sentinel, add a master, then add a slave to the
master (as a way to reproduce it) you'll see the +slave event issued,
but the sentinel config won't be updated with the known-slave entry.

This change makes sentinel do the flush of the config if a new slave is
deteted in sentinelRefreshInstanceInfo.
2015-05-04 12:54:13 +02:00
antirez
b2d4dddaf9 Sentinel: remove useless sentinelFlushConfig() call
To rewrite the config in the loop that adds slaves back after a master
reset, in order to handle switching to another master, is useless: it
just adds latency since there is an fsync call in the inner loop,
without providing any additional guarantee, but the contrary, since if
after the first loop iteration the server crashes we end with just a
single slave entry losing all the other informations.

It is wiser to rewrite the config at the end when the full new
state is configured.
2015-05-04 12:50:44 +02:00
clark.kang
0040a15d13 fix sentinel memory leak 2015-04-29 00:05:26 +09:00
Salvatore Sanfilippo
4aa4365112 Merge pull request #2386 from inkel/sentinel-add-client-command
Support CLIENT commands in Redis Sentinel
2015-03-13 18:23:36 +01:00
Salvatore Sanfilippo
b0eb128ac7 Merge pull request #2054 from mattsta/fix-set-sentinel-quorum
Sentinel: Add initial quorum bounds check
2015-02-25 10:09:40 +01:00
Salvatore Sanfilippo
b2ffd67c91 Merge pull request #1966 from mattsta/fix-sentinel-info
Sentinel: Improve INFO command behavior
2015-02-24 17:20:09 +01:00
Leandro López (inkel)
b5e5db9b86 Support CLIENT commands in Redis Sentinel
When trying to debug sentinel connections or max connections errors it
would be very useful to have the ability to see the list of connected
clients to a running sentinel. At the same time it would be very helpful
to be able to name each sentinel connection or kill offending clients.

This commits adds the already defined CLIENT commands back to Redis
Sentinel.
2015-02-02 18:16:18 -03:00
Matt Stancliff
01b7155ff5 Fix three simple clang analyzer warnings 2014-12-23 09:31:04 -05:00
Matt Stancliff
87d6324607 Add addReplyBulkSds() function
Refactor a common pattern into one function so we don't
end up with copy/paste programming.
2014-12-23 09:31:02 -05:00
Matt Stancliff
1ad036ab8f Add 'age' value to SENTINEL INFO-CACHE 2014-12-22 21:17:04 -05:00
antirez
070ec599ba sdsformatip() removed.
Specialized single-use function. Not the best match for sds.c btw.
Also genClientPeerId() is no longer static: we need symbols.
2014-12-11 18:29:04 +01:00
antirez
3d476bf2b6 AnetFormatIP(): renamed, commented, now sticks to IP:port format.
A few code style changes + consistent format: not nice for humans but
better for parsers.
2014-12-11 18:20:30 +01:00
Matt Stancliff
aca61af174 Sentinel: Improve INFO command behavior
Improvements:
  - Return empty string if asking for non-existing section (INFO foo)
  - Fix potential memory leak (caused by sdsempty() then returned if >2 args)
  - Clean up argument parsing
  - Allow "all" as valid section (same as "default" or zero args currently)
  - Move strcasecmp to end of evaluation chain in conditionals

Also, since we're C99, I moved some variable declarations to be closer
to where they are actually used (saves us from needing to free an empty info
if detect argument errors up front).

Closes #1915
Closes #1966
2014-12-11 10:49:16 -05:00
Matt Stancliff
f7a98bdf4d Cleanup all IP formatting code
Instead of manually checking for strchr(n,':') everywhere,
we can use our new centralized IP formatting functions.
2014-12-11 10:12:18 -05:00
antirez
8a03ffa160 Sentinel: INFO-CACHE comments reworked a bit.
Changed in order to make them more review friendly, based on the
experience of reviewing the code myself.
2014-12-10 11:15:13 +01:00
antirez
3f2975ad12 Sentinel: INFO-CACHE GCC minior code cleanup.
I guess the initial goal of the initialization was to suppress GCC
warning, but if we have to initialize, we can do it with the base-case
value instead of NULL which is never retained.
2014-12-10 11:12:26 +01:00
antirez
043ae412ca Sentinel: removed useless flag var from INFO-CACHE. 2014-12-10 11:05:37 +01:00
antirez
71e4635ae9 Sentinel: INFO-CACHE reply format command shortened. 2014-12-10 11:04:24 +01:00
Matt Stancliff
c25d7ceaee Add SENTINEL INFO-CACHE [masters...]
Sentinel queries the INFO from every master and from every replica of
every master.

We can cache the INFO results in Sentinel so Sentinel can be a single
place to quickly get all INFO output for an entire Sentinel monitoring
group.

This commit gives us SENTINEL INFO-CACHE in two forms:
  - SENTINEL INFO-CACHE — returns all masters and all replicas
  - SENTINEL INFO-CACHE master0 master1 ... masterN — vararg specify masters

Results are returned as a multibulk reply with two top-level entries
for each master.  The first entry for each master is the name of the master.
The second entry is a nested multibulk reply with the contents of INFO,
first for the master, then an additional entry for each of the
replicas.
2014-11-20 16:56:30 -05:00
Matt Stancliff
2da508abce Sentinel: Add initial quorum bounds check
Fixes #2054
2014-11-20 16:30:17 -05:00
Matt Stancliff
299c667adb Clean up text throughout project
- Remove trailing newlines from redis.conf
  - Fix comment misspelling
  - Clarifies zipEncodeLength usage and a C API mention (#1243, #1242)
  - Fix cluster typos (inspired by @papanikge #1507)
  - Fix rewite -> rewrite in a few places (inspired by #682)

Closes #1243, #1242, #1507
2014-09-29 06:49:07 -04:00
antirez
5142037af2 Sentinel sentinelGetLeader() top comment improved. 2014-09-11 19:27:45 +02:00
antirez
bdf2ab1891 Sentinel: fix computation of total number of votes.
The code to check the number of voters was never updated to follow the new
Sentinel specification, so the number of voters was computed using only
the set of Sentinels that provided a vote.

This means that there is a changing majority on partitions, even if
usually the issue is not triggered because of the configured quorum
check (what was broken was the other implicit check that requires anyway
half of the known sentinels to agree in order to start a failover).
2014-09-11 18:53:31 +02:00
antirez
da1b1e246a Sentinel: don't set announce-ip if is empty. 2014-09-04 11:45:58 +02:00
antirez
6b132288b6 Sentinel: announce ip/port changes + rewrite.
The original implementation was modified in order to allow to
selectively announce a different IP or port, and to rewrite the two
options in the config file after a rewrite.
2014-09-04 11:23:31 +02:00
Dara Kong
2690dc2474 sentinel: Decouple bind address from address sent to other sentinels
There are instances such as EC2 where the bind address is private
(behind a NAT) and cannot be accessible from WAN.

https://groups.google.com/d/msg/redis-db/PVVvjO4nMd0/P3oWC036v3cJ
2014-09-04 10:54:21 +02:00
Matt Stancliff
784f4d7a5a Sentinel: Abort Hello quicker if not connected
We can save a little work by aborting when we enter the function
if we're disconnected.
2014-09-01 16:34:06 +02:00
Matt Stancliff
6b4823782d Rename two 'buf' vars to 'ip' for better clarity
Clearly ip[32] is wrong, but it's less clear that buf[32] was wrong
without further reading.
2014-08-25 10:16:20 +02:00
Eiichi Sato
090afef1c4 Sentinel: fix bufsize to support IPv6 address
Closes #1914
2014-08-25 10:15:43 +02:00
antirez
2e94ffb1d1 Remove warnings and improve integer sign correctness. 2014-08-13 11:44:38 +02:00
antirez
d7c72531d4 Sentinel implementation of ROLE. 2014-06-23 12:07:41 +02:00
Matt Stancliff
cac13411b8 Sentinel: bind source address
Some deployments need traffic sent from a specific address.  This
change uses the same policy as Cluster where the first listed bindaddr
becomes the source address for outgoing Sentinel communication.

Fixes #1667
2014-06-23 11:44:35 +02:00
antirez
f3b16dde10 Sentinel: send hello messages ASAP after config change.
Eventual configuration convergence is guaranteed by our periodic hello
messages to all the instances, however when there are important notices
to share, better make a phone call. With this commit we force an hello
message to other Sentinal and Redis instances within the next 100
milliseconds of a config update, which is practically better than
waiting a few seconds.
2014-06-19 15:17:06 +02:00
antirez
f254e2077a Sentinel: handle SRI_PROMOTED flag correctly.
Lack of check of the SRI_PROMOTED flag caused Sentienl to act with the
promoted slave turned into a master during failover like if it was a
normal instance.

Normally this problem was not apparent because during real failovers the
old master is down so the bugged code path was not entered, however with
manual failovers via the SENTINEL FAILOVER command, the problem was
easily triggered.

This commit prevents promoted slaves from getting reconfigured, moreover
we now explicitly check that during a failover the slave turning into a
master is the one we selected for promotion and not a different one.
2014-06-19 10:28:27 +02:00
antirez
faf07a72d5 Sentinel: send SLAVEOF with MULTI, CLIENT KILL, CONFIG REWRITE.
This implements the new Sentinel-Client protocol for the Sentinel part:
now instances are reconfigured using a transaction that ensures that the
config is rewritten in the target instance, and that clients lose the
connection with the instance, in order to be forced to: ask Sentinel,
reconnect to the instance, and verify the instance role with the new
ROLE command.
2014-06-17 11:03:21 +02:00
antirez
9dd14ee1d1 More trailing spaces in sentinel.c removed. 2014-05-28 15:46:05 +02:00
antirez
8b7981725b Remove trailing spaces from sentinel.c. 2014-05-20 14:22:42 +02:00
antirez
1a32a0f9a0 Sentinel: log when a failover will be attempted again.
When a Sentinel performs a failover (successful or not), or when a
Sentinel votes for a different Sentinel trying to start a failover, it
sets a min delay before it will try to get elected for a failover.

While not strictly needed, because if multiple Sentinels will try
to failover the same master at the same time, only one configuration
will eventually win, this serialization is practically very useful.
Normal failovers are cleaner: one Sentinel starts to failover, the
others update their config when the Sentinel performing the failover
is able to get the selected slave to move from the role of slave to the
one of master.

However currently this timeout was implicit, so users could see
Sentinels not reacting, after a failed failover, for some time, without
giving any feedback in the logs to the poor sysadmin waiting for clues.

This commit makes Sentinels more verbose about the delay: when a master
is down and a failover attempt is not performed because the delay has
still not elaped, something like that will be logged:

    Next failover delay: I will not start a failover
    before Thu May  8 16:48:59 2014
2014-05-08 16:38:53 +02:00
antirez
7ef2b30677 Sentinel: generate +config-update-from event when a new config is received.
This event makes clear, before the switch-master event is generated,
that a Sentinel received a configuration update from another Sentinel.
2014-05-08 15:59:34 +02:00