48 Commits

Author SHA1 Message Date
Yossi Gottlieb
10ffeb03e4 TLS: Connections refactoring and TLS support.
* Introduce a connection abstraction layer for all socket operations and
integrate it across the code base.
* Provide an optional TLS connections implementation based on OpenSSL.
* Pull a newer version of hiredis with TLS support.
* Tests, redis-cli updates for TLS support.
2019-10-07 21:06:13 +03:00
antirez
ebf0968a38 Module cluster flags: add hooks for NO_REDIRECTION flag. 2018-09-19 11:31:22 +02:00
antirez
96ceafbe4d Module cluster flags: initial vars / defines added. 2018-09-19 11:20:52 +02:00
Jack Drogon
bae1d36e5d Fix typo 2018-07-03 18:19:46 +02:00
antirez
72f11ded18 Modules Cluster API: message bus implementation. 2018-03-29 15:13:31 +02:00
antirez
b94379e29a Cluster: ability to prevent slaves from failing over their masters.
This commit, in some parts derived from PR #3041 which is no longer
possible to merge (because the user deleted the original branch),
implements the ability of slaves to have a special configuration
preventing that they try to start a failover when the master is failing.

There are multiple reasons for wanting this, and the feautre was
requested in issue #3021 time ago.

The differences between this patch and the original PR are the
following:

1. The flag is saved/loaded on the nodes configuration.
2. The 'myself' node is now flag-aware, the flag is updated as needed
   when the configuration is changed via CONFIG SET.
3. The flag name uses NOFAILOVER instead of NO_FAILOVER to be consistent
   with existing NOADDR.
4. The redis.conf documentation was rewritten.

Thanks to @deep011 for the original patch.
2018-03-14 14:01:38 +01:00
Antonio Mallia
c8ef3bc7d1 Fixed comment in clusterMsg version field 2017-06-04 15:09:05 +01:00
antirez
a3c3e86851 Cluster: always add PFAIL nodes at end of gossip section.
To rely on the fact that nodes in PFAIL state will be shared around by
randomly adding them in the gossip section is a weak assumption,
especially after changes related to sending less ping/pong packets.

We want to always include gossip entries for all the nodes that are in
PFAIL state, so that the PFAIL -> FAIL state promotion can happen much
faster and reliably.

Related to #3929.
2017-04-14 13:39:49 +02:00
antirez
6eaf3ffe4c Cluster: collect more specific bus messages stats.
First step in order to change Cluster in order to use less messages.
Related to issue #3929.
2017-04-13 19:22:35 +02:00
antirez
7ff9124636 Cluster: hash slots tracking using a radix tree. 2017-03-27 16:37:22 +02:00
antirez
51a16c9ca0 Cluster announce ip / port initial implementation. 2016-01-29 09:06:37 +01:00
antirez
bb2f057d6f Cluster: add announce ip field in messages header. 2016-01-29 09:06:36 +01:00
antirez
5318c14c3c Cluster: clarify node->slave may be NULL. 2016-01-14 11:58:31 +01:00
antirez
60daa127fd Cluster: replica migration with delay.
We wait a fixed amount of time (5 seconds currently) much greater than
the usual Cluster node to node communication latency, before migrating.
This way when a failover occurs, before detecting the new master as a
target for migration, we give the time to its natural slaves (the slaves
of the failed over master) to announce they switched to the new master,
preventing an useless migration operation.
2015-12-11 09:19:06 +01:00
antirez
923098bbda Fix replicas migration by adding a new flag.
Some time ago I broken replicas migration (reported in #2924).
The idea was to prevent masters without replicas from getting replicas
because of replica migration, I remember it to create issues with tests,
but there is no clue in the commit message about why it was so
undesirable.

However my patch as a side effect totally ruined the concept of replicas
migration since we want it to work also for instances that, technically,
never had slaves in the past: promoted slaves.

So now instead the ability to be targeted by replicas migration, is a
new flag "migrate-to". It only applies to masters, and is set in the
following two cases:

1. When a master gets a slave, it is set.
2. When a slave turns into a master because of fail over, it is set.

This way replicas migration targets are only masters that used to have
slaves, and slaves of masters (that used to have slaves... obviously)
and are promoted.

The new flag is only internal, and is never exposed in the output nor
persisted in the nodes configuration, since all the information to
handle it are implicit in the cluster configuration we already have.
2015-12-09 23:03:18 +01:00
antirez
3064487047 RDMF: more names updated. 2015-07-27 15:03:10 +02:00
antirez
c15cac0d77 RDMF: More consistent define names. 2015-07-27 14:37:58 +02:00
antirez
fa26d3dd63 RDMF: use client instead of redisClient, like Disque. 2015-07-26 15:20:52 +02:00
antirez
64ae753eb0 Cluster: redirection refactoring + handling of blocked clients.
There was a bug in Redis Cluster caused by clients blocked in a blocking
list pop operation, for keys no longer handled by the instance, or
in a condition where the cluster became down after the client blocked.

A typical situation is:

1) BLPOP <somekey> 0
2) <somekey> hash slot is resharded to another master.

The client will block forever int this case.

A symmentrical non-cluster-specific bug happens when an instance is
turned from master to slave. In that case it is more serious since this
will desynchronize data between slaves and masters. This other bug was
discovered as a side effect of thinking about the bug explained and
fixed in this commit, but will be fixed in a separated commit.
2015-03-24 11:56:24 +01:00
antirez
99e8cc230d Cluster: better cluster state transiction handling.
Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.
2015-03-20 09:59:28 +01:00
antirez
7e4233e3f7 Cluster: clusterMsgDataGossip structure, explict padding + minor stuff.
Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.

Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.
2015-01-13 10:40:09 +01:00
antirez
0186fa3aa5 Cluster PUBLISH message: fix totlen count.
bulk_data field size was not removed from the count. It is not possible
to declare it simply as 'char bulk_data[]' since the structure is nested
into another structure.
2014-11-28 10:21:47 +01:00
antirez
1730b49ad7 Cluster: more chatty slaves when failover is stalled. 2014-10-07 09:51:55 +02:00
Matt Stancliff
299c667adb Clean up text throughout project
- Remove trailing newlines from redis.conf
  - Fix comment misspelling
  - Clarifies zipEncodeLength usage and a C API mention (#1243, #1242)
  - Fix cluster typos (inspired by @papanikge #1507)
  - Fix rewite -> rewrite in a few places (inspired by #682)

Closes #1243, #1242, #1507
2014-09-29 06:49:07 -04:00
antirez
aea347d60c Cluster: new option to work with partial slots coverage. 2014-09-17 11:10:09 +02:00
antirez
3ef46fb17b Use modern typedef form in cluster.h. 2014-08-25 10:42:18 +02:00
antirez
aec8e92316 Cluster: slave validity factor is now user configurable.
Check the commit changes in the example redis.conf for more information.
2014-05-22 16:57:54 +02:00
antirez
0f816d9867 Cluster: last_vote_epoch -> lastVoteEpoch.
Use cammel case for epochs that are persisted on disk.
2014-03-27 15:01:24 +01:00
antirez
964ee1343f Cluster: better timeout and retry time for failover.
When node-timeout is too small, in the order of a few milliseconds,
there is no way the voting process can terminate during that time, so we
set a lower limit for the failover timeout of two seconds.

The retry time is set to two times the failover timeout time, so it is
at least 4 seconds.
2014-03-10 09:57:52 +01:00
antirez
e7022731e0 Redis Cluster: support for multi-key operations. 2014-03-07 13:19:09 +01:00
Matt Stancliff
6aafad9d8a Remove redundant IP length definition
REDIS_CLUSTER_IPLEN had the same value as
REDIS_IP_STR_LEN.  They were both #define'd
to the same INET6_ADDRSTRLEN.
2014-03-06 17:55:43 +01:00
Matt Stancliff
7b00620e1f Fix IP representation in clusterMsgDataGossip 2014-02-25 16:02:28 -05:00
antirez
467ed194be Cluster: clusterReadHandler() fixed to work with new message header. 2014-02-10 16:27:37 +01:00
antirez
e68a4656d3 Cluster: signature changed to "RCmb" (Redis Cluster message bus).
Sounds better after all.
2014-02-10 15:55:21 +01:00
antirez
39c37c7515 Cluster: added signature + version in bus packets. 2014-02-10 15:53:09 +01:00
antirez
6edbc88416 Cluster: force AUTH ACK on manual failover.
When a slave requests masters vote for a manual failover, the
REQUEST_AUTH message is flagged in a special way in order to force the
masters to give the authorization even if the master is not marked as
failing.
2014-02-05 13:10:03 +01:00
antirez
45a900e448 Cluster: manual failover initial implementation. 2014-02-05 13:01:24 +01:00
antirez
bc300b22af Cluster: configurable replicas migration barrier.
It is possible to configure the min number of additional working slaves
a master should be left with, for a slave to migrate to an orphaned
master.
2014-01-31 11:26:36 +01:00
antirez
f5c5e2707c Cluster: added progressive election delay according to slave rank.
Note that when we compute the initial delay, there are probably still
more up to date information to receive from slaves with new offsets, so
the delay is recomputed when new data is available.
2014-01-29 16:53:45 +01:00
antirez
0c36b17916 Cluster: refactoring: new macros to check node flags. 2014-01-29 12:17:16 +01:00
antirez
b8a0220498 Cluster: broadcast master/slave replication offset in bus header. 2014-01-28 16:51:50 +01:00
antirez
0ae1d07b01 Cluster: limit cluster.h to 80 cols. 2014-01-28 16:34:23 +01:00
antirez
91afff8fee Cluster: introduced repl_offset fields in clusterNode.
The two fields are used in order to remember the latest known
replication offset and the time we received it from other slave nodes.

This will be used by slaves in order to start the election procedure
with a delay that is proportional to the rank of the slave among the
other slaves for this master, when sorted for replication offset.

Usually this allows the slave with the most updated offset to win the
election and replace the failing master in the cluster.
2014-01-28 16:28:07 +01:00
antirez
5502face59 Cluster: basic data structures for nodes black list. 2013-11-29 17:37:06 +01:00
antirez
a829c85988 Cluster: some code about clusterHandleSlaveFailover() marginally improved.
80 cols friendly, some minor change to the code to make it simpler.
2013-11-29 16:17:05 +01:00
antirez
a19147c2fa Cluster: UPDATE msg data structure and sending function. 2013-11-08 16:26:50 +01:00
antirez
e4b341a335 Cluster: time switched from seconds to milliseconds.
All the internal state of cluster involving time is now using mstime_t
and mstime() in order to use milliseconds resolution.

Also the clusterCron() function is called with a 10 hz frequency instead
of 1 hz.

The cluster node_timeout must be also configured in milliseconds by the
user in redis.conf.
2013-10-09 16:19:26 +02:00
antirez
1560b70889 Cluster: cluster stuff moved from redis.h to cluster.h. 2013-10-09 15:38:05 +02:00