futriix

Author	SHA1	Message	Date
antirez	adc613f456	Cluster: ignore empty lines in nodes.conf. Even without the user messing manually with the file, it is still possible to have blank lines (just a single "\n" per line) because of how the nodes.conf update/write process works.	2014-01-15 11:23:41 +01:00
antirez	e4a1d6bb5d	Cluster: atomic update of nodes.conf file. The way the file was generated was unsafe and leaded to nodes.conf file corruption (zero length file) on server stop/crash during the creation of the file. The previous file update method was as simple as open with O_TRUNC followed by the write call. While the write call was a single one with the full payload, ensuring no half-written files for POSIX semantics, stopping the server just after the open call resulted into a zero-length file (all the nodes information lost!).	2014-01-15 10:31:20 +01:00
antirez	fdab41fe65	Cluster: support to read from slave nodes. A client can enter a special cluster read-only mode using the READONLY command: if the client read from a slave instance after this command, for slots that are actually served by the instance's master, the queries will be processed without redirection, allowing clients to read from slaves (but without any kind fo read-after-write guarantee). The READWRITE command can be used in order to exit the readonly state.	2014-01-14 16:33:16 +01:00
antirez	ed3c6c0124	Fix RESTORE ttl handling in 32 bit archs. long was used instead of long long in order to handle a 64 bit resolution millisecond timestamp. This fixes issue #1483.	2014-01-09 11:09:23 +01:00
antirez	98901950f9	Cluster: clusterProcessPacket() was not 80 cols friendly. The function actually needs to be split into sub-functions at some point in the future.	2013-12-25 17:57:36 +01:00
antirez	a75b334bdf	Redis Cluster: add repl_ping_slave_period to slave data validity time. When the configured node timeout is very small, the data validity time (maximum data age for a slave to try a failover) is too little (ten times the configured node timeout) when the replication link with the master is mostly idle. In this case we'll receive some data from the master only every server.repl_ping_slave_period to refresh the last interaction with the master. This commit adds to the max data validity time the slave ping period to avoid this problem of slaves sensing too old data without a good reason. However this max data validity time is likely a setting that should be configurable by the Redis Cluster user in a way completely independent from the node timeout.	2013-12-22 10:05:16 +01:00
antirez	db016acb7f	Redis Cluster: move node failure reports logging from VERBOSE to NOTICE level.	2013-12-21 00:04:53 +01:00
antirez	8527ba1eea	Redis Cluster: remove no longer relevant comment.	2013-12-20 14:40:11 +01:00
antirez	dd10efb31a	Redis Cluster: reconfigure replication when master changes address.	2013-12-20 12:47:22 +01:00
antirez	4d11d4c86c	Redis Cluster: handshake code refactoring + Gossip IP switch detection. This commit makes it simple to start an handshake with a specific node address, and uses this in order to detect a node IP change and start a new handshake in order to fix the IP if possible.	2013-12-20 12:38:03 +01:00
antirez	f42e0277ab	Redis Cluster: delay state change when in the majority again. As specified in the Redis Cluster specification, when a node can reach the majority again after a period in which it was partitioend away with the minorty of masters, wait some time before accepting queries, to provide a reasonable amount of time for other nodes to upgrade its configuration. This lowers the probabilities of both a client and a master with not updated configuration to rejoin the cluster at the same time, with a stale master accepting writes.	2013-12-20 09:56:18 +01:00
antirez	e48365e2c2	Cluster: set n->slaves to NULL in clusterNodeResetSlaves(). The value was otherwise undefined, so next time the node was promoted again from slave to master, adding a slave to the list of slaves would likely crash the server or result into undefined behavior.	2013-12-17 14:50:24 +01:00
antirez	7f51cf8b56	Cluster: check link is valid before sending UPDATE.	2013-12-17 12:28:37 +01:00
antirez	c17be18035	Cluster: initialize todo_before_sleep flags to 0.	2013-12-17 12:22:02 +01:00
antirez	195aab3345	Cluster: use proper type mstime_t for ping delay var.	2013-12-17 10:27:36 +01:00
antirez	118d0fb533	Fixed clearNodeFailureIfNeeded() time type to mstime_t. This prevented 32bit cluster instances from clearing the FAIL flag when needed.	2013-12-17 09:45:52 +01:00
antirez	9180fb7931	Cluster: use long long for timestamps in clusterGenNodesDescription(). Ping sent and pong received fields need to be casted to long long to be printed correctly into 32 bit systems.	2013-12-17 09:38:11 +01:00
antirez	7a5a646df9	Fixed grammar: before H the article is a, not an.	2013-12-05 16:35:32 +01:00
antirez	b7c955046d	Cluster: nodes re-addition blacklist API.	2013-12-02 11:12:23 +01:00
antirez	5502face59	Cluster: basic data structures for nodes black list.	2013-11-29 17:37:06 +01:00
antirez	a829c85988	Cluster: some code about clusterHandleSlaveFailover() marginally improved. 80 cols friendly, some minor change to the code to make it simpler.	2013-11-29 16:17:05 +01:00
antirez	e159239f9c	Cluster: removed not needed newline at end of redisLog() msg.	2013-11-08 17:28:02 +01:00
antirez	a67935e5e3	Cluster: send a single UPDATE packet for now.	2013-11-08 17:25:49 +01:00
antirez	a146482c83	Cluster: replace hardcoded 4096 for bus msg len with sizeof().	2013-11-08 17:19:19 +01:00
antirez	36db83ac50	Cluster: slots update refactored + UPDATE msg processing. Now there is a function that handles the update of the local slot configuration every time we have some new info about a node and its set of served slots and configEpoch. Moreoever the UPDATE packets are now processed when received (it was a work in progress in the previous commit).	2013-11-08 17:02:10 +01:00
antirez	a19147c2fa	Cluster: UPDATE msg data structure and sending function.	2013-11-08 16:26:50 +01:00
antirez	4666966c9f	Cluster: refactoring of slots update code and more. The commit also introduces detection of nodes publishing not updated configuration. More work in progress to send an UPDATE packet to inform of the config change.	2013-11-08 10:32:16 +01:00
antirez	f6738923a6	Cluster: initialize senderConfigEpoch and senderCurrentEpoch for warnings suppression.	2013-11-05 12:01:07 +01:00
antirez	e45d9420e0	Cluster: there is a lower limit for the handshake timeout.	2013-10-11 10:34:32 +02:00
antirez	39c90945e0	Cluster: data_age conversion to milliseconds fixed.	2013-10-09 16:36:06 +02:00
antirez	aa0e7dbcf3	Cluster: clusterCron() freq is now 10h. Still ping 1 node every sec. After the change in clusterCron() frequency of call, we still want to ping just one random node every second.	2013-10-09 16:29:17 +02:00
antirez	e4b341a335	Cluster: time switched from seconds to milliseconds. All the internal state of cluster involving time is now using mstime_t and mstime() in order to use milliseconds resolution. Also the clusterCron() function is called with a 10 hz frequency instead of 1 hz. The cluster node_timeout must be also configured in milliseconds by the user in redis.conf.	2013-10-09 16:19:26 +02:00
antirez	1560b70889	Cluster: cluster stuff moved from redis.h to cluster.h.	2013-10-09 15:38:05 +02:00
antirez	0f079966c7	Cluster: masters don't vote for a slave with stale config. When a slave requests our vote, the configEpoch he claims for its master and the set of served slots must be greater or equal to the configEpoch of the nodes serving these slots in the current configuraiton of the master granting its vote. In other terms, masters don't vote for slaves having a stale configuration for the slots they want to serve.	2013-10-08 12:45:35 +02:00
antirez	26ea55b7f5	Cluster: fix slave data age computation when master is still connected.	2013-10-07 16:07:13 +02:00
antirez	acd9ec222e	Cluster: log message improved when FAIL is cleared from a slave node.	2013-10-07 15:44:58 +02:00
antirez	e9b8b30c81	Cluster: slave nodes advertise master slots bitmap and configEpoch.	2013-10-07 11:31:12 +02:00
antirez	dbf6c85d5e	Cluster: new clusterDoBeforeSleep() API. The new API is able to remember operations to perform before returning to the event loop, such as checking if there is the failover quorum for a slave, save and fsync the configuraiton file, and so forth. Because this operations are performed before returning on the event loop we are sure that messages that are sent in the same event loop run will be delivered after the configuration is already saved, that is a requirement sometimes. For instance we want to publish a new epoch only when it is already stored in nodes.conf in order to avoid returning back in the logical clock when a node is restarted. This new API provides a big performance advantage compared to saving and possibly fsyncing the configuration file multiple times in the same event loop run, especially in the case of big clusters with tens or hundreds of nodes.	2013-10-03 09:58:06 +02:00
antirez	43f3df99c8	Cluster: update cluster config when slave changes master.	2013-10-02 12:27:12 +02:00
antirez	5cbb913994	Cluster: bus messages stats in CLUSTER info.	2013-10-02 10:10:08 +02:00
antirez	90b06ab7b5	Cluster: FAIL messages from unknown senders are handled better. Previously the event was not logged but instead the node reported an unknown packet type received.	2013-10-02 09:42:45 +02:00
antirez	3be5010adb	Cluster: senderCurrentEpoch == node currentEpoch was too strict. We can accept a vote as long as its epoch is >= the epoch at which we started the voting process. There is no need for it to be exactly the same.	2013-10-01 17:21:28 +02:00
antirez	0000cfbf38	Cluster: fix typo in clusterProcessPacket() comment.	2013-10-01 15:40:20 +02:00
antirez	6ed0dee927	Cluster: time field removed from cluster messages header. The new algorithm does not check replies time as checking for the currentEpoch in the reply ensures that the reply is about the current election process.	2013-09-30 16:19:44 +02:00
antirez	60d4ae49be	Cluster: log message shortened.	2013-09-30 11:51:58 +02:00
antirez	1239f49065	Cluster: detect cluster reconfiguration when master slots drop to 0. The old algorithm used a PROMOTED flag and explicitly checks about slave->master convertions. Wit the new cluster meta-data propagation algorithm we just look at the configEpoch to check if we need to reconfigure slots, then: 1) If a node is a master but it reaches zero served slots becuase of reconfiguration. 2) If a node is a slave but the master reaches zero served slots because of a reconfiguration. We switch as a replica of the new slots owner.	2013-09-30 11:45:26 +02:00
antirez	2a391b8bac	Cluster: re-order failover operations to make it safer. We need to: 1) Increment the configEpoch. 2) Save it to disk and fsync the file. 3) Broadcast the PONG with the new configuration. If other nodes will receive the updated configuration we need to be sure to restart with this new config in the event of a crash.	2013-09-30 10:16:48 +02:00
antirez	0b63dc2841	Cluster: when upading the configEpoch for a node, save config on disk ASAP.	2013-09-30 10:16:25 +02:00
antirez	5d393adeac	Cluster: fsync data when saving the cluster config.	2013-09-30 10:13:07 +02:00
antirez	8fa4e7817a	Cluster: update the node configEpoch when newer is detected.	2013-09-27 09:55:41 +02:00

... 6 7 8 9 10 ...

632 Commits