futriix

Author	SHA1	Message	Date
antirez	67133d2f48	Cluster: use clusterSetNodeAsMaster() during slave failover. clusterHandleSlaveFailover() was reimplementing what clusterSetNodeAsMaster() without any good reason.	2014-05-15 17:03:28 +02:00
antirez	b1d19fd6e6	Cluster: clear todo_before_sleep flags when executing actions. Thanks to this change, when there is some code like: clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE\|...); ... and later before returning to the event loop ... clusterUpdateState(); The clusterUpdateState() function will clar the flag and will not be repeated in the clusterBeforeSleep() function. This especially important for config save/fsync flags which are slow to execute and not a good idea to repeat without a good reason. This is implemented for all the CLUSTER_TODO flags.	2014-05-15 16:33:13 +02:00
antirez	8c6e92c3bc	Cluster: clear todo_before_sleep flags when executing actions. Thanks to this change, when there is some code like: clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE\|...); ... and later before returning to the event loop ... clusterUpdateState(); The clusterUpdateState() function will clar the flag and will not be repeated in the clusterBeforeSleep() function. This especially important for config save/fsync flags which are slow to execute and not a good idea to repeat without a good reason. This is implemented for all the CLUSTER_TODO flags.	2014-05-15 16:33:13 +02:00
antirez	73b3fcde40	Fixed typo in CLUSTER RESET implementation.	2014-05-15 12:33:57 +02:00
antirez	7b87cda70e	Fixed typo in CLUSTER RESET implementation.	2014-05-15 12:33:57 +02:00
antirez	b5765f3a73	CLUSTER RESET implemented. The new command is able to reset a cluster node so that it starts again as a fresh node. By default the command performs a soft reset (the same as calling it as CLUSTER RESET SOFT), and the following steps are performed: 1) All slots are set as unassigned. 2) The list of known nodes is flushed. 3) Node is set as master if it is a slave. When an hard reset is performed with CLUSTER RESET HARD the following additional operations are performed: 4) A new Node ID is created at random. 5) Epochs are set to 0. CLUSTER RESET is useful both when the sysadmin wants to reconfigure a node with a different role (for example turning a slave into a master) and for testing purposes. It also may play a role in automatically provisioned Redis Clusters, since it allows to reset a node back to the initial state in order to be reconfigured.	2014-05-15 11:43:06 +02:00
antirez	796f4ae9f7	CLUSTER RESET implemented. The new command is able to reset a cluster node so that it starts again as a fresh node. By default the command performs a soft reset (the same as calling it as CLUSTER RESET SOFT), and the following steps are performed: 1) All slots are set as unassigned. 2) The list of known nodes is flushed. 3) Node is set as master if it is a slave. When an hard reset is performed with CLUSTER RESET HARD the following additional operations are performed: 4) A new Node ID is created at random. 5) Epochs are set to 0. CLUSTER RESET is useful both when the sysadmin wants to reconfigure a node with a different role (for example turning a slave into a master) and for testing purposes. It also may play a role in automatically provisioned Redis Clusters, since it allows to reset a node back to the initial state in order to be reconfigured.	2014-05-15 11:43:06 +02:00
antirez	f48f8fda62	Remove trailing spaces from cluster.c file.	2014-05-15 10:18:36 +02:00
antirez	8b9d5ecbd1	Remove trailing spaces from cluster.c file.	2014-05-15 10:18:36 +02:00
antirez	0ca22608a8	Cluster: don't accept cluster bus connections during startup.	2014-05-14 12:05:00 +02:00
antirez	60e5d1724c	Cluster: don't accept cluster bus connections during startup.	2014-05-14 12:05:00 +02:00
antirez	716b729ab9	Cluster: better handling of stolen slots. The previous code handling a lost slot (by another master with an higher configuration for the slot) was defensive, considering it an error and putting the cluster in an odd state requiring redis-cli fix. This was changed, because actually this only happens either in a legitimate way, with failovers, or when the admin messed with the config in order to reconfigure the cluster. So the new code instead will try to make sure that the keys stored match the new slots map, by removing all the keys in the slots we lost ownership from. The function that deletes the keys from the lost slots is called only if the node does not lose all its slots (resulting in a reconfiguration as a slave of the node that got ownership). This is an optimization since the replication code will anyway flush all the instance data in a faster way.	2014-05-14 10:46:37 +02:00
antirez	6baac558d8	Cluster: better handling of stolen slots. The previous code handling a lost slot (by another master with an higher configuration for the slot) was defensive, considering it an error and putting the cluster in an odd state requiring redis-cli fix. This was changed, because actually this only happens either in a legitimate way, with failovers, or when the admin messed with the config in order to reconfigure the cluster. So the new code instead will try to make sure that the keys stored match the new slots map, by removing all the keys in the slots we lost ownership from. The function that deletes the keys from the lost slots is called only if the node does not lose all its slots (resulting in a reconfiguration as a slave of the node that got ownership). This is an optimization since the replication code will anyway flush all the instance data in a faster way.	2014-05-14 10:46:37 +02:00
antirez	e7b8d75ba3	Cluster: fixed data_age computation / check integer overflow.	2014-05-12 17:46:15 +02:00
antirez	832a298005	Cluster: fixed data_age computation / check integer overflow.	2014-05-12 17:46:15 +02:00
Matt Stancliff	2548b2d254	Fix lack of strtold under Cygwin Renaming strtold to strtod then casting the result is the standard way of dealing with no strtold in Cygwin.	2014-05-12 11:11:09 -04:00
Matt Stancliff	7c4decb101	Fix lack of strtold under Cygwin Renaming strtold to strtod then casting the result is the standard way of dealing with no strtold in Cygwin.	2014-05-12 11:11:09 -04:00
Matt Stancliff	9bcd2e53a5	Fix lack of SA_ONSTACK under Cygwin Fixes #232	2014-05-12 11:10:24 -04:00
Matt Stancliff	3e0e51dd9f	Fix lack of SA_ONSTACK under Cygwin Fixes #232	2014-05-12 11:10:24 -04:00
antirez	28eb9c3f39	Cluster: forced failover implemented. Using CLUSTER FAILOVER FORCE it is now possible to failover a master in a forced way, which means: 1) No check to understand if the master is up is performed. 2) No data age of the slave is checked. Evan a slave with very old data can manually failover a master in this way. 3) No chat with the master is attempted to reach its replication offset: the master can just be down.	2014-05-12 16:34:20 +02:00
antirez	2692339138	Cluster: forced failover implemented. Using CLUSTER FAILOVER FORCE it is now possible to failover a master in a forced way, which means: 1) No check to understand if the master is up is performed. 2) No data age of the slave is checked. Evan a slave with very old data can manually failover a master in this way. 3) No chat with the master is attempted to reach its replication offset: the master can just be down.	2014-05-12 16:34:20 +02:00
antirez	b70dc36c8a	Cluster: bypass data_age check for manual failovers. Automatic failovers only happen in Redis Cluster if the slave trying to be elected was disconnected from its master for no more than 10 times the node-timeout value. However there should be no such a check for manual failovers, since these are initiated by the sysadmin that, in theory, knows what she is doing when a slave is selected to be promoted.	2014-05-12 16:12:12 +02:00
antirez	005f564eb3	Cluster: bypass data_age check for manual failovers. Automatic failovers only happen in Redis Cluster if the slave trying to be elected was disconnected from its master for no more than 10 times the node-timeout value. However there should be no such a check for manual failovers, since these are initiated by the sysadmin that, in theory, knows what she is doing when a slave is selected to be promoted.	2014-05-12 16:12:12 +02:00
Akos Vandra	52ede31ac4	Fixed possible buffer overflow bug if RDB file is corrupted. (Note: commit message modified by @antirez for clarity).	2014-05-12 11:48:14 +02:00
Akos Vandra	b252fab06c	Fixed possible buffer overflow bug if RDB file is corrupted. (Note: commit message modified by @antirez for clarity).	2014-05-12 11:48:14 +02:00
Akos Vandra	367c237194	fixed possible buffer overflow error	2014-05-12 11:19:07 +02:00
Akos Vandra	433e835d3e	fixed possible buffer overflow error	2014-05-12 11:19:07 +02:00
antirez	603a06086f	redis-trib create: use CONFIG SET-CONFIG-EPOCH before joining the cluster. This way there is no need for the conflict resolution algo to be used in order to start with a cluster where each node has a different configEpoch.	2014-05-12 11:06:37 +02:00
antirez	658ad301cc	redis-trib create: use CONFIG SET-CONFIG-EPOCH before joining the cluster. This way there is no need for the conflict resolution algo to be used in order to start with a cluster where each node has a different configEpoch.	2014-05-12 11:06:37 +02:00
antirez	ba310a9ed2	redis-trib import: trap MIGRATE errors.	2014-05-12 10:36:33 +02:00
antirez	715a6d3a78	redis-trib import: trap MIGRATE errors.	2014-05-12 10:36:33 +02:00
antirez	223e29147f	redis-trib.rb: MIGRATE hardcoded timeout set to 15 sec. Will be configurable / adaptive at some point but let's start with a saner value compared to 1 sec which is not a good idea for big data structures stored into a single key.	2014-05-12 10:22:24 +02:00
antirez	939c586ef7	redis-trib.rb: MIGRATE hardcoded timeout set to 15 sec. Will be configurable / adaptive at some point but let's start with a saner value compared to 1 sec which is not a good idea for big data structures stored into a single key.	2014-05-12 10:22:24 +02:00
antirez	9cbccf9a6b	RESTORE: reply with -BUSYKEY special error code. The error when the target key is busy was a generic one, while it makes sense to be able to distinguish between the target key busy error and the others easily.	2014-05-12 10:01:59 +02:00
antirez	5c78f87666	RESTORE: reply with -BUSYKEY special error code. The error when the target key is busy was a generic one, while it makes sense to be able to distinguish between the target key busy error and the others easily.	2014-05-12 10:01:59 +02:00
antirez	dacd56d597	Cluster: initial ability to import data from standalone instance.	2014-05-10 17:59:31 +02:00
antirez	2a48bd4a37	Cluster: initial ability to import data from standalone instance.	2014-05-10 17:59:31 +02:00
antirez	89afa27bb7	CLUSTER MEET: better error messages when address is invalid. Fixes issue #1734.	2014-05-09 16:36:59 +02:00
antirez	71d0e7e0ea	CLUSTER MEET: better error messages when address is invalid. Fixes issue #1734.	2014-05-09 16:36:59 +02:00
antirez	630a447449	redis-trib: allow support for mandatory options.	2014-05-09 16:11:11 +02:00
antirez	74435aba47	redis-trib: allow support for mandatory options.	2014-05-09 16:11:11 +02:00
antirez	9b459881c1	DEBUG POPULATE: call dictExpand() to avoid useless rehashing.	2014-05-09 15:02:29 +02:00
antirez	72ff03346f	DEBUG POPULATE: call dictExpand() to avoid useless rehashing.	2014-05-09 15:02:29 +02:00
antirez	ab05314244	Cluster: bulk-accept new nodes connections. The same change was operated for normal client connections. This is important for Cluster as well, since when a node rejoins the cluster, when a partition heals or after a restart, it gets flooded with new connection attempts by all the other nodes trying to form a full mesh again.	2014-05-09 11:52:59 +02:00
antirez	8a170c817d	Cluster: bulk-accept new nodes connections. The same change was operated for normal client connections. This is important for Cluster as well, since when a node rejoins the cluster, when a partition heals or after a restart, it gets flooded with new connection attempts by all the other nodes trying to form a full mesh again.	2014-05-09 11:52:59 +02:00
antirez	0478d73098	Cluster: clusterAcceptHandler() comments updated to match the code.	2014-05-09 11:44:46 +02:00
antirez	3625b52791	Cluster: clusterAcceptHandler() comments updated to match the code.	2014-05-09 11:44:46 +02:00
antirez	1a32a0f9a0	Sentinel: log when a failover will be attempted again. When a Sentinel performs a failover (successful or not), or when a Sentinel votes for a different Sentinel trying to start a failover, it sets a min delay before it will try to get elected for a failover. While not strictly needed, because if multiple Sentinels will try to failover the same master at the same time, only one configuration will eventually win, this serialization is practically very useful. Normal failovers are cleaner: one Sentinel starts to failover, the others update their config when the Sentinel performing the failover is able to get the selected slave to move from the role of slave to the one of master. However currently this timeout was implicit, so users could see Sentinels not reacting, after a failed failover, for some time, without giving any feedback in the logs to the poor sysadmin waiting for clues. This commit makes Sentinels more verbose about the delay: when a master is down and a failover attempt is not performed because the delay has still not elaped, something like that will be logged: Next failover delay: I will not start a failover before Thu May 8 16:48:59 2014	2014-05-08 16:38:53 +02:00
antirez	2102778606	Sentinel: log when a failover will be attempted again. When a Sentinel performs a failover (successful or not), or when a Sentinel votes for a different Sentinel trying to start a failover, it sets a min delay before it will try to get elected for a failover. While not strictly needed, because if multiple Sentinels will try to failover the same master at the same time, only one configuration will eventually win, this serialization is practically very useful. Normal failovers are cleaner: one Sentinel starts to failover, the others update their config when the Sentinel performing the failover is able to get the selected slave to move from the role of slave to the one of master. However currently this timeout was implicit, so users could see Sentinels not reacting, after a failed failover, for some time, without giving any feedback in the logs to the poor sysadmin waiting for clues. This commit makes Sentinels more verbose about the delay: when a master is down and a failover attempt is not performed because the delay has still not elaped, something like that will be logged: Next failover delay: I will not start a failover before Thu May 8 16:48:59 2014	2014-05-08 16:38:53 +02:00
antirez	7ef2b30677	Sentinel: generate +config-update-from event when a new config is received. This event makes clear, before the switch-master event is generated, that a Sentinel received a configuration update from another Sentinel.	2014-05-08 15:59:34 +02:00

... 318 319 320 321 322 ...

20983 Commits