2882 Commits

Author SHA1 Message Date
antirez
8b7981725b Remove trailing spaces from sentinel.c. 2014-05-20 14:22:42 +02:00
michael-grunder
6124589d27 Fix LUA_OBJCACHE segfault.
When scanning the argument list inside of a redis.call() invocation
for pre-cached values, there was no check being done that the
argument we were on was in fact within the bounds of the cache size.

So if a redis.call() command was ever executed with more than 32
arguments (current cache size #define setting) redis-server could
segfault.
2014-05-19 13:18:13 -07:00
Mike Trinkala
ac1a3c7340 Correct the HyperLogLog stale cache flag to prevent unnecessary computations.
Set the MSB as documented.
2014-05-18 07:26:26 -07:00
antirez
352f4fbbd5 Cluster: use clusterSetNodeAsMaster() during slave failover.
clusterHandleSlaveFailover() was reimplementing what
clusterSetNodeAsMaster() without any good reason.
2014-05-15 17:03:28 +02:00
antirez
b1d19fd6e6 Cluster: clear todo_before_sleep flags when executing actions.
Thanks to this change, when there is some code like:

    clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|...);
    ... and later before returning to the event loop ...
    clusterUpdateState();

The clusterUpdateState() function will clar the flag and will not be
repeated in the clusterBeforeSleep() function. This especially important
for config save/fsync flags which are slow to execute and not a good
idea to repeat without a good reason.

This is implemented for all the CLUSTER_TODO flags.
2014-05-15 16:33:13 +02:00
antirez
73b3fcde40 Fixed typo in CLUSTER RESET implementation. 2014-05-15 12:33:57 +02:00
antirez
b5765f3a73 CLUSTER RESET implemented.
The new command is able to reset a cluster node so that it starts again
as a fresh node. By default the command performs a soft reset (the same
as calling it as CLUSTER RESET SOFT), and the following steps are
performed:

1) All slots are set as unassigned.
2) The list of known nodes is flushed.
3) Node is set as master if it is a slave.

When an hard reset is performed with CLUSTER RESET HARD the following
additional operations are performed:

4) A new Node ID is created at random.
5) Epochs are set to 0.

CLUSTER RESET is useful both when the sysadmin wants to reconfigure a
node with a different role (for example turning a slave into a master)
and for testing purposes.

It also may play a role in automatically provisioned Redis Clusters,
since it allows to reset a node back to the initial state in order to be
reconfigured.
2014-05-15 11:43:06 +02:00
antirez
f48f8fda62 Remove trailing spaces from cluster.c file. 2014-05-15 10:18:36 +02:00
antirez
0ca22608a8 Cluster: don't accept cluster bus connections during startup. 2014-05-14 12:05:00 +02:00
antirez
716b729ab9 Cluster: better handling of stolen slots.
The previous code handling a lost slot (by another master with an higher
configuration for the slot) was defensive, considering it an error and
putting the cluster in an odd state requiring redis-cli fix.

This was changed, because actually this only happens either in a
legitimate way, with failovers, or when the admin messed with the config
in order to reconfigure the cluster. So the new code instead will try to
make sure that the keys stored match the new slots map, by removing all
the keys in the slots we lost ownership from.

The function that deletes the keys from the lost slots is called only
if the node does not lose all its slots (resulting in a reconfiguration
as a slave of the node that got ownership). This is an optimization
since the replication code will anyway flush all the instance data in
a faster way.
2014-05-14 10:46:37 +02:00
antirez
e7b8d75ba3 Cluster: fixed data_age computation / check integer overflow. 2014-05-12 17:46:15 +02:00
Matt Stancliff
2548b2d254 Fix lack of strtold under Cygwin
Renaming strtold to strtod then casting
the result is the standard way of dealing with
no strtold in Cygwin.
2014-05-12 11:11:09 -04:00
Matt Stancliff
9bcd2e53a5 Fix lack of SA_ONSTACK under Cygwin
Fixes #232
2014-05-12 11:10:24 -04:00
antirez
28eb9c3f39 Cluster: forced failover implemented.
Using CLUSTER FAILOVER FORCE it is now possible to failover a master in
a forced way, which means:

1) No check to understand if the master is up is performed.
2) No data age of the slave is checked. Evan a slave with very old data
   can manually failover a master in this way.
3) No chat with the master is attempted to reach its replication offset:
   the master can just be down.
2014-05-12 16:34:20 +02:00
antirez
b70dc36c8a Cluster: bypass data_age check for manual failovers.
Automatic failovers only happen in Redis Cluster if the slave trying to
be elected was disconnected from its master for no more than 10 times
the node-timeout value. However there should be no such a check for
manual failovers, since these are initiated by the sysadmin that, in
theory, knows what she is doing when a slave is selected to be promoted.
2014-05-12 16:12:12 +02:00
Akos Vandra
52ede31ac4 Fixed possible buffer overflow bug if RDB file is corrupted.
(Note: commit message modified by @antirez for clarity).
2014-05-12 11:48:14 +02:00
Akos Vandra
367c237194 fixed possible buffer overflow error 2014-05-12 11:19:07 +02:00
antirez
603a06086f redis-trib create: use CONFIG SET-CONFIG-EPOCH before joining the cluster.
This way there is no need for the conflict resolution algo to be used in
order to start with a cluster where each node has a different
configEpoch.
2014-05-12 11:06:37 +02:00
antirez
ba310a9ed2 redis-trib import: trap MIGRATE errors. 2014-05-12 10:36:33 +02:00
antirez
223e29147f redis-trib.rb: MIGRATE hardcoded timeout set to 15 sec.
Will be configurable / adaptive at some point but let's start with a
saner value compared to 1 sec which is not a good idea for big data
structures stored into a single key.
2014-05-12 10:22:24 +02:00
antirez
9cbccf9a6b RESTORE: reply with -BUSYKEY special error code.
The error when the target key is busy was a generic one, while it makes
sense to be able to distinguish between the target key busy error and
the others easily.
2014-05-12 10:01:59 +02:00
antirez
dacd56d597 Cluster: initial ability to import data from standalone instance. 2014-05-10 17:59:31 +02:00
antirez
89afa27bb7 CLUSTER MEET: better error messages when address is invalid.
Fixes issue #1734.
2014-05-09 16:36:59 +02:00
antirez
630a447449 redis-trib: allow support for mandatory options. 2014-05-09 16:11:11 +02:00
antirez
9b459881c1 DEBUG POPULATE: call dictExpand() to avoid useless rehashing. 2014-05-09 15:02:29 +02:00
antirez
ab05314244 Cluster: bulk-accept new nodes connections.
The same change was operated for normal client connections. This is
important for Cluster as well, since when a node rejoins the cluster,
when a partition heals or after a restart, it gets flooded with new
connection attempts by all the other nodes trying to form a full
mesh again.
2014-05-09 11:52:59 +02:00
antirez
0478d73098 Cluster: clusterAcceptHandler() comments updated to match the code. 2014-05-09 11:44:46 +02:00
antirez
1a32a0f9a0 Sentinel: log when a failover will be attempted again.
When a Sentinel performs a failover (successful or not), or when a
Sentinel votes for a different Sentinel trying to start a failover, it
sets a min delay before it will try to get elected for a failover.

While not strictly needed, because if multiple Sentinels will try
to failover the same master at the same time, only one configuration
will eventually win, this serialization is practically very useful.
Normal failovers are cleaner: one Sentinel starts to failover, the
others update their config when the Sentinel performing the failover
is able to get the selected slave to move from the role of slave to the
one of master.

However currently this timeout was implicit, so users could see
Sentinels not reacting, after a failed failover, for some time, without
giving any feedback in the logs to the poor sysadmin waiting for clues.

This commit makes Sentinels more verbose about the delay: when a master
is down and a failover attempt is not performed because the delay has
still not elaped, something like that will be logged:

    Next failover delay: I will not start a failover
    before Thu May  8 16:48:59 2014
2014-05-08 16:38:53 +02:00
antirez
7ef2b30677 Sentinel: generate +config-update-from event when a new config is received.
This event makes clear, before the switch-master event is generated,
that a Sentinel received a configuration update from another Sentinel.
2014-05-08 15:59:34 +02:00
antirez
24cce99135 REDIS_ENCODING_EMBSTR_SIZE_LIMIT set to 39.
The new value is the limit for the robj + SDS header + string +
null-term to stay inside the 64 bytes Jemalloc arena in 64 bits
systems.
2014-05-07 17:05:09 +02:00
antirez
c40d92c702 Scripting: objects caching for Lua c->argv creation.
Reusing small objects when possible is a major speedup under certain
conditions, since it is able to avoid the malloc/free pattern that
otherwise is performed for every argument in the client command vector.
2014-05-07 16:12:32 +02:00
antirez
a857519e46 Scripting: Use faster API for Lua client c->argv creation.
Replace the three calls to Lua API lua_tostring, lua_lua_strlen,
and lua_isstring, with a single call to lua_tolstring.

~ 5% consistent speed gain measured.
2014-05-07 16:12:32 +02:00
antirez
2517e56f6d Scripting: don't call lua_gc() after Lua script run.
Calling lua_gc() after every script execution is too expensive, and
apparently does not make the execution smoother: the same peak latency
was measured before and after the commit.

This change accounts for scripts execution speedup in the order of 10%.
2014-05-07 16:12:32 +02:00
antirez
a9b5eea0b5 Scripting: cache argv in luaRedisGenericCommand().
~ 4% consistently measured speed improvement.
2014-05-07 16:12:32 +02:00
antirez
c11d3ef4f2 Fixed missing c->bufpos reset in luaRedisGenericCommand().
Bug introduced when adding a fast path to avoid copying the reply buffer
for small replies that fit into the client static buffer.
2014-05-07 16:12:32 +02:00
antirez
477ff4cd32 Scripting: replace tolower() with faster code in evalGenericCommand().
The function showed up consuming a non trivial amount of time in the
profiler output. After this change benchmarking gives a 6% speed
improvement that can be consistently measured.
2014-05-07 16:12:32 +02:00
antirez
1420cc4970 Scripting: luaRedisGenericCommand() fast path for buffer-only replies.
When the reply is only contained in the client static output buffer, use
a fast path avoiding the dynamic allocation of an SDS string to
concatenate the client reply objects.
2014-05-07 16:12:32 +02:00
antirez
5f0d8a97bc Define HAVE_ATOMIC for clang. 2014-05-07 16:12:32 +02:00
antirez
61ba83b75b Scripting: simpler reply buffer creation in luaRedisGenericCommand().
It if faster to just create the string with a single sdsnewlen() call.
If c->bufpos is zero, the call will simply be like sdsemtpy().
2014-05-07 16:12:32 +02:00
antirez
835af28f4e CLUSTER SET-CONFIG-EPOCH implemented.
Initially Redis Cluster accepted that after cluster creation all the
nodes were at configEpoch 0, evolving from zero as failovers happen.

However later the semantic was made more strict in order to make sure a
cluster has always all the master nodes with a different configEpoch,
which is more robust in some corner case (especially resulting from
errors by the system administrator).

To assign different configEpochs to different nodes at startup was a
task performed naturally by the config conflicts resolution algorithm
(see the Cluster specification). However this works well only for small
clusters or when there are actually just a few collisions, since it is
designed for exceptional cases.

When a large cluster is created hundred of nodes can be at epoch 0, so
the conflict resolution code is slow to provide an unique config to each
node. For this reason this new command was introduced. It can be called
only when a node is totally fresh: no other nodes known, and configEpoch
set to zero, so it is safe even against misuses.

redis-trib will use the new command in order to start the cluster
already setting an incremental unique config to every node.
2014-04-29 19:15:16 +02:00
antirez
d4a180bbc1 CLIENT LIST speedup via peerid caching + smart allocation.
This commit adds peer ID caching in the client structure plus an API
change and the use of sdsMakeRoomFor() in order to improve the
reallocation pattern to generate the CLIENT LIST output.

Both the changes account for a very significant speedup.
2014-04-28 17:36:57 +02:00
antirez
49c543415b Use sdscatfmt() in getClientInfoString() to make it faster. 2014-04-28 16:55:43 +02:00
antirez
de11c325ae Added new sdscatfmt() %u and %U format specifiers.
This commit also fixes a bug in the implementation of sdscatfmt()
resulting from stale references to the SDS string header after
sdsMakeRoomFor() calls.
2014-04-28 16:38:17 +02:00
antirez
8e7e7cc5eb sdscatfmt() added to SDS library.
sdscatprintf() relies on printf() family libc functions and is sometimes
too slow in critical code paths. sdscatfmt() is an alternative which is:

1) Far less capable.
2) Format specifier uncompatible.
3) Faster.

It is suitable to be used in those speed critical code paths such as
CLIENT LIST output generation.
2014-04-28 16:23:17 +02:00
antirez
4912f873b4 Process events with processEventsWhileBlocked() when blocked.
When we are blocked and a few events a processed from time to time, it
is smarter to call the event handler a few times in order to handle the
accept, read, write, close cycle of a client in a single pass, otherwise
there is too much latency added for clients to receive a reply while the
server is busy in some way (for example during the DB loading).
2014-04-24 21:44:32 +02:00
antirez
9cd2dfec0b Accept multiple clients per iteration.
When the listening sockets readable event is fired, we have the chance
to accept multiple clients instead of accepting a single one. This makes
Redis more responsive when there is a mass-connect event (for example
after the server startup), and in workloads where a connect-disconnect
pattern is used often, so that multiple clients are waiting to be
accepted continuously.

As a side effect, this commit makes the LOADING, BUSY, and similar
errors much faster to deliver to the client, making Redis more
responsive when there is to return errors to inform the clients that the
server is blocked in an not interruptible operation.
2014-04-24 21:44:32 +02:00
antirez
c07b94e7f4 AE_ERR -> ANET_ERR in acceptUnixHandler().
No actual changes since the value is the same.
2014-04-24 21:43:22 +02:00
antirez
845945cad2 While ANET_ERR is -1, check syscall retval for -1 itself. 2014-04-24 17:03:07 +02:00
antirez
fe8ce2b064 clusterLoadConfig() REDIS_ERR retval semantics refined.
We should return REDIS_ERR to signal we can't read the configuration
because there is no config file only after checking errno, othewise
we risk to rewrite an existing file that was not accessible for some
other reason.
2014-04-24 16:23:03 +02:00
antirez
52668c900f Lock nodes.conf to avoid multiple processes using the same file.
This was a common source of problems among users.
The solution adopted is not bullet-proof as if the user deletes the
nodes.conf file manually, and starts a new instance with the same
nodes.conf file path, two instances will use the same file. However
following this reasoning the user may drop a nuclear bomb into the
datacenter as well.
2014-04-24 16:04:10 +02:00