1912 Commits

Author SHA1 Message Date
Salvatore Sanfilippo
cff7e102bb Merge pull request #895 from badboy/catch_con_error
redis-cli: always exit if connection fails.
2013-01-19 01:27:56 -08:00
bitterb
d5424943de Fix an error reply for CLIENT command 2013-01-19 14:11:33 +09:00
Jan-Erik Rediger
2d31425356 Always exit if connection fails.
This avoids unnecessary core dumps. Fixes antirez/redis#894
2013-01-18 10:13:10 +01:00
Nathan Parry
49ea5589c0 redis-cli --rdb fails if server sends a ping
Redis pings slaves in "pre-synchronization stage" with newlines. (See
https://github.com/antirez/redis/blob/2.6.9/src/replication.c#L814)
However, redis-cli does not expect this - it sees the newline as the end
of the bulk length line, and ends up returning 0 as bulk the length.
This manifests as the following when running redis-cli:

    $ ./src/redis-cli --rdb some_file
    SYNC sent to master, writing 0 bytes to 'some_file'
    Transfer finished with success.

With this commit, we just ignore leading newlines while reading the bulk
length line.

To reproduce the problem, load enough data into Redis so that the
preparation of the RDB snapshot takes long enough for a ping to occur
while redis-cli is waiting for the data.
2013-01-18 00:10:58 -05:00
charsyam
5596ba6ede redis-cli prompt bug fix 2013-01-16 17:20:54 -08:00
antirez
64c7380fa4 redis-cli: save an RDB dump from remote server to local file. 2013-01-16 19:44:37 +01:00
antirez
7d11268b51 Typo fixed, ASCI -> ASCII. 2013-01-15 13:34:13 +01:00
antirez
1472cae960 CLIENT GETNAME and CLIENT SETNAME introduced.
Sometimes it is much simpler to debug complex Redis installations if it
is possible to assign clients a name that is displayed in the CLIENT
LIST output.

This is the case, for example, for "leaked" connections. The ability to
provide a name to the client makes it quite trivial to understand what
is the part of the code implementing the client not releasing the
resources appropriately.

Behavior:

    CLIENT SETNAME: set a name for the client, or remove the current
                    name if an empty name is set.
    CLIENT GETNAME: get the current name, or a nil.
    CLIENT LIST: now displays the client name if any.

Thanks to Mark Gravell for pushing this idea forward.
2013-01-15 13:34:10 +01:00
antirez
e84a6cd10c Undo slave-master handshake when SLAVEOF sets a new slave.
Issue #828 shows how Redis was not correctly undoing a non-blocking
connection attempt with the previous master when the master was set to a
new address using the SLAVEOF command.

This was also a result of lack of refactoring, so now there is a
function to cancel the non blocking handshake with the master.
The new function is now used when SLAVEOF NO ONE is called or when
SLAVEOF is used to set the master to a different address.
2013-01-15 13:33:24 +01:00
antirez
affbd1a90a Makefile.dep updated. 2013-01-11 23:50:32 +01:00
antirez
cdc8fd2d9e Comment in the call() function clarified a bit. 2013-01-10 11:19:40 +01:00
antirez
6130e0373d Multiple fixes for EVAL (issue #872).
1) The event handler was no restored after a timeout condition if the
   command was eventually executed with success.
2) The command was not converted to EVAL in case of errors in the middle
   of the execution.
3) Terrible duplication of code without any apparent reason.
2013-01-10 10:46:05 +01:00
Bilal Husain
5e46ca0f2d s/adiacent/adjacent/
fixed typo in a comment (step 2 memcheck)
2013-01-09 21:46:58 +05:30
antirez
0fc5457a7f Better error reporting when fd event creation fails. 2013-01-03 14:29:34 +01:00
antirez
a74c0a06b5 ae.c: set errno when error is not a failing syscall.
In this way the caller is able to perform better error checking or to
use strerror() without the risk of meaningless error messages being
displayed.
2013-01-03 14:29:20 +01:00
antirez
1d235fa5ad Fix overflow in mstime() in redis-cli and benchmark.
The problem does not exist in the Redis server implementation of mstime()
but is only limited to redis-cli and redis-benchmark.

Thix fixes issue #839.
2012-12-20 15:20:55 +01:00
antirez
1d2eaf4cb4 serverCron() frequency is now a runtime parameter (was REDIS_HZ).
REDIS_HZ is the frequency our serverCron() function is called with.
A more frequent call to this function results into less latency when the
server is trying to handle very expansive background operations like
mass expires of a lot of keys at the same time.

Redis 2.4 used to have an HZ of 10. This was good enough with almost
every setup, but the incremental key expiration algorithm was working a
bit better under *extreme* pressure when HZ was set to 100 for Redis
2.6.

However for most users a latency spike of 30 milliseconds when million
of keys are expiring at the same time is acceptable, on the other hand a
default HZ of 100 in Redis 2.6 was causing idle instances to use some
CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3%
for an idle instance, however this is a shame as more energy is consumed
by the server, if not important resources.

This commit introduces HZ as a runtime parameter, that can be queried by
INFO or CONFIG GET, and can be modified with CONFIG SET. At the same
time the default frequency is set back to 10.

In this way we default to a sane value of 10, but allows users to
easily switch to values up to 500 for near real-time applications if
needed and if they are willing to pay this small CPU usage penalty.
2012-12-14 17:10:40 +01:00
Salvatore Sanfilippo
f876e2776d Merge pull request #824 from ptjm/unstable
Define _XOPEN_SOURCE appropriately on NetBSD.
2012-12-12 09:34:41 -08:00
Patrick TJ McPhee
0b58a57d78 Define _XOPEN_SOURCE appropriately on NetBSD. 2012-12-12 10:49:12 -05:00
antirez
9dc3b14872 Fix config.h endianess detection to work on Linux / PPC64.
Config.h performs endianess detection including OS-specific headers to
define the endianess macros, or when this is not possible, checking the
processor type via ifdefs.

Sometimes when the OS-specific macro is included, only __BYTE_ORDER is
defined, while BYTE_ORDER remains undefined. There is code at the end of
config.h endianess detection in order to define the macros without the
underscore, but it was not working correctly.

This commit fixes endianess detection fixing Redis on Linux / PPC64 and
possibly other systems.
2012-12-11 17:01:00 +01:00
antirez
0200e4f6ef Memory leak fixed: release client's bpop->keys dictionary.
Refactoring performed after issue #801 resolution (see commit
b775d5c3e9b4184c931fe2250a08ff1c3fb3b080) introduced a memory leak that
is fixed by this commit.

I simply forgot to free the new allocated dictionary in the client
structure trusting the output of "make test" on OSX.

However due to changes in the "leaks" utility the test was no longer
testing memory leaks. This problem was also fixed.

Fortunately the CI test running at ci.redis.io spotted the bug in the
valgrind run.

The leak never ended into a stable release.
2012-12-03 12:12:53 +01:00
antirez
b775d5c3e9 Blocking POP: use a dictionary to store keys clinet side.
To store the keys we block for during a blocking pop operation, in the
case the client is blocked for more data to arrive, we used a simple
linear array of redis objects, in the blockingState structure:

    robj **keys;
    int count;

However in order to fix issue #801 we also use a dictionary in order to
avoid to end in the blocked clients queue for the same key multiple
times with the same client.

The dictionary was only temporary, just to avoid duplicates, but since
we create / destroy it there is no point in doing this duplicated work,
so this commit simply use a dictionary as the main structure to store
the keys we are blocked for. So instead of the previous fields we now
just have:

    dict *keys;

This simplifies the code and reduces the work done by the server during
a blocking POP operation.
2012-12-02 20:43:15 +01:00
antirez
5f622803ec Client should not block multiple times on the same key.
Sending a command like:

BLPOP foo foo foo foo 0

Resulted into a crash before this commit since the client ended being
inserted in the waiting list for this key multiple times.
This resulted into the function handleClientsBlockedOnLists() to fail
because we have code like that:

    if (de) {
        list *clients = dictGetVal(de);
        int numclients = listLength(clients);

        while(numclients--) {
            listNode *clientnode = listFirst(clients);

            /* server clients here... */
        }
    }

The code to serve clients used to remove the served client from the
waiting list, so if a client is blocking multiple times, eventually the
call to listFirst() will return NULL or worse will access random memory
since the list may no longer exist as it is removed by the function
unblockClientWaitingData() if there are no more clients waiting for this
list.

To avoid making the rest of the implementation more complex, this commit
modifies blockForKeys() so that a client will be put just a single time
into the waiting list for a given key.

Since it is Saturday, I hope this fixes issue #801.
2012-12-02 20:43:07 +01:00
antirez
68e2d2ce07 SDIFF is now able to select between two algorithms for speed.
SDIFF used an algorithm that was O(N) where N is the total number
of elements of all the sets involved in the operation.

The algorithm worked like that:

ALGORITHM 1:

1) For the first set, add all the members to an auxiliary set.
2) For all the other sets, remove all the members of the set from the
auxiliary set.

So it is an O(N) algorithm where N is the total number of elements in
all the sets involved in the diff operation.

Cristobal Viedma suggested to modify the algorithm to the following:

ALGORITHM 2:

1) Iterate all the elements of the first set.
2) For every element, check if the element also exists in all the other
remaining sets.
3) Add the element to the auxiliary set only if it does not exist in any
of the other sets.

The complexity of this algorithm on the worst case is O(N*M) where N is
the size of the first set and M the total number of sets involved in the
operation.

However when there are elements in common, with this algorithm we stop
the computation for a given element as long as we find a duplicated
element into another set.

I (antirez) added an additional step to algorithm 2 to make it faster,
that is to sort the set to subtract from the biggest to the
smallest, so that it is more likely to find a duplicate in a larger sets
that are checked before the smaller ones.

WHAT IS BETTER?

None of course, for instance if the first set is much larger than the
other sets the second algorithm does a lot more work compared to the
first algorithm.

Similarly if the first set is much smaller than the other sets, the
original algorithm will less work.

So this commit makes Redis able to guess the number of operations
required by each algorithm, and select the best at runtime according
to the input received.

However, since the second algorithm has better constant times and can do
less work if there are duplicated elements, an advantage is given to the
second algorithm.
2012-11-30 16:36:42 +01:00
antirez
95cf003ba5 redis-benchmark: seed the PRNG with time() at startup. 2012-11-30 15:41:09 +01:00
antirez
a5af538d98 Introduced the Build ID in INFO and --version output.
The idea is to be able to identify a build in a unique way, so for
instance after a bug report we can recognize that the build is the one
of a popular Linux distribution and perform the debugging in the same
environment.
2012-11-29 14:20:08 +01:00
antirez
8f5635d339 On crash memory test rewrote so that it actaully works.
1) We no longer test location by location, otherwise the CPU write cache
completely makes our business useless.
2) We still need a memory test that operates in steps from the first to
the last location in order to never hit the cache, but that is still
able to retain the memory content.

This was tested using a Linux box containing a bad memory module with a
zingle bit error (always zero).

So the final solution does has an error propagation step that is:

1) Invert bits at every location.
2) Swap adiacent locations.
3) Swap adiacent locations again.
4) Invert bits at every location.
5) Swap adiacent locations.
6) Swap adiacent locations again.

Before and after these steps, and after step 4, a CRC64 checksum is computed.
If the three CRC64 checksums don't match, a memory error was detected.
2012-11-29 10:24:35 +01:00
charsyam
08402aee73 Remove unnecessary condition in _dictExpandIfNeeded (dict.c) 2012-11-28 11:44:39 +01:00
antirez
d6e0bcf436 Merge remote-tracking branch 'origin/unstable' into unstable 2012-11-28 11:41:27 +01:00
Matt Arsenault
e1f5d3ca6f It's a watchdog, not a watchdong. 2012-11-28 11:35:19 +01:00
charsyam
edfe8811aa remove compile warning bioKillThreads 2012-11-23 05:52:39 +08:00
antirez
ed54464034 EVALSHA is now case insensitive.
EVALSHA used to crash if the SHA1 was not lowercase (Issue #783).
Fixed using a case insensitive dictionary type for the sha -> script
map used for replication of scripts.
2012-11-22 15:50:00 +01:00
antirez
35c7312f59 Fix integer overflow in zunionInterGenericCommand().
This fixes issue #761.
2012-11-22 15:28:28 +01:00
antirez
0d4ca9a874 Safer handling of MULTI/EXEC on errors.
After the transcation starts with a MULIT, the previous behavior was to
return an error on problems such as maxmemory limit reached. But still
to execute the transaction with the subset of queued commands on EXEC.

While it is true that the client was able to check for errors
distinguish QUEUED by an error reply, MULTI/EXEC in most client
implementations uses pipelining for speed, so all the commands and EXEC
are sent without caring about replies.

With this change:

1) EXEC fails if at least one command was not queued because of an
error. The EXECABORT error is used.
2) A generic error is always reported on EXEC.
3) The client DISCARDs the MULTI state after a failed EXEC, otherwise
pipelining multiple transactions would be basically impossible:
After a failed EXEC the next transaction would be simply queued as
the tail of the previous transaction.
2012-11-22 10:32:07 +01:00
antirez
a68a4463f7 Make bio.c threads killable ASAP if needed.
We use this new bio.c feature in order to stop our I/O threads if there
is a memory test to do on crash. In this case we don't want anything
else than the main thread to run, otherwise the other threads may mess
with the heap and the memory test will report a false positive.
2012-11-22 10:12:11 +01:00
antirez
022bd293a6 Fast memory test on Redis crash. 2012-11-21 13:24:44 +01:00
antirez
49edc56291 Use more fine grained HAVE macros instead of HAVE_PROCFS. 2012-11-21 13:17:38 +01:00
charsyam
fad954fd74 fix randstring bug 2012-11-20 02:50:31 +08:00
antirez
af6b9a87d7 Children creating AOF or RDB files now report memory used by COW.
Finally Redis is able to report the amount of memory used by
copy-on-write while saving an RDB or writing an AOF file in background.

Note that this information is currently only logged (at NOTICE level)
and not shown in INFO because this is less trivial (but surely doable
with some minor form of interprocess communication).

The reason we can't capture this information on the parent before we
call wait3() is that the Linux kernel will release the child memory
ASAP, and only retain the minimal state for the process that is useful
to report the child termination to the parent.

The COW size is obtained by summing all the Private_Dirty fields found
in the "smap" file inside the proc filesystem for the process.

All this is Linux specific and is not available on other systems.
2012-11-19 12:02:08 +01:00
antirez
be336f7764 zmalloc_get_private_dirty() function added (Linux only).
For non Linux systmes it just returns 0.

This function is useful to estimate copy-on-write because of childs
saving stuff on disk.
2012-11-19 11:47:35 +01:00
antirez
e4b176ec57 zmalloc: kill unused __size parameter in update_zmalloc_stat_alloc() macro. 2012-11-14 12:52:38 +01:00
antirez
d9b02a38e6 Merge branch 'migrate-cache' into unstable 2012-11-14 12:21:23 +01:00
antirez
ee9ab14628 MIGRATE: retry one time on I/O error.
Now that we cache connections, a retry attempt makes sure that the
operation don't fail just because there is an existing connection error
on the socket, like the other end closing the connection.

Unfortunately this condition is not detectable using
getsockopt(SO_ERROR), so the only option left is to retry.

We don't retry on timeouts.
2012-11-14 11:30:24 +01:00
antirez
547c83bf5d TTL API change: TTL returns -2 for non existing keys.
The previous behavior was to return -1 if:

1) Existing key but without an expire set.
2) Non existing key.

Now the second case is handled in a different, and TTL will return -2
if the key does not exist at all.

PTTL follows the same behavior as well.
2012-11-12 23:04:36 +01:00
antirez
a2f07e9b00 MIGRATE: fix default timeout to 1000 milliseconds.
When a timeout <= 0 is provided we set a default timeout of 1 second.
It was set to 1 millisecond for an error resulting from a recent change.
2012-11-12 18:54:35 +01:00
antirez
0b2596147a MIGRATE count of cached sockets in INFO output. 2012-11-12 14:01:56 +01:00
antirez
0b98eb03cc MIGRATE timeout should be in milliseconds.
While it is documented that the MIGRATE timeout is in milliseconds, it
was in seconds instead. This commit fixes the problem.
2012-11-12 14:01:02 +01:00
antirez
6561ce7666 MIGRATE TCP connections caching.
By caching TCP connections used by MIGRATE to chat with other Redis
instances a 5x performance improvement was measured with
redis-benchmark against small keys.

This can dramatically speedup cluster resharding and other processes
where an high load of MIGRATE commands are used.
2012-11-12 00:47:24 +01:00
antirez
a32d1ddff6 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez
7736dc2e0f COPY and REPLACE options for MIGRATE.
With COPY now MIGRATE does not remove the key from the source instance.
With REPLACE it uses RESTORE REPLACE on the target host so that even if
the key already eixsts in the target instance it will be overwritten.

The options can be used together.
2012-11-07 15:32:27 +01:00