5190 Commits

Author SHA1 Message Date
Salvatore Sanfilippo
c12de2c04a Merge pull request #1744 from mattsta/better-RDB-failure-error
Improve Loading RDB Failure Action
2015-01-28 17:30:27 +01:00
Matt Stancliff
8ada516fb6 Improve RDB error-on-load handling
Previouly if we loaded a corrupt RDB, Redis printed an error report
with a big "REPORT ON GITHUB" message at the bottom.  But, we know
RDB load failures are corrupt data, not corrupt code.

Now when RDB failure is detected (duplicate keys or unknown data
types in the file), we run check-rdb against the RDB then exit.  The
automatic check-rdb hopefully gives the user instant feedback
about what is wrong instead of providing a mysterious stack
trace.
2015-01-28 11:19:00 -05:00
Matt Stancliff
2902fe7597 Remove code duplication from check-rdb
redis-check-rdb (previously redis-check-dump) had every RDB define
copy/pasted from rdb.h and some defines copied from redis.h.  Since
the initial copy, some constants had changed in Redis headers and
check-dump was using incorrect values.

Since check-rdb is now a mode of Redis, the old check-dump code
is cleaned up to:
  - replace all printf with redisLog (and remove \n from all strings)
  - remove all copy/pasted defines to use defines from rdb.h and redis.h
  - replace all malloc/free with zmalloc/zfree
  - remove unnecessary include headers
2015-01-28 11:18:18 -05:00
Matt Stancliff
302a22ad8c Convert check-dump to Redis check-rdb mode
redis-check-dump is now named redis-check-rdb and it runs
as a mode of redis-server instead of an independent binary.

You can now use 'redis-server redis.conf --check-rdb' to check
the RDB defined in redis.conf.  Using argument --check-rdb
checks the RDB and exits.  We could potentially also allow
the server to continue starting if the RDB check succeeds.

This change also enables us to use RDB checking programatically
from inside Redis for certain failure conditions.
2015-01-28 11:18:16 -05:00
mattcollier
3db601e064 Update redis-cli.c
Code was adding '\n'  (line 521) to the end of NIL values exlusively making csv output inconsistent.  Removed '\n'
2015-01-25 14:01:39 -05:00
antirez
f46c66ec9a Cluster: initialized not used fileds in gossip section.
Otherwise we risk sending not initialized data to other nodes, that may
contain anything. This was actually not possible only because the
initialization of the buffer where the cluster packets header is created
was larger than the 3 gossip sections we use, so the memory was already
all filled with zeroes by the memset().
2015-01-24 07:52:24 +01:00
antirez
848a382619 dict.c: make chaining strategy more clear in dictAddRaw(). 2015-01-23 18:11:05 +01:00
antirez
eed24343df DEBUG structsize
Show sizes of a few important data structures in Redis. More missing.
2015-01-23 18:10:14 +01:00
antirez
a301cd0de4 Avoid duplicated instance execution code in Cluster test. 2015-01-22 18:59:39 +01:00
antirez
527961f64a Merge branch 'clusterfixes' into unstable 2015-01-22 16:31:14 +01:00
antirez
46670cd1ff Cluster test: when valgrind is enabled, use a larger node-timeout.
Removes some percentage of timing related failures.
2015-01-22 16:08:21 +01:00
antirez
453ddb1c20 The seed must be static in getRandomHexChars(). 2015-01-22 11:10:50 +01:00
antirez
bf77170352 The seed must be static in getRandomHexChars(). 2015-01-22 11:10:43 +01:00
antirez
9cdd8c5599 counter must be static in getRandomHexChars(). 2015-01-22 11:00:26 +01:00
antirez
0d76782d97 getRandomHexChars(): use /dev/urandom just to seed.
On Darwin /dev/urandom depletes terribly fast. This is not an issue
normally, but with Redis Cluster we generate a lot of unique IDs, for
example during nodes handshakes. Our IDs need just to be unique without
other strong crypto requirements, so this commit turns the function into
something that gets a 20 bytes seed from /dev/urandom, and produces the
rest of the output just using SHA1 in counter mode.
2015-01-21 23:21:55 +01:00
antirez
106c183300 Merge branch 'clusterfixes' into unstable 2015-01-21 19:30:22 +01:00
antirez
ebe1f52ddb Cluster test initialization: use transaction for reset + set-config-epoch.
Otherwise between the two commands other nodes may contact us making the
next SET-CONFIG-EPOCH call impossible.
2015-01-21 18:48:08 +01:00
Matt Stancliff
ccc4753ae6 Fix cluster migrate memory leak
Fixes valgrind error:
48 bytes in 1 blocks are definitely lost in loss record 196 of 373
   at 0x4910D3: je_malloc (jemalloc.c:944)
   by 0x42807D: zmalloc (zmalloc.c:125)
   by 0x41FA0D: dictGetIterator (dict.c:543)
   by 0x41FA48: dictGetSafeIterator (dict.c:555)
   by 0x459B73: clusterHandleSlaveMigration (cluster.c:2776)
   by 0x45BF27: clusterCron (cluster.c:3123)
   by 0x423344: serverCron (redis.c:1239)
   by 0x41D6CD: aeProcessEvents (ae.c:311)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)
2015-01-21 18:47:16 +01:00
Matt Stancliff
f41f3403c6 Fix potential invalid read past end of array
If array has N elements, we can't read +1 if we are already at N.

Also, we need to move elements by their storage size in the array,
not just by individual bytes.
2015-01-21 18:01:03 +01:00
Matt Stancliff
ae87d2454b Fix cluster reset memory leak
[maybe] Fixes valgrind errors:
32 bytes in 4 blocks are definitely lost in loss record 107 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80A9AFC: clusterSetMaster (cluster.c:801)
   by 0x80AEDC9: clusterCommand (cluster.c:3994)
   by 0x80682A5: call (redis.c:2049)
   by 0x8068A20: processCommand (redis.c:2309)
   by 0x8076497: processInputBuffer (networking.c:1143)
   by 0x8073BAF: readQueryFromClient (networking.c:1208)
   by 0x8060E98: aeProcessEvents (ae.c:412)
   by 0x806123B: aeMain (ae.c:455)
   by 0x806C3DB: main (redis.c:3832)

64 bytes in 8 blocks are definitely lost in loss record 143 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80AAB40: clusterProcessPacket (cluster.c:801)
   by 0x80A847F: clusterReadHandler (cluster.c:1975)
   by 0x30000FF: ???

80 bytes in 10 blocks are definitely lost in loss record 148 of 228
   at 0x80EA447: je_malloc (jemalloc.c:944)
   by 0x806E59C: zrealloc (zmalloc.c:125)
   by 0x80AAB40: clusterProcessPacket (cluster.c:801)
   by 0x80A847F: clusterReadHandler (cluster.c:1975)
   by 0x2FFFFFF: ???
2015-01-21 17:51:57 +01:00
Matt Stancliff
7167c2dc8e Fix sending uninitialized bytes
Fixes valgrind error:
Syscall param write(buf) points to uninitialised byte(s)
   at 0x514C35D: ??? (syscall-template.S:81)
   by 0x456B81: clusterWriteHandler (cluster.c:1907)
   by 0x41D596: aeProcessEvents (ae.c:416)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)
 Address 0x5f268e2 is 2,274 bytes inside a block of size 8,192 alloc'd
   at 0x4932D1: je_realloc (jemalloc.c:1297)
   by 0x428185: zrealloc (zmalloc.c:162)
   by 0x4269E0: sdsMakeRoomFor.part.0 (sds.c:142)
   by 0x426CD7: sdscatlen (sds.c:251)
   by 0x4579E7: clusterSendMessage (cluster.c:1995)
   by 0x45805A: clusterSendPing (cluster.c:2140)
   by 0x45BB03: clusterCron (cluster.c:2944)
   by 0x423344: serverCron (redis.c:1239)
   by 0x41D6CD: aeProcessEvents (ae.c:311)
   by 0x41D8EA: aeMain (ae.c:455)
   by 0x41A84B: main (redis.c:3832)
 Uninitialised value was created by a stack allocation
   at 0x457810: nodeUpdateAddressIfNeeded (cluster.c:1236)
2015-01-21 17:50:17 +01:00
antirez
3f759958ac Cluster test: wait for port to unbound in kill_instance.
Otherwise kill_instance + restart_instance in short succession will
still find the port busy and will fail.
2015-01-21 16:47:36 +01:00
antirez
9a540bf3cb AOF rewrite: set iterator var to NULL when freed.
The cleanup code expects that if 'di' is not NULL, it is a valid
iterator that should be freed.

The result of this bug was a crash of the AOF rewriting process if an
error occurred after the DBs data are written and the iterator is no
longer valid.
2015-01-21 16:42:08 +01:00
antirez
b71060f360 Cluster/Sentinel test: also pause on abort_sentinel_test call. 2015-01-21 16:18:34 +01:00
antirez
507ca219b4 Cluster/Sentinel test: pause on exceptions as well. 2015-01-21 16:13:30 +01:00
antirez
735adaa62d Cluster: node deletion cleanup / centralization. 2015-01-21 16:03:43 +01:00
antirez
c13e0820ad Cluster: set the slaves->slaveof filed to NULL when master is freed.
Related to issue #2289.
2015-01-21 15:55:53 +01:00
Matt Stancliff
e240d16cc9 Add --track-origins=yes to valgrind 2015-01-21 15:48:19 +01:00
Matt Stancliff
52aa050a79 Tell sentinel/cluster tests to allow valgrind 2015-01-21 15:04:12 +01:00
antirez
e63ad12b8f Fix gcc warning for lack of casting to char pointer. 2015-01-21 14:51:42 +01:00
antirez
2d7e7141a5 luaRedisGenericCommand(): log error at WARNING level when re-entered.
Rationale is that when re-entering, it is likely due to Lua debugging
hooks. Returning an error will be ignored in most cases, going totally
unnoticed. With the log at least we leave a trace.

Related to issue #2302.
2015-01-20 23:21:21 +01:00
antirez
10cb7e83d9 luaRedisGenericCommand() recursion: just return an error.
Instead of calling redisPanic() to abort the server.

Related to issue #2302.
2015-01-20 23:16:19 +01:00
antirez
2249f0d386 Panic on recursive calls to luaRedisGenericCommand().
Related to issue #2302.
2015-01-20 18:02:26 +01:00
Matt Stancliff
8958c39e71 Improve networking type correctness
read() and write() return ssize_t (signed long), not int.

For other offsets, we can use the unsigned size_t type instead
of a signed offset (since our replication offsets and buffer
positions are never negative).
2015-01-19 14:10:12 -05:00
Matt Stancliff
0c611363e5 Improve RDB type correctness
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.

This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.

This commit increases object size calculation and
cascades the change back up to serializedlength printing.

Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...

After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...
2015-01-19 14:10:12 -05:00
antirez
1b004d62a0 Cluster: fetch my IP even if msg is not MEET for the first time.
In order to avoid that misconfigured cluster nodes at some time may
force an IP update on other nodes, it is required that nodes update
their own address only on MEET messages. However it does not make sense
to do this the first time a node is contacted and yet does not have an
IP, we just risk that myself->ip remains not assigned if there are
messages lost or cluster creation procedures that don't make sure
everybody is targeted by at least one incoming MEET message.

Also fix the logging of the IP switch avoiding the :-1 tail.
2015-01-13 10:50:34 +01:00
antirez
7e4233e3f7 Cluster: clusterMsgDataGossip structure, explict padding + minor stuff.
Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.

Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.
2015-01-13 10:40:09 +01:00
antirez
a35c89f3d3 Suppress valgrind error about write sending uninitialized data.
Valgrind checks that the buffers we transfer via syscalls are all
composed of bytes actually initialized. This is useful, it makes we able
to avoid leaking informations in non initialized parts fo messages
transferred to other hosts. This commit fixes one of such issues.
2015-01-13 09:31:37 +01:00
antirez
b4c42569ea Revert "Use REDIS_SUPERVISED_NONE instead of 0."
This reverts commit 25cdb725c97184eb116639c034d0c48c8d0b7e83.

Nevermind.
2015-01-12 15:58:23 +01:00
antirez
25cdb725c9 Use REDIS_SUPERVISED_NONE instead of 0. 2015-01-12 15:57:50 +01:00
antirez
59f380ea81 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-01-12 15:56:46 +01:00
Salvatore Sanfilippo
d2c8d88c0b Merge pull request #2266 from mattsta/improve/supervised/startup
Three fixes: explicit supervise, pidfile create, remove memory leaks.
2015-01-12 15:56:36 +01:00
antirez
a4e2c9fe3d Cluster: initialize mf_end.
Can't be initialized by resetManualFailover() since it's actual state
the function uses, so we need to initialize it at startup time. Not
really a bug in practical terms, but showed up into valgrind and is not
technically correct anyway.
2015-01-12 15:55:00 +01:00
Matt Stancliff
ffb8ce1e3b Add maxmemory limit to INFO MEMORY
Since we have the eviction policy, we should have the memory limit too.
2015-01-09 17:18:37 -05:00
Matt Stancliff
6c8a6df3fe Improve consistency of INFO MEMORY fields
Adds used_memory_rss_human and used_memory_lua_human to match
all the other fields reporting human-readable memory too.
2015-01-09 17:18:37 -05:00
Matt Stancliff
c0b0e23100 Remove RDB AUX memory leaks 2015-01-09 15:19:18 -05:00
Matt Stancliff
02b75aac5a Supervise redis processes only if configured
Adds configuration option 'supervised [no | upstart | systemd | auto]'

Also removed 'bzero' from the previous implementation because it's 2015.
(We could actually statically initialize those structs, but clang
throws an invalid warning when we try, so it looks bad even though it
isn't bad.)

Fixes #2264
2015-01-09 15:19:18 -05:00
Matt Stancliff
33bcd9e2d6 Define default pidfile when creating pid
We want pidfile to be NULL on startup so we can detect if the user
set an explicit value versus only using the default value.

Closes #1967
Fixes #2076
2015-01-09 15:19:18 -05:00
rebx
fb538fb98e Create PID file even if in foreground
Previously, Redis only wrote the pid file if
it was daemonizing, but many times it's useful to have
the pid written out even if you're in the foreground.

Some background for this is:
I usually run redis via daemontools. That entails running
redis-server on the foreground. Given that, I'd also want
redis-server to create a pidfile so other processes (e.g. nagios)
can run checks for that.

Closes #463
2015-01-09 15:19:18 -05:00
antirez
3c7daa5bec Add "-lrt" in Makefile for Solaris.
This fix is from @NanXiao, however I was not able to retain authorship
because the Pull Request original repository was removed.
2015-01-09 11:53:51 +01:00