3644 Commits

Author SHA1 Message Date
antirez
9a540bf3cb AOF rewrite: set iterator var to NULL when freed.
The cleanup code expects that if 'di' is not NULL, it is a valid
iterator that should be freed.

The result of this bug was a crash of the AOF rewriting process if an
error occurred after the DBs data are written and the iterator is no
longer valid.
2015-01-21 16:42:08 +01:00
antirez
735adaa62d Cluster: node deletion cleanup / centralization. 2015-01-21 16:03:43 +01:00
antirez
c13e0820ad Cluster: set the slaves->slaveof filed to NULL when master is freed.
Related to issue #2289.
2015-01-21 15:55:53 +01:00
antirez
e63ad12b8f Fix gcc warning for lack of casting to char pointer. 2015-01-21 14:51:42 +01:00
antirez
2d7e7141a5 luaRedisGenericCommand(): log error at WARNING level when re-entered.
Rationale is that when re-entering, it is likely due to Lua debugging
hooks. Returning an error will be ignored in most cases, going totally
unnoticed. With the log at least we leave a trace.

Related to issue #2302.
2015-01-20 23:21:21 +01:00
antirez
10cb7e83d9 luaRedisGenericCommand() recursion: just return an error.
Instead of calling redisPanic() to abort the server.

Related to issue #2302.
2015-01-20 23:16:19 +01:00
antirez
2249f0d386 Panic on recursive calls to luaRedisGenericCommand().
Related to issue #2302.
2015-01-20 18:02:26 +01:00
Matt Stancliff
8958c39e71 Improve networking type correctness
read() and write() return ssize_t (signed long), not int.

For other offsets, we can use the unsigned size_t type instead
of a signed offset (since our replication offsets and buffer
positions are never negative).
2015-01-19 14:10:12 -05:00
Matt Stancliff
0c611363e5 Improve RDB type correctness
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.

This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.

This commit increases object size calculation and
cascades the change back up to serializedlength printing.

Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...

After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...
2015-01-19 14:10:12 -05:00
antirez
1b004d62a0 Cluster: fetch my IP even if msg is not MEET for the first time.
In order to avoid that misconfigured cluster nodes at some time may
force an IP update on other nodes, it is required that nodes update
their own address only on MEET messages. However it does not make sense
to do this the first time a node is contacted and yet does not have an
IP, we just risk that myself->ip remains not assigned if there are
messages lost or cluster creation procedures that don't make sure
everybody is targeted by at least one incoming MEET message.

Also fix the logging of the IP switch avoiding the :-1 tail.
2015-01-13 10:50:34 +01:00
antirez
7e4233e3f7 Cluster: clusterMsgDataGossip structure, explict padding + minor stuff.
Also explicitly set version to 0, add a protocol version define, improve
comments in the gossip structure.

Note that the structure layout is the same after the change, we are just
making the padding explicit with an additional not used 16 bits field.
So this commit is still able to talk with the previous versions of
cluster nodes.
2015-01-13 10:40:09 +01:00
antirez
a35c89f3d3 Suppress valgrind error about write sending uninitialized data.
Valgrind checks that the buffers we transfer via syscalls are all
composed of bytes actually initialized. This is useful, it makes we able
to avoid leaking informations in non initialized parts fo messages
transferred to other hosts. This commit fixes one of such issues.
2015-01-13 09:31:37 +01:00
antirez
b4c42569ea Revert "Use REDIS_SUPERVISED_NONE instead of 0."
This reverts commit 25cdb725c97184eb116639c034d0c48c8d0b7e83.

Nevermind.
2015-01-12 15:58:23 +01:00
antirez
25cdb725c9 Use REDIS_SUPERVISED_NONE instead of 0. 2015-01-12 15:57:50 +01:00
antirez
59f380ea81 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-01-12 15:56:46 +01:00
antirez
a4e2c9fe3d Cluster: initialize mf_end.
Can't be initialized by resetManualFailover() since it's actual state
the function uses, so we need to initialize it at startup time. Not
really a bug in practical terms, but showed up into valgrind and is not
technically correct anyway.
2015-01-12 15:55:00 +01:00
Matt Stancliff
ffb8ce1e3b Add maxmemory limit to INFO MEMORY
Since we have the eviction policy, we should have the memory limit too.
2015-01-09 17:18:37 -05:00
Matt Stancliff
6c8a6df3fe Improve consistency of INFO MEMORY fields
Adds used_memory_rss_human and used_memory_lua_human to match
all the other fields reporting human-readable memory too.
2015-01-09 17:18:37 -05:00
Matt Stancliff
c0b0e23100 Remove RDB AUX memory leaks 2015-01-09 15:19:18 -05:00
Matt Stancliff
02b75aac5a Supervise redis processes only if configured
Adds configuration option 'supervised [no | upstart | systemd | auto]'

Also removed 'bzero' from the previous implementation because it's 2015.
(We could actually statically initialize those structs, but clang
throws an invalid warning when we try, so it looks bad even though it
isn't bad.)

Fixes #2264
2015-01-09 15:19:18 -05:00
Matt Stancliff
33bcd9e2d6 Define default pidfile when creating pid
We want pidfile to be NULL on startup so we can detect if the user
set an explicit value versus only using the default value.

Closes #1967
Fixes #2076
2015-01-09 15:19:18 -05:00
rebx
fb538fb98e Create PID file even if in foreground
Previously, Redis only wrote the pid file if
it was daemonizing, but many times it's useful to have
the pid written out even if you're in the foreground.

Some background for this is:
I usually run redis via daemontools. That entails running
redis-server on the foreground. Given that, I'd also want
redis-server to create a pidfile so other processes (e.g. nagios)
can run checks for that.

Closes #463
2015-01-09 15:19:18 -05:00
antirez
3c7daa5bec Add "-lrt" in Makefile for Solaris.
This fix is from @NanXiao, however I was not able to retain authorship
because the Pull Request original repository was removed.
2015-01-09 11:53:51 +01:00
antirez
230d40b89d Check for __sun macro in solarisfixes.h, not in includers. 2015-01-09 11:23:22 +01:00
antirez
dc1f63d909 Prevent Lua scripts from violating Redis Cluster keyspace access rules.
Before this commit scripts were able to access / create keys outside the
set of hash slots served by the local node.
2015-01-09 11:23:22 +01:00
Matt Stancliff
80560dd661 Remove end of line whitespace from redis-trib 2015-01-08 13:31:03 -05:00
Matt Stancliff
b24efbf8b5 Fix redis-trib cluster create
Under certain conditions the node list wasn't being fully populated
and 'create' would fail trying to call methods on nil objects.
2015-01-08 13:28:35 -05:00
antirez
83c56336e0 Typo fixed: fiels -> fields in rdbSaveInfoAuxFields().
Thx to @badboy.
2015-01-08 12:06:22 +01:00
antirez
5de189fd79 A few more AUX info fields added to RDB. 2015-01-08 09:52:59 +01:00
antirez
d93e29bea0 RDB AUX fields support.
This commit introduces a new RDB data type called 'aux'. It is used in
order to insert inside an RDB file key-value pairs that may serve
different needs, without breaking backward compatibility when new
informations are embedded inside an RDB file. The contract between Redis
versions is to ignore unknown aux fields when encountered.

Aux fields can be used in order to:

1. Augment the RDB file with info like version of Redis that created the
RDB file, creation time, used memory while the RDB was created, and so
forth.
2. Add state about Redis inside the RDB file that we need to reload
later: replication offset, previos master run ID, in order to improve
failovers safety and allow partial resynchronization after a slave
restart.
3. Anything that we may want to add to RDB files without breaking the
ability of past versions of Redis to load the file.
2015-01-08 09:52:55 +01:00
antirez
e2308cf791 rdbLoad() refactoring to make it simpler to follow. 2015-01-08 09:52:51 +01:00
antirez
4a56ebe7dd New RDB v7 opcode: RESIZEDB.
The new opcode is an hint about the size of the dataset (keys and number
of expires) we are going to load for a given Redis database inside the
RDB file. Since hash tables are resized accordingly ASAP, useless
rehashing is avoided, speeding up load times significantly, in the order
of ~ 20% or more for larger data sets.

Related issue: #1719
2015-01-08 09:52:47 +01:00
antirez
28e8878baa sdsnative() removed: New rdb.c API can load native strings. 2015-01-08 09:52:44 +01:00
antirez
30041299ed Use RDB_LOAD_PLAIN to load quicklists and encoded types.
Before we needed to create a string object with an embedded SDS, adn
basically duplicate the SDS part into a plain zmalloc() allocation.
2015-01-08 09:52:40 +01:00
antirez
a07f5e0b14 RDB refactored to load plain strings from RDB. 2015-01-08 09:52:36 +01:00
Matt Stancliff
c41e99b806 Upgrade LZF to 3.6 (2011) from 3.5 (2009)
This is lzf_c and lzf_d from
http://dist.schmorp.de/liblzf/liblzf-3.6.tar.gz
2015-01-02 11:16:10 -05:00
Matt Stancliff
cde821759e Set optional 'static' for Quicklist+Redis
This also defines REDIS_STATIC='' for building everything
inside src/ and everything inside deps/lua/.
2015-01-02 11:16:10 -05:00
Matt Stancliff
f682a941cb Add more quicklist info to DEBUG OBJECT
Adds: ql_compressed (boolean, 1 if compression enabled for list, 0
otherwise)
Adds: ql_uncompressed_size (actual uncompressed size of all quicklistNodes)
Adds: ql_ziplist_max (quicklist max ziplist fill factor)

Compression ratio of the list is then ql_uncompressed_size / serializedlength

We report ql_uncompressed_size for all quicklists because serializedlength
is a _compressed_ representation anyway.

Sample output from a large list:
127.0.0.1:6379> llen abc
(integer) 38370061
127.0.0.1:6379> debug object abc
Value at:0x7ff97b51d140 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718164 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:0 ql_uncompressed_size:1643187761
(1.36s)

The 1.36s result time is because rdbSavedObjectLen() is serializing the
object, not because of any new stats reporting.

If we run DEBUG OBJECT on a compressed list, DEBUG OBJECT takes almost *zero*
time because rdbSavedObjectLen() reuses already-compressed ziplists:
127.0.0.1:6379> debug object abc
Value at:0x7fe5c5800040 refcount:1 encoding:quicklist serializedlength:19878335 lru:9718109 lru_seconds_idle:5 ql_nodes:21945 ql_avg_node:1748.46 ql_ziplist_max:-2 ql_compressed:1 ql_uncompressed_size:1643187761
2015-01-02 11:16:10 -05:00
Matt Stancliff
16cda6f076 Config: Add quicklist, remove old list options
This removes:
  - list-max-ziplist-entries
  - list-max-ziplist-value

This adds:
  - list-max-ziplist-size
  - list-compress-depth

Also updates config file with new sections and updates
tests to use quicklist settings instead of old list settings.
2015-01-02 11:16:10 -05:00
Matt Stancliff
b1a66a1968 Add branch prediction hints to quicklist
Actually makes a noticeable difference.

Branch hints were selected based on profiler hotspots.
2015-01-02 11:16:10 -05:00
Matt Stancliff
2ab0ed547b Cleanup quicklist style
Small fixes due to a new version of clang-format (it's less
crazy than the older version).
2015-01-02 11:16:09 -05:00
Matt Stancliff
1120f6b855 Allow compression of interior quicklist nodes
Let user set how many nodes to *not* compress.

We can specify a compression "depth" of how many nodes
to leave uncompressed on each end of the quicklist.

Depth 0 = disable compression.
Depth 1 = only leave head/tail uncompressed.
  - (read as: "skip 1 node on each end of the list before compressing")
Depth 2 = leave head, head->next, tail->prev, tail uncompressed.
  - ("skip 2 nodes on each end of the list before compressing")
Depth 3 = Depth 2 + head->next->next + tail->prev->prev
  - ("skip 3 nodes...")
etc.

This also:
  - updates RDB storage to use native quicklist compression (if node is
    already compressed) instead of uncompressing, generating the RDB string,
    then re-compressing the quicklist node.
  - internalizes the "fill" parameter for the quicklist so we don't
    need to pass it to _every_ function.  Now it's just a property of
    the list.
  - allows a runtime-configurable compression option, so we can
    expose a compresion parameter in the configuration file if people
    want to trade slight request-per-second performance for up to 90%+
    memory savings in some situations.
  - updates the quicklist tests to do multiple passes: 200k+ tests now.
2015-01-02 11:16:09 -05:00
Matt Stancliff
81a0be2282 Add quicklist info to DEBUG OBJECT
Added field 'ql_nodes' and 'ql_avg_per_node'.

ql_nodes is the number of quicklist nodes in the quicklist.
ql_avg_node is the average fill level in each quicklist node. (LLEN / QL_NODES)

Sample output:
127.0.0.1:6379> DEBUG object b
Value at:0x7fa42bf2fed0 refcount:1 encoding:quicklist serializedlength:18489 lru:8983768 lru_seconds_idle:3 ql_nodes:430 ql_avg_per_node:511.73
127.0.0.1:6379> llen b
(integer) 220044
2015-01-02 11:16:09 -05:00
Matt Stancliff
6780b97edb Remove malloc failure checks
We trust zmalloc to kill the whole process on memory failure
2015-01-02 11:16:09 -05:00
Matt Stancliff
1dfe1cea49 Convert quicklist RDB to store ziplist nodes
Turns out it's a huge improvement during save/reload/migrate/restore
because, with compression enabled, we're compressing 4k or 8k
chunks of data consisting of multiple elements in one ziplist
instead of compressing series of smaller individual elements.
2015-01-02 11:16:09 -05:00
Matt Stancliff
5257f91390 Convert RDB ziplist loading to sdsnative()
This saves us an unnecessary zmalloc, memcpy, and two frees.
2015-01-02 11:16:09 -05:00
Matt Stancliff
2133073bb8 Add sdsnative()
Use the existing memory space for an SDS to convert it to a regular
character buffer so we don't need to allocate duplicate space just
to extract a usable buffer for native operations.
2015-01-02 11:16:08 -05:00
Matt Stancliff
69d46d25f3 Add adaptive quicklist fill factor
Fill factor now has two options:
  - negative (1-5) for size-based ziplist filling
  - positive for length-based ziplist filling with implicit size cap.

Negative offsets define ziplist size limits of:
  -1: 4k
  -2: 8k
  -3: 16k
  -4: 32k
  -5: 64k

Positive offsets now automatically limit their max size to 8k.  Any
elements larger than 8k will be in individual nodes.

Positive ziplist fill factors will keep adding elements
to a ziplist until one of:
  - ziplist has FILL number of elements
    - or -
  - ziplist grows above our ziplist max size (currently 8k)

When using positive fill factors, if you insert a large
element (over 8k), that element will automatically allocate
an individual quicklist node with one element and no other elements will be
in the same ziplist inside that quicklist node.

When using negative fill factors, elements up to the size
limit can be added to one quicklist node.  If an element
is added larger than the max ziplist size, that element
will be allocated an individual ziplist in a new quicklist node.

Tests also updated to start testing at fill factor -5.
2015-01-02 11:16:08 -05:00
Matt Stancliff
153f919b4d redis-benchmark: Add RPUSH and RPOP tests 2015-01-02 11:16:08 -05:00
Matt Stancliff
7b0b296882 Free ziplist test lists during tests
Freeing our test lists helps keep valgrind output clean
2015-01-02 11:16:08 -05:00