10569 Commits

Author SHA1 Message Date
antirez
43aaf96163 Markdown generation of Redis Modules API reference improved. 2017-07-14 11:29:31 +02:00
antirez
f6d871f4c4 Markdown generation of Redis Modules API reference improved. 2017-07-14 11:29:31 +02:00
antirez
e74f0aa6d1 Fix replication of SLAVEOF inside transaction.
In Redis 4.0 replication, with the introduction of PSYNC2, masters and
slaves replicate commands to cascading slaves and to the replication
backlog itself in a different way compared to the past.

Masters actually replicate the effects of client commands.
Slaves just propagate what they receive from masters.

This mechanism can cause problems when the configuration of an instance
is changed from master to slave inside a transaction. For instance
we could send to a master instance the following sequence:

    MULTI
    SLAVEOF 127.0.0.1 0
    EXEC
    SLAVEOF NO ONE

Before the fixes in this commit, the MULTI command used to be propagated
into the replication backlog, however after the SLAVEOF command the
instance is a slave, so the EXEC implementation failed to also propagate
the EXEC command. When the slaves of the above instance reconnected,
they were incrementally synchronized just sending a "MULTI". This put
the master client (in the slaves) into MULTI state, breaking the
replication.

Notably even Redis Sentinel uses the above approach in order to guarantee
that configuration changes are always performed together with rewrites
of the configuration and with clients disconnection. Sentiel does:

    MULTI
    SLAVEOF ...
    CONFIG REWRITE
    CLIENT KILL TYPE normal
    EXEC

So this was a really problematic issue. However even with the fix in
this commit, that will add the final EXEC to the replication stream in
case the instance was switched from master to slave during the
transaction, the result would be to increment the slave replication
offset, so a successive reconnection with the new master, will not
permit a successful partial resynchronization: no way the new master can
provide us with the backlog needed, we incremented our offset to a value
that the new master cannot have.

However the EXEC implementation waits to emit the MULTI, so that if the
commands inside the transaction actually do not need to be replicated,
no commands propagation happens at all. From multi.c:

    if (!must_propagate && !(c->cmd->flags & (CMD_READONLY|CMD_ADMIN))) {
	execCommandPropagateMulti(c);
	must_propagate = 1;
    }

The above code is already modified by this commit you are reading.
Now also ADMIN commands do not trigger the emission of MULTI. It is actually
not clear why we do not just check for CMD_WRITE... Probably I wrote it this
way in order to make the code more reliable: better to over-emit MULTI
than not emitting it in time.

So this commit should indeed fix issue #3836 (verified), however it looks
like some reconsideration of this code path is needed in the long term.

BONUS POINT: The reverse bug.

Even in a read only slave "B", in a replication setup like:

	A -> B -> C

There are commands without the READONLY nor the ADMIN flag, that are also
not flagged as WRITE commands. An example is just the PING command.

So if we send B the following sequence:

    MULTI
    PING
    SLAVEOF NO ONE
    EXEC

The result will be the reverse bug, where only EXEC is emitted, but not the
previous MULTI. However this apparently does not create problems in practice
but it is yet another acknowledge of the fact some work is needed here
in order to make this code path less surprising.

Note that there are many different approaches we could follow. For instance
MULTI/EXEC blocks containing administrative commands may be allowed ONLY
if all the commands are administrative ones, otherwise they could be
denined. When allowed, the commands could simply never be replicated at all.
2017-07-12 11:07:28 +02:00
antirez
66c47a4d06 Fix replication of SLAVEOF inside transaction.
In Redis 4.0 replication, with the introduction of PSYNC2, masters and
slaves replicate commands to cascading slaves and to the replication
backlog itself in a different way compared to the past.

Masters actually replicate the effects of client commands.
Slaves just propagate what they receive from masters.

This mechanism can cause problems when the configuration of an instance
is changed from master to slave inside a transaction. For instance
we could send to a master instance the following sequence:

    MULTI
    SLAVEOF 127.0.0.1 0
    EXEC
    SLAVEOF NO ONE

Before the fixes in this commit, the MULTI command used to be propagated
into the replication backlog, however after the SLAVEOF command the
instance is a slave, so the EXEC implementation failed to also propagate
the EXEC command. When the slaves of the above instance reconnected,
they were incrementally synchronized just sending a "MULTI". This put
the master client (in the slaves) into MULTI state, breaking the
replication.

Notably even Redis Sentinel uses the above approach in order to guarantee
that configuration changes are always performed together with rewrites
of the configuration and with clients disconnection. Sentiel does:

    MULTI
    SLAVEOF ...
    CONFIG REWRITE
    CLIENT KILL TYPE normal
    EXEC

So this was a really problematic issue. However even with the fix in
this commit, that will add the final EXEC to the replication stream in
case the instance was switched from master to slave during the
transaction, the result would be to increment the slave replication
offset, so a successive reconnection with the new master, will not
permit a successful partial resynchronization: no way the new master can
provide us with the backlog needed, we incremented our offset to a value
that the new master cannot have.

However the EXEC implementation waits to emit the MULTI, so that if the
commands inside the transaction actually do not need to be replicated,
no commands propagation happens at all. From multi.c:

    if (!must_propagate && !(c->cmd->flags & (CMD_READONLY|CMD_ADMIN))) {
	execCommandPropagateMulti(c);
	must_propagate = 1;
    }

The above code is already modified by this commit you are reading.
Now also ADMIN commands do not trigger the emission of MULTI. It is actually
not clear why we do not just check for CMD_WRITE... Probably I wrote it this
way in order to make the code more reliable: better to over-emit MULTI
than not emitting it in time.

So this commit should indeed fix issue #3836 (verified), however it looks
like some reconsideration of this code path is needed in the long term.

BONUS POINT: The reverse bug.

Even in a read only slave "B", in a replication setup like:

	A -> B -> C

There are commands without the READONLY nor the ADMIN flag, that are also
not flagged as WRITE commands. An example is just the PING command.

So if we send B the following sequence:

    MULTI
    PING
    SLAVEOF NO ONE
    EXEC

The result will be the reverse bug, where only EXEC is emitted, but not the
previous MULTI. However this apparently does not create problems in practice
but it is yet another acknowledge of the fact some work is needed here
in order to make this code path less surprising.

Note that there are many different approaches we could follow. For instance
MULTI/EXEC blocks containing administrative commands may be allowed ONLY
if all the commands are administrative ones, otherwise they could be
denined. When allowed, the commands could simply never be replicated at all.
2017-07-12 11:07:28 +02:00
antirez
e1b8b4b6da CLUSTER GETKEYSINSLOT: avoid overallocating.
Close #3911.
2017-07-11 15:49:09 +02:00
antirez
e1b9781bda CLUSTER GETKEYSINSLOT: avoid overallocating.
Close #3911.
2017-07-11 15:49:09 +02:00
antirez
5bd46d33db Fix isHLLObjectOrReply() to handle integer encoded strings.
Close #3766.
2017-07-11 12:44:59 +02:00
antirez
647406c1c1 Fix isHLLObjectOrReply() to handle integer encoded strings.
Close #3766.
2017-07-11 12:44:59 +02:00
antirez
e203a46cf3 Clients blocked in modules: free argv/argc later.
See issue #3844 for more information.
2017-07-11 12:33:01 +02:00
antirez
89508a4fd4 Clients blocked in modules: free argv/argc later.
See issue #3844 for more information.
2017-07-11 12:33:01 +02:00
antirez
14c32c3569 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-07-11 09:46:58 +02:00
antirez
f1308fcb08 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2017-07-11 09:46:58 +02:00
antirez
54e4bbeabd Event loop: call after sleep() only from top level.
In general we do not want before/after sleep() callbacks to be called
when we re-enter the event loop, since those calls are only designed in
order to perform operations every main iteration of the event loop, and
re-entering is often just a way to incrementally serve clietns with
error messages or other auxiliary operations. However, if we call the
callbacks, we are then forced to think at before/after sleep callbacks
as re-entrant, which is much harder without any good need.

However here there was also a clear bug: beforeSleep() was actually
never called when re-entering the event loop. But the new afterSleep()
callback was. This is broken and in this instance re-entering
afterSleep() caused a modules GIL dead lock.
2017-07-11 00:13:52 +02:00
antirez
ff1b4ccbca Event loop: call after sleep() only from top level.
In general we do not want before/after sleep() callbacks to be called
when we re-enter the event loop, since those calls are only designed in
order to perform operations every main iteration of the event loop, and
re-entering is often just a way to incrementally serve clietns with
error messages or other auxiliary operations. However, if we call the
callbacks, we are then forced to think at before/after sleep callbacks
as re-entrant, which is much harder without any good need.

However here there was also a clear bug: beforeSleep() was actually
never called when re-entering the event loop. But the new afterSleep()
callback was. This is broken and in this instance re-entering
afterSleep() caused a modules GIL dead lock.
2017-07-11 00:13:52 +02:00
Salvatore Sanfilippo
58104d8327 Merge pull request #4113 from guybe7/module_io_bytes
Modules: Fix io->bytes calculation in RDB save
2017-07-10 19:14:34 +02:00
Salvatore Sanfilippo
cefcc33c41 Merge pull request #4113 from guybe7/module_io_bytes
Modules: Fix io->bytes calculation in RDB save
2017-07-10 19:14:34 +02:00
antirez
11182a1a58 redis-check-aof: tell users there is a --fix option. 2017-07-10 16:41:25 +02:00
antirez
a5cb21177a redis-check-aof: tell users there is a --fix option. 2017-07-10 16:41:25 +02:00
Guy Benoish
dfb68cd235 Modules: Fix io->bytes calculation in RDB save 2017-07-10 14:41:57 +03:00
Guy Benoish
cd3b6c9d5c Modules: Fix io->bytes calculation in RDB save 2017-07-10 14:41:57 +03:00
antirez
fc7ecd8d35 AOF check utility: ability to check files with RDB preamble. 2017-07-10 13:38:23 +02:00
antirez
63ec3e0170 AOF check utility: ability to check files with RDB preamble. 2017-07-10 13:38:23 +02:00
Salvatore Sanfilippo
6b0670daad Merge pull request #3853 from itamarhaber/issue-3851
Sets up fake client to select current db in RM_Call()
2017-07-06 15:02:11 +02:00
Salvatore Sanfilippo
ed5d0632b3 Merge pull request #3853 from itamarhaber/issue-3851
Sets up fake client to select current db in RM_Call()
2017-07-06 15:02:11 +02:00
Salvatore Sanfilippo
38dd30af42 Merge pull request #4105 from spinlock/unstable-networking
Optimize addReplyBulkSds for better performance
2017-07-06 14:31:08 +02:00
Salvatore Sanfilippo
873fff969e Merge pull request #4105 from spinlock/unstable-networking
Optimize addReplyBulkSds for better performance
2017-07-06 14:31:08 +02:00
Salvatore Sanfilippo
2d5aa00959 Merge pull request #4106 from petersunbag/unstable
minor fix in listJoin().
2017-07-06 14:29:37 +02:00
Salvatore Sanfilippo
8b7342cc67 Merge pull request #4106 from petersunbag/unstable
minor fix in listJoin().
2017-07-06 14:29:37 +02:00
sunweinan
87f771bff1 minor fix in listJoin(). 2017-07-06 19:47:21 +08:00
sunweinan
16b407a1ff minor fix in listJoin(). 2017-07-06 19:47:21 +08:00
antirez
2b36950e9b Free IO context if any in RDB loading code.
Thanks to @oranagra for spotting this bug.
2017-07-06 11:20:49 +02:00
antirez
cb3790a209 Free IO context if any in RDB loading code.
Thanks to @oranagra for spotting this bug.
2017-07-06 11:20:49 +02:00
antirez
51ffd062d3 Modules: DEBUG DIGEST interface. 2017-07-06 11:04:46 +02:00
antirez
ed93fb8a29 Modules: DEBUG DIGEST interface. 2017-07-06 11:04:46 +02:00
spinlock
10db81af71 update Makefile for test-sds 2017-07-05 14:32:09 +00:00
spinlock
db56f485a8 update Makefile for test-sds 2017-07-05 14:32:09 +00:00
spinlock
ea31a4eae3 Optimize addReplyBulkSds for better performance 2017-07-05 14:25:05 +00:00
spinlock
b7b3e80a73 Optimize addReplyBulkSds for better performance 2017-07-05 14:25:05 +00:00
antirez
f9fac7f777 Avoid closing invalid FDs to make Valgrind happier. 2017-07-05 15:40:25 +02:00
antirez
ed7cbd5a4b Avoid closing invalid FDs to make Valgrind happier. 2017-07-05 15:40:25 +02:00
antirez
413c2bc180 Modules: no MULTI/EXEC for commands replicated from async contexts.
They are technically like commands executed from external clients one
after the other, and do not constitute a single atomic entity.
2017-07-05 10:10:20 +02:00
antirez
fe48716c0c Modules: no MULTI/EXEC for commands replicated from async contexts.
They are technically like commands executed from external clients one
after the other, and do not constitute a single atomic entity.
2017-07-05 10:10:20 +02:00
Salvatore Sanfilippo
09dd7b5ff0 Merge pull request #4101 from dvirsky/fix_modules_reply_len
Proposed fix to #4100
2017-07-04 12:01:51 +02:00
Salvatore Sanfilippo
92863ae784 Merge pull request #4101 from dvirsky/fix_modules_reply_len
Proposed fix to #4100
2017-07-04 12:01:51 +02:00
antirez
eddd8d34c4 Add symmetrical assertion to track c->reply_buffer infinite growth.
Redis clients need to have an instantaneous idea of the amount of memory
they are consuming (if the number is not exact should at least be
proportional to the actual memory usage). We do that adding and
subtracting the SDS length when pushing / popping from the client->reply
list. However it is quite simple to add bugs in such a setup, by not
taking the objects in the list and the count in sync. For such reason,
Redis has an assertion to track counts near 2^64: those are always the
result of the counter wrapping around because we subtract more than we
add. This commit adds the symmetrical assertion: when the list is empty
since we sent everything, the reply_bytes count should be zero. Thanks
to the new assertion it should be simple to also detect the other
problem, where the count slowly increases because of over-counting.
The assertion adds a conditional in the code that sends the buffer to
the socket but should not create any measurable performance slowdown,
listLength() just accesses a structure field, and this code path is
totally dominated by write(2).

Related to #4100.
2017-07-04 11:55:05 +02:00
antirez
80f2d39f64 Add symmetrical assertion to track c->reply_buffer infinite growth.
Redis clients need to have an instantaneous idea of the amount of memory
they are consuming (if the number is not exact should at least be
proportional to the actual memory usage). We do that adding and
subtracting the SDS length when pushing / popping from the client->reply
list. However it is quite simple to add bugs in such a setup, by not
taking the objects in the list and the count in sync. For such reason,
Redis has an assertion to track counts near 2^64: those are always the
result of the counter wrapping around because we subtract more than we
add. This commit adds the symmetrical assertion: when the list is empty
since we sent everything, the reply_bytes count should be zero. Thanks
to the new assertion it should be simple to also detect the other
problem, where the count slowly increases because of over-counting.
The assertion adds a conditional in the code that sends the buffer to
the socket but should not create any measurable performance slowdown,
listLength() just accesses a structure field, and this code path is
totally dominated by write(2).

Related to #4100.
2017-07-04 11:55:05 +02:00
Dvir Volk
86e564e9ff fixed #4100 2017-07-04 00:02:19 +03:00
Dvir Volk
4291a39afe fixed #4100 2017-07-04 00:02:19 +03:00
antirez
b2cd9fcab6 Fix GEORADIUS edge case with huge radius.
This commit closes issue #3698, at least for now, since the root cause
was not fixed: the bounding box function, for huge radiuses, does not
return a correct bounding box, there are points still within the radius
that are left outside.

So when using GEORADIUS queries with radiuses in the order of 5000 km or
more, it was possible to see, at the edge of the area, certain points
not correctly reported.

Because the bounding box for now was used just as an optimization, and
such huge radiuses are not common, for now the optimization is just
switched off when the radius is near such magnitude.

Three test cases found by the Continuous Integration test were added, so
that we can easily trigger the bug again, both for regression testing
and in order to properly fix it as some point in the future.
2017-07-03 19:38:31 +02:00
antirez
b525305f9d Fix GEORADIUS edge case with huge radius.
This commit closes issue #3698, at least for now, since the root cause
was not fixed: the bounding box function, for huge radiuses, does not
return a correct bounding box, there are points still within the radius
that are left outside.

So when using GEORADIUS queries with radiuses in the order of 5000 km or
more, it was possible to see, at the edge of the area, certain points
not correctly reported.

Because the bounding box for now was used just as an optimization, and
such huge radiuses are not common, for now the optimization is just
switched off when the radius is near such magnitude.

Three test cases found by the Continuous Integration test were added, so
that we can easily trigger the bug again, both for regression testing
and in order to properly fix it as some point in the future.
2017-07-03 19:38:31 +02:00