6560 Commits

Author SHA1 Message Date
antirez
86981fce60 Replication: fix the infamous key leakage of writable slaves + EXPIRE.
BACKGROUND AND USE CASEj

Redis slaves are normally write only, however the supprot a "writable"
mode which is very handy when scaling reads on slaves, that actually
need write operations in order to access data. For instance imagine
having slaves replicating certain Sets keys from the master. When
accessing the data on the slave, we want to peform intersections between
such Sets values. However we don't want to intersect each time: to cache
the intersection for some time often is a good idea.

To do so, it is possible to setup a slave as a writable slave, and
perform the intersection on the slave side, perhaps setting a TTL on the
resulting key so that it will expire after some time.

THE BUG

Problem: in order to have a consistent replication, expiring of keys in
Redis replication is up to the master, that synthesize DEL operations to
send in the replication stream. However slaves logically expire keys
by hiding them from read attempts from clients so that if the master did
not promptly sent a DEL, the client still see logically expired keys
as non existing.

Because slaves don't actively expire keys by actually evicting them but
just masking from the POV of read operations, if a key is created in a
writable slave, and an expire is set, the key will be leaked forever:

1. No DEL will be received from the master, which does not know about
such a key at all.

2. No eviction will be performed by the slave, since it needs to disable
eviction because it's up to masters, otherwise consistency of data is
lost.

THE FIX

In order to fix the problem, the slave should be able to tag keys that
were created in the slave side and have an expire set in some way.

My solution involved using an unique additional dictionary created by
the writable slave only if needed. The dictionary is obviously keyed by
the key name that we need to track: all the keys that are set with an
expire directly by a client writing to the slave are tracked.

The value in the dictionary is a bitmap of all the DBs where such a key
name need to be tracked, so that we can use a single dictionary to track
keys in all the DBs used by the slave (actually this limits the solution
to the first 64 DBs, but the default with Redis is to use 16 DBs).

This solution allows to pay both a small complexity and CPU penalty,
which is zero when the feature is not used, actually. The slave-side
eviction is encapsulated in code which is not coupled with the rest of
the Redis core, if not for the hook to track the keys.

TODO

I'm doing the first smoke tests to see if the feature works as expected:
so far so good. Unit tests should be added before merging into the
4.0 branch.
2016-12-13 10:59:54 +01:00
Salvatore Sanfilippo
2664d4d82a Merge pull request #3680 from yossigo/fix_rediscli_command_crash
Fix redis-cli rare crash.
2016-12-12 19:36:15 +01:00
Yossi Gottlieb
24934e43d7 Fix redis-cli rare crash.
This happens if the server (mysteriously) returns an unexpected response
to the COMMAND command.
2016-12-12 20:18:40 +02:00
Jan-Erik Rediger
754150cf20 Reset the ttl for additional keys
Before, if a previous key had a TTL set but the current one didn't, the
TTL was reused and thus resulted in wrong expirations set.

This behaviour was experienced, when `MigrateDefaultPipeline` in
redis-trib was set to >1

Fixes #3655
2016-12-08 14:27:21 +01:00
Salvatore Sanfilippo
d6c49a09f3 Merge pull request #3663 from wshn13/add-LF-character-in-memory-doctor-output-message
Add '\n' to MEMORY DOCTOR command output message
2016-12-06 09:20:36 +01:00
wangshaonan
99a3e17fae Add '\n' to MEMORY DOCTOR command output message when num_reports
is 0 or empty is 1
2016-12-06 03:11:27 +00:00
itamar
f9404ce002 Corrects a couple of omissions in the modules docs 2016-12-05 18:34:38 +02:00
antirez
71529a8592 Modules: types doc updated to new API. 2016-12-05 14:40:51 +01:00
antirez
d77b1d713c Modules: API doc updated (auto generated). 2016-12-05 14:40:43 +01:00
antirez
9aed64dc12 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-12-05 14:17:11 +01:00
antirez
e4ee2ac1b7 Geo: improve fuzz test.
The test now uses more diverse radius sizes, especially sizes near or
greater the whole earth surface are used, that are known to trigger edge
cases. Moreover the PRNG seeding was probably resulting into the same
sequence tested over and over again, now seeding unsing the current unix
time in milliseconds.

Related to #3631.
2016-12-05 14:16:46 +01:00
antirez
5375b0f2ec Geo: fix computation of bounding box.
A bug was reported in the context in issue #3631. The root cause of the
bug was that certain neighbor boxes were zeroed after the "inside the
bounding box or not" check, simply because the bounding box computation
function was wrong.

A few debugging infos where enhanced and moved in other parts of the
code. A check to avoid steps=0 was added, but is unrelated to this
issue and I did not verified it was an actual bug in practice.
2016-12-05 14:02:32 +01:00
cbgbt
9d7756db96 cli: Only print elapsed time on OUTPUT_STANDARD 2016-12-02 20:59:33 -08:00
Itamar Haber
72fbdb45b7 Verify pairs are provided after subcommands
Fixes https://github.com/antirez/redis/issues/3639
2016-12-02 18:19:36 +02:00
antirez
028036009b PSYNC2: Do not accept WAIT in slave instances.
No longer makes sense since writable slaves only do local writes now:
writes are no longer passed to sub-slaves in the stream.
2016-12-02 10:21:20 +01:00
Chris Lamb
fd6546e08d src/rdb.c: Correct "whenver" -> "whenever" typo. 2016-12-01 13:16:30 +01:00
Salvatore Sanfilippo
1a0d53318b Merge pull request #3651 from yossigo/datatype_methods_typo
Fix typo in RedisModuleTypeMethods declaration.
2016-12-01 09:09:37 +01:00
Yossi Gottlieb
1f1955766d Fix typo in RedisModuleTypeMethods declaration. 2016-11-30 22:05:59 +02:00
Salvatore Sanfilippo
61b22aa6df Merge pull request #3648 from dvirsky/fix_reply_crash
fix memory corruption on RM_FreeCallReply
2016-11-30 11:21:10 +01:00
antirez
4d71791b17 Modules: change type registration API to use a struct of methods. 2016-11-30 11:14:01 +01:00
Dvir Volk
2054df2e1c fix memory corruption on RM_FreeCallReply 2016-11-30 11:49:49 +02:00
antirez
e4c6e8a710 PSYNC2 test: check ability to resync after restart. 2016-11-29 11:15:16 +01:00
antirez
bda1dd05b9 PSYNC2 test: 20 seconds are enough... 2016-11-29 10:27:53 +01:00
antirez
6c40182042 PSYNC2 test: test added to the default tests. 2016-11-29 10:25:42 +01:00
antirez
172852a69e PSYNC2: Minor memory leak reading -NOMASTERLINK master reply fixed. 2016-11-29 10:25:00 +01:00
antirez
4d8362506b PSYNC2 test: modify the test for production. 2016-11-29 10:22:40 +01:00
andyli
5c516082c2 Modify MIN->MAX 2016-11-29 16:34:41 +08:00
antirez
c90c479f9e PSYNC2: stop sending newlines to sub-slaves when master is down.
This actually includes two changes:

1) No newlines to take the master-slave link up when the upstream master
is down. Doing this is dangerous because the sub-slave often is received
replication protocol for an half-command, so can't receive newlines
without desyncing the replication link, even with the code in order to
cancel out the bytes that PSYNC2 was using. Moreover this is probably
also not needed/sane, because anyway the slave can keep serving
requests, and because if it's configured to don't serve stale data, it's
a good idea, actually, to break the link.

2) When a +CONTINUE with a different ID is received, we now break
connection with the sub-slaves: they need to be notified as well. This
was part of the original specification but for some reason it was not
implemented in the code, and was alter found as a PSYNC2 bug in the
integration testing.
2016-11-28 17:54:04 +01:00
antirez
a6561f8e30 PSYNC2: Test (WIP).
This is the PSYNC2 test that helped find issues in the code, and that
still can show a protocol desync from time to time. Work is in progress
in order to find the issue. For now the test is not enabled in "make
test" and must be run manually.
2016-11-28 10:13:24 +01:00
antirez
868bf0ad68 Better protocol errors logging. 2016-11-25 10:55:16 +01:00
antirez
d21c7a08be PSYNC2: on transient error jump to error, not write_error. 2016-11-24 15:48:18 +01:00
antirez
d74a86c17e Modules: fix client blocking calls access to invalid struct field.
We already have reference to the client pointer, no need to access the
already freed structure.

Close #3634.
2016-11-24 11:05:19 +01:00
antirez
8046a3cf11 PSYNC2: bugfixing pre release.
1. Master replication offset was cleared after switching configuration
to some other slave, since it was assumed you can't PSYNC after a
switch. Note the case anymore and when we successfully PSYNC we need to
have our offset untouched.

2. Secondary replication ID was not reset to "000..." pattern at
startup.

3. Master in error state replying -LOADING or other transient errors
forced the slave to discard the cached master and full resync. This is
now fixed.

4. Better logging of what's happening on failed PSYNCs.
2016-11-23 17:36:45 +01:00
antirez
17a3b79925 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-11-18 13:10:57 +01:00
antirez
14c11b2540 Test: WAIT tests added in wait.tcl unit. 2016-11-18 13:10:29 +01:00
Salvatore Sanfilippo
b6ef607a55 Merge pull request #3612 from deep011/unstable
fix a possible bug for 'replconf getack'
2016-11-18 10:45:09 +01:00
antirez
7e4353c4d7 Merge branch 'psync2' into unstable 2016-11-17 09:37:03 +01:00
oranagra
50ec85fea8 when a slave loads an RDB, stop an AOFRW fork before flusing db and parsing rdb file, to avoid a CoW disaster. 2016-11-16 21:30:59 +02:00
antirez
cea27e9025 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2016-11-16 14:13:37 +01:00
antirez
f37134682b Cluster: handle zero bytes at the end of nodes.conf. 2016-11-16 14:13:18 +01:00
deep011
a60e88005c fix a possible bug for 'replconf getack' 2016-11-16 11:04:33 +08:00
hylepo
f6d3494ece Merge pull request #1 from hylepo/Fixing-typo-in-the-usage-of-redis-benchmark
Fixing typo in the usage of redis-benchmark
2016-11-11 10:34:37 +08:00
hylepo
cc1a950227 Update redis-benchmark.c
Fixing typo in the usage of redis-benchmark
2016-11-11 10:33:48 +08:00
oranagra
03eac87d6c fix unsigned int overflow in adjustOpenFilesLimit 2016-11-10 16:59:52 +02:00
antirez
996bc7b3e1 PSYNC2: Save replication ID/offset on RDB file.
This means that stopping a slave and restarting it will still make it
able to PSYNC with the master. Moreover the master itself will retain
its ID/offset, in case it gets turned into a slave, or if a slave will
try to PSYNC with it with an exactly updated offset (otherwise there is
no backlog).

This change was possible thanks to PSYNC v2 that makes saving the current
replication state much simpler.
2016-11-10 12:35:29 +01:00
antirez
6ee2313d15 PSYNC2: Wrap debugging code with if(0) 2016-11-09 15:37:15 +01:00
antirez
f53ed7a969 PSYNC2: different improvements to Redis replication.
The gist of the changes is that now, partial resynchronizations between
slaves and masters (without the need of a full resync with RDB transfer
and so forth), work in a number of cases when it was impossible
in the past. For instance:

1. When a slave is promoted to mastrer, the slaves of the old master can
partially resynchronize with the new master.

2. Chained slalves (slaves of slaves) can be moved to replicate to other
slaves or the master itsef, without requiring a full resync.

3. The master itself, after being turned into a slave, is able to
partially resynchronize with the new master, when it joins replication
again.

In order to obtain this, the following main changes were operated:

* Slaves also take a replication backlog, not just masters.

* Same stream replication for all the slaves and sub slaves. The
replication stream is identical from the top level master to its slaves
and is also the same from the slaves to their sub-slaves and so forth.
This means that if a slave is later promoted to master, it has the
same replication backlong, and can partially resynchronize with its
slaves (that were previously slaves of the old master).

* A given replication history is no longer identified by the `runid` of
a Redis node. There is instead a `replication ID` which changes every
time the instance has a new history no longer coherent with the past
one. So, for example, slaves publish the same replication history of
their master, however when they are turned into masters, they publish
a new replication ID, but still remember the old ID, so that they are
able to partially resynchronize with slaves of the old master (up to a
given offset).

* The replication protocol was slightly modified so that a new extended
+CONTINUE reply from the master is able to inform the slave of a
replication ID change.

* REPLCONF CAPA is used in order to notify masters that a slave is able
to understand the new +CONTINUE reply.

* The RDB file was extended with an auxiliary field that is able to
select a given DB after loading in the slave, so that the slave can
continue receiving the replication stream from the point it was
disconnected without requiring the master to insert "SELECT" statements.
This is useful in order to guarantee the "same stream" property, because
the slave must be able to accumulate an identical backlog.

* Slave pings to sub-slaves are now sent in a special form, when the
top-level master is disconnected, in order to don't interfer with the
replication stream. We just use out of band "\n" bytes as in other parts
of the Redis protocol.

An old design document is available here:

https://gist.github.com/antirez/ae068f95c0d084891305

However the implementation is not identical to the description because
during the work to implement it, different changes were needed in order
to make things working well.
2016-11-09 15:37:15 +01:00
Salvatore Sanfilippo
f344d9564a Merge pull request #3568 from MichaelTSS/patch-1
Typo
2016-11-02 15:18:44 +01:00
antirez
0a82c7c275 redis-cli typo fixed: perferences -> preferences.
Thanks to @qiaodaimadelaowang for signaling the issue.
Close #3585.
2016-11-02 15:15:49 +01:00
Salvatore Sanfilippo
a93812acdc Merge pull request #3514 from charsyam/feature/simple-refactoring
Simple change just using slaves instead of server.slaves
2016-11-02 11:04:52 +01:00