21057 Commits

Author SHA1 Message Date
antirez
435ae67ecb Fix comment over 80 cols. 2016-08-03 10:56:26 +02:00
antirez
ede6e22cd3 Fix comment over 80 cols. 2016-08-03 10:56:26 +02:00
antirez
fa5d8e218e Fix comment over 80 cols. 2016-08-03 10:56:26 +02:00
antirez
e3cb70f2e4 Modules: initial draft for a testing module. 2016-08-03 10:23:03 +02:00
antirez
04340e1ff1 Modules: initial draft for a testing module. 2016-08-03 10:23:03 +02:00
antirez
89e24b8f4a Modules: initial draft for a testing module. 2016-08-03 10:23:03 +02:00
antirez
f7a8b65f3d Modules: StringAppendBuffer() and ability to retain strings.
RedisModule_StringRetain() allows, when automatic memory management is
on, to keep string objects living after the callback returns. Can also
be used in order to use Redis reference counting of objects inside
modules.

The reason why this is useful is that sometimes when implementing new
data types we want to reference RedisModuleString objects inside the
module private data structures, so those string objects must be valid
after the callback returns even if not referenced inside the Redis key
space.
2016-08-02 15:29:04 +02:00
antirez
7829e4ed2c Modules: StringAppendBuffer() and ability to retain strings.
RedisModule_StringRetain() allows, when automatic memory management is
on, to keep string objects living after the callback returns. Can also
be used in order to use Redis reference counting of objects inside
modules.

The reason why this is useful is that sometimes when implementing new
data types we want to reference RedisModuleString objects inside the
module private data structures, so those string objects must be valid
after the callback returns even if not referenced inside the Redis key
space.
2016-08-02 15:29:04 +02:00
antirez
5a52229249 Modules: StringAppendBuffer() and ability to retain strings.
RedisModule_StringRetain() allows, when automatic memory management is
on, to keep string objects living after the callback returns. Can also
be used in order to use Redis reference counting of objects inside
modules.

The reason why this is useful is that sometimes when implementing new
data types we want to reference RedisModuleString objects inside the
module private data structures, so those string objects must be valid
after the callback returns even if not referenced inside the Redis key
space.
2016-08-02 15:29:04 +02:00
Qu Chen
a03e36addd Fix a bug to delay bgsave while AOF rewrite in progress for replication 2016-08-02 10:44:33 +02:00
Qu Chen
d982f44372 Fix a bug to delay bgsave while AOF rewrite in progress for replication 2016-08-02 10:44:33 +02:00
Qu Chen
7d67db4b9f Fix a bug to delay bgsave while AOF rewrite in progress for replication 2016-08-02 10:44:33 +02:00
antirez
ab5f761493 Remove extra "-" from ASCII horizontal bar in comment. 2016-08-02 10:32:44 +02:00
antirez
9424fe4580 Remove extra "-" from ASCII horizontal bar in comment. 2016-08-02 10:32:44 +02:00
antirez
964cea9fba Remove extra "-" from ASCII horizontal bar in comment. 2016-08-02 10:32:44 +02:00
antirez
43b5a64a8f Ability of slave to announce arbitrary ip/port to master.
This feature is useful, especially in deployments using Sentinel in
order to setup Redis HA, where the slave is executed with NAT or port
forwarding, so that the auto-detected port/ip addresses, as listed in
the "INFO replication" output of the master, or as provided by the
"ROLE" command, don't match the real addresses at which the slave is
reachable for connections.
2016-07-27 17:32:15 +02:00
antirez
55385f99de Ability of slave to announce arbitrary ip/port to master.
This feature is useful, especially in deployments using Sentinel in
order to setup Redis HA, where the slave is executed with NAT or port
forwarding, so that the auto-detected port/ip addresses, as listed in
the "INFO replication" output of the master, or as provided by the
"ROLE" command, don't match the real addresses at which the slave is
reachable for connections.
2016-07-27 17:32:15 +02:00
antirez
f8a3861465 Ability of slave to announce arbitrary ip/port to master.
This feature is useful, especially in deployments using Sentinel in
order to setup Redis HA, where the slave is executed with NAT or port
forwarding, so that the auto-detected port/ip addresses, as listed in
the "INFO replication" output of the master, or as provided by the
"ROLE" command, don't match the real addresses at which the slave is
reachable for connections.
2016-07-27 17:32:15 +02:00
antirez
23bdeec122 Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS
tests failures were detected.

Fortunately when a GEORADIUS failure happens, the test suite logs enough
information in order to reproduce the problem: the PRNG seed,
coordinates and radius of the query.

By reproducing the issues, three different bugs were discovered and
fixed in this commit. This commit also improves the already good
reporting of the fuzzer and adds the failure vectors as regression
tests.

The issues found:

1. We need larger squares around the poles in order to cover the area
requested by the user. There were already checks in order to use a
smaller step (larger squares) but the limit set (+/- 67 degrees) is not
enough in certain edge cases, so 66 is used now.

2. Even near the equator, when the search area center is very near the
edge of the square, the north, south, west or ovest square may not be
able to fully cover the specified radius. Now a test is performed at the
edge of the initial guessed search area, and larger squares are used in
case the test fails.

3. Because of rounding errors between Redis and Tcl, sometimes the test
signaled false positives. This is now addressed.

Whenever possible the original code was improved a bit in other ways. A
debugging example stanza was added in order to make the next debugging
session simpler when the next bug is found.
2016-07-27 11:34:25 +02:00
antirez
356a6304ec Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS
tests failures were detected.

Fortunately when a GEORADIUS failure happens, the test suite logs enough
information in order to reproduce the problem: the PRNG seed,
coordinates and radius of the query.

By reproducing the issues, three different bugs were discovered and
fixed in this commit. This commit also improves the already good
reporting of the fuzzer and adds the failure vectors as regression
tests.

The issues found:

1. We need larger squares around the poles in order to cover the area
requested by the user. There were already checks in order to use a
smaller step (larger squares) but the limit set (+/- 67 degrees) is not
enough in certain edge cases, so 66 is used now.

2. Even near the equator, when the search area center is very near the
edge of the square, the north, south, west or ovest square may not be
able to fully cover the specified radius. Now a test is performed at the
edge of the initial guessed search area, and larger squares are used in
case the test fails.

3. Because of rounding errors between Redis and Tcl, sometimes the test
signaled false positives. This is now addressed.

Whenever possible the original code was improved a bit in other ways. A
debugging example stanza was added in order to make the next debugging
session simpler when the next bug is found.
2016-07-27 11:34:25 +02:00
antirez
976c2425c4 Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS
tests failures were detected.

Fortunately when a GEORADIUS failure happens, the test suite logs enough
information in order to reproduce the problem: the PRNG seed,
coordinates and radius of the query.

By reproducing the issues, three different bugs were discovered and
fixed in this commit. This commit also improves the already good
reporting of the fuzzer and adds the failure vectors as regression
tests.

The issues found:

1. We need larger squares around the poles in order to cover the area
requested by the user. There were already checks in order to use a
smaller step (larger squares) but the limit set (+/- 67 degrees) is not
enough in certain edge cases, so 66 is used now.

2. Even near the equator, when the search area center is very near the
edge of the square, the north, south, west or ovest square may not be
able to fully cover the specified radius. Now a test is performed at the
edge of the initial guessed search area, and larger squares are used in
case the test fails.

3. Because of rounding errors between Redis and Tcl, sometimes the test
signaled false positives. This is now addressed.

Whenever possible the original code was improved a bit in other ways. A
debugging example stanza was added in order to make the next debugging
session simpler when the next bug is found.
2016-07-27 11:34:25 +02:00
antirez
d4b975f0b8 Replication: when possible start RDB saving ASAP.
In a previous commit the replication code was changed in order to
centralize the BGSAVE for replication trigger in replicationCron(),
however after further testings, the 1 second delay imposed by this
change is not acceptable.

So now the BGSAVE is only delayed if the AOF rewriting process is
active. However past comments made sure that replicationCron() is always
able to trigger the BGSAVE when needed, making the code generally more
robust.

The new code is more similar to the initial @oranagra patch where the
BGSAVE was delayed only if an AOF rewrite was in progress.

Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now
fixed.
2016-07-22 17:03:18 +02:00
antirez
03f5b508e5 Replication: when possible start RDB saving ASAP.
In a previous commit the replication code was changed in order to
centralize the BGSAVE for replication trigger in replicationCron(),
however after further testings, the 1 second delay imposed by this
change is not acceptable.

So now the BGSAVE is only delayed if the AOF rewriting process is
active. However past comments made sure that replicationCron() is always
able to trigger the BGSAVE when needed, making the code generally more
robust.

The new code is more similar to the initial @oranagra patch where the
BGSAVE was delayed only if an AOF rewrite was in progress.

Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now
fixed.
2016-07-22 17:03:18 +02:00
antirez
0de6a2702b Replication: when possible start RDB saving ASAP.
In a previous commit the replication code was changed in order to
centralize the BGSAVE for replication trigger in replicationCron(),
however after further testings, the 1 second delay imposed by this
change is not acceptable.

So now the BGSAVE is only delayed if the AOF rewriting process is
active. However past comments made sure that replicationCron() is always
able to trigger the BGSAVE when needed, making the code generally more
robust.

The new code is more similar to the initial @oranagra patch where the
BGSAVE was delayed only if an AOF rewrite was in progress.

Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now
fixed.
2016-07-22 17:03:18 +02:00
antirez
d6332b5034 Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
3e9ce38b0a Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
f844032233 Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have
a very high disconnection time from its master (since technically it was
disconnected since forever, so the current UNIX time in seconds is
reported).

However when the slave is connected again the Sentinel may re-scan the
INFO output again only after 10 seconds, which is a long time. During
this time Sentinels will consider this instance unable to failover, so
a useless delay is introduced.

Actaully this hardly happened in the practice because when a slave's
master is down, the INFO period for slaves changes to 1 second. However
when a manual failover is attempted immediately after adding slaves
(like in the case of the Sentinel unit test), this problem may happen.

This commit changes the INFO period to 1 second even in the case the
slave's master is not down, but the slave reported to be disconnected
from the master (by publishing, last time we checked, a master
disconnection time field in INFO).

This change is required as a result of an unrelated change in the
replication code that adds a small delay in the master-slave first
synchronization.
2016-07-22 10:51:25 +02:00
antirez
276dfeb3d0 Avoid simultaneous RDB and AOF child process.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion
to 03a4432. Together the two patches should avoid that the AOF and RDB saving
processes can be spawned at the same time. Previously conditions that
could lead to two saving processes at the same time were:

1. When AOF is enabled via CONFIG SET and an RDB saving process is
   already active.

2. When the SYNC command decides to start an RDB saving process ASAP in
   order to serve a new slave that cannot partially resynchronize (but
   only if we have a disk target for replication, for diskless
   replication there is not such a problem).

Condition "1" is not very severe but "2" can happen often and is
definitely good at degrading Redis performances in an unexpected way.

The two commits have the effect of always spawning RDB savings for
replication in replicationCron() instead of attempting to start an RDB
save synchronously. Moreover when a BGSAVE or AOF rewrite must be
performed, they are instead just postponed using flags that will try to
perform such operations ASAP.

Finally the BGSAVE command was modified in order to accept a SCHEDULE
option so that if an AOF rewrite is in progress, when this option is
given, the command no longer returns an error, but instead schedules an
RDB rewrite operation for when it will be possible to start it.
2016-07-21 18:35:01 +02:00
antirez
0a628e5102 Avoid simultaneous RDB and AOF child process.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion
to 780a8b1. Together the two patches should avoid that the AOF and RDB saving
processes can be spawned at the same time. Previously conditions that
could lead to two saving processes at the same time were:

1. When AOF is enabled via CONFIG SET and an RDB saving process is
   already active.

2. When the SYNC command decides to start an RDB saving process ASAP in
   order to serve a new slave that cannot partially resynchronize (but
   only if we have a disk target for replication, for diskless
   replication there is not such a problem).

Condition "1" is not very severe but "2" can happen often and is
definitely good at degrading Redis performances in an unexpected way.

The two commits have the effect of always spawning RDB savings for
replication in replicationCron() instead of attempting to start an RDB
save synchronously. Moreover when a BGSAVE or AOF rewrite must be
performed, they are instead just postponed using flags that will try to
perform such operations ASAP.

Finally the BGSAVE command was modified in order to accept a SCHEDULE
option so that if an AOF rewrite is in progress, when this option is
given, the command no longer returns an error, but instead schedules an
RDB rewrite operation for when it will be possible to start it.
2016-07-21 18:35:01 +02:00
antirez
8f6844d7dd Avoid simultaneous RDB and AOF child process.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion
to 780a8b1. Together the two patches should avoid that the AOF and RDB saving
processes can be spawned at the same time. Previously conditions that
could lead to two saving processes at the same time were:

1. When AOF is enabled via CONFIG SET and an RDB saving process is
   already active.

2. When the SYNC command decides to start an RDB saving process ASAP in
   order to serve a new slave that cannot partially resynchronize (but
   only if we have a disk target for replication, for diskless
   replication there is not such a problem).

Condition "1" is not very severe but "2" can happen often and is
definitely good at degrading Redis performances in an unexpected way.

The two commits have the effect of always spawning RDB savings for
replication in replicationCron() instead of attempting to start an RDB
save synchronously. Moreover when a BGSAVE or AOF rewrite must be
performed, they are instead just postponed using flags that will try to
perform such operations ASAP.

Finally the BGSAVE command was modified in order to accept a SCHEDULE
option so that if an AOF rewrite is in progress, when this option is
given, the command no longer returns an error, but instead schedules an
RDB rewrite operation for when it will be possible to start it.
2016-07-21 18:35:01 +02:00
antirez
03a443292e Replication: start BGSAVE for replication always in replicationCron().
This makes the replication code conceptually simpler by removing the
synchronous BGSAVE trigger in syncCommand(). This also means that
socket and disk BGSAVE targets are handled by the same code.
2016-07-21 12:10:56 +02:00
antirez
780a8b1d76 Replication: start BGSAVE for replication always in replicationCron().
This makes the replication code conceptually simpler by removing the
synchronous BGSAVE trigger in syncCommand(). This also means that
socket and disk BGSAVE targets are handled by the same code.
2016-07-21 12:10:56 +02:00
antirez
1c2fb8a0df Replication: start BGSAVE for replication always in replicationCron().
This makes the replication code conceptually simpler by removing the
synchronous BGSAVE trigger in syncCommand(). This also means that
socket and disk BGSAVE targets are handled by the same code.
2016-07-21 12:10:56 +02:00
antirez
ef404d604b Fix maxmemory shared integer check bug introduced with LFU. 2016-07-21 11:14:18 +02:00
antirez
e0582b3598 Fix maxmemory shared integer check bug introduced with LFU. 2016-07-21 11:14:18 +02:00
antirez
b98c2df5a0 Fix maxmemory shared integer check bug introduced with LFU. 2016-07-21 11:14:18 +02:00
antirez
36230c7b82 Volatile-ttl eviction policy implemented in terms of the pool.
Precision of the eviction improved sensibly. Also this allows us to have
a single code path for most eviction types.
2016-07-20 19:54:12 +02:00
antirez
2d5eb1f1a0 Volatile-ttl eviction policy implemented in terms of the pool.
Precision of the eviction improved sensibly. Also this allows us to have
a single code path for most eviction types.
2016-07-20 19:54:12 +02:00
antirez
4376529929 Volatile-ttl eviction policy implemented in terms of the pool.
Precision of the eviction improved sensibly. Also this allows us to have
a single code path for most eviction types.
2016-07-20 19:54:12 +02:00
antirez
73df375139 LFU: make counter log factor and decay time configurable. 2016-07-20 15:00:35 +02:00
antirez
6854c7b9ee LFU: make counter log factor and decay time configurable. 2016-07-20 15:00:35 +02:00
antirez
0e1b0d6276 LFU: make counter log factor and decay time configurable. 2016-07-20 15:00:35 +02:00
antirez
0020a8b5fa LFU: Use the LRU pool for the LFU algorithm.
Verified to have better real world performances with power-law access
patterns because of the data accumulated across calls.
2016-07-18 18:17:59 +02:00
antirez
6416ab19d0 LFU: Use the LRU pool for the LFU algorithm.
Verified to have better real world performances with power-law access
patterns because of the data accumulated across calls.
2016-07-18 18:17:59 +02:00
antirez
ab78df558e LFU: Use the LRU pool for the LFU algorithm.
Verified to have better real world performances with power-law access
patterns because of the data accumulated across calls.
2016-07-18 18:17:59 +02:00
antirez
1cbf6f06b1 LFU: Fix bugs in frequency decay code. 2016-07-18 14:19:38 +02:00
antirez
dbce190ad0 LFU: Fix bugs in frequency decay code. 2016-07-18 14:19:38 +02:00
antirez
c262acb656 LFU: Fix bugs in frequency decay code. 2016-07-18 14:19:38 +02:00
antirez
368aa4da99 LFU: Initial naive eviction cycle.
It is possible to get better results by using the pool like in the LRU
case. Also from tests during the morning I believe the current
implementation has issues in the frequency decay function that should
decrease the counter at periodic intervals.
2016-07-18 13:50:19 +02:00
antirez
a8e2d0849e LFU: Initial naive eviction cycle.
It is possible to get better results by using the pool like in the LRU
case. Also from tests during the morning I believe the current
implementation has issues in the frequency decay function that should
decrease the counter at periodic intervals.
2016-07-18 13:50:19 +02:00