Technically as soon as Redis 64 bit gets proper support for loading
collections and/or DBs with more than 2^32 elements, the 32 bit version
should be modified in order to check if what we read from rdbLoadLen()
overflows. This would only apply to huge RDB files created with a 64 bit
instance and later loaded into a 32 bit instance.
Technically as soon as Redis 64 bit gets proper support for loading
collections and/or DBs with more than 2^32 elements, the 32 bit version
should be modified in order to check if what we read from rdbLoadLen()
overflows. This would only apply to huge RDB files created with a 64 bit
instance and later loaded into a 32 bit instance.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion
to 780a8b1. Together the two patches should avoid that the AOF and RDB saving
processes can be spawned at the same time. Previously conditions that
could lead to two saving processes at the same time were:
1. When AOF is enabled via CONFIG SET and an RDB saving process is
already active.
2. When the SYNC command decides to start an RDB saving process ASAP in
order to serve a new slave that cannot partially resynchronize (but
only if we have a disk target for replication, for diskless
replication there is not such a problem).
Condition "1" is not very severe but "2" can happen often and is
definitely good at degrading Redis performances in an unexpected way.
The two commits have the effect of always spawning RDB savings for
replication in replicationCron() instead of attempting to start an RDB
save synchronously. Moreover when a BGSAVE or AOF rewrite must be
performed, they are instead just postponed using flags that will try to
perform such operations ASAP.
Finally the BGSAVE command was modified in order to accept a SCHEDULE
option so that if an AOF rewrite is in progress, when this option is
given, the command no longer returns an error, but instead schedules an
RDB rewrite operation for when it will be possible to start it.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion
to 780a8b1. Together the two patches should avoid that the AOF and RDB saving
processes can be spawned at the same time. Previously conditions that
could lead to two saving processes at the same time were:
1. When AOF is enabled via CONFIG SET and an RDB saving process is
already active.
2. When the SYNC command decides to start an RDB saving process ASAP in
order to serve a new slave that cannot partially resynchronize (but
only if we have a disk target for replication, for diskless
replication there is not such a problem).
Condition "1" is not very severe but "2" can happen often and is
definitely good at degrading Redis performances in an unexpected way.
The two commits have the effect of always spawning RDB savings for
replication in replicationCron() instead of attempting to start an RDB
save synchronously. Moreover when a BGSAVE or AOF rewrite must be
performed, they are instead just postponed using flags that will try to
perform such operations ASAP.
Finally the BGSAVE command was modified in order to accept a SCHEDULE
option so that if an AOF rewrite is in progress, when this option is
given, the command no longer returns an error, but instead schedules an
RDB rewrite operation for when it will be possible to start it.
As Oran Agra suggested, in startBgsaveForReplication() when the BGSAVE
attempt returns an error, we scan the list of slaves in order to remove
them since there is no way to serve them currently.
However we check for the replication state BGSAVE_START, which was
modified by rdbSaveToSlaveSockets() before forking(). So when fork fails
the state of slaves remain BGSAVE_END and no cleanup is performed.
This commit fixes the problem by making rdbSaveToSlavesSockets() able to
undo the state change on fork failure.
In previous commits we moved the FULLRESYNC to the moment we start the
BGSAVE, so that the offset we provide is the right one. However this
also means that we need to re-emit the SELECT statement every time a new
slave starts to accumulate the changes.
To obtian this effect in a more clean way, the function that sends the
FULLRESYNC reply was overloaded with a more important role of also doing
this and chanigng the slave state. So it was renamed to
replicationSetupSlaveForFullResync() to better reflect what it does now.
This commit attempts to fix a bug involving PSYNC and diskless
replication (currently experimental) found by Yuval Inbar from Redis Labs
and that was later found to have even more far reaching effects (the bug also
exists when diskstore is off).
The gist of the bug is that, a Redis master replies with +FULLRESYNC to
a PSYNC attempt that fails and requires a full resynchronization.
However, the baseline offset sent along with FULLRESYNC was always the
current master replication offset. This is not ok, because there are
many reasosn that may delay the RDB file creation. And... guess what,
the master offset we communicate must be the one of the time the RDB
was created. So for example:
1) When the BGSAVE for replication is delayed since there is one
already but is not good for replication.
2) When the BGSAVE is not needed as we attach one currently ongoing.
3) When because of diskless replication the BGSAVE is delayed.
In all the above cases the PSYNC reply is wrong and the slave may
reconnect later claiming to need a wrong offset: this may cause
data curruption later.
Previouly if we loaded a corrupt RDB, Redis printed an error report
with a big "REPORT ON GITHUB" message at the bottom. But, we know
RDB load failures are corrupt data, not corrupt code.
Now when RDB failure is detected (duplicate keys or unknown data
types in the file), we run check-rdb against the RDB then exit. The
automatic check-rdb hopefully gives the user instant feedback
about what is wrong instead of providing a mysterious stack
trace.
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.
This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.
This commit increases object size calculation and
cascades the change back up to serializedlength printing.
Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...
After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...