This feature was implemented in the initial days of the Redis Cluster
implementaiton but is not a good idea at all.
1) It depends on clocks to be synchronized, that is already very bad.
2) Moreover it adds a bug where the pong time is updated via gossip so
no new PING is ever sent by the current node, with the effect of no PONG
received, no update of tables, no clearing of PFAIL flag.
In general to trust other nodes about the reachability of other nodes is
a broken distributed programming model.
This feature was implemented in the initial days of the Redis Cluster
implementaiton but is not a good idea at all.
1) It depends on clocks to be synchronized, that is already very bad.
2) Moreover it adds a bug where the pong time is updated via gossip so
no new PING is ever sent by the current node, with the effect of no PONG
received, no update of tables, no clearing of PFAIL flag.
In general to trust other nodes about the reachability of other nodes is
a broken distributed programming model.
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.
Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.
This commit is related to issue #1240.
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.
Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.
This commit is related to issue #1240.
This commit does mainly two things:
1) It fixes zunionInterGenericCommand() by removing mass-initialization
of all the iterators used, so that we don't violate the unsafe iterator
API of dictionaries. This fixes issue #1240.
2) Since the zui* APIs required the allocator to be initialized in the
zsetopsrc structure in order to use non-iterator related APIs, this
commit fixes this strict requirement by accessing objects directly via
the op->subject->ptr pointer we have to the object.
This commit does mainly two things:
1) It fixes zunionInterGenericCommand() by removing mass-initialization
of all the iterators used, so that we don't violate the unsafe iterator
API of dictionaries. This fixes issue #1240.
2) Since the zui* APIs required the allocator to be initialized in the
zsetopsrc structure in order to use non-iterator related APIs, this
commit fixes this strict requirement by accessing objects directly via
the op->subject->ptr pointer we have to the object.
dict.c allows the user to create unsafe iterators, that are iterators
that will not touch the dictionary data structure in any way, preventing
copy on write, but at the same time are limited in their usage.
The limitation is that when itearting with an unsafe iterator, no call
to other dictionary functions must be done inside the iteration loop,
otherwise the dictionary may be incrementally rehashed resulting into
missing elements in the set of the elements returned by the iterator.
However after introducing this kind of iterators a number of bugs were
found due to misuses of the API, and we are still finding
bugs about this issue. The bugs are not trivial to track because the
effect is just missing elements during the iteartion.
This commit introduces auto-detection of the API misuse. The idea is
that an unsafe iterator has a contract: from initialization to the
release of the iterator the dictionary should not change.
So we take a fingerprint of the dictionary state, xoring a few important
dict properties when the unsafe iteartor is initialized. We later check
when the iterator is released if the fingerprint is still the same. If it
is not, we found a misuse of the iterator, as not allowed API calls
changed the internal state of the dictionary.
This code was checked against a real bug, issue #1240.
This is what Redis prints (aborting) when a misuse is detected:
Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)),
function dictReleaseIterator, file dict.c, line 587.
dict.c allows the user to create unsafe iterators, that are iterators
that will not touch the dictionary data structure in any way, preventing
copy on write, but at the same time are limited in their usage.
The limitation is that when itearting with an unsafe iterator, no call
to other dictionary functions must be done inside the iteration loop,
otherwise the dictionary may be incrementally rehashed resulting into
missing elements in the set of the elements returned by the iterator.
However after introducing this kind of iterators a number of bugs were
found due to misuses of the API, and we are still finding
bugs about this issue. The bugs are not trivial to track because the
effect is just missing elements during the iteartion.
This commit introduces auto-detection of the API misuse. The idea is
that an unsafe iterator has a contract: from initialization to the
release of the iterator the dictionary should not change.
So we take a fingerprint of the dictionary state, xoring a few important
dict properties when the unsafe iteartor is initialized. We later check
when the iterator is released if the fingerprint is still the same. If it
is not, we found a misuse of the iterator, as not allowed API calls
changed the internal state of the dictionary.
This code was checked against a real bug, issue #1240.
This is what Redis prints (aborting) when a misuse is detected:
Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)),
function dictReleaseIterator, file dict.c, line 587.
The previous code using a static buffer as an optimization was lame:
1) Premature optimization, actually it was *slower* than naive code
because resulted into the creation / destruction of the object
encapsulating the output buffer.
2) The code was very hard to test, since it was needed to have specific
tests for command lines exceeding the size of the static buffer.
3) As a result of "2" the code was bugged as the current tests were not
able to stress specific corner cases.
It was replaced with easy to understand code that is safer and faster.