98 Commits

Author SHA1 Message Date
Muhammad Zahalqa
6bbfb97a89 Fix compiler warnings on function rev(unsigned long) 2020-05-08 10:37:36 +02:00
OMG-By
99427a1183 fix: dict.c->dictResize()->minimal type 2020-03-31 16:57:20 +02:00
Oran Agra
0693ac1961 RM_Scan disable dict rehashing
The callback approach we took is very efficient, the module can do any
filtering of keys without building any list and cloning strings, it can
also read data from the key's value. but if the user tries to re-open
the key, or any other key, this can cause dict re-hashing (dictFind does
that), and that's very bad to do from inside dictScan.

this commit protects the dict from doing any rehashing during scan, but
also warns the user not to attempt any writes or command calls from
within the callback, for fear of unexpected side effects and crashes.
2020-03-05 12:51:14 +01:00
antirez
0ad63de59d Set dictGetFairRandomKey() samples to 20 for final version.
Distribution improves dramatically: tests show it clearly. Better to
have a slower implementation than a wrong one, because random member
extraction should be correct or tends to be useless for a number of
tasks.
2019-02-19 17:27:42 +01:00
antirez
34982b0d0e Limit sampling size in dictGetFairRandomKey().
This way the implementation is almost as fast as the original one, but
the distribution is not too bad.
2019-02-18 18:38:40 +01:00
antirez
aae7e1bff0 Better distribution for set get-random-element operations. 2019-02-18 18:27:18 +01:00
antirez
44d33611d0 dict.c: remove a few trailing spaces. 2018-07-17 10:39:47 +02:00
Salvatore Sanfilippo
800c9de8c2 Merge pull request #5128 from kingpeterpaule/remove-one-loop-in-freeMemoryIfNeeded
remove ineffective loop in dictGetSomeKeys.
2018-07-17 10:38:55 +02:00
paule
ee1efee3cb Update dict.c
change coding style.
2018-07-16 14:29:59 +08:00
peterpaule
c05fa8c4e7 remove one ineffective loop in dictGetSomeKeys. 2018-07-16 11:28:22 +08:00
Jack Drogon
bae1d36e5d Fix typo 2018-07-03 18:19:46 +02:00
Salvatore Sanfilippo
add9a38835 Merge pull request #4800 from soloestoy/dict-expand
adjust position of _dictNextPower in dictExpand
2018-06-08 12:32:18 +02:00
赵磊
0b8ca6cffe Fix dictScan(): It can't scan all buckets when dict is shrinking. 2018-05-08 15:30:11 +08:00
zhaozhao.zz
76ca9f436f adjust position of _dictNextPower in dictExpand 2018-03-29 17:36:15 +08:00
zhaozhao.zz
3ab4a9020d dict: fix the int problem for defrag 2017-12-05 15:38:03 +01:00
zhaozhao.zz
5b651f2f0a dict: fix the int problem 2017-12-05 15:37:59 +01:00
Salvatore Sanfilippo
4278b57cd3 Merge pull request #2741 from kmiku7/unstable
fix boundary case for _dictNextPower
2017-11-08 17:06:09 +01:00
antirez
93b5ed037c Use locale agnostic tolower() in dict.c hash function. 2017-02-20 17:39:44 +01:00
antirez
7a20dc7919 Use SipHash hash function to mitigate HashDos attempts.
This change attempts to switch to an hash function which mitigates
the effects of the HashDoS attack (denial of service attack trying
to force data structures to worst case behavior) while at the same time
providing Redis with an hash function that does not expect the input
data to be word aligned, a condition no longer true now that sds.c
strings have a varialbe length header.

Note that it is possible sometimes that even using an hash function
for which collisions cannot be generated without knowing the seed,
special implementation details or the exposure of the seed in an
indirect way (for example the ability to add elements to a Set and
check the return in which Redis returns them with SMEMBERS) may
make the attacker's life simpler in the process of trying to guess
the correct seed, however the next step would be to switch to a
log(N) data structure when too many items in a single bucket are
detected: this seems like an overkill in the case of Redis.

SPEED REGRESION TESTS:

In order to verify that switching from MurmurHash to SipHash had
no impact on speed, a set of benchmarks involving fast insertion
of 5 million of keys were performed.

The result shows Redis with SipHash in high pipelining conditions
to be about 4% slower compared to using the previous hash function.
However this could partially be related to the fact that the current
implementation does not attempt to hash whole words at a time but
reads single bytes, in order to have an output which is endian-netural
and at the same time working on systems where unaligned memory accesses
are a problem.

Further X86 specific optimizations should be tested, the function
may easily get at the same level of MurMurHash2 if a few optimizations
are performed.
2017-02-20 17:29:17 +01:00
oranagra
b260c7ef74 active defrag improvements 2017-01-02 09:42:32 +02:00
oranagra
c053025a0a active memory defragmentation 2016-12-30 03:37:52 +02:00
antirez
e396deda94 dict.c: fix dictGenericDelete() return ASAP condition.
Recently we moved the "return ASAP" condition for the Delete() function
from checking .size to checking .used, which is smarter, however while
testing the first table alone always works to ensure the dict is totally
emtpy, when we test the .size field, testing .used requires testing both
T0 and T1, since a rehashing could be in progress.
2016-09-20 17:22:30 +02:00
antirez
e2106f0281 dict.c: dictReplaceRaw() -> dictAddOrFind().
What they say about "naming things" in programming?
2016-09-14 16:43:38 +02:00
antirez
7f608a61de Apply the new dictUnlink() where possible.
Optimizations suggested and originally implemented by @oranagra.
Re-applied by @antirez using the modified API.
2016-09-14 16:37:53 +02:00
oranagra
90af5e0825 dict.c: introduce dictUnlink().
Notes by @antirez:

This patch was picked from a larger commit by Oran and adapted to change
the API a bit. The basic idea is to avoid double lookups when there is
to use the value of the deleted entry.

BEFORE:

    entry = dictFind( ... ); /* 1st lookup. */
    /* Do somethjing with the entry. */
    dictDelete(...);         /* 2nd lookup. */

AFTER:

    entry = dictUnlink( ... ); /* 1st lookup. */
    /* Do somethjing with the entry. */
    dictFreeUnlinkedEntry(entry); /* No lookups!. */
2016-09-14 12:18:59 +02:00
oranagra
4448185b52 Optimize repeated keyname hashing.
(Change cherry-picked and modified by @antirez from a larger commit
provided by @oranagra in PR #3223).
2016-09-12 13:19:05 +02:00
antirez
c9e62f0056 dict.c benchmark minor improvements. 2016-09-07 15:28:40 +02:00
antirez
a66a31feb6 dict.c benchmark: mixed del/insert benchmark. 2016-09-07 12:34:53 +02:00
antirez
78c178dfa3 dict.c benchmark: finish rehashing before testing lookups. 2016-09-07 11:06:03 +02:00
antirez
f3742ec364 dict.c benchmark improvements. 2016-09-07 10:53:47 +02:00
antirez
7d22d0328d dict.c benchmark: take optional count argument. 2016-09-07 10:44:29 +02:00
antirez
2a29f2334e dict.c benchmark. 2016-09-07 10:33:15 +02:00
Oran Agra
0f82ba8236 dict.c minor optimization 2016-04-25 16:48:25 +03:00
antirez
ca8d13bd90 Lazyfree: a first implementation of non blocking DEL. 2015-10-01 13:00:19 +02:00
kmiku7
b42dceab12 fix boundary case for _dictNextPower 2015-08-23 16:47:42 +08:00
antirez
1b16400551 DEBUG HTSTATS <dbid> added.
The command reports information about the hash table internal state
representing the specified database ID.

This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
2015-07-14 17:15:37 +02:00
antirez
df2a1e0901 dict.c: convert types to unsigned long where appropriate.
No semantical changes since to make dict.c truly able to scale over the
32 bit table size limit, the hash function shoulds and other internals
related to hash function output should be 64 bit ready.
2015-03-27 10:14:52 +01:00
antirez
ce5d516e75 dict.c: add casting to avoid compilation warning.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
2015-03-27 10:12:25 +01:00
antirez
87a6696d89 SPOP: reimplemented for speed and better distribution.
The old version of SPOP with "count" argument used an API call of dict.c
which was actually designed for a different goal, and was not capable of
good distribution. We follow a different three-cases approach optimized
for different ratiion between sets and requested number of elements.

The implementation is simpler and allowed the removal of a large amount
of code.
2015-02-11 10:52:28 +01:00
antirez
e1da371706 dict.c: reset emptylen when bucket is not empty.
Fixed by @oranagra, thank you.
2015-02-11 10:52:27 +01:00
antirez
2bb647770d dict.c: add dictGetSomeKeys(), specialized for eviction. 2015-02-11 10:52:27 +01:00
antirez
6086a3656a dict.c: avoid code repetition in dictRehash().
Avoid code repetition introduced with PR #2367, also fixes the return
value to always return 0 if there is nothing more to rehash.
2015-02-11 10:52:27 +01:00
Sun He
9c047bbfec dict.c/dictRehash: check again to update 2015-02-11 10:52:26 +01:00
antirez
15940c1988 dict.c: don't try buckets that are empty for sure in dictGetRandomKey().
This is very similar to the optimization applied to dictGetRandomKeys,
but applied to the single key variant.

Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
5d106de0c9 dict.c: dictGetRandomKeys() optimization for big->small table case.
Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
71b990a7ab dict.c: dictGetRandomKeys() visit pattern optimization.
We use the invariant that the original table ht[0] is never populated up
to the index before the current rehashing index.

Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
19d51b181b dict.c: put a bound to max work dictRehash() call can do.
Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
1115984e9f dict.c: prevent useless resize to same size.
Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
4c149c5ab9 Less blocking dictGetRandomKeys().
Related to issue #2306.
2015-02-11 10:52:26 +01:00
antirez
848a382619 dict.c: make chaining strategy more clear in dictAddRaw(). 2015-01-23 18:11:05 +01:00