98 Commits

Author SHA1 Message Date
Matt Stancliff
0cd666c85e Cleanup wording of dictScan() comment
Some language in the comment was difficult
to understand, so this commit: clarifies wording, removes
unnecessary words, and relocates some dependent clauses
closer to what they actually describe.

I also tried to break up longer chains of thought
(if X, then Y, and Q, and also F, so obviously M)
into more manageable chunks for ease of understanding.
2014-09-29 06:49:08 -04:00
Michael Parker
cb4ce4ab18 Fix hash table size in comment for dictScan
Closes #1351
2014-09-29 06:49:07 -04:00
antirez
4db8a7396c Fix dictRehash assert casting type.
Also related to #1929.
2014-08-26 10:32:44 +02:00
antirez
4937702744 Cast to right type in dictNext().
This closes issue #1929, the other part was fixed in the context of issue
2014-08-26 10:26:36 +02:00
Cong Ding
ca13fbca08 Remove unused function
Closes #878
2014-08-18 11:12:26 +02:00
antirez
2e94ffb1d1 Remove warnings and improve integer sign correctness. 2014-08-13 11:44:38 +02:00
antirez
28573f8fe8 Added dictGetRandomKeys() to dict.c: mass get random entries.
This new function is useful to get a number of random entries from an
hash table when we just need to do some sampling without particularly
good distribution.

It just jumps at a random place of the hash table and returns the first
N items encountered by scanning linearly.

The main usefulness of this function is to speedup Redis internal
sampling of the key space, for example for key eviction or expiry.
2014-03-20 15:50:46 +01:00
zhanghailei
37bd3f04bb FIXED a typo more thank should be more than 2014-03-04 11:21:34 +08:00
zhanghailei
1ec4a421e3 According to context,the size should be 16 rather than 64 2014-03-04 11:21:34 +08:00
antirez
247a311317 dict.c: added optional callback to dictEmpty().
Redis hash table implementation has many non-blocking features like
incremental rehashing, however while deleting a large hash table there
was no way to have a callback called to do some incremental work.

This commit adds this support, as an optiona callback argument to
dictEmpty() that is currently called at a fixed interval (one time every
65k deletions).
2013-12-10 18:46:24 +01:00
antirez
7a5a646df9 Fixed grammar: before H the article is a, not an. 2013-12-05 16:35:32 +01:00
antirez
865d3b0f33 removed not used vars in dictScan(). 2013-11-05 11:56:11 +01:00
antirez
99efa37a6b dictScan(): empty hash table requires special handling. 2013-10-28 11:17:18 +01:00
antirez
f86c07df30 Fixed typos in dictScan() comment. 2013-10-25 17:05:55 +02:00
antirez
817e6766aa dictScan() algorithm documented. 2013-10-25 17:01:30 +02:00
Pieter Noordhuis
f18269d1ef Fix error in scan algorithm
The irrelevant bits shouldn't be masked to 1. This can result in slots being
skipped when the hash table is resized between calls to the iterator.
2013-10-25 10:50:03 +02:00
Pieter Noordhuis
956c0ed927 Add SCAN command 2013-10-25 10:49:48 +02:00
antirez
13c59cfdc8 dictFingerprint(): cast pointers to integer of same size. 2013-08-20 11:49:55 +02:00
antirez
5824ab54dd Revert "Fixed type in dict.c comment: 265 -> 256."
This reverts commit d22d557e41151c1d716045e0059550e197d6e526.
2013-08-19 17:25:48 +02:00
antirez
d22d557e41 Fixed type in dict.c comment: 265 -> 256. 2013-08-19 15:10:37 +02:00
antirez
ded611636f assert.h replaced with redisassert.h when appropriate.
Also a warning was suppressed by including unistd.h in redisassert.h
(needed for _exit()).
2013-08-19 15:01:21 +02:00
antirez
ae1bb62f62 dictFingerprint() fingerprinting made more robust.
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.

Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.

This commit is related to issue #1240.
2013-08-19 15:01:12 +02:00
antirez
bfaadb0df2 dict.c iterator API misuse protection.
dict.c allows the user to create unsafe iterators, that are iterators
that will not touch the dictionary data structure in any way, preventing
copy on write, but at the same time are limited in their usage.

The limitation is that when itearting with an unsafe iterator, no call
to other dictionary functions must be done inside the iteration loop,
otherwise the dictionary may be incrementally rehashed resulting into
missing elements in the set of the elements returned by the iterator.

However after introducing this kind of iterators a number of bugs were
found due to misuses of the API, and we are still finding
bugs about this issue. The bugs are not trivial to track because the
effect is just missing elements during the iteartion.

This commit introduces auto-detection of the API misuse. The idea is
that an unsafe iterator has a contract: from initialization to the
release of the iterator the dictionary should not change.

So we take a fingerprint of the dictionary state, xoring a few important
dict properties when the unsafe iteartor is initialized. We later check
when the iterator is released if the fingerprint is still the same. If it
is not, we found a misuse of the iterator, as not allowed API calls
changed the internal state of the dictionary.

This code was checked against a real bug, issue #1240.

This is what Redis prints (aborting) when a misuse is detected:

Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)),
function dictReleaseIterator, file dict.c, line 587.
2013-08-19 15:00:57 +02:00
guiquanz
df7a5b7157 Fixed many typos. 2013-01-19 10:59:44 +01:00
charsyam
08402aee73 Remove unnecessary condition in _dictExpandIfNeeded (dict.c) 2012-11-28 11:44:39 +01:00
antirez
a32d1ddff6 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez
0c7d3bef67 Hash function switched to murmurhash2.
The previously used hash function, djbhash, is not secure against
collision attacks even when the seed is randomized as there are simple
ways to find seed-independent collisions.

The new hash function appears to be safe (or much harder to exploit at
least) in this case, and has better distribution.

Better distribution does not always means that's better. For instance in
a fast benchmark with "DEBUG POPULATE 1000000" I obtained the following
results:

    1.6 seconds with djbhash
    2.0 seconds with murmurhash2

This is due to the fact that djbhash will hash objects that follow the
pattern `prefix:<id>` and where the id is numerically near, to near
buckets. This improves the locality.

However in other access patterns with keys that have no relation
murmurhash2 has some (apparently minimal) speed advantage.

On the other hand a better distribution should significantly
improve the quality of the distribution of elements returned with
dictGetRandomKey() that is used in SPOP, SRANDMEMBER, RANDOMKEY, and
other commands.

Everything considered, and under the suspect that this commit fixes a
security issue in Redis, we are switching to the new hash function.
If some serious speed regression will be found in the future we'll be able
to step back easiliy.

This commit fixes issue #663.
2012-10-05 11:20:13 +02:00
antirez
0f3705f3c8 Even inside #if 0 comments are comments. 2012-04-21 21:49:21 +02:00
Salvatore Sanfilippo
cc080d1fc2 Merge pull request #440 from ErikDubbelboer/spelling
Fixed some spelling errors in comments
2012-04-21 03:31:06 -07:00
antirez
8fc5a95344 Currenly not used code in dict.c commented out. 2012-04-18 23:56:07 +02:00
Erik Dubbelboer
358745fcc2 Update src/dict.c 2012-04-07 15:45:53 +03:00
Erik Dubbelboer
1c82a561f1 Fixed some spelling errors in the comments 2012-04-07 14:40:29 +02:00
huangz1990
f7192a2ce1 fix typo 2012-03-15 14:27:14 +08:00
antirez
36d5d67ed7 fixed typo in hahs function seed default value. It is no longer used but fixed to retain the old constant as default anyway. 2012-01-22 01:40:23 +01:00
antirez
fff238e507 Fix for hash table collision attack. We simply randomize hash table initialization value at startup time. 2012-01-21 23:30:13 +01:00
antirez
cd3dc80e8c dict.c: added macros in dict.h to set signed and unsigned 64 bit values directly inside the hash entry without using additional memory. 2011-11-08 19:41:29 +01:00
antirez
d6c3b3004e dict.c API names modified to be more coincise and consistent. 2011-11-08 17:07:55 +01:00
antirez
e1ebf77694 dict.c: added two lower level methods for directly manipulating hash entries. This is useful in order to set 64 bit integers as values directly inside the hash entry (in order to save memory), without casting, and even in 32 bit builds. 2011-11-08 16:57:20 +01:00
antirez
be38c7b77b added an union in the dict.h structure to store 64 bit integers directly into hash table entries. 2011-11-02 15:28:45 +01:00
antirez
d4e65ce0c2 Introduced a safe iterator interface that can be used to iterate while accessing the dictionary at the same time. Now the default interface is consireded unsafe and should be used only with dictNext() 2011-05-10 10:15:50 +02:00
antirez
fd16bf40d9 fixed two diskstore issues, a quasi-deadlock creating problems with I/O speed and a race condition among threads 2011-02-11 11:16:15 +01:00
antirez
5d33a8862a command lookup process turned into a much more flexible and probably faster hash table 2010-11-03 11:23:59 +01:00
antirez
7d0534c0df This should fix Issue 332: when there is a background process saving we still allow the hash tables to grow, but only when a critical treshold is reached. Formerly we prevented the resize at all triggering pathological O(N) behavior. Also there is a fix for the statistics in INFO about the number of keys expired 2010-09-15 14:09:41 +02:00
antirez
e0be2289e9 hash table example commented out in dict.c 2010-07-27 10:00:38 +02:00
Benjamin Kramer
399f2f401c Add zcalloc and use it where appropriate
calloc is more effecient than malloc+memset when the system uses mmap to
allocate memory. mmap always returns zeroed memory so the memset can be
avoided.  The threshold to use mmap is 16k in osx libc and 128k in bsd
libc and glibc. The kernel can lazily allocate the pages, this reduces
memory usage when we have a page table or hash table that is mostly
empty.

This change is most visible when you start a new redis instance with vm
enabled.  You'll see no increased memory usage no matter how big your
page table is.
2010-07-25 00:11:20 +02:00
Benjamin Kramer
d9dd352b36 Remove _dictAlloc and friends
zmalloc calls abort() so _dictPanic will never be called.
2010-07-24 23:10:42 +02:00
Benjamin Kramer
b1e0bd4b9b Reduce code duplication 2010-07-24 22:37:01 +02:00
antirez
e2641e09cc redis.c split into many different C files.
networking related stuff moved into networking.c

moved more code

more work on layout of source code

SDS instantaneuos memory saving. By Pieter and Salvatore at VMware ;)

cleanly compiling again after the first split, now splitting it in more C files

moving more things around... work in progress

split replication code

splitting more

Sets split

Hash split

replication split

even more splitting

more splitting

minor change
2010-07-01 14:38:51 +02:00