futriix

Author	SHA1	Message	Date
jsully	763b349bd2	Merge branch 'multithread_load' into 'keydbpro' Multithread load See merge request external-collab/keydb-pro-6!5 Former-commit-id: 20e712244071028b0f75ccad477308efd139261f	2021-10-08 17:55:55 +00:00
Malavan Sotheeswaran	27bf624bae	Merge fix to dict resize during rdb load Former-commit-id: c398d5f8a027c67acac64bdbfbd01486dde555eb	2021-09-17 16:10:48 +00:00
malavan	86784fe9ba	improve overwrite key performance Former-commit-id: 56f9d5528385ea78074a308c6d3987b920d6cc35	2021-09-14 17:06:04 +00:00
John Sully	f4151f0d6b	Merge branch 'unstable' into keydbpro Former-commit-id: 205d8f18d2bb8df5253bab40578b006b7aa73fd5	2021-05-28 23:32:46 +00:00
John Sully	ea6a0f214b	Merge tag '6.2.2' into unstable Former-commit-id: 93ebb31b17adec5d406d2e30a5b9ea71c07fce5c	2021-05-21 05:54:39 +00:00
John Sully	f49d8f9adb	Merge tag '6.2.1' into unstable Former-commit-id: bfed57e3e0edaa724b9d060a6bb8edc5a6de65fa	2021-05-19 02:59:48 +00:00
John Sully	42f98c27e3	Fix test hang Former-commit-id: 23647390e628de07759f8e7d8768a7f638edf01d	2021-05-07 00:28:10 +00:00
John Sully	131a8c9e35	Fix bug where we skip valid dict elements in dictGetRandomKey Former-commit-id: 626b56b00824573660af0c47b210fd1e8d2cfeb2	2021-03-24 20:26:33 +00:00
John Sully	44603c8227	Make async rehash behave with snapshots (thread safety issues) Former-commit-id: 372adf39a80252b8035e3c948fcaf7d5ef6f928f	2021-03-16 02:38:41 +00:00
sundb	364b7c376d	Add run all test support with define REDIS_TEST (#8570 ) 1. Add `redis-server test all` support to run all tests. 2. Add redis test to daily ci. 3. Add `--accurate` option to run slow tests for more iterations (so that by default we run less cycles (shorter time, and less prints). 4. Move dict benchmark to REDIS_TEST. 5. fix some leaks in tests 6. make quicklist tests run on a specific fill set of options rather than huge ranges 7. move some prints in quicklist test outside their loops to reduce prints 8. removing sds.h from dict.c since it is now used in both redis-server and redis-cli (uses hiredis sds)	2021-03-10 09:13:11 +02:00
John Sully	ca13fda90f	Fix issue where finding random keys is slow due to not shrinking the hash table. Former-commit-id: fd05010cdcf9d6a6187ca2e18bc55adbaa680a02	2021-02-22 09:14:24 +00:00
Jim Brunner	d8b623ddde	dict: pause rehash, minor readability refactor (#8515 ) The `dict` field `iterators` is misleading and incorrect. This variable is used for 1 purpose - to pause rehashing. The current `iterators` field doesn't actually count "iterators". It counts "safe iterators". But - it doesn't actually count safe iterators either. For one, it's only incremented once the safe iterator begins to iterate, not when it's created. It's also incremented in `dictScan` to prevent rehashing (and commented to make it clear why `iterators` is being incremented during a scan). This update renames the field as `pauserehash` and creates 2 helper macros `dictPauseRehashing(d)` and `dictResumeRehashing(d)`	2021-02-20 12:56:30 +02:00
John Sully	078abba316	Merge branch 'unstable' into keydbpro Former-commit-id: e2140793f2bf565972ada799af73bf4457e2718d	2021-02-08 18:17:09 +00:00
John Sully	ba006abe02	Ensure rehash completes even when we're in a long running task Former-commit-id: f107746e90f7a8ff3c7094145ee1ad438911e8c2	2021-02-07 19:11:05 -05:00
John Sully	495dff1e8c	Implement rehash during spinlock Former-commit-id: f68a26381a35b27948046d46c2c7bcfbdc21143d	2021-02-07 19:11:05 -05:00
John Sully	071ecb801a	Allow multiple threads to rehash simultaneously Former-commit-id: 5a2cc90786dfd1bfd341dbf5713bcde01f0cfff3	2021-02-07 19:11:05 -05:00
John Sully	1c0b603def	Initial implementation Former-commit-id: 958f2c00c8efc15dc91fdeec2ff2e2ae2016c124	2021-02-07 19:11:05 -05:00
Greg Femec	c2eaae3653	Fix random element selection for large hash tables. (#8133 ) When a database on a 64 bit build grows past 2^31 keys, the underlying hash table expands to 2^32 buckets. After this point, the algorithms for selecting random elements only return elements from half of the available buckets because they use random() which has a range of 0 to 2^31 - 1. This causes problems for eviction policies which use dictGetSomeKeys or dictGetRandomKey. Over time they cause the hash table to become unbalanced because, while new keys are spread out evenly across all buckets, evictions come from only half of the available buckets. Eventually this half of the table starts to run out of keys and it takes longer and longer to find candidates for eviction. This continues until no more evictions can happen. This solution addresses this by using a 64 bit PRNG instead of libc random(). Co-authored-by: Greg Femec <gfemec@google.com>	2020-12-23 15:52:07 +02:00
Oran Agra	595cff7fd4	Sanitize dump payload: fail RESTORE if memory allocation fails When RDB input attempts to make a huge memory allocation that fails, RESTORE should fail gracefully rather than die with panic	2020-12-06 14:54:34 +02:00
Wang Yuan	5aa078afb0	Limit the main db and expires dictionaries to expand (#7954 ) As we know, redis may reject user's requests or evict some keys if used memory is over maxmemory. Dictionaries expanding may make things worse, some big dictionaries, such as main db and expires dict, may eat huge memory at once for allocating a new big hash table and be far more than maxmemory after expanding. There are related issues: #4213 #4583 More details, when expand dict in redis, we will allocate a new big ht[1] that generally is double of ht[0], The size of ht[1] will be very big if ht[0] already is big. For db dict, if we have more than 64 million keys, we need to cost 1GB for ht[1] when dict expands. If the sum of used memory and new hash table of dict needed exceeds maxmemory, we shouldn't allow the dict to expand. Because, if we enable keys eviction, we still couldn't add much more keys after eviction and rehashing, what's worse, redis will keep less keys when redis only remains a little memory for storing new hash table instead of users' data. Moreover users can't write data in redis if disable keys eviction. What this commit changed ? Add a new member function expandAllowed for dict type, it provide a way for caller to allow expand or not. We expose two parameters for this function: more memory needed for expanding and dict current load factor, users can implement a function to make a decision by them. For main db dict and expires dict type, these dictionaries may be very big and cost huge memory for expanding, so we implement a judgement function: we can stop dict to expand provisionally if used memory will be over maxmemory after dict expands, but to guarantee the performance of redis, we still allow dict to expand if dict load factor exceeds the safe load factor. Add test cases to verify we don't allow main db to expand when left memory is not enough, so that avoid keys eviction. Other changes: For new hash table size when expand. Before this commit, the size is that double used of dict and later _dictNextPower. Actually we aim to control a dict load factor between 0.5 and 1.0. Now we replace 2 with +1, since the first check is that used >= size, the outcome of before will usually be the same as _dictNextPower(used+1). The only case where it'll differ is when dict_can_resize is false during fork, so that later the _dictNextPower(used2) will cause the dict to jump to 4 (i.e. _dictNextPower(10252) will return 4096). Fix rehash test cases due to changing algorithm of new hash table size when expand.	2020-12-06 11:53:04 +02:00
John Sully	ce69a765f8	Remove unnecessary key comparisons in perf critical snapshot paths Former-commit-id: 40f8a8d102fdca9443399ef03a47df609b146d58	2020-08-15 23:25:58 +00:00
John Sully	6b8e979434	Prehash the tombstone for cleanup Former-commit-id: c9d97a7c7448fc769486175bea1648589487c87c	2020-08-14 16:05:39 +00:00
John Sully	8888498bfd	Make snapshot completion faster and add latency monitor Former-commit-id: 8063be6ee70a652c22c3263dccf318366e208891	2020-06-04 01:07:14 -04:00
John Sully	d715bc15e1	Add new faster dictionary merging for use by snapshotting code Former-commit-id: b6f120b3d401c92ef5cf1cc6f5e77da139e33a97	2020-02-01 20:17:40 -05:00
John Sully	da4adb261b	Fix multithreading data races Former-commit-id: 80f6e5818fd575cb08a5f620c35eed1cd862eb57	2019-11-24 13:44:43 -05:00
John Sully	68bec6f239	Move remaning files dependent on server.h over to C++ Former-commit-id: 8c133b605c65212b023d35b3cb71e63b6a4c443a	2019-04-08 01:00:48 -04:00
John Sully	8cd2cdca3d	Merge branch 'unstable' of https://github.com/antirez/redis into Multithread	2019-02-21 18:17:12 -05:00
antirez	aae7e1bff0	Better distribution for set get-random-element operations.	2019-02-18 18:27:18 +01:00
John Sully	90c6c37628	make headers C++ safe	2019-02-15 16:55:40 -05:00
zhaozhao.zz	3ab4a9020d	dict: fix the int problem for defrag	2017-12-05 15:38:03 +01:00
antirez	7a20dc7919	Use SipHash hash function to mitigate HashDos attempts. This change attempts to switch to an hash function which mitigates the effects of the HashDoS attack (denial of service attack trying to force data structures to worst case behavior) while at the same time providing Redis with an hash function that does not expect the input data to be word aligned, a condition no longer true now that sds.c strings have a varialbe length header. Note that it is possible sometimes that even using an hash function for which collisions cannot be generated without knowing the seed, special implementation details or the exposure of the seed in an indirect way (for example the ability to add elements to a Set and check the return in which Redis returns them with SMEMBERS) may make the attacker's life simpler in the process of trying to guess the correct seed, however the next step would be to switch to a log(N) data structure when too many items in a single bucket are detected: this seems like an overkill in the case of Redis. SPEED REGRESION TESTS: In order to verify that switching from MurmurHash to SipHash had no impact on speed, a set of benchmarks involving fast insertion of 5 million of keys were performed. The result shows Redis with SipHash in high pipelining conditions to be about 4% slower compared to using the previous hash function. However this could partially be related to the fact that the current implementation does not attempt to hash whole words at a time but reads single bytes, in order to have an output which is endian-netural and at the same time working on systems where unaligned memory accesses are a problem. Further X86 specific optimizations should be tested, the function may easily get at the same level of MurMurHash2 if a few optimizations are performed.	2017-02-20 17:29:17 +01:00
oranagra	b260c7ef74	active defrag improvements	2017-01-02 09:42:32 +02:00
oranagra	c053025a0a	active memory defragmentation	2016-12-30 03:37:52 +02:00
antirez	e2106f0281	dict.c: dictReplaceRaw() -> dictAddOrFind(). What they say about "naming things" in programming?	2016-09-14 16:43:38 +02:00
oranagra	90af5e0825	dict.c: introduce dictUnlink(). Notes by @antirez: This patch was picked from a larger commit by Oran and adapted to change the API a bit. The basic idea is to avoid double lookups when there is to use the value of the deleted entry. BEFORE: entry = dictFind( ... ); /* 1st lookup. / / Do somethjing with the entry. / dictDelete(...); / 2nd lookup. / AFTER: entry = dictUnlink( ... ); / 1st lookup. / / Do somethjing with the entry. / dictFreeUnlinkedEntry(entry); / No lookups!. */	2016-09-14 12:18:59 +02:00
oranagra	4448185b52	Optimize repeated keyname hashing. (Change cherry-picked and modified by @antirez from a larger commit provided by @oranagra in PR #3223).	2016-09-12 13:19:05 +02:00
antirez	ca8d13bd90	Lazyfree: a first implementation of non blocking DEL.	2015-10-01 13:00:19 +02:00
antirez	1b16400551	DEBUG HTSTATS <dbid> added. The command reports information about the hash table internal state representing the specified database ID. This can be used in order to investigate rehashings, memory usage issues and for other debugging purposes.	2015-07-14 17:15:37 +02:00
antirez	87a6696d89	SPOP: reimplemented for speed and better distribution. The old version of SPOP with "count" argument used an API call of dict.c which was actually designed for a different goal, and was not capable of good distribution. We follow a different three-cases approach optimized for different ratiion between sets and requested number of elements. The implementation is simpler and allowed the removal of a large amount of code.	2015-02-11 10:52:28 +01:00
antirez	2bb647770d	dict.c: add dictGetSomeKeys(), specialized for eviction.	2015-02-11 10:52:27 +01:00
antirez	4fc51067c1	Use long for rehash and iterator index in dict.h. This allows to support datasets with more than 2 billion of keys (possible in very large memory instances, this bug was actually reported). Closes issue #1814.	2014-08-26 10:18:56 +02:00
xiaoyu	1c63ccb119	Clarify argument to dict macro d is more clear because the type of argument is dict not dictht Closes #513	2014-08-18 10:59:01 +02:00
antirez	2e94ffb1d1	Remove warnings and improve integer sign correctness.	2014-08-13 11:44:38 +02:00
antirez	fc62a2283f	Add double field in dict.c entry value union.	2014-07-22 17:38:22 +02:00
antirez	28573f8fe8	Added dictGetRandomKeys() to dict.c: mass get random entries. This new function is useful to get a number of random entries from an hash table when we just need to do some sampling without particularly good distribution. It just jumps at a random place of the hash table and returns the first N items encountered by scanning linearly. The main usefulness of this function is to speedup Redis internal sampling of the key space, for example for key eviction or expiry.	2014-03-20 15:50:46 +01:00
antirez	247a311317	dict.c: added optional callback to dictEmpty(). Redis hash table implementation has many non-blocking features like incremental rehashing, however while deleting a large hash table there was no way to have a callback called to do some incremental work. This commit adds this support, as an optiona callback argument to dictEmpty() that is currently called at a fixed interval (one time every 65k deletions).	2013-12-10 18:46:24 +01:00
Pieter Noordhuis	956c0ed927	Add SCAN command	2013-10-25 10:49:48 +02:00
antirez	bfaadb0df2	dict.c iterator API misuse protection. dict.c allows the user to create unsafe iterators, that are iterators that will not touch the dictionary data structure in any way, preventing copy on write, but at the same time are limited in their usage. The limitation is that when itearting with an unsafe iterator, no call to other dictionary functions must be done inside the iteration loop, otherwise the dictionary may be incrementally rehashed resulting into missing elements in the set of the elements returned by the iterator. However after introducing this kind of iterators a number of bugs were found due to misuses of the API, and we are still finding bugs about this issue. The bugs are not trivial to track because the effect is just missing elements during the iteartion. This commit introduces auto-detection of the API misuse. The idea is that an unsafe iterator has a contract: from initialization to the release of the iterator the dictionary should not change. So we take a fingerprint of the dictionary state, xoring a few important dict properties when the unsafe iteartor is initialized. We later check when the iterator is released if the fingerprint is still the same. If it is not, we found a misuse of the iterator, as not allowed API calls changed the internal state of the dictionary. This code was checked against a real bug, issue #1240. This is what Redis prints (aborting) when a misuse is detected: Assertion failed: (iter->fingerprint == dictFingerprint(iter->d)), function dictReleaseIterator, file dict.c, line 587.	2013-08-19 15:00:57 +02:00
Salvatore Sanfilippo	c2fdfd04a8	Merge pull request #693 from ghurrell/dict-h-typos Fix (cosmetic) typos in dict.h	2012-10-22 02:55:23 -07:00
antirez	0c7d3bef67	Hash function switched to murmurhash2. The previously used hash function, djbhash, is not secure against collision attacks even when the seed is randomized as there are simple ways to find seed-independent collisions. The new hash function appears to be safe (or much harder to exploit at least) in this case, and has better distribution. Better distribution does not always means that's better. For instance in a fast benchmark with "DEBUG POPULATE 1000000" I obtained the following results: 1.6 seconds with djbhash 2.0 seconds with murmurhash2 This is due to the fact that djbhash will hash objects that follow the pattern `prefix:<id>` and where the id is numerically near, to near buckets. This improves the locality. However in other access patterns with keys that have no relation murmurhash2 has some (apparently minimal) speed advantage. On the other hand a better distribution should significantly improve the quality of the distribution of elements returned with dictGetRandomKey() that is used in SPOP, SRANDMEMBER, RANDOMKEY, and other commands. Everything considered, and under the suspect that this commit fixes a security issue in Redis, we are switching to the new hash function. If some serious speed regression will be found in the future we'll be able to step back easiliy. This commit fixes issue #663.	2012-10-05 11:20:13 +02:00

1 2

61 Commits