futriix

Author	SHA1	Message	Date
zvi-code	b56eed2479	Remove valkey specific changes in jemalloc source code (#1266 ) ### Summary of the change This is a base PR for refactoring defrag. It moves the defrag logic to rely on jemalloc [native api](https://github.com/jemalloc/jemalloc/pull/1463#issuecomment-479706489) instead of relying on custom code changes made by valkey in the jemalloc ([je_defrag_hint](`9f8185f5c8/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L382)`)) library. This enables valkey to use latest vanila jemalloc without the need to maintain code changes cross jemalloc versions. This change requires some modifications because the new api is providing only the information, not a yes\no defrag. The logic needs to be implemented at valkey code. Additionally, the api does not provide, within single call, all the information needed to make a decision, this information is available through additional api call. To reduce the calls to jemalloc, in this PR the required information is collected during the `computeDefragCycles` and not for every single ptr, this way we are avoiding the additional api call. Followup work will utilize the new options that are now open and will further improve the defrag decision and process. ### Added files: `allocator_defrag.c` / `allocator_defrag.h` - This files implement the allocator specific knowledge for making defrag decision. The knowledge about slabs and allocation logic and so on, all goes into this file. This improves the separation between jemalloc specific code and other possible implementation. ### Moved functions: [`zmalloc_no_tcache` , `zfree_no_tcache` ](`4593dc2f05/src/zmalloc.c (L215)`) - these are very jemalloc specific logic assumptions, and are very specific to how we defrag with jemalloc. This is also with the vision that from performance perspective we should consider using tcache, we only need to make sure we don't recycle entries without going through the arena [for example: we can use private tcache, one for free and one for alloc]. `frag_smallbins_bytes` - the logic and implementation moved to the new file ### Existing API: * [once a second + when completed full cycle] [`computeDefragCycles`](`4593dc2f05/src/defrag.c (L916)`) * `zmalloc_get_allocator_info` : gets from jemalloc _allocated, active, resident, retained, muzzy_, `frag_smallbins_bytes` * [`frag_smallbins_bytes`](`4593dc2f05/src/zmalloc.c (L690)`) : for each bin; gets from jemalloc bin_info, `curr_regs`, `cur_slabs` * [during defrag, for each pointer] * `je_defrag_hint` is getting a memory pointer and returns {0,1} . [Internally it uses](`4593dc2f05/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L368)`) this information points: * #`nonfull_slabs` * #`total_slabs` * #free regs in the ptr slab ## Jemalloc API (via ctl interface) [BATCH][`experimental_utilization_batch_query_ctl`](`4593dc2f05/deps/jemalloc/src/ctl.c (L4114)`) : gets an array of pointers, returns for each pointer 3 values, * number of free regions in the extent * number of regions in the extent * size of the extent in terms of bytes [EXTENDED][`experimental_utilization_query_ctl`](`4593dc2f05/deps/jemalloc/src/ctl.c (L3989)`) : * memory address of the extent a potential reallocation would go into * number of free regions in the extent * number of regions in the extent * size of the extent in terms of bytes * [stats-enabled]total number of free regions in the bin the extent belongs to * [stats-enabled]total number of regions in the bin the extent belongs to ### `experimental_utilization_batch_query_ctl` vs valkey `je_defrag_hint`? [good] - We can query pointers in a batch, reduce the overall overhead - The per ptr decision algorithm is not within jemalloc api, jemalloc only provides information, valkey can tune\configure\optimize easily [bad] - In the batch API we only know the utilization of the slab (of that memory ptr), we don’t get the data about #`nonfull_slabs` and total allocated regs. ## New functions: 1. `defrag_jemalloc_init`: Reducing the cost of call to je_ctl: use the [MIB interface](https://jemalloc.net/jemalloc.3.html) to get a faster calls. See this quote from the jemalloc documentation: The mallctlnametomib() function provides a way to avoid repeated name lookups for applications that repeatedly query the same portion of the namespace,by translating a name to a “Management Information Base” (MIB) that can be passed repeatedly to mallctlbymib(). 6. `jemalloc_sz2binind_lgq` : this api is to support reverse map between bin size and it’s info without lookup. This mapping depends on the number of size classes we have that are derived from [`lg_quantum`](`4593dc2f05/deps/Makefile (L115)`) 7. `defrag_jemalloc_get_frag_smallbins` : This function replaces `frag_smallbins_bytes` the logic moved to the new file allocator_defrag `defrag_jemalloc_should_defrag_multi` → `handle_results` - unpacks the results 8. `should_defrag` : implements the same logic as the existing implementation [inside](`9f8185f5c8/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L382)`) je_defrag_hint 9. `defrag_jemalloc_should_defrag_multi` : implements the hint for an array of pointers, utilizing the new batch api. currently only 1 pointer is passed. ### Logical differences: In order to get the information about #`nonfull_slabs` and #`regs`, we use the query cycle to collect the information per size class. In order to find the index of bin information given bin size, in o(1), we use `jemalloc_sz2binind_lgq` . ## Testing This is the first draft. I did some initial testing that basically fragmentation by reducing max memory and than waiting for defrag to reach desired level. The test only serves as sanity that defrag is succeeding eventually, no data provided here regarding efficiency and performance. ### Test: 1. disable `activedefrag` 2. run valkey benchmark on overlapping address ranges with different block sizes 3. wait untill `used_memory` reaches 10GB 4. set `maxmemory` to 5GB and `maxmemory-policy` to `allkeys-lru` 5. stop load 6. wait for `mem_fragmentation_ratio` to reach 2 7. enable `activedefrag` - start test timer 8. wait until reach `mem_fragmentation_ratio` = 1.1 #### Results: (With this PR)Test results: ` 56 sec` (Without this PR)Test results: `67 sec` both runs perform same "work" number of buffers moved to reach fragmentation target Next benchmarking is to compare to: - DONE // existing `je_get_defrag_hint` - compare with naive defrag all: `int defrag_hint() {return 1;}` --------- Signed-off-by: Zvi Schneider <ezvisch@amazon.com> Signed-off-by: Zvi Schneider <zvi.schneider22@gmail.com> Signed-off-by: zvi-code <54795925+zvi-code@users.noreply.github.com> Co-authored-by: Zvi Schneider <ezvisch@amazon.com> Co-authored-by: Zvi Schneider <zvi.schneider22@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2024-11-21 16:29:21 -08:00
Oran Agra	0897c8afed	Upgrade to jemalloc 5.3.0 * Regenerate configure script sccording to deps/README * update iget_defrag_hint by following changes to arena_dalloc_no_tcache	2023-05-01 17:31:31 +03:00
Oran Agra	b8beda3cf8	Merge commit jemalloc 5.3.0	2023-05-01 15:38:08 +03:00
Oran Agra	d4e7ffb38c	Improve active defrag in jemalloc 5.2 (#9778 ) Background: Following the upgrade to jemalloc 5.2, there was a test that used to be flaky and started failing consistently (on 32bit), so we disabled it (see #9645). This is a test that i introduced in #7289 when i attempted to solve a rare stagnation problem, and it later turned out i failed to solve it, ans what's more i added a test that caused it to be not so rare, and as i mentioned, now in jemalloc 5.2 it became consistent on 32bit. Stagnation can happen when all the slabs of the bin are equally utilized, so the decision to move an allocation from a relatively empty slab to a relatively full one, will never happen, and in that test all the slabs are at 50% utilization, so the defragger could just keep scanning the keyspace and not move anything. What this PR changes: * First, finally in jemalloc 5.2 we have the count of non-full slabs, so when we compare the utilization of the current slab, we can compare it to the average utilization of the non-full slabs in our bin, instead of the total average of our bin. this takes the full slabs out of the game, since they're not candidates for migration (neither source nor target). * Secondly, We add some 12% (100/8) to the decision to defrag an allocation, this is the part that aims to avoid stagnation, and it's especially important since the above mentioned change can get us closer to stagnation. * Thirdly, since jemalloc 5.2 adds sharded bins, we take into account all shards (something that's missing from the original PR that merged it), this isn't expected to make any difference since anyway there should be just one shard. How this was benchmarked. What i did was run the memefficiency test unit with `--verbose` and compare the defragger hits and misses the tests reported. At first, when i took into consideration only the non-full slabs, it got a lot worse (i got into stagnation, or just got a lot of misses and a lot of hits), but when i added the 10% i got back to results that were slightly better than the ones of the jemalloc 5.1 branch. i.e. full defragmentation was achieved with fewer hits (relocations), and fewer misses (keyspace scans).	2021-11-21 13:35:39 +02:00
Oran Agra	ed92a3e8ed	Resolve nonsense static analysis warnings	2021-10-12 12:55:35 +03:00
Oran Agra	c6a26519a1	fix a rare active defrag edge case bug leading to stagnation There's a rare case which leads to stagnation in the defragger, causing it to keep scanning the keyspace and do nothing (not moving any allocation), this happens when all the allocator slabs of a certain bin have the same % utilization, but the slab from which new allocations are made have a lower utilization. this commit fixes it by removing the current slab from the overall average utilization of the bin, and also eliminate any precision loss in the utilization calculation and move the decision about the defrag to reside inside jemalloc. and also add a test that consistently reproduce this issue.	2021-10-12 12:55:35 +03:00
Yoav Steinberg	908d3bdad9	Fix defrag to support sharded bins in arena (added in v5.2.1) See `37b8913925`	2021-10-10 18:29:13 +03:00
Oran Agra	91bc78a8b8	Active defrag fixes for 32bit builds (again) * overflow in jemalloc fragmentation hint to the defragger	2021-10-10 18:29:13 +03:00
Oran Agra	29d7f97c96	add defrag hint support into jemalloc 5	2021-10-10 18:29:13 +03:00
Yoav Steinberg	4d5911b4e4	Merge commit '220a0f0880419450c9409202aac1fab4b8be0719' as 'deps/jemalloc'	2021-10-10 18:26:48 +03:00
Yoav Steinberg	4a884343f5	Delete old jemalloc before pulling in subtree.	2021-10-10 18:03:38 +03:00
Oran Agra	fd7d51c353	Resolve nonsense static analysis warnings	2021-05-03 18:59:47 +03:00
Oran Agra	88d71f4793	fix a rare active defrag edge case bug leading to stagnation There's a rare case which leads to stagnation in the defragger, causing it to keep scanning the keyspace and do nothing (not moving any allocation), this happens when all the allocator slabs of a certain bin have the same % utilization, but the slab from which new allocations are made have a lower utilization. this commit fixes it by removing the current slab from the overall average utilization of the bin, and also eliminate any precision loss in the utilization calculation and move the decision about the defrag to reside inside jemalloc. and also add a test that consistently reproduce this issue.	2020-05-20 16:04:42 +03:00
Oran Agra	920158ec81	Active defrag fixes for 32bit builds (again) * overflow in jemalloc fragmentation hint to the defragger	2018-07-11 16:09:00 +03:00
Oran Agra	e8099cabd1	add defrag hint support into jemalloc 5	2018-06-27 10:52:39 +03:00
antirez	08e1c8e820	Jemalloc upgraded to version 5.0.1.	2018-05-24 17:17:37 +02:00
antirez	e3b8492e83	Revert "Jemalloc updated to 4.4.0." This reverts commit 36c1acc222d29e6e2dc9fc25362e4faa471111bd.	2017-04-22 13:17:07 +02:00
antirez	27e29f4fe6	Jemalloc updated to 4.4.0. The original jemalloc source tree was modified to: 1. Remove the configure error that prevents nested builds. 2. Insert the Redis private Jemalloc API in order to allow the Redis fragmentation function to work.	2017-01-30 09:58:34 +01:00
antirez	173d692bc2	Defrag: activate it only if running modified version of Jemalloc. This commit also includes minor aesthetic changes like removal of trailing spaces.	2017-01-10 11:25:39 +01:00
antirez	a9951b1b6a	Jemalloc updated to 4.0.3.	2015-10-06 16:55:37 +02:00
antirez	6b836b6b41	Jemalloc: use LG_QUANTUM of 3 for AMD64 and I386. This gives us a 24 bytes size class which is dict.c dictEntry size, thus improving the memory efficiency of Redis significantly. Moreover other non 16 bytes aligned tiny classes are added that further reduce the fragmentation of the allocator. Technically speaking LG_QUANTUM should be 4 on i386 / AMD64 because of SSE types and other 16 bytes types, however we don't use those, and our jemalloc only targets Redis. New versions of Jemalloc will have an explicit configure switch in order to specify the quantum value for a platform without requiring any change to the Jemalloc source code: we'll switch to this system when available. This change was originally proposed by Oran Agra (@oranagra) as a change to the Jemalloc script to generate the size classes define. We ended doing it differently by changing LG_QUANTUM since it is apparently the supported Jemalloc method to obtain a 24 bytes size class, moreover it also provides us other potentially useful size classes. Related to issue #2510.	2015-07-24 10:20:02 +02:00
antirez	fceef8e0dd	Jemalloc updated to 3.6.0. Not a single bug in about 3 months, and our previous version was too old (3.2.0).	2014-06-20 14:59:20 +02:00
antirez	7383c3b129	Jemalloc updated to version 3.2.0.	2012-11-28 18:39:35 +01:00
antirez	ad4c0b4117	Jemalloc updated to 3.0.0. Full changelog here: http://www.canonware.com/cgi-bin/gitweb.cgi?p=jemalloc.git;a=blob_plain;f=ChangeLog;hb=master Notable improvements from the point of view of Redis: 1) Bugfixing. 2) Support for Valgrind. 3) Support for OSX Lion, FreeBSD.	2012-05-16 11:09:45 +02:00
jbergstroem	1d03c1c98a	Update to jemalloc 2.2.5	2011-11-23 21:36:25 +01:00
antirez	a78e148b7d	jemalloc source added	2011-06-20 11:30:06 +02:00

26 Commits