zvi-code b56eed2479
Remove valkey specific changes in jemalloc source code (#1266)
### Summary of the change

This is a base PR for refactoring defrag. It moves the defrag logic to
rely on jemalloc [native
api](https://github.com/jemalloc/jemalloc/pull/1463#issuecomment-479706489)
instead of relying on custom code changes made by valkey in the jemalloc
([je_defrag_hint](9f8185f5c8/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L382)))
library. This enables valkey to use latest vanila jemalloc without the
need to maintain code changes cross jemalloc versions.

This change requires some modifications because the new api is providing
only the information, not a yes\no defrag. The logic needs to be
implemented at valkey code. Additionally, the api does not provide,
within single call, all the information needed to make a decision, this
information is available through additional api call. To reduce the
calls to jemalloc, in this PR the required information is collected
during the `computeDefragCycles` and not for every single ptr, this way
we are avoiding the additional api call.
Followup work will utilize the new options that are now open and will
further improve the defrag decision and process.

### Added files: 

`allocator_defrag.c` / `allocator_defrag.h` - This files implement the
allocator specific knowledge for making defrag decision. The knowledge
about slabs and allocation logic and so on, all goes into this file.
This improves the separation between jemalloc specific code and other
possible implementation.


### Moved functions: 

[`zmalloc_no_tcache` , `zfree_no_tcache`
](4593dc2f05/src/zmalloc.c (L215))
- these are very jemalloc specific logic assumptions, and are very
specific to how we defrag with jemalloc. This is also with the vision
that from performance perspective we should consider using tcache, we
only need to make sure we don't recycle entries without going through
the arena [for example: we can use private tcache, one for free and one
for alloc].
`frag_smallbins_bytes` - the logic and implementation moved to the new
file

### Existing API:

* [once a second + when completed full cycle]
[`computeDefragCycles`](4593dc2f05/src/defrag.c (L916))
* `zmalloc_get_allocator_info` : gets from jemalloc _allocated, active,
resident, retained, muzzy_, `frag_smallbins_bytes`
*
[`frag_smallbins_bytes`](4593dc2f05/src/zmalloc.c (L690))
: for each bin; gets from jemalloc bin_info, `curr_regs`, `cur_slabs`
* [during defrag, for each pointer]
* `je_defrag_hint` is getting a memory pointer and returns {0,1} .
[Internally it
uses](4593dc2f05/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L368))
this information points:
        * #`nonfull_slabs`
        * #`total_slabs`
        * #free regs in the ptr slab

## Jemalloc API (via ctl interface)


[BATCH][`experimental_utilization_batch_query_ctl`](4593dc2f05/deps/jemalloc/src/ctl.c (L4114))
: gets an array of pointers, returns for each pointer 3 values,

* number of free regions in the extent
* number of regions in the extent
* size of the extent in terms of bytes


[EXTENDED][`experimental_utilization_query_ctl`](4593dc2f05/deps/jemalloc/src/ctl.c (L3989))
:

* memory address of the extent a potential reallocation would go into
* number of free regions in the extent
* number of regions in the extent
* size of the extent in terms of bytes
* [stats-enabled]total number of free regions in the bin the extent
belongs to
* [stats-enabled]total number of regions in the bin the extent belongs
to

### `experimental_utilization_batch_query_ctl` vs valkey
`je_defrag_hint`?
[good]
   - We can query pointers in a batch, reduce the overall overhead
- The per ptr decision algorithm is not within jemalloc api, jemalloc
only provides information, valkey can tune\configure\optimize easily

 
[bad]
- In the batch API we only know the utilization of the slab (of that
memory ptr), we don’t get the data about #`nonfull_slabs` and total
allocated regs.


## New functions:
1. `defrag_jemalloc_init`: Reducing the cost of call to je_ctl: use the
[MIB interface](https://jemalloc.net/jemalloc.3.html) to get a faster
calls. See this quote from the jemalloc documentation:
    
The mallctlnametomib() function provides a way to avoid repeated name
lookups for
applications that repeatedly query the same portion of the namespace,by
translating
a name to a “Management Information Base” (MIB) that can be passed
repeatedly to
    mallctlbymib().

6. `jemalloc_sz2binind_lgq*` : this api is to support reverse map
between bin size and it’s info without lookup. This mapping depends on
the number of size classes we have that are derived from
[`lg_quantum`](4593dc2f05/deps/Makefile (L115))
7. `defrag_jemalloc_get_frag_smallbins` : This function replaces
`frag_smallbins_bytes` the logic moved to the new file allocator_defrag
`defrag_jemalloc_should_defrag_multi` → `handle_results` - unpacks the
results
8. `should_defrag` : implements the same logic as the existing
implementation
[inside](9f8185f5c8/deps/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h (L382))
je_defrag_hint
9. `defrag_jemalloc_should_defrag_multi` : implements the hint for an
array of pointers, utilizing the new batch api. currently only 1 pointer
is passed.


### Logical differences:

In order to get the information about #`nonfull_slabs` and #`regs`, we
use the query cycle to collect the information per size class. In order
to find the index of bin information given bin size, in o(1), we use
`jemalloc_sz2binind_lgq*` .


## Testing
This is the first draft. I did some initial testing that basically
fragmentation by reducing max memory and than waiting for defrag to
reach desired level. The test only serves as sanity that defrag is
succeeding eventually, no data provided here regarding efficiency and
performance.

### Test: 
1. disable `activedefrag`
2. run valkey benchmark on overlapping address ranges with different
block sizes
3. wait untill `used_memory` reaches 10GB
4. set `maxmemory` to 5GB and `maxmemory-policy` to `allkeys-lru`
5. stop load
6. wait for `mem_fragmentation_ratio` to reach 2
7. enable `activedefrag` - start test timer
8. wait until reach `mem_fragmentation_ratio` = 1.1

#### Results*:
(With this PR)Test results: ` 56 sec`
(Without this PR)Test results: `67 sec`

*both runs perform same "work" number of buffers moved to reach
fragmentation target

Next benchmarking is to compare to:
- DONE // existing `je_get_defrag_hint` 
- compare with naive defrag all: `int defrag_hint() {return 1;}`

---------

Signed-off-by: Zvi Schneider <ezvisch@amazon.com>
Signed-off-by: Zvi Schneider <zvi.schneider22@gmail.com>
Signed-off-by: zvi-code <54795925+zvi-code@users.noreply.github.com>
Co-authored-by: Zvi Schneider <ezvisch@amazon.com>
Co-authored-by: Zvi Schneider <zvi.schneider22@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-11-21 16:29:21 -08:00
2024-10-02 19:28:55 +02:00
2024-06-14 13:40:06 -07:00
2024-03-30 14:23:50 +01:00
2024-04-12 11:05:39 -04:00

codecov

This project was forked from the open source Redis project right before the transition to their new source available licenses.

This README is just a fast quick start document. More details can be found under valkey.io

What is Valkey?

Valkey is a high-performance data structure server that primarily serves key/value workloads. It supports a wide range of native structures and an extensible plugin system for adding new data structures and access patterns.

Building Valkey using Makefile

Valkey can be compiled and used on Linux, OSX, OpenBSD, NetBSD, FreeBSD. We support big endian and little endian architectures, and both 32 bit and 64 bit systems.

It may compile on Solaris derived systems (for instance SmartOS) but our support for this platform is best effort and Valkey is not guaranteed to work as well as in Linux, OSX, and *BSD.

It is as simple as:

% make

To build with TLS support, you'll need OpenSSL development libraries (e.g. libssl-dev on Debian/Ubuntu).

To build TLS support as Valkey built-in:

% make BUILD_TLS=yes

To build TLS as Valkey module:

% make BUILD_TLS=module

Note that sentinel mode does not support TLS module.

To build with experimental RDMA support you'll need RDMA development libraries (e.g. librdmacm-dev and libibverbs-dev on Debian/Ubuntu). For now, Valkey only supports RDMA as connection module mode. Run:

% make BUILD_RDMA=module

To build with systemd support, you'll need systemd development libraries (such as libsystemd-dev on Debian/Ubuntu or systemd-devel on CentOS) and run:

% make USE_SYSTEMD=yes

To append a suffix to Valkey program names, use:

% make PROG_SUFFIX="-alt"

You can build a 32 bit Valkey binary using:

% make 32bit

After building Valkey, it is a good idea to test it using:

% make test

The above runs the main integration tests. Additional tests are started using:

% make test-unit     # Unit tests
% make test-modules  # Tests of the module API
% make test-sentinel # Valkey Sentinel integration tests
% make test-cluster  # Valkey Cluster integration tests

More about running the integration tests can be found in tests/README.md and for unit tests, see src/unit/README.md.

Fixing build problems with dependencies or cached build options

Valkey has some dependencies which are included in the deps directory. make does not automatically rebuild dependencies even if something in the source code of dependencies changes.

When you update the source code with git pull or when code inside the dependencies tree is modified in any other way, make sure to use the following command in order to really clean everything and rebuild from scratch:

% make distclean

This will clean: jemalloc, lua, hiredis, linenoise and other dependencies.

Also if you force certain build options like 32bit target, no C compiler optimizations (for debugging purposes), and other similar build time options, those options are cached indefinitely until you issue a make distclean command.

Fixing problems building 32 bit binaries

If after building Valkey with a 32 bit target you need to rebuild it with a 64 bit target, or the other way around, you need to perform a make distclean in the root directory of the Valkey distribution.

In case of build errors when trying to build a 32 bit binary of Valkey, try the following steps:

  • Install the package libc6-dev-i386 (also try g++-multilib).
  • Try using the following command line instead of make 32bit: make CFLAGS="-m32 -march=native" LDFLAGS="-m32"

Allocator

Selecting a non-default memory allocator when building Valkey is done by setting the MALLOC environment variable. Valkey is compiled and linked against libc malloc by default, with the exception of jemalloc being the default on Linux systems. This default was picked because jemalloc has proven to have fewer fragmentation problems than libc malloc.

To force compiling against libc malloc, use:

% make MALLOC=libc

To compile against jemalloc on Mac OS X systems, use:

% make MALLOC=jemalloc

Monotonic clock

By default, Valkey will build using the POSIX clock_gettime function as the monotonic clock source. On most modern systems, the internal processor clock can be used to improve performance. Cautions can be found here: http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/

To build with support for the processor's internal instruction clock, use:

% make CFLAGS="-DUSE_PROCESSOR_CLOCK"

Verbose build

Valkey will build with a user-friendly colorized output by default. If you want to see a more verbose output, use the following:

% make V=1

Running Valkey

To run Valkey with the default configuration, just type:

% cd src
% ./valkey-server

If you want to provide your valkey.conf, you have to run it using an additional parameter (the path of the configuration file):

% cd src
% ./valkey-server /path/to/valkey.conf

It is possible to alter the Valkey configuration by passing parameters directly as options using the command line. Examples:

% ./valkey-server --port 9999 --replicaof 127.0.0.1 6379
% ./valkey-server /etc/valkey/6379.conf --loglevel debug

All the options in valkey.conf are also supported as options using the command line, with exactly the same name.

Running Valkey with TLS:

Running manually

To manually run a Valkey server with TLS mode (assuming ./gen-test-certs.sh was invoked so sample certificates/keys are available):

  • TLS built-in mode:

    ./src/valkey-server --tls-port 6379 --port 0 \
        --tls-cert-file ./tests/tls/valkey.crt \
        --tls-key-file ./tests/tls/valkey.key \
        --tls-ca-cert-file ./tests/tls/ca.crt
    
  • TLS module mode:

    ./src/valkey-server --tls-port 6379 --port 0 \
        --tls-cert-file ./tests/tls/valkey.crt \
        --tls-key-file ./tests/tls/valkey.key \
        --tls-ca-cert-file ./tests/tls/ca.crt \
        --loadmodule src/valkey-tls.so
    

Note that you can disable TCP by specifying --port 0 explicitly. It's also possible to have both TCP and TLS available at the same time, but you'll have to assign different ports.

Use valkey-cli to connect to the Valkey server:

./src/valkey-cli --tls \
    --cert ./tests/tls/valkey.crt \
    --key ./tests/tls/valkey.key \
    --cacert ./tests/tls/ca.crt

Specifying --tls-replication yes makes a replica connect to the primary.

Using --tls-cluster yes makes Valkey Cluster use TLS across nodes.

Running Valkey with RDMA:

Note that Valkey Over RDMA is an experimental feature. It may be changed or removed in any minor or major version. Currently, it is only supported on Linux.

To manually run a Valkey server with RDMA mode:

% ./src/valkey-server --protected-mode no \
     --loadmodule src/valkey-rdma.so bind=192.168.122.100 port=6379

It's possible to change bind address/port of RDMA by runtime command:

192.168.122.100:6379> CONFIG SET rdma.port 6380

It's also possible to have both RDMA and TCP available, and there is no conflict of TCP(6379) and RDMA(6379), Ex:

% ./src/valkey-server --protected-mode no \
     --loadmodule src/valkey-rdma.so bind=192.168.122.100 port=6379 \
     --port 6379

Note that the network card (192.168.122.100 of this example) should support RDMA. To test a server supports RDMA or not:

% rdma res show (a new version iproute2 package)

Or:

% ibv_devices

Playing with Valkey

You can use valkey-cli to play with Valkey. Start a valkey-server instance, then in another terminal try the following:

% cd src
% ./valkey-cli
valkey> ping
PONG
valkey> set foo bar
OK
valkey> get foo
"bar"
valkey> incr mycounter
(integer) 1
valkey> incr mycounter
(integer) 2
valkey>

Installing Valkey

In order to install Valkey binaries into /usr/local/bin, just use:

% make install

You can use make PREFIX=/some/other/directory install if you wish to use a different destination.

Note: For compatibility with Redis, we create symlinks from the Redis names (redis-server, redis-cli, etc.) to the Valkey binaries installed by make install. The symlinks are created in same directory as the Valkey binaries. The symlinks are removed when using make uninstall. The creation of the symlinks can be skipped by setting the makefile variable USE_REDIS_SYMLINKS=no.

make install will just install binaries in your system, but will not configure init scripts and configuration files in the appropriate place. This is not needed if you just want to play a bit with Valkey, but if you are installing it the proper way for a production system, we have a script that does this for Ubuntu and Debian systems:

% cd utils
% ./install_server.sh

Note: install_server.sh will not work on Mac OSX; it is built for Linux only.

The script will ask you a few questions and will setup everything you need to run Valkey properly as a background daemon that will start again on system reboots.

You'll be able to stop and start Valkey using the script named /etc/init.d/valkey_<portnumber>, for instance /etc/init.d/valkey_6379.

Building using CMake

In addition to the traditional Makefile build, Valkey supports an alternative, experimental, build system using CMake.

To build and install Valkey, in Release mode (an optimized build), type this into your terminal:

mkdir build-release
cd $_
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/opt/valkey
sudo make install
# Valkey is now installed under /opt/valkey

Other options supported by Valkey's CMake build system:

Special build flags

  • -DBUILD_TLS=<on|off|module> enable TLS build for Valkey
  • -DBUILD_RDMA=<off|module> enable RDMA module build (only module mode supported)
  • -DBUILD_MALLOC=<libc|jemalloc|tcmalloc|tcmalloc_minimal> choose the allocator to use. Default on Linux: jemalloc, for other OS: libc
  • -DBUILD_SANITIZER=<address|thread|undefined> build with address sanitizer enabled
  • -DBUILD_UNIT_TESTS=[1|0] when set, the build will produce the executable valkey-unit-tests
  • -DBUILD_TEST_MODULES=[1|0] when set, the build will include the modules located under the tests/modules folder
  • -DBUILD_EXAMPLE_MODULES=[1|0] when set, the build will include the example modules located under the src/modules folder

Common flags

  • -DCMAKE_BUILD_TYPE=<Debug|Release...> define the build type, see CMake manual for more details
  • -DCMAKE_INSTALL_PREFIX=/installation/path override this value to define a custom install prefix. Default: /usr/local
  • -G<Generator Name> generate build files for "Generator Name". By default, CMake will generate Makefiles.

Verbose build

CMake generates a user-friendly colorized output by default. If you want to see a more verbose output, use the following:

make VERBOSE=1

Troubleshooting

During the CMake stage, CMake caches variables in a local file named CMakeCache.txt. All variables generated by Valkey are removed from the cache once consumed (this is done by calling to unset(VAR-NAME CACHE)). However, some variables, like the compiler path, are kept in cache. To start a fresh build either remove the cache file CMakeCache.txt from the build folder, or delete the build folder completely.

It is important to re-run CMake when adding new source files.

Integration with IDE

During the CMake stage of the build, CMake generates a JSON file named compile_commands.json and places it under the build folder. This file is used by many IDEs and text editors for providing code completion (via clangd).

A small caveat is that these tools will look for compile_commands.json under the Valkey's top folder. A common workaround is to create a symbolic link to it:

cd /path/to/valkey/
# We assume here that your build folder is `build-release`
ln -sf $(pwd)/build-release/compile_commands.json $(pwd)/compile_commands.json

Restart your IDE and voila

Code contributions

Please see the CONTRIBUTING.md. For security bugs and vulnerabilities, please see SECURITY.md.

Valkey is an open community project under LF Projects

Valkey a Series of LF Projects, LLC 2810 N Church St, PMB 57274 Wilmington, Delaware 19802-4447

Description
Futriix-Распределённая СУБД на языке С с поддержкой модулей на базе Искусственного интеллекта и плагинов на языке Golang.
https://futriix.ru Readme 151 MiB
Languages
C 73.6%
Tcl 24.6%
Python 0.5%
CMake 0.4%
Ruby 0.3%
Other 0.5%