122 Commits

Author SHA1 Message Date
ranshid
398b8b1afa change the container image to ubuntu:plucky (#1359)
Our fortify workflow is running on ubuntu lunar container that is EOL
since [January 25, 2024(January 25,
2024](https://lists.ubuntu.com/archives/ubuntu-announce/2024-January/000298.html).
This case cause the workflow to fail during update actions like:
```
apt-get update && apt-get install -y make gcc-13
  update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-1[3](https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209#step:5:3) 100
  make all-with-unit-tests CC=gcc OPT=-O3 SERVER_CFLAGS='-Werror -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3'
  shell: sh -e {0}
Ign:1 http://security.ubuntu.com/ubuntu lunar-security InRelease
Err:2 http://security.ubuntu.com/ubuntu lunar-security Release
  [4](https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209#step:5:4)04  Not Found [IP: 91.189.91.82 80]
Ign:3 http://archive.ubuntu.com/ubuntu lunar InRelease
Ign:4 http://archive.ubuntu.com/ubuntu lunar-updates InRelease
Ign:[5](https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209#step:5:5) http://archive.ubuntu.com/ubuntu lunar-backports InRelease
Err:[6](https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209#step:5:7) http://archive.ubuntu.com/ubuntu lunar Release
  404  Not Found [IP: 185.125.190.81 80]
Err:7 http://archive.ubuntu.com/ubuntu lunar-updates Release
  404  Not Found [IP: 185.125.190.81 80]
Err:8 http://archive.ubuntu.com/ubuntu lunar-backports Release
  404  Not Found [IP: 185.125.190.81 80]
Reading package lists...
E: The repository 'http://security.ubuntu.com/ubuntu lunar-security Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu lunar Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu lunar-updates Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu lunar-backports Release' does not have a Release file.
update-alternatives: error: alternative path /usr/bin/gcc-[13](https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209#step:5:14) doesn't exist
Error: Process completed with exit code 2.
```

example:
https://github.com/valkey-io/valkey/actions/runs/12021130026/job/33547460209

This pr uses the latest stable ubuntu image release
[plucky](https://hub.docker.com/layers/library/ubuntu/plucky/images/sha256-dc4565c7636f006c26d54c988faae576465e825ea349fef6fd3af6bf5100e8b6?context=explore)

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
2025-01-08 11:35:54 -08:00
Roshan Khatri
fe2ef2616c Workflow changes to fix old release binaries (#1461)
- Moves `build-config.json` to workflow dir to build old versions with
new configs.
- Enables contributors to test release Wf on private repo by adding
`github.event_name == 'workflow_dispatch' ||`

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2025-01-08 11:35:54 -08:00
Binbin
14fb6d3487 Fix wrong file name in build-release-packages.yml (#1437)
Introduced in #1363, the file name does not match.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2025-01-08 11:35:54 -08:00
Roshan Khatri
8b17e6a3d9 Fix the secrete for test bucket. (#1447)
We have set the secret as `AWS_S3_TEST_BUCKET` for test bucket and I
missed it in the initial review.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2025-01-08 11:35:54 -08:00
Vu Diep
8786fcb762 Use configure-aws-credentials workflow instead of passing secret_access_key (#1363)
This PR fixes #1346 where we can get rid of the long term credentials by
using OpenID Connect. OpenID Connect (OIDC) allows your GitHub Actions
workflows to access resources in Amazon Web Services (AWS), without
needing to store the AWS credentials as long-lived GitHub secrets.

---------

Signed-off-by: vudiep411 <vdiep@amazon.com>
2025-01-08 11:35:54 -08:00
Binbin
d6cd90bc8e Skip build-release-packages CI job in forks (#1438)
The CI job was introduced in #1363, we should skip it in forks.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2025-01-08 11:35:54 -08:00
Binbin
e59525f037 Skip IPv6 tests when TCLSH version is < 8.6 (#910)
In #786, we did skip it in the daily, but not for the others.
When running ./runtest on MacOS, we will get the failure.
```
couldn't open socket: host is unreachable (nodename nor servname provided, or not known)
```

The reason is that TCL 8.5 doesn't support ipv6, so we skip tests
tagged with ipv6. This also revert #786.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2025-01-08 11:35:54 -08:00
Yury-Fridlyand
9fbd8ea344 Fix CI concurrency (#849)
Few CI improvements witch will reduce occupation CI queue and eliminate
stale runs.

1. Kill CI jobs on PRs once PR branch gets a new push. This will prevent
situation happened today - a huge job triggered twice in less than an
hour and occupied all **org** (for all repositories) runners queue for
the rest of the day (see pic). This completely blocked valkey-glide
team.
2. Distribute nightly croned jobs on time to prevent them running
together. Keep in mind, cron's TZ is UTC, so midnight tasks incur
developers located in other timezones.

This must be backported to all release branches (`valkey-x.y` and `x.y`)

![image](https://github.com/user-attachments/assets/923d8237-3cb7-42f5-80c8-5322b3f5187d)

---------

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
2025-01-08 11:35:54 -08:00
Viktor Söderqvist
9308ed4ecb Skip IPv6 tests on MacOS (daily) (#786)
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2025-01-08 11:35:54 -08:00
Jonathan Wright
763b6f28ca Replace centos 7 with alternative versions (#543)
replace centos 7 with almalinux 8, add almalinux 9, centos stream 9, fedora stable, rawhide

Fixes #527

---------

Signed-off-by: Jonathan Wright <jonathan@almalinux.org>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2025-01-08 11:35:54 -08:00
Siddhartha Sankar Mondal
f5d106a90d Deprecate MacOS 11 build target (#524)
Deprecate MacOS 11 build target. End of life June 2024.  Fixes #523

---------

Signed-off-by: Siddhartha Mondal <siddharthmondal@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Roshan Khatri <117414976+roshkhatri@users.noreply.github.com>
2025-01-08 11:35:54 -08:00
Madelyn Olson
3e0c587c08 Automatically notify the slack channel when tests fail (#509)
Adds a job that will automatically run at the end of the daily, which
will collect all the failed tests and send them to the developer slack.
It will include a link to the job as well.

Example job that ran on my private repo:
https://github.com/madolson/valkey/actions/runs/9123245899/job/25085418567

Example notification:
<img width="662" alt="image"
src="https://github.com/valkey-io/valkey/assets/34459052/69127db4-e416-4321-bc06-eefcecab1130">
(Note: I removed the sassy text at the bottom from the PR)

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2025-01-08 11:35:54 -08:00
Björn Svensson
d241d77f6c Pin versions of Github Actions in CI (#221)
Pin the Github Action dependencies to the hash according to secure
software development best practices
recommended by the Open Source Security Foundation (OpenSSF).

When developing a CI workflow, it's common to version-pin dependencies
(i.e. actions/checkout@v4). However, version tags are mutable, so a
malicious attacker could overwrite a version tag to point to a malicious
or vulnerable commit instead.
Pinning workflow dependencies by hash ensures the dependency is
immutable and its behavior is guaranteed.
See
https://github.com/ossf/scorecard/blob/main/docs/checks.md#pinned-dependencies

The `dependabot` supports updating a hash and the version comment so its
update will continue to work as before.

Links to used actions and theit tag/hash for review/validation:
https://github.com/actions/checkout/tags    (v4.1.2 was rolled back)
https://github.com/github/codeql-action/tags
https://github.com/maxim-lobanov/setup-xcode/tags
https://github.com/cross-platform-actions/action/releases/tag/v0.22.0
https://github.com/py-actions/py-dependency-install/tags
https://github.com/actions/upload-artifact/tags
https://github.com/actions/setup-node/tags
https://github.com/taiki-e/install-action/releases/tag/v2.32.2

This PR is part of #211.

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
2025-01-08 11:35:54 -08:00
Vitah Lin
ae6c6495bf Add Codecov for Automated Code Coverage (#316)
This PR introduces Codecov to automate code coverage tracking for our
project's tests.

For more information about the Codecov platform, please refer to
https://docs.codecov.com/docs/quick-start

---------

Signed-off-by: Vitah Lin <vitahlin@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2025-01-08 11:35:54 -08:00
Björn Svensson
4b2edc68ca Set permissions for Github Actions in CI (#312)
This sets the default permission for current CI workflows to only be
able to read from the repository (scope: "contents").
When a used Github Action require additional permissions (like CodeQL)
we grant that permission on job-level instead.

This means that a compromised action will not be able to modify the repo
or even steal secrets since all other permission-scopes are implicit set
to "none", i.e. not permitted. This is recommended by
[OpenSSF](https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions).

This PR includes a small fix for the possibility of missing server logs
artifacts, found while verifying the permission.
The `upload-artifact@v3` action will replace artifacts which already
exists. Since both CI-jobs `test-external-standalone` and
`test-external-nodebug` uses the same artifact name, when both jobs
fail, we only get logs from the last finished job. This can be avoided
by using unique artifact names.

This PR is part of #211

More about permissions and scope can be found here:

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions

---------

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
2025-01-08 11:35:54 -08:00
Melroy van den Berg
44a7c35d31 Build binary releases with systemd support (#1107)
- Add systemd support to the build artifact tarballs, so people can use
it under systemd compatible distros. As discussed here:
https://github.com/orgs/valkey-io/discussions/1103#discussioncomment-10815549.
Adding `libsystemd-dev` to install and add `USE_SYSTEMD=yes` to the
build.
- Cleanup & bring the arm & x86 workflow files in-sync. It was a bit of
a mess ;) (removing `jq wget awscli` from the 'Tarball' step)

Signed-off-by: Melroy van den Berg <melroy@melroy.org>
2024-10-02 20:11:12 +02:00
Melroy van den Berg
c789c3f1d1 Avoid .c, .d and .o files from being copied to the binary tar.gz releases (#1106)
As discussed here:
https://github.com/orgs/valkey-io/discussions/1103#discussioncomment-10814006

`cp` can't be used anymore, `rsync` is more powerful and allow to
exclude files.

Alternatively:

1. Remove the c, d and o files. Which isn't ideal either.
2. Improve the build. Eg. by building inside a `build` directory instead
of in the src folder.

Ps. I know these workflows aren't trigger in this PR. Only via "Build
Release Packages" workflow action:
https://github.com/valkey-io/valkey/actions/workflows/build-release-packages.yml..
So I can't fully test in this PR. But it should work ^^

Ps. ps. I did test `rsync -av --exclude='*.c' --exclude='*.d'
--exclude='*.o' src/valkey-*` command in isolation and that works as
expected!

---------

Signed-off-by: Melroy van den Berg <melroy@melroy.org>
2024-10-02 20:11:12 +02:00
Roshan Khatri
50eefd647d
[Cherry-Pick]Adds workflows to build release binaries and push to S3 (#315) (#857)
[related to](https://github.com/valkey-io/valkey/issues/230)

Adds workflows to build Valkey binaries and push to S3 to make it
available to download from the website

The Workflows can be triggered by pushing a release to the repo and the
other option is manually by one of the Maintainers.

Once the workflow triggers, it will generate a matrix of Jobs for the
platforms we need to build from `utils/releasetools/build-config.json`
and then the respective Jobs are triggered. These jobs make Valkey with
respect to the platform binaries we want to release and would push to a
private S3 bucket.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2024-09-30 14:21:33 -07:00
Yossi Gottlieb
6df023fb98 Reduce FreeBSD daily scope. (#12758)
The full test is very flaky running on a VM inside GitHub worker, so we
have to settle for only building and running a small smoke test.

Signed-off-by: Ping Xie <pingxie@google.com>
2024-07-15 14:29:28 -07:00
Jonathan Wright
7cb3426a4b Replace centos 7 with alternative versions (#543)
replace centos 7 with almalinux 8, add almalinux 9, centos stream 9, fedora stable, rawhide

Fixes #527

---------

Signed-off-by: Jonathan Wright <jonathan@almalinux.org>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Ping Xie <pingxie@google.com>
2024-07-12 15:01:57 -07:00
Madelyn Olson
7ed24abe49 Squash changes for creating valkey-server
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-04-06 20:33:09 -07:00
dependabot[bot]
1634a0f271
Bump vmactions/freebsd-vm from 0.3.0 to 0.3.1 (#12352)
Bumps [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm) from 0.3.0 to 0.3.1.
- [Release notes](https://github.com/vmactions/freebsd-vm/releases)
- [Commits](https://github.com/vmactions/freebsd-vm/compare/v0.3.0...v0.3.1)

---
updated-dependencies:
- dependency-name: vmactions/freebsd-vm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 09:17:34 +03:00
Oran Agra
c871db24c4
CI to validate commands.def is up to date (#12227)
and update recent SENTINEL CONFIG changes.
2023-05-24 16:21:18 +03:00
Yossi Gottlieb
5ddc0af33e
Update old Debian CI. (#12104)
We were using `oldstable` Debian as a CI with an older toolchain, but that image is now offline so move to debian:buster

```
E: Failed to fetch http://security.debian.org/debian-security/dists/oldoldstable/updates/main/binary-amd64/Packages 404 Not Found [IP: 151.101.2.132 80]
```
2023-04-25 14:09:37 +03:00
sundb
42c8c61813
Fix some compile warnings and errors when building with gcc-12 or clang (#12035)
This PR is to fix the compilation warnings and errors generated by the latest
complier toolchain, and to add a new runner of the latest toolchain for daily CI.

## Fix various compilation warnings and errors

1) jemalloc.c

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
src/jemalloc.c:1028:7: warning: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Wstring-concatenation]
                    "/etc/malloc.conf",
                    ^
src/jemalloc.c:1027:3: note: place parentheses around the string literal to silence warning
                "\"name\" of the file referenced by the symbolic link named "
                ^
```

REASON:  the compiler to alert developers to potential issues with string concatenation
that may miss a comma,
just like #9534 which misses a comma.

SOLUTION: use `()` to tell the compiler that these two line strings are continuous.

2) config.h

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
In file included from quicklist.c:36:
./config.h:319:76: warning: attribute declaration must precede definition [-Wignored-attributes]
char *strcat(char *restrict dest, const char *restrict src) __attribute__((deprecated("please avoid use of unsafe C functions. prefer use of redis_strlcat instead")));
```

REASON: Enabling _FORTIFY_SOURCE will cause the compiler to use `strcpy()` with check,
it results in a deprecated attribute declaration after including <features.h>.

SOLUTION: move the deprecated attribute declaration from config.h to fmacro.h before "#include <features.h>".

3) networking.c

COMPILER: GCC-12

WARNING: 
```
networking.c: In function ‘addReplyDouble.part.0’:
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  876 |         dbuf[start] = '$';
      |                     ^
networking.c:868:14: note: at offset -5 into destination object ‘dbuf’ of size 5152
  868 |         char dbuf[MAX_LONG_DOUBLE_CHARS+32];
      |              ^
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  876 |         dbuf[start] = '$';
      |                     ^
networking.c:868:14: note: at offset -6 into destination object ‘dbuf’ of size 5152
  868 |         char dbuf[MAX_LONG_DOUBLE_CHARS+32];
```

REASON: GCC-12 predicts that digits10() may return 9 or 10 through `return 9 + (v >= 1000000000UL)`.

SOLUTION: add an assert to let the compiler know the possible length;

4) redis-cli.c & redis-benchmark.c

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
redis-benchmark.c:1621:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
redis-cli.c:3015:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
```

REASON: when _FORTIFY_SOURCE is enabled, the compiler will use the print() with
check, which is a macro. this may result in the use of directives within the macro, which
is undefined behavior.

SOLUTION: move the directives-related code out of `print()`.

5) server.c

COMPILER: gcc-13 with FORTIFY_SOURCE

WARNING:
```
In function 'lookupCommandLogic',
    inlined from 'lookupCommandBySdsLogic' at server.c:3139:32:
server.c:3102:66: error: '*(robj **)argv' may be used uninitialized [-Werror=maybe-uninitialized]
 3102 |     struct redisCommand *base_cmd = dictFetchValue(commands, argv[0]->ptr);
      |                                                              ~~~~^~~
```

REASON: The compiler thinks that the `argc` returned by `sdssplitlen()` could be 0,
resulting in an empty array of size 0 being passed to lookupCommandLogic.
this should be a false positive, `argc` can't be 0 when strings are not NULL.

SOLUTION: add an assert to let the compiler know that `argc` is positive.

6) sha1.c

COMPILER: gcc-12

WARNING:
```
In function ‘SHA1Update’,
    inlined from ‘SHA1Final’ at sha1.c:195:5:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
  152 |             SHA1Transform(context->state, &data[i]);
      |             ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
   56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
      |      ^
In function ‘SHA1Update’,
    inlined from ‘SHA1Final’ at sha1.c:198:9:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
  152 |             SHA1Transform(context->state, &data[i]);
      |             ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
   56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
```

REASON: due to the bug[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80922], when
enable LTO, gcc-12 will not see `diagnostic ignored "-Wstringop-overread"`, resulting in a warning.

SOLUTION: temporarily set SHA1Update to noinline to avoid compiler warnings due
to LTO being enabled until the above gcc bug is fixed.

7) zmalloc.h

COMPILER: GCC-12

WARNING: 
```
In function ‘memset’,
    inlined from ‘moduleCreateContext’ at module.c:877:5,
    inlined from ‘RM_GetDetachedThreadSafeContext’ at module.c:8410:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:59:10: warning: ‘__builtin_memset’ writing 104 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
   59 |   return __builtin___memset_chk (__dest, __ch, __len,
```

REASON: due to the GCC-12 bug [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503],
GCC-12 cannot see alloc_size, which causes GCC to think that the actual size of memory
is 0 when checking with __glibc_objsize0().

SOLUTION: temporarily set malloc-related interfaces to `noinline` to avoid compiler warnings
due to LTO being enabled until the above gcc bug is fixed.

## Other changes
1) Fixed `ps -p [pid]`  doesn't output `<defunct>` when using procps 4.x causing `replication
  child dies when parent is killed - diskless` test to fail.
2) Add a new fortify CI with GCC-13 and ubuntu-lunar docker image.
2023-04-18 09:53:51 +03:00
Binbin
810ea67b5b
Don't pass --fail-commands-not-all-hit to validator if we don't run the full testsuite (#12023)
In daily.yml, if the input suggests we don't run the full testsuite,
do not pass --fail-commands-not-all-hit to the validator.

This fixes the first point in #11954. Credit goes to the comment
on the open issue for GH actions: actions/runner#409

Also improve prints to show the dispatch arguments in every job.
2023-04-12 12:23:50 +03:00
Oran Agra
997fa41e99
Attempt to solve MacOS CI issues in GH Actions (#12013)
The MacOS CI in github actions often hangs without any logs. GH argues that
it's due to resource utilization, either running out of disk space, memory, or CPU
starvation, and thus the runner is terminated.

This PR contains multiple attempts to resolve this:
1. introducing pause_process instead of SIGSTOP, which waits for the process
  to stop before resuming the test, possibly resolving race conditions in some tests,
  this was a suspect since there was one test that could result in an infinite loop in that
 case, in practice this didn't help, but still a good idea to keep.
2. disable the `save` config in many tests that don't need it, specifically ones that use
  heavy writes and could create large files.
3. change the `populate` proc to use short pipeline rather than an infinite one.
4. use `--clients 1` in the macos CI so that we don't risk running multiple resource
  demanding tests in parallel.
5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout
  when a test or a server starts.
2023-04-12 09:19:21 +03:00
Oran Agra
f263b6daf3
Increase threshold for flaky cache reclaim test (#12004)
This test produces 1GB of data and moves it around, and was expecting less
than 500kb to be present in the system page cache.
It sometimes fails with up to some 6mb in the page cache (0 in the actual RDB files),
increasing the threshold. It looks like some background tasks in the container are
occupying the page cache.

It is safe to ignore the above since we also explicitly check the pages of our dump.rdb
are not cached (matching `vmtouch -v` to `0%`).
An additional fix is to match ` 0%` (add space), so that we don't successfully match `10%`.

details in https://github.com/redis/redis/pull/11818
2023-04-05 14:45:42 +03:00
Oran Agra
9e15b42fda
ignore latency errors in the schema validation CI (#11958)
these latency threshold errors prevent the schema validation from running.
2023-03-23 10:49:09 +02:00
guybe7
4ba47d2d21
Add reply_schema to command json files (internal for now) (#10273)
Work in progress towards implementing a reply schema as part of COMMAND DOCS, see #9845
Since ironing the details of the reply schema of each and every command can take a long time, we
would like to merge this PR when the infrastructure is ready, and let this mature in the unstable branch.
Meanwhile the changes of this PR are internal, they are part of the repo, but do not affect the produced build.

### Background
In #9656 we add a lot of information about Redis commands, but we are missing information about the replies

### Motivation
1. Documentation. This is the primary goal.
2. It should be possible, based on the output of COMMAND, to be able to generate client code in typed
  languages. In order to do that, we need Redis to tell us, in detail, what each reply looks like.
3. We would like to build a fuzzer that verifies the reply structure (for now we use the existing
  testsuite, see the "Testing" section)

### Schema
The idea is to supply some sort of schema for the various replies of each command.
The schema will describe the conceptual structure of the reply (for generated clients), as defined in RESP3.
Note that the reply structure itself may change, depending on the arguments (e.g. `XINFO STREAM`, with
and without the `FULL` modifier)
We decided to use the standard json-schema (see https://json-schema.org/) as the reply-schema.

Example for `BZPOPMIN`:
```
"reply_schema": {
    "oneOf": [
        {
            "description": "Timeout reached and no elements were popped.",
            "type": "null"
        },
        {
            "description": "The keyname, popped member, and its score.",
            "type": "array",
            "minItems": 3,
            "maxItems": 3,
            "items": [
                {
                    "description": "Keyname",
                    "type": "string"
                },
                {
                    "description": "Member",
                    "type": "string"
                },
                {
                    "description": "Score",
                    "type": "number"
                }
            ]
        }
    ]
}
```

#### Notes
1.  It is ok that some commands' reply structure depends on the arguments and it's the caller's responsibility
  to know which is the relevant one. this comes after looking at other request-reply systems like OpenAPI,
  where the reply schema can also be oneOf and the caller is responsible to know which schema is the relevant one.
2. The reply schemas will describe RESP3 replies only. even though RESP3 is structured, we want to use reply
  schema for documentation (and possibly to create a fuzzer that validates the replies)
3. For documentation, the description field will include an explanation of the scenario in which the reply is sent,
  including any relation to arguments. for example, for `ZRANGE`'s two schemas we will need to state that one
  is with `WITHSCORES` and the other is without.
4. For documentation, there will be another optional field "notes" in which we will add a short description of
  the representation in RESP2, in case it's not trivial (RESP3's `ZRANGE`'s nested array vs. RESP2's flat
  array, for example)

Given the above:
1. We can generate the "return" section of all commands in [redis-doc](https://redis.io/commands/)
  (given that "description" and "notes" are comprehensive enough)
2. We can generate a client in a strongly typed language (but the return type could be a conceptual
  `union` and the caller needs to know which schema is relevant). see the section below for RESP2 support.
3. We can create a fuzzer for RESP3.

### Limitations (because we are using the standard json-schema)
The problem is that Redis' replies are more diverse than what the json format allows. This means that,
when we convert the reply to a json (in order to validate the schema against it), we lose information (see
the "Testing" section below).
The other option would have been to extend the standard json-schema (and json format) to include stuff
like sets, bulk-strings, error-string, etc. but that would mean also extending the schema-validator - and that
seemed like too much work, so we decided to compromise.

Examples:
1. We cannot tell the difference between an "array" and a "set"
2. We cannot tell the difference between simple-string and bulk-string
3. we cannot verify true uniqueness of items in commands like ZRANGE: json-schema doesn't cover the
  case of two identical members with different scores (e.g. `[["m1",6],["m1",7]]`) because `uniqueItems`
  compares (member,score) tuples and not just the member name. 

### Testing
This commit includes some changes inside Redis in order to verify the schemas (existing and future ones)
are indeed correct (i.e. describe the actual response of Redis).
To do that, we added a debugging feature to Redis that causes it to produce a log of all the commands
it executed and their replies.
For that, Redis needs to be compiled with `-DLOG_REQ_RES` and run with
`--reg-res-logfile <file> --client-default-resp 3` (the testsuite already does that if you run it with
`--log-req-res --force-resp3`)
You should run the testsuite with the above args (and `--dont-clean`) in order to make Redis generate
`.reqres` files (same dir as the `stdout` files) which contain request-response pairs.
These files are later on processed by `./utils/req-res-log-validator.py` which does:
1. Goes over req-res files, generated by redis-servers, spawned by the testsuite (see logreqres.c)
2. For each request-response pair, it validates the response against the request's reply_schema
  (obtained from the extended COMMAND DOCS)
5. In order to get good coverage of the Redis commands, and all their different replies, we chose to use
  the existing redis test suite, rather than attempt to write a fuzzer.

#### Notes about RESP2
1. We will not be able to use the testing tool to verify RESP2 replies (we are ok with that, it's time to
  accept RESP3 as the future RESP)
2. Since the majority of the test suite is using RESP2, and we want the server to reply with RESP3
  so that we can validate it, we will need to know how to convert the actual reply to the one expected.
   - number and boolean are always strings in RESP2 so the conversion is easy
   - objects (maps) are always a flat array in RESP2
   - others (nested array in RESP3's `ZRANGE` and others) will need some special per-command
     handling (so the client will not be totally auto-generated)

Example for ZRANGE:
```
"reply_schema": {
    "anyOf": [
        {
            "description": "A list of member elements",
            "type": "array",
            "uniqueItems": true,
            "items": {
                "type": "string"
            }
        },
        {
            "description": "Members and their scores. Returned in case `WITHSCORES` was used.",
            "notes": "In RESP2 this is returned as a flat array",
            "type": "array",
            "uniqueItems": true,
            "items": {
                "type": "array",
                "minItems": 2,
                "maxItems": 2,
                "items": [
                    {
                        "description": "Member",
                        "type": "string"
                    },
                    {
                        "description": "Score",
                        "type": "number"
                    }
                ]
            }
        }
    ]
}
```

### Other changes
1. Some tests that behave differently depending on the RESP are now being tested for both RESP,
  regardless of the special log-req-res mode ("Pub/Sub PING" for example)
2. Update the history field of CLIENT LIST
3. Added basic tests for commands that were not covered at all by the testsuite

### TODO

- [x] (maybe a different PR) add a "condition" field to anyOf/oneOf schemas that refers to args. e.g.
  when `SET` return NULL, the condition is `arguments.get||arguments.condition`, for `OK` the condition
  is `!arguments.get`, and for `string` the condition is `arguments.get` - https://github.com/redis/redis/issues/11896
- [x] (maybe a different PR) also run `runtest-cluster` in the req-res logging mode
- [x] add the new tests to GH actions (i.e. compile with `-DLOG_REQ_RES`, run the tests, and run the validator)
- [x] (maybe a different PR) figure out a way to warn about (sub)schemas that are uncovered by the output
  of the tests - https://github.com/redis/redis/issues/11897
- [x] (probably a separate PR) add all missing schemas
- [x] check why "SDOWN is triggered by misconfigured instance replying with errors" fails with --log-req-res
- [x] move the response transformers to their own file (run both regular, cluster, and sentinel tests - need to
  fight with the tcl including mechanism a bit)
- [x] issue: module API - https://github.com/redis/redis/issues/11898
- [x] (probably a separate PR): improve schemas: add `required` to `object`s - https://github.com/redis/redis/issues/11899

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Hanna Fadida <hanna.fadida@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Shaya Potter <shaya@redislabs.com>
2023-03-11 10:14:16 +02:00
Oran Agra
3ac835777c
Stablize page reclaim CI test (#11818)
stabilize the test introduced in #11248
* remove random aspect of the test by using DEBUG POPULATE instead of redis-benchmark
* disable rdbcompression, so that the rdb file is always about 1GB.

when fadvise was disabled, i get about 1GB in the page cace
when enabled i get less than 200KB
so for now, i'll keep the 500kb threshold.
2023-02-19 18:38:07 +02:00
Oran Agra
5b61b0dc6d
skip new page cache reclame unit test when running in valgrind (#11808)
the new test is incompatible with valgrind.
added a new `--valgrind` argument to `redis-server tests` mode,
which will cause that test to be skipped..
2023-02-16 10:50:58 +02:00
Tian
7dae142a2e
Reclaim page cache of RDB file (#11248)
# Background
The RDB file is usually generated and used once and seldom used again, but the content would reside in page cache until OS evicts it. A potential problem is that once the free memory exhausts, the OS have to reclaim some memory from page cache or swap anonymous page out, which may result in a jitters to the Redis service.

Supposing an exact scenario, a high-capacity machine hosts many redis instances, and we're upgrading the Redis together. The page cache in host machine increases as RDBs are generated. Once the free memory drop into low watermark(which is more likely to happen in older Linux kernel like 3.10, before [watermark_scale_factor](https://lore.kernel.org/lkml/1455813719-2395-1-git-send-email-hannes@cmpxchg.org/) is introduced, the `low watermark` is linear to `min watermark`, and there'is not too much buffer space for `kswapd` to be wake up to reclaim memory), a `direct reclaim` happens, which means the process would stall to wait for memory allocation.

# What the PR does
The PR introduces a capability to reclaim the cache when the RDB is operated. Generally there're two cases, read and write the RDB. For read it's a little messy to address the incremental reclaim, so the reclaim is done in one go in background after the load is finished to avoid blocking the work thread. For write, incremental reclaim amortizes the work of reclaim so no need to put it into background, and the peak watermark of cache can be reduced in this way.

Two cases are addresses specially, replication and restart, for both of which the cache is leveraged to speed up the processing, so the reclaim is postponed to a right time. To do this, a flag is added to`rdbSave` and `rdbLoad` to control whether the cache need to be kept, with the default value false.

# Something deserve noting
1. Though `posix_fadvise` is the POSIX standard, but only few platform support it, e.g. Linux, FreeBSD 10.0.
2. In Linux `posix_fadvise` only take effect on writeback-ed pages, so a `sync`(or `fsync`, `fdatasync`) is needed to flush the dirty page before `posix_fadvise` if we reclaim write cache.

# About test
A unit test is added to verify the effect of `posix_fadvise`.
In integration test overall cache increase is checked, as well as the cache backed by RDB as a specific TCL test is executed in isolated Github action job.
2023-02-12 09:23:29 +02:00
Binbin
0f85713174
Fix sentinel update loglevel tls test (#11528)
Apparently we used to set `loglevel debug` for tls in spawn_instance.
I.e. cluster and sentinel tests used to run with debug logging, only when tls mode was enabled.
this was probably a leftover from when creating the tls mode tests.
it cause a new test created for #11214 to fail in tls mode.

At the same time, in order to better distinguish the tests, change the
name of `test-centos7-tls` to `test-centos7-tls-module`, change the name
of `test-centos7-tls-no-tls` to `test-centos7-tls-module-no-tls`.

Note that in `test-centos7-tls-module`, we did not pass `--tls-module`
in sentinel test because it is not supported, see 4faddf1, added in #9320.
So only `test-ubuntu-tls` fails in daily CI.

Co-authored-by: Oran Agra <oran@redislabs.com>
2022-11-21 22:53:13 +02:00
Binbin
5246bf4544
Bump vmactions/freebsd-vm to 0.3.0 to fix FreeBSD daily (#11476)
Our FreeBSD daily has been failing recently:
```
  Config file: freebsd-13.1.conf
  cd: /Users/runner/work/redis/redis: No such file or directory
  gmake: *** No targets specified and no makefile found.  Stop.
```

Upgrade vmactions/freebsd-vm to the latest version (0.3.0) can work.
I've tested it, but don't know why, but first let's fix it.
2022-11-04 20:28:27 +02:00
dependabot[bot]
c66eaf4e4a
Bump vmactions/freebsd-vm from 0.2.3 to 0.2.4 (#11203)
Bumps [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm) from 0.2.3 to 0.2.4.
- [Release notes](https://github.com/vmactions/freebsd-vm/releases)
- [Commits](https://github.com/vmactions/freebsd-vm/compare/v0.2.3...v0.2.4)

---
updated-dependencies:
- dependency-name: vmactions/freebsd-vm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-31 10:13:48 +03:00
Oran Agra
4faddf18ca Build TLS as a loadable module
* Support BUILD_TLS=module to be loaded as a module via config file or
  command line. e.g. redis-server --loadmodule redis-tls.so
* Updates to redismodule.h to allow it to be used side by side with
  server.h by defining REDISMODULE_CORE_MODULE
* Changes to server.h, redismodule.h and module.c to avoid repeated
  type declarations (gcc 4.8 doesn't like these)
* Add a mechanism for non-ABI neutral modules (ones who include
  server.h) to refuse loading if they detect not being built together with
  redis (release.c)
* Fix wrong signature of RedisModuleDefragFunc, this could break
  compilation of a module, but not the ABI
* Move initialization of listeners in server.c to be after loading
  the modules
* Config TLS after initialization of listeners
* Init cluster after initialization of listeners
* Add TLS module to CI
* Fix a test suite race conditions:
  Now that the listeners are initialized later, it's not sufficient to
  wait for the PID message in the log, we need to wait for the "Server
  Initialized" message.
* Fix issues with moduleconfigs test as a result from start_server
  waiting for "Server Initialized"
* Fix issues with modules/infra test as a result of an additional module
  present

Notes about Sentinel:
Sentinel can't really rely on the tls module, since it uses hiredis to
initiate connections and depends on OpenSSL (won't be able to use any
other connection modules for that), so it was decided that when TLS is
built as a module, sentinel does not support TLS at all.
This means that it keeps using redis_tls_ctx and redis_tls_client_ctx directly.

Example code of config in redis-tls.so(may be use in the future):
RedisModuleString *tls_cfg = NULL;

void tlsInfo(RedisModuleInfoCtx *ctx, int for_crash_report) {
    UNUSED(for_crash_report);
    RedisModule_InfoAddSection(ctx, "");
    RedisModule_InfoAddFieldLongLong(ctx, "var", 42);
}

int tlsCommand(RedisModuleCtx *ctx, RedisModuleString **argv, int argc)
{
    if (argc != 2) return RedisModule_WrongArity(ctx);
    return RedisModule_ReplyWithString(ctx, argv[1]);
}

RedisModuleString *getStringConfigCommand(const char *name, void *privdata) {
    REDISMODULE_NOT_USED(name);
    REDISMODULE_NOT_USED(privdata);
    return tls_cfg;
}

int setStringConfigCommand(const char *name, RedisModuleString *new, void *privdata, RedisModuleString **err) {
    REDISMODULE_NOT_USED(name);
    REDISMODULE_NOT_USED(err);
    REDISMODULE_NOT_USED(privdata);
    if (tls_cfg) RedisModule_FreeString(NULL, tls_cfg);
    RedisModule_RetainString(NULL, new);
    tls_cfg = new;
    return REDISMODULE_OK;
}

int RedisModule_OnLoad(void *ctx, RedisModuleString **argv, int argc)
{
    ....
    if (RedisModule_CreateCommand(ctx,"tls",tlsCommand,"",0,0,0) == REDISMODULE_ERR)
        return REDISMODULE_ERR;

    if (RedisModule_RegisterStringConfig(ctx, "cfg", "", REDISMODULE_CONFIG_DEFAULT, getStringConfigCommand, setStringConfigCommand, NULL, NULL) == REDISMODULE_ERR)
        return REDISMODULE_ERR;

    if (RedisModule_LoadConfigs(ctx) == REDISMODULE_ERR) {
        if (tls_cfg) {
            RedisModule_FreeString(ctx, tls_cfg);
            tls_cfg = NULL;
        }
        return REDISMODULE_ERR;
    }
    ...
}

Co-authored-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2022-08-23 12:37:56 +03:00
dependabot[bot]
4fe9242a7f
Bump vmactions/freebsd-vm from 0.2.0 to 0.2.3 (#11072)
Bumps [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm) from 0.2.0 to 0.2.3.
- [Release notes](https://github.com/vmactions/freebsd-vm/releases)
- [Commits](https://github.com/vmactions/freebsd-vm/compare/v0.2.0...v0.2.3)

---
updated-dependencies:
- dependency-name: vmactions/freebsd-vm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-02 21:56:48 +03:00
Yossi Gottlieb
b550a55cbf
CI: Update vmaction. (#11013) 2022-07-19 15:30:06 +03:00
Oran Agra
475563e2e9
crash report instructions (#10816)
Trying to avoid people opening crash report issues about module crashes and ARM QEMU bugs.
2022-06-06 11:39:23 +03:00
dependabot[bot]
ff3a3577f2
Bump github/codeql-action from 1 to 2 (#10635)
* Bump github/codeql-action from 1 to 2

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 1 to 2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v1...v2)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Avoid CodeQL on push error.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-04 11:40:08 +03:00
dependabot[bot]
6b403f56a5
Bump actions/upload-artifact from 2 to 3 (#10566)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2 to 3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-12 13:23:41 +03:00
dependabot[bot]
4e55d557eb
Bump actions/checkout from 2 to 3 (#10390)
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-30 16:18:03 +03:00
dependabot[bot]
8df37363db
Bump actions/cache from 2 to 3 (#10463)
Bumps [actions/cache](https://github.com/actions/cache) from 2 to 3.
- [Release notes](https://github.com/actions/cache/releases)
- [Commits](https://github.com/actions/cache/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-30 16:16:21 +03:00
Oran Agra
16d206ee36
fix daily.yaml skip filters (#10490)
* missing parenthesis meant that the ubuntu and centos jobs were not
  skipped
* the recently divided freebsd, macos, and valgrind jobs, which are now
  split into distict jobs for redis, modules, sentinel, cluster. were
  all executed, producing a build, but not running anything.
  now they're filtered at the job level
* iothreads was missing from the skip list defaults, so was not skipped
2022-03-29 18:35:17 +03:00
Oran Agra
1a57af629c
Split daily CI into smaller chunks (#10469)
this should aid find the CI issues with freebsd and macos runs, and also
get faster results from valgrind and tls
2022-03-22 17:38:01 +02:00
蔡相跃
24da71e507
Fix typo "the the" (#10399) 2022-03-09 13:55:17 +02:00
Oran Agra
9478d5a134
enable daily CI on release branches (#10357) 2022-02-28 13:17:56 +02:00
Oran Agra
1193e96d02
Add workflow_dispatch filters for daily CI. (#10289)
sometimes you just wanna run one test on one system (e.g. memefficiency
on macos), so you want all other tests to be skipped
2022-02-13 17:43:19 +02:00
dependabot[bot]
edc050cc57
Bump vmactions/freebsd-vm from 0.1.5 to 0.1.6 (#10219) 2022-02-02 10:39:34 +02:00