futriix

Author	SHA1	Message	Date
Roshan Khatri	fe2ef2616c	Workflow changes to fix old release binaries (#1461 ) - Moves `build-config.json` to workflow dir to build old versions with new configs. - Enables contributors to test release Wf on private repo by adding `github.event_name == 'workflow_dispatch' \|\|` --------- Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>	2025-01-08 11:35:54 -08:00
Binbin	14fb6d3487	Fix wrong file name in build-release-packages.yml (#1437 ) Introduced in #1363, the file name does not match. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-01-08 11:35:54 -08:00
Roshan Khatri	8b17e6a3d9	Fix the secrete for test bucket. (#1447 ) We have set the secret as `AWS_S3_TEST_BUCKET` for test bucket and I missed it in the initial review. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>	2025-01-08 11:35:54 -08:00
Vu Diep	8786fcb762	Use `configure-aws-credentials` workflow instead of passing `secret_access_key` (#1363 ) This PR fixes #1346 where we can get rid of the long term credentials by using OpenID Connect. OpenID Connect (OIDC) allows your GitHub Actions workflows to access resources in Amazon Web Services (AWS), without needing to store the AWS credentials as long-lived GitHub secrets. --------- Signed-off-by: vudiep411 <vdiep@amazon.com>	2025-01-08 11:35:54 -08:00
Binbin	d6cd90bc8e	Skip build-release-packages CI job in forks (#1438 ) The CI job was introduced in #1363, we should skip it in forks. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-01-08 11:35:54 -08:00
Binbin	e59525f037	Skip IPv6 tests when TCLSH version is < 8.6 (#910 ) In #786, we did skip it in the daily, but not for the others. When running ./runtest on MacOS, we will get the failure. ``` couldn't open socket: host is unreachable (nodename nor servname provided, or not known) ``` The reason is that TCL 8.5 doesn't support ipv6, so we skip tests tagged with ipv6. This also revert #786. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-01-08 11:35:54 -08:00
Yury-Fridlyand	9fbd8ea344	Fix CI concurrency (#849 ) Few CI improvements witch will reduce occupation CI queue and eliminate stale runs. 1. Kill CI jobs on PRs once PR branch gets a new push. This will prevent situation happened today - a huge job triggered twice in less than an hour and occupied all org (for all repositories) runners queue for the rest of the day (see pic). This completely blocked valkey-glide team. 2. Distribute nightly croned jobs on time to prevent them running together. Keep in mind, cron's TZ is UTC, so midnight tasks incur developers located in other timezones. This must be backported to all release branches (`valkey-x.y` and `x.y`) ![image](https://github.com/user-attachments/assets/923d8237-3cb7-42f5-80c8-5322b3f5187d) --------- Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>	2025-01-08 11:35:54 -08:00
Viktor Söderqvist	9308ed4ecb	Skip IPv6 tests on MacOS (daily) (#786 ) Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-01-08 11:35:54 -08:00
Jonathan Wright	763b6f28ca	Replace centos 7 with alternative versions (#543 ) replace centos 7 with almalinux 8, add almalinux 9, centos stream 9, fedora stable, rawhide Fixes #527 --------- Signed-off-by: Jonathan Wright <jonathan@almalinux.org> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2025-01-08 11:35:54 -08:00
Siddhartha Sankar Mondal	f5d106a90d	Deprecate MacOS 11 build target (#524 ) Deprecate MacOS 11 build target. End of life June 2024. Fixes #523 --------- Signed-off-by: Siddhartha Mondal <siddharthmondal@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Roshan Khatri <117414976+roshkhatri@users.noreply.github.com>	2025-01-08 11:35:54 -08:00
Madelyn Olson	3e0c587c08	Automatically notify the slack channel when tests fail (#509 ) Adds a job that will automatically run at the end of the daily, which will collect all the failed tests and send them to the developer slack. It will include a link to the job as well. Example job that ran on my private repo: https://github.com/madolson/valkey/actions/runs/9123245899/job/25085418567 Example notification: <img width="662" alt="image" src="https://github.com/valkey-io/valkey/assets/34459052/69127db4-e416-4321-bc06-eefcecab1130"> (Note: I removed the sassy text at the bottom from the PR) Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2025-01-08 11:35:54 -08:00
Björn Svensson	d241d77f6c	Pin versions of Github Actions in CI (#221 ) Pin the Github Action dependencies to the hash according to secure software development best practices recommended by the Open Source Security Foundation (OpenSSF). When developing a CI workflow, it's common to version-pin dependencies (i.e. actions/checkout@v4). However, version tags are mutable, so a malicious attacker could overwrite a version tag to point to a malicious or vulnerable commit instead. Pinning workflow dependencies by hash ensures the dependency is immutable and its behavior is guaranteed. See https://github.com/ossf/scorecard/blob/main/docs/checks.md#pinned-dependencies The `dependabot` supports updating a hash and the version comment so its update will continue to work as before. Links to used actions and theit tag/hash for review/validation: https://github.com/actions/checkout/tags (v4.1.2 was rolled back) https://github.com/github/codeql-action/tags https://github.com/maxim-lobanov/setup-xcode/tags https://github.com/cross-platform-actions/action/releases/tag/v0.22.0 https://github.com/py-actions/py-dependency-install/tags https://github.com/actions/upload-artifact/tags https://github.com/actions/setup-node/tags https://github.com/taiki-e/install-action/releases/tag/v2.32.2 This PR is part of #211. Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>	2025-01-08 11:35:54 -08:00
Vitah Lin	ae6c6495bf	Add Codecov for Automated Code Coverage (#316 ) This PR introduces Codecov to automate code coverage tracking for our project's tests. For more information about the Codecov platform, please refer to https://docs.codecov.com/docs/quick-start --------- Signed-off-by: Vitah Lin <vitahlin@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2025-01-08 11:35:54 -08:00
Björn Svensson	4b2edc68ca	Set permissions for Github Actions in CI (#312 ) This sets the default permission for current CI workflows to only be able to read from the repository (scope: "contents"). When a used Github Action require additional permissions (like CodeQL) we grant that permission on job-level instead. This means that a compromised action will not be able to modify the repo or even steal secrets since all other permission-scopes are implicit set to "none", i.e. not permitted. This is recommended by [OpenSSF](https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions). This PR includes a small fix for the possibility of missing server logs artifacts, found while verifying the permission. The `upload-artifact@v3` action will replace artifacts which already exists. Since both CI-jobs `test-external-standalone` and `test-external-nodebug` uses the same artifact name, when both jobs fail, we only get logs from the last finished job. This can be avoided by using unique artifact names. This PR is part of #211 More about permissions and scope can be found here: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions --------- Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>	2025-01-08 11:35:54 -08:00
Harkrishn Patro	62b42707ea	Add release notes for 7.2.8	2025-01-08 11:35:54 -08:00
Madelyn Olson	e04acb377e	Fix LUA garbage collector (CVE-2024-46981) (#1513 ) Reset GC state before closing the lua VM to prevent user data to be wrongly freed while still might be used on destructor callbacks. Created and publish by Redis in their OSS branch. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: YaacovHazan <yaacov.hazan@redis.com>	2025-01-08 11:35:54 -08:00
Madelyn Olson	bc1680d7e6	Fix Read/Write key pattern selector (CVE-2024-51741) (#1514 ) The explanation on the original commit was wrong. Key based access must have a `~` in order to correctly configure whey key prefixes to apply the selector to. If this is missing, a server assert will be triggered later. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: YaacovHazan <yaacov.hazan@redis.com>	2025-01-08 11:35:54 -08:00
gmbnomis	6101248fb0	Use the correct command proc for the LOOKUP_NOTOUCH exception in lookupKey (#1499 ) When looking up a key in no-touch mode, `LOOKUP_NOTOUCH` is set to avoid updating the last access time in `lookupKey`. An exception must be made for the `TOUCH` command which must always update the key. When called from a script, `server.executing_client` will point to the `TOUCH` command, while `server.current_client` will point to e.g. an `EVAL` command. So, we must use the former to find out the currently executing command if defined. This fix addresses the issue where TOUCH wasn't updating key access times when called from scripts like EVAL. Fixes #1498 Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2025-01-08 11:35:54 -08:00
muelstefamzn	51fcd6e4ca	Trim free space from inline command argument strings to avoid excess memory usage (#1213 ) The command argument strings created while parsing inline commands (see `processInlineBuffer()`) can contain free capacity. Since some commands ,such as `SET`, store these strings in the database, that free capacity increases the memory usage. In the worst case, it could double the memory usage. This only occurs if the inline command format is used. The argument strings are built by appending character by character in `sdssplitargs()`. Regular RESP commands are not affected. This change trims the strings within `processInlineBuffer()`. this? When the command argument string is packed into an object, `trimStringObjectIfNeeded()` is called. This does only trim the string if it is larger than `PROTO_MBULK_BIG_ARG` (32kB), as only strings larger than this would ever need trimming if the command it sent using the bulk string format. We could modify this condition, but that would potentially have a performance impact on commands using the bulk format. Since those make up for the vast majority of executed commands, limiting this change to inline commands seems prudent. * 1 million `SET [key] [value]` commands * Random keys (16 bytes) * 600 bytes values Memory usage without this change: ``` used_memory:1089327888 used_memory_human:1.01G used_memory_rss:1131696128 used_memory_rss_human:1.05G used_memory_peak:1089348264 used_memory_peak_human:1.01G used_memory_peak_perc:100.00% used_memory_overhead:49302800 used_memory_startup:911808 used_memory_dataset:1040025088 used_memory_dataset_perc:95.55% ``` Memory usage with this change: ``` used_memory:705327888 used_memory_human:672.65M used_memory_rss:718802944 used_memory_rss_human:685.50M used_memory_peak:705348256 used_memory_peak_human:672.67M used_memory_peak_perc:100.00% used_memory_overhead:49302800 used_memory_startup:911808 used_memory_dataset:656025088 used_memory_dataset_perc:93.13% ``` If the same experiment is repeated using the normal RESP array of bulk string format (`*3\r\n$3\r\nSET\r\n...`) then the memory usage is 672MB with and without of this change. If a replica is attached, its memory usage is 672MB with and without this change, since the replication link never uses inline commands. Signed-off-by: Stefan Mueller <muelstef@amazon.com>	2025-01-08 11:35:54 -08:00
Binbin	ccc8acec9d	Fix FUNCTION KILL error message being displayed as SCRIPT KILL (#1171 ) The client that was killed by FUNCTION KILL received a reply of SCRIPT KILL and the server log also showed SCRIPT KILL. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-01-08 11:35:54 -08:00
Madelyn Olson	357191acb9	Apply security fixes for CVEs (#1113 ) Apply the security fixes for the release. (CVE-2024-31449) Lua library commands may lead to stack overflow and potential RCE. (CVE-2024-31227) Potential Denial-of-service due to malformed ACL selectors. (CVE-2024-31228) Potential Denial-of-service due to unbounded pattern matching. --------- Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> 7.2.7	2024-10-02 13:11:08 -07:00
Melroy van den Berg	44a7c35d31	Build binary releases with systemd support (#1107 ) - Add systemd support to the build artifact tarballs, so people can use it under systemd compatible distros. As discussed here: https://github.com/orgs/valkey-io/discussions/1103#discussioncomment-10815549. Adding `libsystemd-dev` to install and add `USE_SYSTEMD=yes` to the build. - Cleanup & bring the arm & x86 workflow files in-sync. It was a bit of a mess ;) (removing `jq wget awscli` from the 'Tarball' step) Signed-off-by: Melroy van den Berg <melroy@melroy.org>	2024-10-02 20:11:12 +02:00
Melroy van den Berg	c789c3f1d1	Avoid .c, .d and .o files from being copied to the binary tar.gz releases (#1106 ) As discussed here: https://github.com/orgs/valkey-io/discussions/1103#discussioncomment-10814006 `cp` can't be used anymore, `rsync` is more powerful and allow to exclude files. Alternatively: 1. Remove the c, d and o files. Which isn't ideal either. 2. Improve the build. Eg. by building inside a `build` directory instead of in the src folder. Ps. I know these workflows aren't trigger in this PR. Only via "Build Release Packages" workflow action: https://github.com/valkey-io/valkey/actions/workflows/build-release-packages.yml.. So I can't fully test in this PR. But it should work ^^ Ps. ps. I did test `rsync -av --exclude='.c' --exclude='.d' --exclude='.o' src/valkey-` command in isolation and that works as expected! --------- Signed-off-by: Melroy van den Berg <melroy@melroy.org>	2024-10-02 20:11:12 +02:00
Madelyn Olson	ef56d713d5	Prepare for 7.2.7 release without security fixes Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-30 15:25:39 -07:00
Binbin	6d03409e01	Fix module RdbLoad wrongly disable the AOF (#1001 ) In RdbLoad, we disable AOF before emptyData and rdbLoad to prevent copy-on-write issues. After rdbLoad completes, AOF should be re-enabled, but the code incorrectly checks server.aof_state, which has been reset to AOF_OFF in stopAppendOnly. This leads to AOF not being re-enabled after being disabled. --------- Signed-off-by: Binbin <binloveplay1314@qq.com>	2024-09-30 15:17:13 -07:00
Binbin	45ae39af04	Fix missing replication link re-connection when primary's IP/port is updated in `clusterProcessGossipSection` (#965 ) `clusterProcessGossipSection` currently doesn't trigger a check and call `replicationSetPrimary` when `myself`'s primary node’s IP/port is updated. This fix ensures that after every node address update, `replicationSetPrimary` is called if the updated node is `myself`'s primary. This prevents missed updates and ensures that replicas reconnect properly to maintain their replication link with the primary. Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-09-30 15:07:33 -07:00
Ted Lyngmo	2be94e68e8	Log the real reason for why posix_fadvise failed (#430 ) `reclaimFilePageCache` did not set `errno` but `rdbSaveInternal` which is logging the error assumed it did. This makes sure `errno` is set. Signed-off-by: Ted Lyngmo <ted@lyncon.se>	2024-09-30 14:58:31 -07:00
Roshan Khatri	50eefd647d	[Cherry-Pick]Adds workflows to build release binaries and push to S3 (#315 ) (#857 ) [related to](https://github.com/valkey-io/valkey/issues/230) Adds workflows to build Valkey binaries and push to S3 to make it available to download from the website The Workflows can be triggered by pushing a release to the repo and the other option is manually by one of the Maintainers. Once the workflow triggers, it will generate a matrix of Jobs for the platforms we need to build from `utils/releasetools/build-config.json` and then the respective Jobs are triggered. These jobs make Valkey with respect to the platform binaries we want to release and would push to a private S3 bucket. --------- Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>	2024-09-30 14:21:33 -07:00
Pierre	68d887636e	Fix crash in active defrag (#883 ) This crash would happen if we disable active defrag while it is running, then re-enable active defrag. In this case the `expires_cursor` variable in `activeDefragCycle()` would not be reset to 0 when disabling active defrag, so when re-enabling it, this variable can still be non-zero, which causes the `db` and other variables to not be initialized before starting the defrag process. Signed-off-by: Pierre Turin <pieturin@amazon.com>	2024-08-12 13:38:20 -07:00
Ping Xie	579cca5f00	Valkey 7.2.6 Patch Release (#842 ) Signed-off-by: Ping Xie <pingxie@google.com> Signed-off-by: Ping Xie <pingxie@outlook.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> 7.2.6	2024-07-30 16:40:02 -07:00
Harkrishn Patro	3aaa129ce4	Generate correct slot information in cluster shards command on primary failure (#790 ) Fix #784 Prior to the change, `CLUSTER SHARDS` command processing might pick a failed primary node which won't have the slot coverage information and the slots `output` in turn would be empty. This change finds an appropriate node which has the slot coverage information served by a given shard and correctly displays it as part of `CLUSTER SHARDS` output. Before: ``` 1) 1) "slots" 2) (empty array) 3) "nodes" 4) 1) 1) "id" 2) "2936f22a490095a0a851b7956b0a88f2b67a5d44" ... 9) "role" 10) "master" ... 13) "health" 14) "fail" ``` After: ``` 1) 1) "slots" 2) 1) 0 2) 5461 3) "nodes" 4) 1) 1) "id" 2) "2936f22a490095a0a851b7956b0a88f2b67a5d44" ... 9) "role" 10) "master" ... 13) "health" 14) "fail" ``` --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>	2024-07-22 23:48:00 -07:00
Yossi Gottlieb	6df023fb98	Reduce FreeBSD daily scope. (#12758 ) The full test is very flaky running on a VM inside GitHub worker, so we have to settle for only building and running a small smoke test. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-15 14:29:28 -07:00
Jonathan Wright	7cb3426a4b	Replace centos 7 with alternative versions (#543 ) replace centos 7 with almalinux 8, add almalinux 9, centos stream 9, fedora stable, rawhide Fixes #527 --------- Signed-off-by: Jonathan Wright <jonathan@almalinux.org> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-12 15:01:57 -07:00
Madelyn Olson	5ab3d1b981	Skip tls for xgroup read regression since it doesn't matter Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>	2024-07-11 20:02:30 -07:00
Ping Xie	ad0a24c742	Add missing test helper function Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
Binbin	010609e8a0	Make valkey compatible with redis-sentinel to start sentinel (#731 ) We already have similar changes to check-rdb / check-aof, apply this change to sentinel. Fixes #719. Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
KarthikSubbarao	0210b642ce	Allow Module authentication to succeed when cluster is down (#693 ) Module Authentication using a blocking implementation currently gets rejected when the "cluster is down" from the client timeout cron job (`clientsCronHandleTimeout`). This PR exempts clients blocked on Module Authentication from being rejected here. --------- Signed-off-by: KarthikSubbarao <karthikrs2021@gmail.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
Binbin	d746556e4f	Only primary with slots has the right to mark a node as failed (#634 ) In markNodeAsFailingIfNeeded we will count needed_quorum and failures, needed_quorum is the half the cluster->size and plus one, and cluster-size is the size of primary node which contain slots, but when counting failures, we dit not check if primary has slots. Only the primary has slots that has the rights to vote, adding a new clusterNodeIsVotingPrimary to formalize this concept. Release notes: bugfix where nodes not in the quorum group might spuriously mark nodes as failed --------- Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ping Xie <pingxie@outlook.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
Sankar	c4244b502e	Make cluster meet reliable under link failures (#461 ) When there is a link failure while an ongoing MEET request is sent the sending node stops sending anymore MEET and starts sending PINGs. Since every node responds to PINGs from unknown nodes with a PONG, the receiving node never adds the sending node. But the sending node adds the receiving node when it sees a PONG. This can lead to asymmetry in cluster membership. This changes makes the sender keep sending MEET until it sees a PONG, avoiding the asymmetry. --------- Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
nitaicaro	17b7b8a733	Fix crash where command duration is not reset when client is blocked … (#526 ) In #11012, we changed the way command durations were computed to handle the same command being executed multiple times. In #11970, we added an assert if the duration is not properly reset, potentially indicating that a call to report statistics was missed. I found an edge case where this happens - easily reproduced by blocking a client on `XGROUPREAD` and migrating the stream's slot. This causes the engine to process the `XGROUPREAD` command twice: 1. First time, we are blocked on the stream, so we wait for unblock to come back to it a second time. In most cases, when we come back to process the command second time after unblock, we process the command normally, which includes recording the duration and then resetting it. 2. After unblocking we come back to process the command, and this is where we hit the edge case - at this point, we had already migrated the slot to another node, so we return a `MOVED` response. But when we do that, we don’t reset the duration field. Fix: also reset the duration when returning a `MOVED` response. I think this is right, because the client should redirect the command to the right node, which in turn will calculate the execution duration. Also wrote a test which reproduces this, it fails without the fix and passes with it. --------- Signed-off-by: Nitai Caro <caronita@amazon.com> Co-authored-by: Nitai Caro <caronita@amazon.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
Viktor Söderqvist	ee6b93c899	Don't let install flags affect build (#382 ) Don't let the Make valiable `USE_REDIS_SYMLINKS` affect the build. If it does, it causes the second line in the example below (`make install`) to recompile what was already compiled on the line above, and this time it's built without BUILD_TLS=yes USE_SYSTEMD=yes. make BUILD_TLS=yes USE_SYSTEMD=yes make PREFIX=custom/usr USE_REDIS_SYMLINKS=no install Fixes #377 Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-09 19:39:40 -07:00
Yanqi Lv	2342db3927	fix wrong data type conversion in zrangeResultBeginStore (#13148 ) In `beginResultEmission`, -1 means the result length is not known in advance. But after #12185, if we pass -1 to `zrangeResultBeginStore`, it will convert to SIZE_MAX in `zsetTypeCreate` and try to `dictExpand`. Although `dictExpand` won't succeed because the size overflows, I think we'd better to avoid this wrong conversion. This bug can be triggered when the source of `zrangestore` doesn't exist or we use `zrangestore` command with `byscore` or `bylex`. The impact is that dst keys will be converted to use skiplist instead of listpack. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
Binbin	306ce81c10	Fix redis-check-aof incorrectly considering data in manifest format as MP-AOF (#12958 ) The check in fileIsManifest misjudged the manifest file. For example, if resp aof contains "file", it will be considered a manifest file and the check will fail: ``` *3 $3 set $4 file $4 file ``` In #12951, if the preamble aof also contains it, it will also fail. Fixes #12951. the bug was happening if the the word "file" is mentioned in the first 1024 lines of the AOF. and now as soon as it finds a non-comment line it'll break (if it contains "file" or doesn't) Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
Matthew Douglass	d9e20f2964	Fix conversion of numbers in lua args to redis args (#13115 ) Since lua_Number is not explicitly an integer or a double, we need to make an effort to convert it as an integer when that's possible, since the string could later be used in a context that doesn't support scientific notation (e.g. 1e9 instead of 100000000). Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`, which ever is shorter, this would break if we try to pass a long integer number to a command that takes integer. we'll get an implicit conversion to string in Lua, and then the parsing in getLongLongFromObjectOrReply will fail. ``` > eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0 (nil) > eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0 (error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1. ``` Switch to using ll2string if the number can be safely represented as a long long. The problem was introduced in #10587 (Redis 7.2). closes #13113. --------- Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Oran Agra <oran@redislabs.com> Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
LiiNen	f91dd1d498	Fix redis-cli --count (for --scan, --bigkeys, etc) was ignored unless --pattern was also used (#13092 ) The --count option for redis-cli has been released in redis 7.2. https://github.com/redis/redis/pull/12042 But I have found in code, that some logic was missing for using this 'count' option. ``` static redisReply sendScan(unsigned long long it) { redisReply reply; if (config.pattern) reply = redisCommand(context, "SCAN %llu MATCH %b COUNT %d", it, config.pattern, sdslen(config.pattern), config.count); else reply = redisCommand(context,"SCAN %llu",it); ``` The intention was being able to using scan count. But in this case, the --count will be only applied when 'pattern' is declared. So, I had fix it simply, to be worked properly - even if --pattern option is not being used. I tested it simply with time() command several times, and I could see it works as intended with this commit. The examples of test results are below: ``` # unstable build time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null) real 0m1.287s user 0m0.011s sys 0m0.022s # count is not applied time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null) real 0m1.117s user 0m0.011s sys 0m0.020s # count is applied with --pattern time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:" >/dev/null 2>/dev/null) real 0m0.045s user 0m0.002s sys 0m0.002s ``` ``` # fix-redis-cli-scan-count build time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan >/dev/null 2>/dev/null) real 0m1.084s user 0m0.008s sys 0m0.024s # count is applied even if --pattern is not declared time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 >/dev/null 2>/dev/null) real 0m0.043s user 0m0.000s sys 0m0.004s # of course this also applied time(./redis-cli -a $AUTH -p $PORT -h $HOST --scan --count 1000 --pattern "hash:*" >/dev/null 2>/dev/null) real 0m0.031s user 0m0.002s sys 0m0.002s ``` Thanks a lot. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
Binbin	fa1bba8619	Increase tolerance range to block reprocess tests to avoid timing issues (#13053 ) These tests have all failed in daily CI: ``` * [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) * [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) ``` Increase the range to avoid failures, and improve the comment to be clearer. tests was introduced in #13004. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
debing.sun	01473fbd5c	Fix crash due to merge of quicklist node introduced by #12955 (#13040 ) Fix two crash introducted by #12955 When a quicklist node can't be inserted and split, we eventually merge the current node with its neighboring nodes after inserting, and compress the current node and its siblings. 1. When the current node is merged with another node, the current node may become invalid and can no longer be used. Solution: let `_quicklistMergeNodes()` return the merged nodes. 3. If the current node is a LZF quicklist node, its recompress will be 1. If the split node can be merged with a sibling node to become head or tail, recompress may cause the head and tail to be compressed, which is not allowed. Solution: always recompress to 0 after merging. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
debing.sun	b685fadb77	Prevent LSET command from causing quicklist plain node size to exceed 4GB (#12955 ) Fix #12864 The main reason for this crash is that when replacing a element of a quicklist packed node with lpReplace() method, if the final size is larger than 4GB, lpReplace() will fail and returns NULL, causing `node->entry` to be incorrectly set to NULL. Since the inserted data is not a large element, we can't just replace it like a large element, first quicklistInsertAfter() and then quicklistDelIndex(), because the current node may be merged and invalidated in quicklistInsertAfter(). The solution of this PR: When replacing a node fails (listpack exceeds 4GB), split the current node, create a new node to put in the middle, and try to merge them. This is the same as inserting a large element. In the worst case, its size will not exceed 4GB. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
Oran Agra	e4b88bb10f	update redis-check-rdb types (#12969 ) seems that we forgot to update the array in redis-check rdb. Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00
Binbin	900ae7aed6	Fix timeout not being set in module blockClient case (#13011 ) This was introduced in #13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ``` Signed-off-by: Ping Xie <pingxie@google.com>	2024-07-02 00:24:19 -07:00

1 2 3 4 5 ...

11946 Commits