The important part is that read-only scripts (not just EVAL_RO
and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or
FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike
before where only the _RO commands would be processed).
Other than that, some errors like OOM, READONLY, MASTERDOWN are now
handled by processCommand, rather than the command itself affects the
error string (and even error code in some cases), and command stats.
Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will
be considered `write` commands in scripts and will be blocked in all
read-only scripts just like other write commands.
They'll also be blocked in EVAL_RO (i.e. even for scripts without the
`no-writes` shebang flag.
This commit also hides the `may_replicate` flag from the COMMAND command
output. this is a **breaking change**.
background about may_replicate:
We don't want to expose a no-may-replicate flag or alike to scripts, since we
consider the may-replicate thing an internal concern of redis, that we may
some day get rid of.
In fact, the may-replicate flag was initially introduced to flag EVAL: since
we didn't know what it's gonna do ahead of execution, before function-flags
existed). PUBLISH and PFCOUNT, both of which because they have side effects
which may some day be fixed differently.
code changes:
The changes in eval.c are mostly code re-ordering:
- evalCalcFunctionName is extracted out of evalGenericCommand
- evalExtractShebangFlags is extracted luaCreateFunction
- evalGetCommandFlags is new code
* Fix broken protocol when redis can't persist to RDB (general commands, not
modules), excessive newline. regression of #10372 (7.0 RC3)
* Fix broken protocol when Redis can't persist to AOF (modules and
scripts), missing newline.
* Fix bug in OOM check of EVAL scripts called from RM_Call.
set the cached OOM state for scripts before executing module commands too,
so that it can serve scripts that are executed by modules.
i.e. in the past EVAL executed by RM_Call could have either falsely
fail or falsely succeeded because of a wrong cached OOM state flag.
* Fix bugs with RM_Yield:
1. SHUTDOWN should only accept the NOSAVE mode
2. Avoid eviction during yield command processing.
3. Avoid processing master client commands while yielding from another client
* Add new two more checks to RM_Call script mode.
1. READONLY You can't write against a read only replica
2. MASTERDOWN Link with MASTER is down and `replica-serve-stale-data` is set to `no`
* Add new RM_Call flag to let redis automatically refuse `deny-oom` commands
while over the memory limit.
* Add tests to cover various errors from Scripts, Modules, Modules
calling scripts, and Modules calling commands in script mode.
Add tests:
* Looks like the MISCONF error was completely uncovered by the tests,
add tests for it, including from scripts, and modules
* Add tests for NOREPLICAS from scripts
* Add tests for the various errors in module RM_Call, including RM_Call that
calls EVAL, and RM_call in "eval mode". that includes:
NOREPLICAS, READONLY, MASTERDOWN, MISCONF
The amount of `server.stat_net_output_bytes/server.stat_net_input_bytes`
is actually the sum of replication flow and users' data flow.
It may cause confusions like this:
"Why does my server get such a large output_bytes while I am doing nothing? ".
After discussions and revisions, now here is the change about what this
PR brings (final version before merge):
- 2 server variables to count the network bytes during replication,
including fullsync and propagate bytes.
- `server.stat_net_repl_output_bytes`/`server.stat_net_repl_input_bytes`
- 3 info fields to print the input and output of repl bytes and instantaneous
value of total repl bytes.
- `total_net_repl_input_bytes` / `total_net_repl_output_bytes`
- `instantaneous_repl_total_kbps`
- 1 new API `rioCheckType()` to check the type of rio. So we can use this
to distinguish between diskless and diskbased replication
- 2 new counting items to keep network statistics consistent between master
and slave
- rdb portion during diskless replica. in `rdbLoadProgressCallback()`
- first line of the full sync payload. in `readSyncBulkPayload()`
Co-authored-by: Oran Agra <oran@redislabs.com>
To easily distinguish between sharded channel message and a global
channel message, introducing `smessage` (instead of `message`) as
message bulk for sharded channel publish message.
This is gonna be a breaking change in 7.0.1!
Background:
Sharded pubsub introduced in redis 7.0, but after the release we quickly
realized that the fact that it's problematic that the client can't distinguish
between normal (global) pubsub messages and sharded ones.
This is important because the same connection can subscribe to both,
but messages sent to one pubsub system are not propagated to the
other (they're completely separate), so if one connection is used to
subscribe to both, we need to assist the client library to know which
message it got so it can forward it to the correct callback.
instead of printing a log when a folder or a manifest is missing (level reduced), we print:
total time it took to load all the aof files
when creating a new base or incr file
starting to write to an existing incr file on startup
This PR does 2 main things:
1) Add warning for suspected slow system clocksource setting. This is Linux specific.
2) Add a `--check-system` argument to redis which runs all system checks and prints a report.
## System checks
Add a command line option `--check-system` which runs all known system checks and provides
a report to stdout of which systems checks have failed with details on how to reconfigure the
system for optimized redis performance.
The `--system-check` mode exists with an appropriate error code after running all the checks.
## Slow clocksource details
We check the system's clocksource performance by running `clock_gettime()` in a loop and then
checking how much time was spent in a system call (via `getrusage()`). If we spend more than
10% of the time in the kernel then we print a warning. I verified that using the slow clock sources:
`acpi_pm` (~90% in the kernel on my laptop) and `xen` (~30% in the kernel on an ec2 `m4.large`)
we get this warning.
The check runs 5 system ticks so we can detect time spent in kernel at 20% jumps (0%,20%,40%...).
Anything more accurate will require the test to run longer. Typically 5 ticks are 50ms. This means
running the test on startup will delay startup by 50ms. To avoid this we make sure the test is only
executed in the `--check-system` mode.
For a quick startup check, we specifically warn if the we see the system is using the `xen` clocksource
which we know has bad performance and isn't recommended (at least on ec2). In such a case the
user should manually rung redis with `--check-system` to force the slower clocksource test described
above.
## Other changes in the PR
* All the system checks are now implemented as functions in _syscheck.c_.
They are implemented using a standard interface (see details in _syscheck.c_).
To do this I moved the checking functions `linuxOvercommitMemoryValue()`,
`THPIsEnabled()`, `linuxMadvFreeForkBugCheck()` out of _server.c_ and _latency.c_
and into the new _syscheck.c_. When moving these functions I made sure they don't
depend on other functionality provided in _server.c_ and made them use a standard
"check functions" interface. Specifically:
* I removed all logging out of `linuxMadvFreeForkBugCheck()`. In case there's some
unexpected error during the check aborts as before, but without any logging.
It returns an error code 0 meaning the check didn't not complete.
* All these functions now return 1 on success, -1 on failure, 0 in case the check itself
cannot be completed.
* The `linuxMadvFreeForkBugCheck()` function now internally calls `exit()` and not
`exitFromChild()` because the latter is only available in _server.c_ and I wanted to
remove that dependency. This isn't an because we don't need to worry about the
child process created by the test doing anything related to the rdb/aof files which
is why `exitFromChild()` was created.
* This also fixes parsing of other /proc/\<pid\>/stat fields to correctly handle spaces
in the process name and be more robust in general. Not that before this fix the rss
info in `INFO memory` was corrupt in case of spaces in the process name. To
recreate just rename `redis-server` to `redis server`, start it, and run `INFO memory`.
Scripts that have the `no-writes` flag, cannot execute write commands,
and since all `deny-oom` commands are write commands, we now act
as if the `allow-oom` flag is implicitly set for scripts that set the `no-writes` flag.
this also implicitly means that the EVAL*_RO and FCALL_RO commands can
never fails with OOM error.
Note about a bug that's no longer relevant:
There was an issue with EVAL*_RO using shebang not being blocked correctly
in OOM state:
When an EVAL script declares a shebang, it was by default not allowed to run in
OOM state.
but this depends on a flag that is updated before the command is executed, which
was not updated in case of the `_RO` variants.
the result is that if the previous cached state was outdated (either true or false),
the script will either unjustly fail with OOM, or unjustly allowed to run despite
the OOM state.
It doesn't affect scripts without a shebang since these depend on the actual
commands they run, and since these are only read commands, they don't care
for that cached oom state flag.
it did affect scripts with shebang and no allow-oom flag, bug after the change in
this PR, scripts that are run with eval_ro would implicitly have that flag so again
the cached state doesn't matter.
p.s. this isn't a breaking change since all it does is allow scripts to run when they
should / could rather than blocking them.
sometimes it is using `scriptIsRunning()` and other times it is using `server.in_script`.
We should use the `scriptIsRunning()` method consistently throughout the code base.
Removed server.in_script sine it's no longer used / needed.
Updated the comments for:
info command
lmpopCommand and blmpopCommand
sinterGenericCommand
Fix the missing "key" words in the srandmemberCommand function
For LPOS command, when rank is 0, prompt user that rank could be
positive number or negative number, and add a test for it
## Take one bulk string with spaces for MULTI_ARG configs parsing
Currently redis-server looks for arguments that start with `--`,
and anything in between them is considered arguments for the config.
like: `src/redis-server --shutdown-on-sigint nosave force now --port 6380`
MULTI_ARG configs behave differently for CONFIG command, vs the command
line argument for redis-server.
i.e. CONFIG command takes one bulk string with spaces in it, while the
command line takes an argv array with multiple values.
In this PR, in config.c, if `argc > 1` we can take them as is,
and if the config is a `MULTI_ARG` and `argc == 1`, we will split it by spaces.
So both of these will be the same:
```
redis-server --shutdown-on-sigint nosave force now --shutdown-on-sigterm nosave force
redis-server --shutdown-on-sigint nosave "force now" --shutdown-on-sigterm nosave force
redis-server --shutdown-on-sigint nosave "force now" --shutdown-on-sigterm "nosave force"
```
## Allow options value to use the `--` prefix
Currently it decides to switch to the next config, as soon as it sees `--`,
even if there was not a single value provided yet to the last config,
this makes it impossible to define a config value that has `--` prefix in it.
For instance, if we want to set the logfile to `--my--log--file`,
like `redis-server --logfile --my--log--file --loglevel verbose`,
current code will handle that incorrectly.
In this PR, now we allow a config value that has `--` prefix in it.
**But note that** something like `redis-server --some-config --config-value1 --config-value2 --loglevel debug`
would not work, because if you want to pass a value to a config starting with `--`, it can only be a single value.
like: `redis-server --some-config "--config-value1 --config-value2" --loglevel debug`
An example (using `--` prefix config value):
```
redis-server --logfile --my--log--file --loglevel verbose
redis-cli config get logfile loglevel
1) "loglevel"
2) "verbose"
3) "logfile"
4) "--my--log--file"
```
### Potentially breaking change
`redis-server --save --loglevel verbose` used to work the same as `redis-server --save "" --loglevel verbose`
now, it'll error!
Fix#10552
We no longer piggyback getkeys_proc to hold the RedisModuleCommand struct, when exists
Others:
Use `doesCommandHaveKeys` in `RM_GetCommandKeysWithFlags` and `getKeysSubcommandImpl`.
It causes a very minor behavioral change in commands that don't have actual keys, but have a spec
with `CMD_KEY_NOT_KEY`.
For example, before this command `COMMAND GETKEYS SPUBLISH` would return
`Invalid arguments specified for command` but not it returns `The command has no key arguments`
Changes:
- When AOF is enabled **after** startup, the data accumulated during `AOF_WAIT_REWRITE`
will only be stored in a temp INCR AOF file. Only after the first AOFRW is successful, we will
add it to manifest file.
Before this fix, the manifest referred to the temp file which could cause a restart during that
time to load it without it's base.
- Add `aof_rewrites_consecutive_failures` info field for aofrw limiting implementation.
Now we can guarantee that these behaviors of MP-AOF are the same as before (past redis releases):
- When AOF is enabled after startup, the data accumulated during `AOF_WAIT_REWRITE` will only
be stored in a visible place. Only after the first AOFRW is successful, we will add it to manifest file.
- When disable AOF, we did not delete the AOF file in the past so there's no need to change that
behavior now (yet).
- When toggling AOF off and then on (could be as part of a full-sync), a crash or restart before the
first rewrite is completed, would result with the previous version being loaded (might not be right thing,
but that's what we always had).
The SHUTDOWN command has various flags to change it's default behavior,
but in some cases establishing a connection to redis is complicated and it's easier
for the management software to use signals. however, so far the signals could only
trigger the default shutdown behavior.
Here we introduce the option to control shutdown arguments for SIGTERM and SIGINT.
New config options:
`shutdown-on-sigint [nosave | save] [now] [force]`
`shutdown-on-sigterm [nosave | save] [now] [force]`
Implementation:
Support MULTI_ARG_CONFIG on createEnumConfig to support multiple enums to be applied as bit flags.
Co-authored-by: Oran Agra <oran@redislabs.com>
* Till now, replicas that were unable to persist, would still execute the commands
they got from the master, now they'll panic by default, and we add a new
`replica-ignore-disk-errors` config to change that.
* Till now, when a command failed on a replica or AOF-loading, it only logged a
warning and a stat, we add a new `propagation-error-behavior` config to allow
panicking in that state (may become the default one day)
Note that commands that fail on the replica can either indicate a bug that could
cause data inconsistency between the replica and the master, or they could be
in some cases (specifically in previous versions), a result of a command (e.g. EVAL)
that failed on the master, but still had to be propagated to fail on the replica as well.
Adds the `allow-cross-slot-keys` flag to Eval scripts and Functions to allow
scripts to access keys from multiple slots.
The default behavior is now that they are not allowed to do that (unlike before).
This is a breaking change for 7.0 release candidates (to be part of 7.0.0), but
not for previous redis releases since EVAL without shebang isn't doing this check.
Note that the check is done on both the keys declared by the EVAL / FCALL command
arguments, and also the ones used by the script when making a `redis.call`.
A note about the implementation, there seems to have been some confusion
about allowing access to non local keys. I thought I missed something in our
wider conversation, but Redis scripts do block access to non-local keys.
So the issue was just about cross slots being accessed.
A change in #10612 introduced a regression.
when replying with garbage bytes to the caller, we must make sure it
doesn't include any newlines.
in the past it called rejectCommandFormat which did that trick.
but now it calls rejectCommandSds, which doesn't, so we need to make sure
to sanitize the sds.
1. Disk error and slave count checks didn't flag the transactions or counted correctly in command stats (regression from #10372 , 7.0 RC3)
2. RM_Call will reply the same way Redis does, in case of non-exisitng command or arity error
3. RM_WrongArtiy will consider the full command name
4. Use lowercase 'u' in "unknonw subcommand" (to align with "unknown command")
Followup work of #10127
In #7491 (part of redis 6.2), we started using the monotonic timer instead of mstime to measure
command execution time for stats, apparently this meant sampling the clock 3 times per command
rather than two (wince we also need the wall-clock time).
In some cases this causes a significant overhead.
This PR fixes that by avoiding the use of monotonic timer, except for the cases were we know it
should be extremely fast.
This PR also adds a new INFO field called `monotonic_clock` that shows which clock redis is using.
Co-authored-by: Oran Agra <oran@redislabs.com>
This PR unifies all the places that test if the current client is the
master client or AOF client, and uses a method to test that on all of
these.
Other than some refactoring, these are the actual implications:
- Replicas **don't** ignore disk error when processing commands not
coming from their master.
**This is important for PING to be used for health check of replicas**
- SETRANGE, APPEND, SETBIT, BITFIELD don't do proto_max_bulk_len check for AOF
- RM_Call in SCRIPT_MODE ignores disk error when coming from master /
AOF
- RM_Call in cluster mode ignores slot check when processing AOF
- Scripts ignore disk error when processing AOF
- Scripts **don't** ignore disk error on a replica, if the command comes
from clients other than the master
- SCRIPT KILL won't kill script coming from AOF
- Scripts **don't** skip OOM check on replica if the command comes from
clients other than the master
Note that Script, AOF, and module clients don't reach processCommand,
which is why some of the changes don't actually have any implications.
Note, reverting the change done to processCommand in 2f4240b9d9
should be dead code due to the above mentioned fact.
`hdr_value_at_percentile()` is part of the Hdr_Histogram library
used when generating `latencystats` report.
There's a pending optimization for this function which greatly
affects the performance of `info latencystats`.
https://github.com/HdrHistogram/HdrHistogram_c/pull/107
This PR:
1. Upgrades the sources in _deps/hdr_histogram_ to the latest Hdr_Histogram
version 0.11.5
2. Applies the referenced optimization.
3. Adds minor documentation about the hdr_histogram dependency which was
missing under _deps/README.md_.
benchmark on my machine:
running: `redis-benchmark -n 100000 info latencystats` on a clean build with no data.
| benchmark | RPS |
| ---- | ---- |
| before upgrade to v0.11.05 | 7,681 |
| before optimization | 12,474 |
| after optimization | 52,606 |
Co-authored-by: filipe oliveira <filipecosta.90@gmail.com>
Add a configuration option to attach an operating system-specific identifier to Redis sockets, supporting advanced network configurations using iptables (Linux) or ipfw (FreeBSD).
Changes:
1. Check the failed rewrite time threshold only when we actually consider triggering a rewrite.
i.e. this should be the last condition tested, since the test has side effects (increasing time threshold)
Could have happened in some rare scenarios
2. no limit in startup state (e.g. after restarting redis that previously failed and had many incr files)
3. the “triggered the limit” log would be recorded only when the limit status is returned
4. remove failure count in log (could be misleading in some cases)
Co-authored-by: chenyang8094 <chenyang8094@users.noreply.github.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Apparently, some modules can afford deprecating command arguments
(something that was never done in Redis, AFAIK), so in order to represent
this piece of information, we added the `deprecated_since` field to redisCommandArg
(in symmetry to the already existing `since` field).
This commit adds `const char *deprecated_since` to `RedisModuleCommandArg`,
which is technically a breaking change, but since 7.0 was not released yet, we decided to let it slide
Durability of database is a big and old topic, in this regard Redis use AOF to
support it, and `appendfsync=alwasys` policy is the most strict level, guarantee
all data is both written and synced on disk before reply success to client.
But there are some cases have been overlooked, and could lead to durability broken.
1. The most clear one is about threaded-io mode
we should also set client's write handler with `ae_barrier` in
`handleClientsWithPendingWritesUsingThreads`, or the write handler would be
called after read handler in the next event loop, it means the write command result
could be replied to client before flush to AOF.
2. About blocked client (mostly by module)
in `beforeSleep()`, `handleClientsBlockedOnKeys()` should be called before
`flushAppendOnlyFile()`, in case the unblocked clients modify data without persistence
but send reply.
3. When handling `ProcessingEventsWhileBlocked`
normally it takes place when lua/function/module timeout, and we give a chance to users
to kill the slow operation, but we should call `flushAppendOnlyFile()` before
`handleClientsWithPendingWrites()`, in case the other clients in the last event loop get
acknowledge before data persistence.
for a instance:
```
in the same event loop
client A executes set foo bar
client B executes eval "for var=1,10000000,1 do end" 0
```
after the script timeout, client A will get `OK` but lose data after restart (kill redis when
timeout) if we don't flush the write command to AOF.
4. A more complex case about `ProcessingEventsWhileBlocked`
it is lua timeout in transaction, for example
`MULTI; set foo bar; eval "for var=1,10000000,1 do end" 0; EXEC`, then client will get set
command's result before the whole transaction done, that breaks atomicity too.
fortunately, it's already fixed by #5428 (although it's not the original purpose just a side
effect : )), but module timeout should be fixed too.
case 1, 2, 3 are fixed in this commit, the module issue in case 4 needs a followup PR.
Add field to COMMAND DOCS response to denote the name of the module
that added that command.
COMMAND LIST can filter by module, but if you get the full commands list,
you may still wanna know which command belongs to which module.
The alternative would be to do MODULE LIST, and then multiple calls to COMMAND LIST
To remove `pending_querybuf`, the key point is reusing `querybuf`, it means master client's `querybuf` is not only used to parse command, but also proxy to sub-replicas.
1. add a new variable `repl_applied` for master client to record how many data applied (propagated via `replicationFeedStreamFromMasterStream()`) but not trimmed in `querybuf`.
2. don't sdsrange `querybuf` in `commandProcessed()`, we trim it to `repl_applied` after the whole replication pipeline processed to avoid fragmented `sdsrange`. And here are some scenarios we cannot trim to `qb_pos`:
* we don't receive complete command from master
* master client blocked because of client pause
* IO threads operate read, master client flagged with CLIENT_PENDING_COMMAND
In these scenarios, `qb_pos` points to the part of the current command or the beginning of next command, and the current command is not applied yet, so the `repl_applied` is not equal to `qb_pos`.
Some other notes:
* Do not do big arg optimization on master client, since we can only sdsrange `querybuf` after data sent to replicas.
* Set `qb_pos` and `repl_applied` to 0 when `freeClient` in `replicationCacheMaster`.
* Rewrite `processPendingCommandsAndResetClient` to `processPendingCommandAndInputBuffer`, let `processInputBuffer` to be called successively after `processCommandAndResetClient`.
The PR extends RM_Call with 3 new capabilities using new flags that
are given to RM_Call as part of the `fmt` argument.
It aims to assist modules that are getting a list of commands to be
executed from the user (not hard coded as part of the module logic),
think of a module that implements a new scripting language...
* `S` - Run the command in a script mode, this means that it will raise an
error if a command which are not allowed inside a script (flaged with the
`deny-script` flag) is invoked (like SHUTDOWN). In addition, on script mode,
write commands are not allowed if there is not enough good replicas (as
configured with `min-replicas-to-write`) and/or a disk error happened.
* `W` - no writes mode, Redis will reject any command that is marked with `write`
flag. Again can be useful to modules that implement a new scripting language
and wants to prevent any write commands.
* `E` - Return errors as RedisModuleCallReply. Today the errors that happened
before the command was invoked (like unknown commands or acl error) return
a NULL reply and set errno. This might be missing important information about
the failure and it is also impossible to just pass the error to the user using
RM_ReplyWithCallReply. This new flag allows you to get a RedisModuleCallReply
object with the relevant error message and treat it as if it was an error that was
raised by the command invocation.
Tests were added to verify the new code paths.
In addition small refactoring was done to share some code between modules,
scripts, and `processCommand` function:
1. `getAclErrorMessage` was added to `acl.c` to unified to log message extraction
from the acl result
2. `checkGoodReplicasStatus` was added to `replication.c` to check the status of
good replicas. It is used on `scriptVerifyWriteCommandAllow`, `RM_Call`, and
`processCommand`.
3. `writeCommandsGetDiskErrorMessage` was added to `server.c` to get the error
message on persistence failure. Again it is used on `scriptVerifyWriteCommandAllow`,
`RM_Call`, and `processCommand`.
In a benchmark we noticed we spend a relatively long time updating the client
memory usage leading to performance degradation.
Before #8687 this was performed in the client's cron and didn't affect performance.
But since introducing client eviction we need to perform this after filling the input
buffers and after processing commands. This also lead me to write this code to be
thread safe and perform it in the i/o threads.
It turns out that the main performance issue here is related to atomic operations
being performed while updating the total clients memory usage stats used for client
eviction (`server.stat_clients_type_memory[]`). This update needed to be atomic
because `updateClientMemUsage()` was called from the IO threads.
In this commit I make sure to call `updateClientMemUsage()` only from the main thread.
In case of threaded IO I call it for each client during the "fan-in" phase of the read/write
operation. This also means I could chuck the `updateClientMemUsageBucket()` function
which was called during this phase and embed it into `updateClientMemUsage()`.
Profiling shows this makes `updateClientMemUsage()` (on my x86_64 linux) roughly x4 faster.
In order to resolve some flaky tests which hard rely on examine memory footprint.
we introduce the following fixes:
# Fix in client-eviction test - by @yoav-steinberg
Sometime the libc allocator can use different size client struct allocations.
this may cause unexpected memory calculations to fail the test.
# Introduce new DEBUG command for disabling reply buffer resizing
In order to eliminate reply buffer resizing during specific tests.
we introduced the ability to disable (and enable) the resizing cron job
Co-authored-by: yoav-steinberg yoav@redislabs.com
* Moved configuration storage from a list to a hash table
* Configs are returned in a non-deterministic order. It's possible that a client was relying on order (hopefully not).
* Fixed an esoteric bug where if you did a set with an alias with an error, it would throw an error indicating a bug with the preferred name for that config.
This PR fix 2 issues on Lua scripting:
* Server error reply statistics (some errors were counted twice).
* Error code and error strings returning from scripts (error code was missing / misplaced).
## Statistics
a Lua script user is considered part of the user application, a sophisticated transaction,
so we want to count an error even if handled silently by the script, but when it is
propagated outwards from the script we don't wanna count it twice. on the other hand,
if the script decides to throw an error on its own (using `redis.error_reply`), we wanna
count that too.
Besides, we do count the `calls` in command statistics for the commands the script calls,
we we should certainly also count `failed_calls`.
So when a simple `eval "return redis.call('set','x','y')" 0` fails, it should count the failed call
to both SET and EVAL, but the `errorstats` and `total_error_replies` should be counted only once.
The PR changes the error object that is raised on errors. Instead of raising a simple Lua
string, Redis will raise a Lua table in the following format:
```
{
err='<error message (including error code)>',
source='<User source file name>',
line='<line where the error happned>',
ignore_error_stats_update=true/false,
}
```
The `luaPushError` function was modified to construct the new error table as describe above.
The `luaRaiseError` was renamed to `luaError` and is now simply called `lua_error` to raise
the table on the top of the Lua stack as the error object.
The reason is that since its functionality is changed, in case some Redis branch / fork uses it,
it's better to have a compilation error than a bug.
The `source` and `line` fields are enriched by the error handler (if possible) and the
`ignore_error_stats_update` is optional and if its not present then the default value is `false`.
If `ignore_error_stats_update` is true, the error will not be counted on the error stats.
When parsing Redis call reply, each error is translated to a Lua table on the format describe
above and the `ignore_error_stats_update` field is set to `true` so we will not count errors
twice (we counted this error when we invoke the command).
The changes in this PR might have been considered as a breaking change for users that used
Lua `pcall` function. Before, the error was a string and now its a table. To keep backward
comparability the PR override the `pcall` implementation and extract the error message from
the error table and return it.
Example of the error stats update:
```
127.0.0.1:6379> lpush l 1
(integer) 2
127.0.0.1:6379> eval "return redis.call('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value. script: e471b73f1ef44774987ab00bdf51f21fd9f7974a, on @user_script:1.
127.0.0.1:6379> info Errorstats
# Errorstats
errorstat_WRONGTYPE:count=1
127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_eval:calls=1,usec=341,usec_per_call=341.00,rejected_calls=0,failed_calls=1
cmdstat_info:calls=1,usec=35,usec_per_call=35.00,rejected_calls=0,failed_calls=0
cmdstat_lpush:calls=1,usec=14,usec_per_call=14.00,rejected_calls=0,failed_calls=0
cmdstat_get:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=1
```
## error message
We can now construct the error message (sent as a reply to the user) from the error table,
so this solves issues where the error message was malformed and the error code appeared
in the middle of the error message:
```diff
127.0.0.1:6379> eval "return redis.call('set','x','y')" 0
-(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'.
+(error) OOM command not allowed when used memory > 'maxmemory' @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479)
```
```diff
127.0.0.1:6379> eval "redis.call('get', 'l')" 0
-(error) ERR Error running script (call to f_8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1): @user_script:1: WRONGTYPE Operation against a key holding the wrong kind of value
+(error) WRONGTYPE Operation against a key holding the wrong kind of value script: 8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1, on @user_script:1.
```
Notica that `redis.pcall` was not change:
```
127.0.0.1:6379> eval "return redis.pcall('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value
```
## other notes
Notice that Some commands (like GEOADD) changes the cmd variable on the client stats so we
can not count on it to update the command stats. In order to be able to update those stats correctly
we needed to promote `realcmd` variable to be located on the client struct.
Tests was added and modified to verify the changes.
Related PR's: #10279, #10218, #10278, #10309
Co-authored-by: Oran Agra <oran@redislabs.com>
Adds the ability to track the lag of a consumer group (CG), that is, the number
of entries yet-to-be-delivered from the stream.
The proposed constant-time solution is in the spirit of "best-effort."
Partially addresses #8737.
## Description of approach
We add a new "entries_added" property to the stream. This starts at 0 for a new
stream and is incremented by 1 with every `XADD`. It is essentially an all-time
counter of the entries added to the stream.
Given the stream's length and this counter value, we can trivially find the logical
"entries_added" counter of the first ID if and only if the stream is contiguous.
A fragmented stream contains one or more tombstones generated by `XDEL`s.
The new "xdel_max_id" stream property tracks the latest tombstone.
The CG also tracks its last delivered ID's as an "entries_read" counter and
increments it independently when delivering new messages, unless the this
read counter is invalid (-1 means invalid offset). When the CG's counter is
available, the reported lag is the difference between added and read counters.
Lastly, this also adds a "first_id" field to the stream structure in order to make
looking it up cheaper in most cases.
## Limitations
There are two cases in which the mechanism isn't able to track the lag.
In these cases, `XINFO` replies with `null` in the "lag" field.
The first case is when a CG is created with an arbitrary last delivered ID,
that isn't "0-0", nor the first or the last entries of the stream. In this case,
it is impossible to obtain a valid read counter (short of an O(N) operation).
The second case is when there are one or more tombstones fragmenting
the stream's entries range.
In both cases, given enough time and assuming that the consumers are
active (reading and lacking) and advancing, the CG should be able to
catch up with the tip of the stream and report zero lag.
Once that's achieved, lag tracking would resume as normal (until the
next tombstone is set).
## API changes
* `XGROUP CREATE` added with the optional named argument `[ENTRIESREAD entries-read]`
for explicitly specifying the new CG's counter.
* `XGROUP SETID` added with an optional positional argument `[ENTRIESREAD entries-read]`
for specifying the CG's counter.
* `XINFO` reports the maximal tombstone ID, the recorded first entry ID, and total
number of entries added to the stream.
* `XINFO` reports the current lag and logical read counter of CGs.
* `XSETID` is an internal command that's used in replication/aof. It has been added with
the optional positional arguments `[ENTRIESADDED entries-added] [MAXDELETEDID max-deleted-entry-id]`
for propagating the CG's offset and maximal tombstone ID of the stream.
## The generic unsolved problem
The current stream implementation doesn't provide an efficient way to obtain the
approximate/exact size of a range of entries. While it could've been nice to have
that ability (#5813) in general, let alone specifically in the context of CGs, the risk
and complexities involved in such implementation are in all likelihood prohibitive.
## A refactoring note
The `streamGetEdgeID` has been refactored to accommodate both the existing seek
of any entry as well as seeking non-deleted entries (the addition of the `skip_tombstones`
argument). Furthermore, this refactoring also migrated the seek logic to use the
`streamIterator` (rather than `raxIterator`) that was, in turn, extended with the
`skip_tombstones` Boolean struct field to control the emission of these.
Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Avoid sprintf/ll2string on setDeferredAggregateLen()/addReplyLongLongWithPrefix() when we can used shared objects.
In some pipelined workloads this achieves about 10% improvement.
Co-authored-by: Oran Agra <oran@redislabs.com>
Current implementation simple idle client which serves no traffic still
use ~17Kb of memory. this is mainly due to a fixed size reply buffer
currently set to 16kb.
We have encountered some cases in which the server operates in a low memory environments.
In such cases a user who wishes to create large connection pools to support potential burst period,
will exhaust a large amount of memory to maintain connected Idle clients.
Some users may choose to "sacrifice" performance in order to save memory.
This commit introduce a dynamic mechanism to shrink and expend the client reply buffer based on
periodic observed peak.
the algorithm works as follows:
1. each time a client reply buffer has been fully written, the last recorded peak is updated:
new peak = MAX( last peak, current written size)
2. during clients cron we check for each client if the last observed peak was:
a. matching the current buffer size - in which case we expend (resize) the buffer size by 100%
b. less than half the buffer size - in which case we shrink the buffer size by 50%
3. In any case we will **not** resize the buffer in case:
a. the current buffer peak is less then the current buffer usable size and higher than 1/2 the
current buffer usable size
b. the value of (current buffer usable size/2) is less than 1Kib
c. the value of (current buffer usable size*2) is larger than 16Kib
4. the peak value is reset to the current buffer position once every **5** seconds. we maintain a new
field in the client structure (buf_peak_last_reset_time) which is used to keep track of how long it
passed since the last buffer peak reset.
### **Interface changes:**
**CIENT LIST** - now contains 2 new extra fields:
rbs= < the current size in bytes of the client reply buffer >
rbp=< the current value in bytes of the last observed buffer peak position >
**INFO STATS** - now contains 2 new statistics:
reply_buffer_shrinks = < total number of buffer shrinks performed >
reply_buffer_expends = < total number of buffer expends performed >
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Yoav Steinberg <yoav@redislabs.com>
This implements the following main pieces of functionality:
* Renames key spec "CHANNEL" to be "NOT_KEY", and update the documentation to
indicate it's for cluster routing and not for any other key related purpose.
* Add the getchannels-api, so that modules can now define commands that are subject to
ACL channel permission checks.
* Add 4 new flags that describe how a module interacts with a command (SUBSCRIBE, PUBLISH,
UNSUBSCRIBE, and PATTERN). They are all technically composable, however not sure how a
command could both subscribe and unsubscribe from a command at once, but didn't see
a reason to add explicit validation there.
* Add two new module apis RM_ChannelAtPosWithFlags and RM_IsChannelsPositionRequest to
duplicate the functionality provided by the keys position APIs.
* The RM_ACLCheckChannelPermissions (only released in 7.0 RC1) was changed to take flags
rather than a boolean literal.
* The RM_ACLCheckKeyPermissions (only released in 7.0 RC1) was changed to take flags
corresponding to keyspecs instead of custom permission flags. These keyspec flags mimic
the flags for ACLCheckChannelPermissions.
This is a followup work for #10278, and a discussion about #10279
The changes:
- fix failed_calls in command stats for blocked clients that got error.
including CLIENT UNBLOCK, and module replying an error from a thread.
- fix latency stats for XREADGROUP that filed with -NOGROUP
Theory behind which errors should be counted:
- error stats represents errors returned to the user, so an error handled by a
module should not be counted.
- total error counter should be the same.
- command stats represents execution of commands (even with RM_Call, and if
they fail or get rejected it counts these calls in commandstats, so it should
also count failed_calls)
Some thoughts about Scripts:
for scripts it could be different since they're part of user code, not the infra (not an extension to redis)
we certainly want commandstats to contain all calls and errors
a simple script is like mult-exec transaction so an error inside it should be counted in error stats
a script that replies with an error to the user (using redis.error_reply) should also be counted in error stats
but then the problem is that a plain `return redis.call("SET")` should not be counted twice (once for the SET
and once for EVAL)
so that's something left to be resolved in #10279
Add aof_rewrites and rdb_snapshots counters to info.
This is useful to figure our if a rewrite or snapshot happened since last check.
This was part of the (ongoing) effort to provide a safe backup solution for multipart-aof backups.
This is an enhancement for INFO command, previously INFO only support one argument
for different info section , if user want to get more categories information, either perform
INFO all / default or calling INFO for multiple times.
**Description of the feature**
The goal of adding this feature is to let the user retrieve multiple categories via the INFO
command, and still avoid emitting the same section twice.
A use case for this is like Redis Sentinel, which periodically calling INFO command to refresh
info from monitored Master/Slaves, only Server and Replication part categories are used for
parsing information. If the INFO command can return just enough categories that client side
needs, it can save a lot of time for client side parsing it as well as network bandwidth.
**Implementation**
To share code between redis, sentinel, and other users of INFO (DEBUG and modules),
we have a new `genInfoSectionDict` function that returns a dict and some boolean flags
(e.g. `all`) to the caller (built from user input).
Sentinel is later purging unwanted sections from that, and then it is forwarded to the info `genRedisInfoString`.
**Usage Examples**
INFO Server Replication
INFO CPU Memory
INFO default commandstats
Co-authored-by: Oran Agra <oran@redislabs.com>
- add COMMAND GETKEYSANDFLAGS sub-command
- add RM_KeyAtPosWithFlags and GetCommandKeysWithFlags
- RM_KeyAtPos and RM_CreateCommand set flags requiring full access for keys
- RM_CreateCommand set VARIABLE_FLAGS
- expose `variable_flags` flag in COMMAND INFO key-specs
- getKeysFromCommandWithSpecs prefers key-specs over getkeys-api
- add tests for all of these
If summary or since is empty, we used to return NULL in
COMMAND DOCS. Currently all redis commands will have these
two fields.
But not for module command, summary and since are optional
for RM_SetCommandInfo. With the change in #10043, if a module
command doesn't have the summary or since, redis-cli will
crash (see #10250).
In this commit, COMMAND DOCS avoid adding summary or since
when they are missing.