This commit does not remove redis.call/pcall just yet. It also does not
rename Redis in error messages such as "Please specify at least one
argument for this redis lib call". This allows users to maintain full
backwards compatibility while introducing an option to use server.call
for new code.
I verified that the unit tests pass. Also manually verified that the
redis-server responds to server.call invocations within lua scripting.
Also verified that function registration works as expected.
```
[ok]: EVAL - is Lua able to call Redis API? (0 ms)
[ok]: EVAL - is Lua able to call Server API? (1 ms)
[ok]: EVAL - No arguments to redis.call/pcall is considered an error (0 ms)
[ok]: EVAL - No arguments to server.call/pcall is considered an error (1 ms)
```
---------
Signed-off-by: Parth Patel <661497+parthpatel@users.noreply.github.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
When loading a function from either RDB/AOF or a replica, it is essential not to
fail on timeout errors. The loading time may vary due to various factors, such as
hardware specifications or the system's workload during the loading process.
Once a function has been successfully loaded, it should be allowed to load from
persistence or on replicas without encountering a timeout failure.
To maintain a clear separation between the engine and Redis internals, the
implementation refrains from directly checking the state of Redis within the
engine itself. Instead, the engine receives the desired timeout as part of the
library creation and duly respects this timeout value. If Redis wishes to disable
any timeout, it can simply send a value of 0.
Technically declaring a prototype with an empty declaration has been deprecated since the early days of C, but we never got a warning for it. C2x will apparently be introducing a breaking change if you are using this type of declarator, so Clang 15 has started issuing a warning with -pedantic. Although not apparently a problem for any of the compiler we build on, if feels like the right thing is to properly adhere to the C standard and use (void).
This pr mainly has the following four changes:
1. Add missing lua_pop in `luaGetFromRegistry`.
This bug affects `redis.register_function`, where `luaGetFromRegistry` in
`luaRegisterFunction` will return null when we call `redis.register_function` nested.
.e.g
```
FUNCTION LOAD "#!lua name=mylib \n local lib=redis \n lib.register_function('f2', function(keys, args) lib.register_function('f1', function () end) end)"
fcall f2 0
````
But since we exit when luaGetFromRegistry returns null, it does not cause the stack to grow indefinitely.
3. When getting `REGISTRY_RUN_CTX_NAME` from the registry, use `serverAssert`
instead of error return. Since none of these lua functions are registered at the time
of function load, scriptRunCtx will never be NULL.
4. Add `serverAssert` for `luaLdbLineHook`, `luaEngineLoadHook`.
5. Remove `luaGetFromRegistry` from `redis_math_random` and
`redis_math_randomseed`, it looks like they are redundant.
The white list is done by setting a metatable on the global table before initializing
any library. The metatable set the `__newindex` field to a function that check
the white list before adding the field to the table. Fields which is not on the
white list are simply ignored.
After initialization phase is done we protect the global table and each table
that might be reachable from the global table. For each table we also protect
the table metatable if exists.
Use the new `lua_enablereadonlytable` Lua API to protect the global tables of
both evals scripts and functions. For eval scripts, the implemetation is easy,
We simply call `lua_enablereadonlytable` on the global table to turn it into
a readonly table.
On functions its more complecated, we want to be able to switch globals between
load run and function run. To achieve this, we create a new empty table that
acts as the globals table for function, we control the actual globals using metatable
manipulation. Notice that even if the user gets a pointer to the original tables, all
the tables are set to be readonly (using `lua_enablereadonlytable` Lua API) so he can
not change them. The following inlustration better explain the solution:
```
Global table {} <- global table metatable {.__index = __real_globals__}
```
The `__real_globals__` is set depends on the run context (function load or function call).
Why this solution is needed and its not enough to simply switch globals?
When we run in the context of function load and create our functions, our function gets
the current globals that was set when they were created. Replacing the globals after
the creation will not effect them. This is why this trick it mandatory.
By the convention of errors, there is supposed to be a space between the code and the name.
While looking at some lua stuff I noticed that interpreter errors were not adding the space,
so some clients will try to map the detailed error message into the error.
We have tests that hit this condition, but they were just checking that the string "starts" with ERR.
I updated some other tests with similar incorrect string checking. This isn't complete though, as
there are other ways we check for ERR I didn't fix.
Produces some fun output like:
```
# Errorstats
errorstat_ERR:count=1
errorstat_ERRuser_script_1_:count=1
```
This PR fix 2 issues on Lua scripting:
* Server error reply statistics (some errors were counted twice).
* Error code and error strings returning from scripts (error code was missing / misplaced).
## Statistics
a Lua script user is considered part of the user application, a sophisticated transaction,
so we want to count an error even if handled silently by the script, but when it is
propagated outwards from the script we don't wanna count it twice. on the other hand,
if the script decides to throw an error on its own (using `redis.error_reply`), we wanna
count that too.
Besides, we do count the `calls` in command statistics for the commands the script calls,
we we should certainly also count `failed_calls`.
So when a simple `eval "return redis.call('set','x','y')" 0` fails, it should count the failed call
to both SET and EVAL, but the `errorstats` and `total_error_replies` should be counted only once.
The PR changes the error object that is raised on errors. Instead of raising a simple Lua
string, Redis will raise a Lua table in the following format:
```
{
err='<error message (including error code)>',
source='<User source file name>',
line='<line where the error happned>',
ignore_error_stats_update=true/false,
}
```
The `luaPushError` function was modified to construct the new error table as describe above.
The `luaRaiseError` was renamed to `luaError` and is now simply called `lua_error` to raise
the table on the top of the Lua stack as the error object.
The reason is that since its functionality is changed, in case some Redis branch / fork uses it,
it's better to have a compilation error than a bug.
The `source` and `line` fields are enriched by the error handler (if possible) and the
`ignore_error_stats_update` is optional and if its not present then the default value is `false`.
If `ignore_error_stats_update` is true, the error will not be counted on the error stats.
When parsing Redis call reply, each error is translated to a Lua table on the format describe
above and the `ignore_error_stats_update` field is set to `true` so we will not count errors
twice (we counted this error when we invoke the command).
The changes in this PR might have been considered as a breaking change for users that used
Lua `pcall` function. Before, the error was a string and now its a table. To keep backward
comparability the PR override the `pcall` implementation and extract the error message from
the error table and return it.
Example of the error stats update:
```
127.0.0.1:6379> lpush l 1
(integer) 2
127.0.0.1:6379> eval "return redis.call('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value. script: e471b73f1ef44774987ab00bdf51f21fd9f7974a, on @user_script:1.
127.0.0.1:6379> info Errorstats
# Errorstats
errorstat_WRONGTYPE:count=1
127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_eval:calls=1,usec=341,usec_per_call=341.00,rejected_calls=0,failed_calls=1
cmdstat_info:calls=1,usec=35,usec_per_call=35.00,rejected_calls=0,failed_calls=0
cmdstat_lpush:calls=1,usec=14,usec_per_call=14.00,rejected_calls=0,failed_calls=0
cmdstat_get:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=1
```
## error message
We can now construct the error message (sent as a reply to the user) from the error table,
so this solves issues where the error message was malformed and the error code appeared
in the middle of the error message:
```diff
127.0.0.1:6379> eval "return redis.call('set','x','y')" 0
-(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'.
+(error) OOM command not allowed when used memory > 'maxmemory' @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479)
```
```diff
127.0.0.1:6379> eval "redis.call('get', 'l')" 0
-(error) ERR Error running script (call to f_8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1): @user_script:1: WRONGTYPE Operation against a key holding the wrong kind of value
+(error) WRONGTYPE Operation against a key holding the wrong kind of value script: 8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1, on @user_script:1.
```
Notica that `redis.pcall` was not change:
```
127.0.0.1:6379> eval "return redis.pcall('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value
```
## other notes
Notice that Some commands (like GEOADD) changes the cmd variable on the client stats so we
can not count on it to update the command stats. In order to be able to update those stats correctly
we needed to promote `realcmd` variable to be located on the client struct.
Tests was added and modified to verify the changes.
Related PR's: #10279, #10218, #10278, #10309
Co-authored-by: Oran Agra <oran@redislabs.com>
# Redis Functions Flags
Following the discussion on #10025 Added Functions Flags support.
The PR is divided to 2 sections:
* Add named argument support to `redis.register_function` API.
* Add support for function flags
## `redis.register_function` named argument support
The first part of the PR adds support for named argument on `redis.register_function`, example:
```
redis.register_function{
function_name='f1',
callback=function()
return 'hello'
end,
description='some desc'
}
```
The positional arguments is also kept, which means that it still possible to write:
```
redis.register_function('f1', function() return 'hello' end)
```
But notice that it is no longer possible to pass the optional description argument on the positional
argument version. Positional argument was change to allow passing only the mandatory arguments
(function name and callback). To pass more arguments the user must use the named argument version.
As with positional arguments, the `function_name` and `callback` is mandatory and an error will be
raise if those are missing. Also, an error will be raise if an unknown argument name is given or the
arguments type is wrong.
Tests was added to verify the new syntax.
## Functions Flags
The second part of the PR is adding functions flags support.
Flags are given to Redis when the engine calls `functionLibCreateFunction`, supported flags are:
* `no-writes` - indicating the function perform no writes which means that it is OK to run it on:
* read-only replica
* Using FCALL_RO
* If disk error detected
It will not be possible to run a function in those situations unless the function turns on the `no-writes` flag
* `allow-oom` - indicate that its OK to run the function even if Redis is in OOM state, if the function will
not turn on this flag it will not be possible to run it if OOM reached (even if the function declares `no-writes`
and even if `fcall_ro` is used). If this flag is set, any command will be allow on OOM (even those that is
marked with CMD_DENYOOM). The assumption is that this flag is for advance users that knows its
meaning and understand what they are doing, and Redis trust them to not increase the memory usage.
(e.g. it could be an INCR or a modification on an existing key, or a DEL command)
* `allow-state` - indicate that its OK to run the function on stale replica, in this case we will also make
sure the function is only perform `stale` commands and raise an error if not.
* `no-cluster` - indicate to disallow running the function if cluster is enabled.
Default behaviure of functions (if no flags is given):
1. Allow functions to read and write
2. Do not run functions on OOM
3. Do not run functions on stale replica
4. Allow functions on cluster
### Lua API for functions flags
On Lua engine, it is possible to give functions flags as `flags` named argument:
```
redis.register_function{function_name='f1', callback=function() return 1 end, flags={'no-writes', 'allow-oom'}, description='description'}
```
The function flags argument must be a Lua table that contains all the requested flags, The following
will result in an error:
* Unknown flag
* Wrong flag type
Default behaviour is the same as if no flags are used.
Tests were added to verify all flags functionality
## Additional changes
* mark FCALL and FCALL_RO with CMD_STALE flag (unlike EVAL), so that they can run if the function was
registered with the `allow-stale` flag.
* Verify `CMD_STALE` on `scriptCall` (`redis.call`), so it will not be possible to call commands from script while
stale unless the command is marked with the `CMD_STALE` flags. so that even if the function is allowed while
stale we do not allow it to bypass the `CMD_STALE` flag of commands.
* Flags section was added to `FUNCTION LIST` command to provide the set of flags for each function:
```
> FUNCTION list withcode
1) 1) "library_name"
2) "test"
3) "engine"
4) "LUA"
5) "description"
6) (nil)
7) "functions"
8) 1) 1) "name"
2) "f1"
3) "description"
4) (nil)
5) "flags"
6) (empty array)
9) "library_code"
10) "redis.register_function{function_name='f1', callback=function() return 1 end}"
```
* Added API to get Redis version from within a script, The redis version can be provided using:
1. `redis.REDIS_VERSION` - string representation of the redis version in the format of MAJOR.MINOR.PATH
2. `redis.REDIS_VERSION_NUM` - number representation of the redis version in the format of `0x00MMmmpp`
(`MM` - major, `mm` - minor, `pp` - patch). The number version can be used to check if version is greater or less
another version. The string version can be used to return to the user or print as logs.
This new API is provided to eval scripts and functions, it also possible to use this API during functions loading phase.
# Redis Function Libraries
This PR implements Redis Functions Libraries as describe on: https://github.com/redis/redis/issues/9906.
Libraries purpose is to provide a better code sharing between functions by allowing to create multiple
functions in a single command. Functions that were created together can safely share code between
each other without worrying about compatibility issues and versioning.
Creating a new library is done using 'FUNCTION LOAD' command (full API is described below)
This PR introduces a new struct called libraryInfo, libraryInfo holds information about a library:
* name - name of the library
* engine - engine used to create the library
* code - library code
* description - library description
* functions - the functions exposed by the library
When Redis gets the `FUNCTION LOAD` command it creates a new empty libraryInfo.
Redis passes the `CODE` to the relevant engine alongside the empty libraryInfo.
As a result, the engine will create one or more functions by calling 'libraryCreateFunction'.
The new funcion will be added to the newly created libraryInfo. So far Everything is happening
locally on the libraryInfo so it is easy to abort the operation (in case of an error) by simply
freeing the libraryInfo. After the library info is fully constructed we start the joining phase by
which we will join the new library to the other libraries currently exist on Redis.
The joining phase make sure there is no function collision and add the library to the
librariesCtx (renamed from functionCtx). LibrariesCtx is used all around the code in the exact
same way as functionCtx was used (with respect to RDB loading, replicatio, ...).
The only difference is that apart from function dictionary (maps function name to functionInfo
object), the librariesCtx contains also a libraries dictionary that maps library name to libraryInfo object.
## New API
### FUNCTION LOAD
`FUNCTION LOAD <ENGINE> <LIBRARY NAME> [REPLACE] [DESCRIPTION <DESCRIPTION>] <CODE>`
Create a new library with the given parameters:
* ENGINE - REPLACE Engine name to use to create the library.
* LIBRARY NAME - The new library name.
* REPLACE - If the library already exists, replace it.
* DESCRIPTION - Library description.
* CODE - Library code.
Return "OK" on success, or error on the following cases:
* Library name already taken and REPLACE was not used
* Name collision with another existing library (even if replace was uses)
* Library registration failed by the engine (usually compilation error)
## Changed API
### FUNCTION LIST
`FUNCTION LIST [LIBRARYNAME <LIBRARY NAME PATTERN>] [WITHCODE]`
Command was modified to also allow getting libraries code (so `FUNCTION INFO` command is no longer
needed and removed). In addition the command gets an option argument, `LIBRARYNAME` allows you to
only get libraries that match the given `LIBRARYNAME` pattern. By default, it returns all libraries.
### INFO MEMORY
Added number of libraries to `INFO MEMORY`
### Commands flags
`DENYOOM` flag was set on `FUNCTION LOAD` and `FUNCTION RESTORE`. We consider those commands
as commands that add new data to the dateset (functions are data) and so we want to disallows
to run those commands on OOM.
## Removed API
* FUNCTION CREATE - Decided on https://github.com/redis/redis/issues/9906
* FUNCTION INFO - Decided on https://github.com/redis/redis/issues/9899
## Lua engine changes
When the Lua engine gets the code given on `FUNCTION LOAD` command, it immediately runs it, we call
this run the loading run. Loading run is not a usual script run, it is not possible to invoke any
Redis command from within the load run.
Instead there is a new API provided by `library` object. The new API's:
* `redis.log` - behave the same as `redis.log`
* `redis.register_function` - register a new function to the library
The loading run purpose is to register functions using the new `redis.register_function` API.
Any attempt to use any other API will result in an error. In addition, the load run is has a time
limit of 500ms, error is raise on timeout and the entire operation is aborted.
### `redis.register_function`
`redis.register_function(<function_name>, <callback>, [<description>])`
This new API allows users to register a new function that will be linked to the newly created library.
This API can only be called during the load run (see definition above). Any attempt to use it outside
of the load run will result in an error.
The parameters pass to the API are:
* function_name - Function name (must be a Lua string)
* callback - Lua function object that will be called when the function is invokes using fcall/fcall_ro
* description - Function description, optional (must be a Lua string).
### Example
The following example creates a library called `lib` with 2 functions, `f1` and `f1`, returns 1 and 2 respectively:
```
local function f1(keys, args)
return 1
end
local function f2(keys, args)
return 2
end
redis.register_function('f1', f1)
redis.register_function('f2', f2)
```
Notice: Unlike `eval`, functions inside a library get the KEYS and ARGV as arguments to the
functions and not as global.
### Technical Details
On the load run we only want the user to be able to call a white list on API's. This way, in
the future, if new API's will be added, the new API's will not be available to the load run
unless specifically added to this white list. We put the while list on the `library` object and
make sure the `library` object is only available to the load run by using [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv) API. This API allows us to set
the `globals` of a function (and all the function it creates). Before starting the load run we
create a new fresh Lua table (call it `g`) that only contains the `library` API (we make sure
to set global protection on this table just like the general global protection already exists
today), then we use [lua_setfenv](https://www.lua.org/manual/5.1/manual.html#lua_setfenv)
to set `g` as the global table of the load run. After the load run finished we update `g`
metatable and set `__index` and `__newindex` functions to be `_G` (Lua default globals),
we also pop out the `library` object as we do not need it anymore.
This way, any function that was created on the load run (and will be invoke using `fcall`) will
see the default globals as it expected to see them and will not have the `library` API anymore.
An important outcome of this new approach is that now we can achieve a distinct global table
for each library (it is not yet like that but it is very easy to achieve it now). In the future we can
decide to remove global protection because global on different libraries will not collide or we
can chose to give different API to different libraries base on some configuration or input.
Notice that this technique was meant to prevent errors and was not meant to prevent malicious
user from exploit it. For example, the load run can still save the `library` object on some local
variable and then using in `fcall` context. To prevent such a malicious use, the C code also make
sure it is running in the right context and if not raise an error.
Redis function unit is located inside functions.c
and contains Redis Function implementation:
1. FUNCTION commands:
* FUNCTION CREATE
* FCALL
* FCALL_RO
* FUNCTION DELETE
* FUNCTION KILL
* FUNCTION INFO
2. Register engine
In addition, this commit introduce the first engine
that uses the Redis Function capabilities, the
Lua engine.