8606 Commits

Author SHA1 Message Date
Binbin
c98a28a848
Fix LREM count LONG_MIN overflow minor issue (#12465)
Limit the range of LREM count to -LONG_MAX ~ LONG_MAX.
Before the fix, passing -LONG_MAX would cause an overflow
and would effectively be the same as passing 0. (Because
this condition `toremove && removed == toremove `can never
be satisfied).

This is a minor fix as it shouldn't really affect users,
more like a cleanup.
2023-08-21 12:50:41 +03:00
Yves LeBras
16988208bd
config.memkeys init for consistency (#12505)
Initializing `memkeys` to 0 for consistency and clarity.
the config struct is anyway zeroed, but other fields are explicitly initialized.
2023-08-21 08:17:07 +03:00
meiravgri
fe47c2027b
Signal handler attributes (#12426)
This PR purpose is to make the crash report process thread safe.
main changes include:

1. `setupSigSegvHandler()` is introduced to initialize the signal handler.
This function first initializes the signal handler mutex (if not initialized yet)
and then registers the process to the signal handler. 

2. **sigsegvHandler** flags :
SA_NODEFER - don't add the signal to the process signal mask. We use this
flag because we want to be able to handle a second call to the signal manually.
removed SA_RESETHAND: this flag resets the signal handler function upon the first
entrance to the registered function. The reason to use this flag is to protect from
recursively entering the signal handler by the same thread. But, it also means
that if a second thread crashes while handling a signal, the process will be
terminated immediately and we won't get the crash report.
In this PR we discard this flag. The signal handler guard described below purpose
is to solve the above issues.

3. Add a **signal handler lock** with ERRORCHECK attributes. 
The lock's purpose is to ensure that only one thread generates a crash report.
Once a second thread enters the signal handler it will be blocked.
We use the ERRORCHECK lock in order to protect from possible deadlock in
case the thread handling the crash gets a signal. In the latest scenario, we log
what we have collected until the handler crashed.

At the end of the crash report we reset the signal handler SIG_DFL, with no flags, and
rethrow the signal to generate a core dump (if enabled) and exit the process.

During the work on this PR we wanted to understand the historical reasons for
how crash is handled.
With respect to the choice of the flag, we believe the **SA_RESETHAND** was not
added for any specific purpose.
**SA_ONSTACK** which is removed here from bugReportEnd(), was originally also
set in the initial registration to signal handler, but removed in 3ada43e73. In addition,
it was removed from another location in deee2c1ef with the following description,
which is also relevant to why it should be removed from bugReportEnd:

> it seems to be some valgrind bug with SA_ONSTACK.
> SA_ONSTACK seems unneeded since WD is not recursive (SA_NODEFER was removed),
> also, not sure if it's even valid without a call to sigaltstack()
2023-08-20 19:16:45 +03:00
Binbin
44cc0fcb9d
redis-cli --stat take dbnum value from CONFIG GET to output total keys (#12279)
In the past we hardcoded it to 20, causing it to not count keys
for more databases.
2023-08-16 10:54:37 +03:00
Tyler Bream (Event pipeline)
ac6bc5d1a8
redis-cli: Fix print of keys per cluster host when over int max (#11698)
When running cluster info, and the number of keys overflows the integer
value, the summary no longer makes sense. This fixes by using an appropriate
type to handle values over the max int value.
2023-08-16 10:48:49 +03:00
WangYu
17904780ae
skip the rehashed entries in dictNext (#12386)
If dict is rehashing, the  entries in the head of table[0] is moved to table[1]
and all entries in `table[0][0:rehashidx]` is NULL.

`dictNext` start looking for non-NULL entry from table 0 index 0, and the first call
of `dictNext` on a rehashing dict will Iterate many times to skip those NULL entries.
We can easily skip those entries by setting `iter->index` as `iter->d->rehashidx` when
dict is rehashing and it's the first call of dictNext (`iter->index == -1 && iter->table == 0`).

Co-authored-by: sundb <sundbcn@gmail.com>
2023-08-16 10:45:26 +03:00
Wen Hui
965dc90b72
change return type to be consistant (#12479)
Currently rdbSaveMillisecondTime, rdbSaveDoubleValue api's return type is
int but they return the value directly from rdbWriteRaw function which has the
return type of ssize_t. As this may cause overflow to int so changed to ssize_t.
2023-08-16 10:38:59 +03:00
Binbin
f4549d1cf4
Fix CLUSTER REPLICAS time complexity, should be O(N) (#12477)
We iterate over all replicas to get the result, the time complexity
should be O(N), like CLUSTER NODES complexity is O(N).
2023-08-14 20:57:55 -07:00
Madelyn Olson
7c179f9bf4
Fixed a bug where sequential matching ACL rules weren't compressed (#12472)
When adding a new ACL rule was added, an attempt was made to remove
any "overlapping" rules. However, there when a match was found, the search
was not resumed at the right location, but instead after the original position of
the original command.

For example, if the current rules were `-config +config|get` and a rule `+config`
was added. It would identify that `-config` was matched, but it would skip over
`+config|get`, leaving the compacted rule `-config +config`. This would be evaluated
safely, but looks weird.

This bug can only be triggered with subcommands, since that is the only way to
have sequential matching rules. Resolves #12470. This is also only present in 7.2.
I think there was also a minor risk of removing another valid rule, since it would start
the search of the next command at an arbitrary point. I couldn't find a valid offset that
would have cause a match using any of the existing commands that have subcommands
with another command.
2023-08-10 09:58:53 +03:00
zhaozhao.zz
1b6bdff48d
optimize the check of kill pubsub clients after modifying ACL rules (#12457)
if there are no subscribers, we can ignore the operation
2023-08-05 10:00:54 +03:00
zhaozhao.zz
8226f39fb2
do not call handleClientsBlockedOnKeys inside yielding command (#12459)
Fix the assertion when a busy script (timeout) signal ready keys (like LPUSH),
and then an arbitrary client's `allow-busy` command steps into `handleClientsBlockedOnKeys`
try wake up clients blocked on keys (like BLPOP).

Reproduction process:
1. start a redis with aof
    `./redis-server --appendonly yes`
2. exec blpop
    `127.0.0.1:6379> blpop a 0`
3. use another client call a busy script and this script push the blocked key
    `127.0.0.1:6379> eval "redis.call('lpush','a','b') while(1) do end" 0`
4. user a new client call an allow-busy command like auth
    `127.0.0.1:6379> auth a`

BTW, this issue also break the atomicity of script.

This bug has been around for many years, the old versions only have the
atomic problem, only 7.0/7.2 has the assertion problem.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-08-05 09:52:03 +03:00
Binbin
7af9f4b36e
Fix GEOHASH / GEODIST / GEOPOS time complexity, should be O(1) (#12445)
GEOHASH / GEODIST / GEOPOS use zsetScore to get the score, in skiplist encoding,
we use dictFind to get the score, which is O(1), same as ZSCORE command.
It is not clear why these commands had O(Log(N)), and O(N) until now.
2023-08-05 07:29:24 +03:00
Meir Shpilraien (Spielrein)
2ee1bbb53b
Ensure that the function load timeout is disabled during loading from RDB/AOF and on replicas. (#12451)
When loading a function from either RDB/AOF or a replica, it is essential not to
fail on timeout errors. The loading time may vary due to various factors, such as
hardware specifications or the system's workload during the loading process.
Once a function has been successfully loaded, it should be allowed to load from
persistence or on replicas without encountering a timeout failure.

To maintain a clear separation between the engine and Redis internals, the
implementation refrains from directly checking the state of Redis within the
engine itself. Instead, the engine receives the desired timeout as part of the
library creation and duly respects this timeout value. If Redis wishes to disable
any timeout, it can simply send a value of 0.
2023-08-02 11:43:31 +03:00
zhaozhao.zz
90ab91f00b
fix false success and a memory leak for ACL selector with bad parenthesis combination (#12452)
When doing merge selector, we should check whether the merge
has started (i.e., whether open_bracket_start is -1) every time.
Otherwise, encountering an illegal selector pattern could succeed
and also cause memory leaks, for example:

```
acl setuser test1 (+PING (+SELECT (+DEL )
```

The above would leak memory and succeed with only DEL being applied,
and would now error after the fix.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-08-02 10:46:06 +03:00
zhaozhao.zz
01eb939a06
update monitor client's memory and evict correctly (#12420)
A bug introduced in #11657 (7.2 RC1), causes client-eviction (#8687)
and INFO to have inaccurate memory usage metrics of MONITOR clients.

Because the type in `c->type` and the type in `getClientType()` are confusing
(in the later, `CLIENT_TYPE_NORMAL` not `CLIENT_TYPE_SLAVE`), the comment
we wrote in `updateClientMemUsageAndBucket` was wrong, and in fact that function
didn't skip monitor clients.
And since it doesn't skip monitor clients, it was wrong to delete the call for it from
`replicationFeedMonitors` (it wasn't a NOP).
That deletion could mean that the monitor client memory usage is not always up to
date (updated less frequently, but still a candidate for client eviction).
2023-07-25 16:10:38 +03:00
nihohit
9f512017aa
Update request/response policies. (#12417)
changing the response and request policy of a few commands,
see https://redis.io/docs/reference/command-tips

1. RANDOMKEY used to have no response policy, which means
  that when sent to multiple shards, the responses should be aggregated.
  this normally applies to commands that return arrays, but since RANDOMKEY
  replies with a simple string, it actually requires a SPECIAL response policy
  (for the client to select just one)
2. SCAN used to have no response policy, but although the key names part of
  the response can be aggregated, the cursor part certainly can't.
3. MSETNX had a request policy of MULTI_SHARD and response policy of AGG_MIN,
  but in fact the contract with MSETNX is that when one key exists, it returns 0
  and doesn't set any key, routing it to multiple shards would mean that if one failed
  and another succeeded, it's atomicity is broken and it's impossible to return a valid
  response to the caller.

Co-authored-by: Shachar Langbeheim <shachlan@amazon.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-07-25 10:21:23 +03:00
Makdon
2495b90a64
redis-cli: use previous hostip when not provided by redis cluster server (#12273)
When the redis server cluster running on cluster-preferred-endpoint-type unknown-endpoint mode, and receive a request that should be redirected to another redis server node, it does not reply the hostip, but a empty host like MOVED 3999 :6381.

The redis-cli would try to connect to an address without a host, which cause the issue:
```
127.0.0.1:7002> set bar bar
-> Redirected to slot [5061] located at :7000
Could not connect to Redis at :7000: No address associated with hostname
Could not connect to Redis at :7000: No address associated with hostname
not connected> exit
```

In this case, the redis-cli should use the previous hostip when there's no host provided by the server.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Madelyn Olson <madelynolson@gmail.com>
2023-07-20 15:31:06 -07:00
George Guimares
9a3b99cbd1 Replaced comment with excessive warning.
The data structures in the comment are not in sync and don't need to be.
Referring to function that handles conversion.
2023-07-16 17:04:15 -05:00
Chen Tianjie
91011100ba
Hide the comma after cport when there is no hostname. (#12411)
According to the format shown in https://redis.io/commands/cluster-nodes/
```
<ip:port@cport[,hostname[,auxiliary_field=value]*]>
```
when there is no hostname, and the auxiliary fields are hidden, the cluster topology should be
```
<ip:port@cport>
```
However in the code we always print the hostname even when it is an empty string, leaving an unnecessary comma tailing after cport, which is weird and conflicts with the doc.
```
94ca2f6cf85228a49fde7b738ee1209de7bee325 127.0.0.1:6379@16379, myself,master - 0 0 0 connected 0-16383
```
2023-07-15 20:31:42 -07:00
Binbin
14f802b360
Initialize cluster owner_not_claiming_slot to avoid warning (#12391)
valgrind report a Uninitialised warning:
```
==25508==  Uninitialised value was created by a heap allocation
==25508==    at 0x4848899: malloc (in
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==25508==    by 0x1A35A1: ztrymalloc_usable_internal (zmalloc.c:117)
==25508==    by 0x1A368D: zmalloc (zmalloc.c:145)
==25508==    by 0x21FDEA: clusterInit (cluster.c:973)
==25508==    by 0x19DC09: main (server.c:7306)
```

Introduced in #12344
2023-07-07 06:40:44 -07:00
zhaozhao.zz
aefdc57f1e
add an assertion to make sure numkeys is 0 in getKeysUsingKeySpecs (#12389)
This is an addition to #12380, to prevent potential bugs when collecting
keys from multiple commands in the future.
Note that this function also resets numkeys in some cases.
2023-07-06 08:17:25 +03:00
Sankar
1190f25ca7
Process loss of slot ownership in cluster bus (#12344)
Process loss of slot ownership in cluster bus

When a node no longer owns a slot, it clears the bit corresponding
to the slot in the cluster bus messages. The receiving nodes
currently don't record the fact that the sender stopped claiming
a slot until some other node in the cluster starts claiming the slot.
This can cause a slot to go missing during slot migration when subjected
to inopportune race with addition of new shards or a failover.
This fix forces the receiving nodes to process the loss of ownership
to avoid spreading wrong information.
2023-07-05 17:46:23 -07:00
Job Henandez Lara
b35e20d1de
redis-benchmark: add documentation for the amount of clients needed (#12372)
In cluster mode, we need more clients than the number of nodes.
2023-07-05 10:27:14 +03:00
Lior Lahav
b7559d9f3b
Fix possible crash in command getkeys (#12380)
When getKeysUsingKeySpecs processes a command with more than one key-spec,
and called with a total of more than 256 keys, it'll call getKeysPrepareResult again,
but since numkeys isn't updated, getKeysPrepareResult will not bother to copy key
names from the old result (leaving these slots uninitialized). Furthermore, it did not
consider the keys it already found when allocating more space.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-07-03 12:45:18 +03:00
Binbin
6b57241fa8
Revert zrangeGenericCommand negative offset check (#12377)
The negative offset check was added in #9052, we realized
that this is a non-mandatory breaking change and we would
like to add it only in 8.0.

This reverts PR #9052, will be re-introduced later in 8.0.
2023-07-02 20:38:30 +03:00
Binbin
26174123ee
Avoid DEBUG POPULATE crash at dictExpand OOM (#12363)
Change to use dictTryExpand, return error on OOM.
2023-07-01 07:35:35 -07:00
DevineLiu
6bf9b144ef
redis-cli: Support URIs with IPv6 (#11834)
Co-authored-by: hrliu <hrliu@alauda.io>
2023-06-29 19:32:01 +03:00
judeng
07ed0eafa9
improve performance for scan command when matching pattern or data type (#12209)
Optimized the performance of the SCAN command in a few ways:
1. Move the key filtering (by MATCH pattern) in the scan callback,
  so as to avoid collecting them for later filtering.
2. Reduce a many memory allocations and copying (use a reference
  to the original sds, instead of creating an robj, an excessive 2 mallocs
  and one string duplication)
3. Compare TYPE filter directly (as integers), instead of inefficient string
  compare per key.
4. fixed a small bug: when scan zset and hash types, maxiterations uses
  a more accurate number to avoid wrong double maxiterations.

Changes **postponed** for a later version (8.0):
1. Prepare to move the TYPE filtering to the scan callback as well. this was
  put on hold since it has side effects that can be considered a breaking
  change, which is that we will not attempt to do lazy expire (delete) a key
  that was filtered by not matching the TYPE (changing it would mean TYPE filter
  starts behaving the same as MATCH filter already does in that respect). 
2. when the specified key TYPE filter is an unknown type, server will reply a error
  immediately instead of doing a full scan that comes back empty handed. 

Benchmark result:
For different scenarios, we obtained about 30% or more performance improvement.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-27 16:43:46 +03:00
Maria Markova
b2cdf6bcc3
Adding of O3 on linking stage for OPTIMIZATION=-O3 cases (#12339)
Added missing O3 flag to linking stage in default option "-O3 -flto". 
Flags doesn't lead to significant changes in performance:
- +0.21% in geomean for all benchmarks on ICX bare-metal (256 cpus)
- +0.33% in geomean for all benchmarks on m6i.2xlarge (16 cpus) 
Checked on redis from Mar'30 (commit 1f76bb17ddcb2adc484bf82f1b839c45e264524f ). Comparison file is attached.
2023-06-27 11:54:17 +03:00
michalbiesek
ef4bb4e374
Add RISC-V debug support (#12349)
- Add support for `getAndSetMcontextEip`
- Add support for `logRegisters`
2023-06-27 11:53:42 +03:00
Binbin
f58fd9e6d2
Set HIDDEN_CONFIG flag on aof-disable-auto-gc (#12355)
aof-disable-auto-gc was created for testing purposes,
to check if certain AOF files were actually generated
and if they were deletedcorrectly during testing.

So hiding it, see #12249 for more discussion.
2023-06-27 10:21:58 +03:00
Chen Tianjie
22a29935ff
Support TLS service when "tls-cluster" is not enabled and persist both plain and TLS port in nodes.conf (#12233)
Originally, when "tls-cluster" is enabled, `port` is set to TLS port. In order to support non-TLS clients, `pport` is used to propagate TCP port across cluster nodes. However when "tls-cluster" is disabled, `port` is set to TCP port, and `pport` is not used, which means the cluster cannot provide TLS service unless "tls-cluster" is on.
```
typedef struct {
    // ...
    uint16_t port;  /* Latest known clients port (TLS or plain). */
    uint16_t pport; /* Latest known clients plaintext port. Only used if the main clients port is for TLS. */
    // ...
} clusterNode;
```
```
typedef struct {
    // ...
    uint16_t port;   /* TCP base port number. */
    uint16_t pport;  /* Sender TCP plaintext port, if base port is TLS */
    // ...
} clusterMsg;
```
This PR renames `port` and `pport` in `clusterNode` to `tcp_port` and `tls_port`, to record both ports no matter "tls-cluster" is enabled or disabled.

This allows to provide TLS service to clients when "tls-cluster" is disabled: when displaying cluster topology, or giving `MOVED` error, server can provide TLS or TCP port according to client's connection type, no matter what type of connection cluster bus is using.

For backwards compatibility, `port` and `pport` in `clusterMsg` are preserved, when "tls-cluster" is enabled, `port` is set to TLS port and `pport` is set to TCP port, when "tls-cluster" is disabled, `port` is set to TCP port and `pport` is set to TLS port (instead of 0).

Also, in the nodes.conf file, a new aux field displaying an extra port is added to complete the persisted info. We may have `tls_port=xxxxx` or `tcp_port=xxxxx` in the aux field, to complete the cluster topology, while the other port is stored in the normal `<ip>:<port>` field. The format is shown below.
```
<node-id> <ip>:<tcp_port>@<cport>,<hostname>,shard-id=...,tls-port=6379 myself,master - 0 0 0 connected 0-1000
```
Or we can switch the position of two ports, both can be correctly resolved.
```
<node-id> <ip>:<tls_port>@<cport>,<hostname>,shard-id=...,tcp-port=6379 myself,master - 0 0 0 connected 0-1000
```
2023-06-26 07:43:38 -07:00
Binbin
9600553ef2
Move clusterBeforeSleep before blockedBeforeSleep (#12343)
The new blockedBeforeSleep was added in #12337, it breaks the order in 2ecb5ed.

This may be related to #2288, quoted from comment in #2288:
```
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the
beforeSleep() call in redis.c. It should be at the top, before processing un blocking
clients. This is exactly the reason in the specific instance of the bug as reported,
why the state was not updated in time before clients served.
```
2023-06-25 14:21:03 +03:00
Meir Shpilraien (Spielrein)
153f8f082e
Fix use after free on blocking RM_Call. (#12342)
blocking RM_Call was introduced on: #11568, It allows a module to perform
blocking commands and get the reply asynchronously.If the command gets
block, a special promise CallReply is returned that allow to set the unblock
handler. The unblock handler will be called when the command invocation
finish and it gets, as input, the command real reply.

The issue was that the real CallReply was created using a stack allocated
RedisModuleCtx which is no longer available after the unblock handler finishes.
So if the module keeps the CallReply after the unblock handler finished, the
CallReply holds a pointer to invalid memory and will try to access it when the
CallReply will be released.

The solution is to create the CallReply with a NULL context to make it totally
detached and can be freed freely when the module wants.

Test was added to cover this case, running the test with valgrind before the
fix shows the use after free error. With the fix, there are no valgrind errors.

unrelated: adding a missing `$rd close` in many tests in that file.
2023-06-25 14:12:27 +03:00
guybe7
3230199920
Modules: Unblock from within a timer coverage (#12337)
Apart from adding the missing coverage, this PR also adds `blockedBeforeSleep`
that gathers all block-related functions from `beforeSleep`

The order inside `blockedBeforeSleep` is different: now `handleClientsBlockedOnKeys`
(which may unblock clients) is called before `processUnblockedClients` (which handles
unblocked clients).
It makes sense to have this order.

There are no visible effects of the wrong ordering, except some cleanups of the now-unblocked
client would have  happen in the next `beforeSleep` (will now happen in the current one)

The reason we even got into it is because i triggers an assertion in logresreq.c (breaking
the assumption that `unblockClient` is called **before** actually flushing the reply to the socket):
`handleClientsBlockedOnKeys` is called, then it calls `moduleUnblockClientOnKey`, which calls
`moduleUnblockClient`, which adds the client to `moduleUnblockedClients` back to `beforeSleep`,
we call `handleClientsWithPendingWritesUsingThreads`, it writes the data of buf to the client, so
`client->bufpos` became 0
On the next `beforeSleep`, we call `moduleHandleBlockedClients`, which calls `unblockClient`,
which calls `reqresAppendResponse`, triggering the assert. (because the `bufpos` is 0) - see https://github.com/redis/redis/pull/12301#discussion_r1226386716
2023-06-22 23:15:16 +03:00
Gabi Ganam
9e5f45f230
Fix typos in comments (#12338) 2023-06-22 08:10:42 -07:00
Binbin
47f32bc998
Print strerror when bio initialization fails (#12333)
Now we can see something like this:
```
Fatal: Can't initialize Background Jobs. Error message: Cannot allocate memory
```
2023-06-21 17:57:11 +03:00
guybe7
d46ef88671
Improve moduleBlockClient timeout overflow handling (#12174)
Continuation of https://github.com/redis/redis/pull/11338
avoid overflow adding input timeout to "now" in moduleBlockClient.
2023-06-21 12:48:13 +03:00
guybe7
20fa156067
Align RM_ReplyWithErrorFormat with RM_ReplyWithError (#12321)
Introduced by https://github.com/redis/redis/pull/11923 (Redis 7.2 RC2)

It's very weird and counterintuitive that `RM_ReplyWithError` requires the error-code
**without** a hyphen while `RM_ReplyWithErrorFormat` requires either the error-code
**with** a hyphen or no error-code at all
```
RedisModule_ReplyWithError(ctx, "BLA bla bla");
```
vs.
```
RedisModule_ReplyWithErrorFormat(ctx, "-BLA %s", "bla bla");
```

This commit aligns RM_ReplyWithErrorFormat to behvae like RM_ReplyWithError.
it's a breaking changes but it's done before 7.2 goes GA.
2023-06-20 20:44:43 +03:00
Oran Agra
8ad8f0f9d8
Fix broken protocol when PUBLISH emits local push inside MULTI (#12326)
When a connection that's subscribe to a channel emits PUBLISH inside MULTI-EXEC,
the push notification messes up the EXEC response.

e.g. MULTI, PING, PUSH foo bar, PING, EXEC
the EXEC's response will contain: PONG, {message foo bar}, 1. and the second PONG
will be delivered outside the EXEC's response.

Additionally, this PR changes the order of responses in case of a plain PUBLISH (when
the current client also subscribed to it), by delivering the push after the command's
response instead of before it.
This also affects modules calling RM_PublishMessage in a similar way, so that we don't
run the risk of getting that push mixed together with the module command's response.
2023-06-20 20:41:41 +03:00
judeng
93708c7f6a
use embedded string object and more efficient ll2string for long long value convert to string (#12250)
A value of type long long is always less than 21 bytes when convert to a
string, so always meets the conditions for using embedded string object
which can always get memory reduction and performance gain (less calls
to the heap allocator).
Additionally, for the conversion of longlong type to sds, we also use a faster
algorithm (the one in util.c instead of the one that used to be in sds.c). 

For the DECR command on 32-bit Redis, we get about a 5.7% performance
improvement. There will also be some performance gains for some commands
that heavily use sdscatfmt to convert numbers, such as INFO.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-20 15:14:44 +03:00
Binbin
d9c2ef8a72
zrangeGenericCommand add check for negative offset (#9052)
Now we will check the offset in zrangeGenericCommand.
With a negative offset, we will throw an error and return.

This also resolve the issue of zeroing the destination key
in case of the "store" variant when we input a negative offset.
```
127.0.0.1:6379> set key value
OK
127.0.0.1:6379> zrangestore key myzset 0 10 byscore limit -1 10
(integer) 0
127.0.0.1:6379> exists key
(integer) 0
```

This change affects the following commands:
- ZRANGE / ZRANGESTORE / ZRANGEBYLEX / ZRANGEBYSCORE
- ZREVRANGE / ZREVRANGEBYSCORE / ZREVRANGEBYLEX
2023-06-20 14:33:17 +03:00
Binbin
d306d86146
Fix ZRANK/ZREVRANK reply_schema description (#12331)
The parameter name is WITHSCORE instead of WITHSCORES.
2023-06-20 11:15:40 +03:00
mstmdev
13e17e94d8
Doc indentation fix for the --functions-rdb option (#12328) 2023-06-20 11:15:11 +03:00
Wen Hui
813924b4c3
Sanitizer reported memory leak for '--invalid' option or port number is missed cases to redis-server. (#12322)
Observed that the sanitizer reported memory leak as clean up is not done
before the process termination in negative/following cases:

**- when we passed '--invalid' as option to redis-server.**

```
 -vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments

=================================================================
==865778==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f0985f65867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x558ec86686ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x558ec86686ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x558ec86686ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x558ec86686ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x558ec865dd7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x558ec87a1be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x558ec87a13b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x558ec85e6f15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7f09856e5d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

```

**- when we pass '--port' as option and missed to add port number to redis-server.**

```
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments

=================================================================
==865846==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1f867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x557e8b04f6ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x557e8b04f6ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x557e8b04f6ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x557e8b04f6ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x557e8b044d7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Indirect leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1fc18 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x557e8b04f9aa in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:287
    #2 0x557e8b04f9aa in ztryrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:317
    #3 0x557e8b04f9aa in zrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:342
    #4 0x557e8b033f90 in _sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:271
    #5 0x557e8b033f90 in sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:295
    #6 0x557e8b033f90 in sdscatlen /home/ubuntu/mem-leak-issue/redis/src/sds.c:486
    #7 0x557e8b044e1f in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1165
    #8 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #9 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #10 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #11 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 18 byte(s) leaked in 2 allocation(s).

```

As part analysis found that the sdsfreesplitres is not called when this condition checks are being hit.

Output after the fix:


```
vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments
vm:~/mem-leak-issue/redis$

===========================================
vm:~/mem-leak-issue/redis$ ./src/redis-server --jdhg

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'jdhg'
Bad directive or wrong number of arguments

---------------------------------------------------------------------------
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments
```

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-20 10:07:29 +03:00
Shaya Potter
07316f16f0
Add ability for modules to know which client is being cmd filtered (#12219)
Adds API
- RedisModule_CommandFilterGetClientId()

Includes addition to commandfilter test module to validate that it works
by performing the same command from 2 different clients
2023-06-20 09:03:52 +03:00
Binbin
cd4f3e20c1
Fix cluster human_nodename Getter data loss in nodes.conf (#12325)
auxHumanNodenameGetter limited to %.40s, since we did not limit the
length of config cluster-announce-human-nodename, %.40s will cause
nodename data loss (we will persist it in nodes.conf).

Additional modified auxHumanNodenamePresent to use sdslen.
2023-06-19 17:13:18 -07:00
Binbin
b510624978
Optimize PSUBSCRIBE and PUNSUBSCRIBE from O(N*M) to O(N) (#12298)
In the original implementation, the time complexity of the commands
is actually O(N*M), where N is the number of patterns the client is
already subscribed and M is the number of patterns to subscribe to.
The docs are all wrong about this.

Specifically, because the original client->pubsub_patterns is a list,
so we need to do listSearchKey which is O(N). In this PR, we change it
to a dict, so the search becomes O(1).

At the same time, both pubsub_channels and pubsubshard_channels are dicts.
Changing pubsub_patterns to a dictionary improves the readability and
maintainability of the code.
2023-06-19 16:31:18 +03:00
Wen Hui
070453eef3
Cluster human readable nodename feature (#9564)
This PR adds a human readable name to a node in clusters that are visible as part of error logs. This is useful so that admins and operators of Redis cluster have better visibility into failures without having to cross-reference the generated ID with some logical identifier (such as pod-ID or EC2 instance ID). This is mentioned in #8948. Specific nodenames can be set by using the variable cluster-announce-human-nodename. The nodename is gossiped using the clusterbus extension in #9530.

Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2023-06-17 21:16:51 -07:00
sundb
b00a235186
Use Reservoir Sampling for random sampling of dict, and fix hang during fork (#12276)
## Issue:
When a dict has a long chain or the length of the chain is longer than
the number of samples, we will never be able to sample the elements
at the end of the chain using dictGetSomeKeys().
This could mean that SRANDMEMBER can be hang in and endless loop.
The most severe case, is the pathological case of when someone uses SCAN+DEL
or SSCAN+SREM creating an unevenly distributed dict.
This was amplified by the recent change in #11692 which prevented a
down-sizing rehashing while there is a fork.

## Solution
1. Before, we will stop sampling when we reach the maximum number
  of samples, even if there is more data after the current chain.
  Now when we reach the maximum we use the Reservoir Sampling
  algorithm to fairly sample the end of the chain that cannot be sampled
2. Fix the rehashing code, so that the same as it allows rehashing for up-sizing
  during fork when the ratio is extreme, it will allow it for down-sizing as well.

Issue was introduced (or became more severe) by #11692

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-16 19:13:07 +03:00