890 Commits

Author SHA1 Message Date
VivekSainiEQ
a7e7540284 Resolved merge conflicts in prior commit
Former-commit-id: b88f06b16f3d9e58ec884c61d2d074d7a489775e
2021-10-21 22:35:15 +00:00
VivekSainiEQ
b84ee93b5d Merge tag '6.2.6' into Redis_626_Merge
Former-commit-id: e6d7e01be6965110d487e12f40511fe0b3497695
2021-10-21 22:33:55 +00:00
DarrenJiang13
94bf9b175e [BUGFIX] Add some missed error statistics (#9328)
add error counting for some missed behaviors.

(cherry picked from commit 43eb0ce3bf76a5d287b93a767bead9ad6230a1ad)
2021-10-04 13:59:40 +03:00
Oran Agra
1334180f82 Improvements to corrupt payload sanitization (#9321)
Recently we found two issues in the fuzzer tester: #9302 #9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.

Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff

Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 0c90370e6d71cc68e4d9cc79a0d8b1e768712a5b)
2021-10-04 13:59:40 +03:00
menwen
44b3a6df77 Add latency monitor sample when key is deleted via lazy expire (#9317)
Fix that there is no sample latency after the key expires via expireIfNeeded().
Some refactoring for shared code.

(cherry picked from commit ca559819f7dcd97ba9ef667bf38360a9527d62f6)
2021-10-04 13:59:40 +03:00
Oran Agra
24977cdff6 Fix ziplist and listpack overflows and truncations (CVE-2021-32627, CVE-2021-32628)
- fix possible heap corruption in ziplist and listpack resulting by trying to
  allocate more than the maximum size of 4GB.
- prevent ziplist (hash and zset) from reaching size of above 1GB, will be
  converted to HT encoding, that's not a useful size.
- prevent listpack (stream) from reaching size of above 1GB.
- XADD will start a new listpack if the new record may cause the previous
  listpack to grow over 1GB.
- XADD will respond with an error if a single stream record is over 1GB
- List type (ziplist in quicklist) was truncating strings that were over 4GB,
  now it'll respond with an error.
2021-10-04 13:59:40 +03:00
Oran Agra
c34f970be5 Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675)
This change sets a low limit for multibulk and bulk length in the
protocol for unauthenticated connections, so that they can't easily
cause redis to allocate massive amounts of memory by sending just a few
characters on the network.
The new limits are 10 arguments of 16kb each (instead of 1m of 512mb)
2021-10-04 13:59:40 +03:00
MalavanEQAlpha
fd5f0b8ebe Merge pull request #313 from MalavanEQAlpha/fixissue295
Resolve Issue #295 by speeding up keyIsExpired and setting timeout on subkey expires.

Former-commit-id: 9e5e6c2f86548b18ae27b4e1ac20c72517392b07
2021-08-18 17:34:18 -04:00
christianEQ
cbf83d5eff fixed overly strict assert for 32bit
Former-commit-id: ce4891b33d65038bb0543eb2d3526c5310fee59b
2021-08-13 11:05:12 -04:00
VivekSainiEQ
72e02159f4 Prevent invalid mvcc timestamps from causing critical errors
Former-commit-id: 6f2dbb00119b1d0a1f5a2543d2c6af05f83ef5de
2021-08-11 15:06:14 -04:00
Christian Legge
7015a2abd2 Add REPLPING command for use during replication (#329)
* added replping command for initiating replication

* backwards compatibility for replping (retry if not recognized)

* don't allow ping during loading (load balancer fix)

* changed replping warning to notice

Former-commit-id: d7f6bc16145206e96ffeb9941398d564c3dba6a9
2021-07-29 15:50:30 -04:00
Huang Zhw
ad8562e5a1 On 32 bit platform, the bit position of GETBIT/SETBIT/BITFIELD/BITCOUNT,BITPOS may overflow (see CVE-2021-32761) (#9191)
GETBIT, SETBIT may access wrong address because of wrap.
BITCOUNT and BITPOS may return wrapped results.
BITFIELD may access the wrong address but also allocate insufficient memory and segfault (see CVE-2021-32761).

This commit uses `uint64_t` or `long long` instead of `size_t`.
related https://github.com/redis/redis/pull/8096

At 32bit platform:
> setbit bit 4294967295 1
(integer) 0
> config set proto-max-bulk-len 536870913
OK
> append bit "\xFF"
(integer) 536870913
> getbit bit 4294967296
(integer) 0

When the bit index is larger than 4294967295, size_t can't hold bit index. In the past,  `proto-max-bulk-len` is limit to 536870912, so there is no problem.

After this commit, bit position is stored in `uint64_t` or `long long`. So when `proto-max-bulk-len > 536870912`, 32bit platforms can still be correct.

For 64bit platform, this problem still exists. The major reason is bit pos 8 times of byte pos. When proto-max-bulk-len is very larger, bit pos may overflow.
But at 64bit platform, we don't have so long string. So this bug may never happen.

Additionally this commit add a test cost `512MB` memory which is tag as `large-memory`. Make freebsd ci and valgrind ci ignore this test.

(cherry picked from commit 71d452876ebf8456afaadd6b3c27988abadd1148)
2021-07-21 21:06:49 +03:00
Yossi Gottlieb
ffe1e9107c Fix CLIENT UNBLOCK crashing modules. (#9167)
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.

(cherry picked from commit aa139e2f02292d668370afde8c91575363c2d611)
2021-07-21 21:06:49 +03:00
Oran Agra
8149476f66 Test infra, handle RESP3 attributes and big-numbers and bools (#9235)
- promote the code in DEBUG PROTOCOL to addReplyBigNum
- DEBUG PROTOCOL ATTRIB skips the attribute when client is RESP2
- networking.c addReply for push and attributes generate assertion when
  called on a RESP2 client, anything else would produce a broken
  protocol that clients can't handle.

(cherry picked from commit 6a5bac309e868deef749c36949723b415de2496f)
2021-07-21 21:06:49 +03:00
perryitay
6ae3673b5e Fail EXEC command in case a watched key is expired (#9194)
There are two issues fixed in this commit: 
1. we want to fail the EXEC command in case there is a watched key that's logically
   expired but not yet deleted by active expire or lazy expire.
2. we saw that currently cache time is update in every `call()` (including nested calls),
   this time is being also being use for the isKeyExpired comparison, we want to update
   the cache time only in the first call (execCommand)

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit ac8b1df8850cc80fbf9ce8c2fbde0c1d3a1b4e91)
2021-07-21 21:06:49 +03:00
John Sully
c0160a4415 Merge tag '6.2.3' into unstable
Former-commit-id: 1895dbb7680fa9aadf6040912e89c733abc8c706
2021-07-09 04:40:31 +00:00
Madelyn Olson
f8938d868b Hide migrate command from slowlog if they include auth (#8859)
Redact commands that include sensitive data from slowlog and monitor

(cherry picked from commit a59e75a475782d86d7ce2b5b9c6f5bb4a5ef0bd6)
2021-06-01 17:03:36 +03:00
yoav-steinberg
7059cfb432 Enforce client output buffer soft limit when no traffic. (#8833)
When client breached the output buffer soft limit but then went idle,
we didn't disconnect on soft limit timeout, now we do.
Note this also resolves some sporadic test failures in due to Linux
buffering data which caused tests to fail if during the test we went
back under the soft COB limit.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 152fce5e2cbf947a389da414a431f7331981a374)
2021-06-01 17:03:36 +03:00
John Sully
4e70e85ab8 Fix failover command test failures
Former-commit-id: d3c37c7159a92319759a33851669862a82cf1b28
2021-05-29 01:19:12 +00:00
John Sully
87f706f8e7 Fix TLS Build Errors
Former-commit-id: aea86c5977c01debb4f4a4340c866aa5c5f20f04
2021-05-25 16:55:59 +00:00
John Sully
ea6a0f214b Merge tag '6.2.2' into unstable
Former-commit-id: 93ebb31b17adec5d406d2e30a5b9ea71c07fce5c
2021-05-21 05:54:39 +00:00
John Sully
f49d8f9adb Merge tag '6.2.1' into unstable
Former-commit-id: bfed57e3e0edaa724b9d060a6bb8edc5a6de65fa
2021-05-19 02:59:48 +00:00
Malavan
7859cbdf59 Merge remote-tracking branch 'upstream/unstable' into fixissue295
Former-commit-id: 9110747194f1a3627adc1a672600a1c6445b8de4
2021-05-17 21:35:43 +00:00
Malavan
bafd03f53c Speedup keyIsExpired by removing subkey search
Former-commit-id: a01564158e40300ab4a0338c0a6e924972385407
2021-05-07 09:40:23 +00:00
John Sully
42f98c27e3 Fix test hang
Former-commit-id: 23647390e628de07759f8e7d8768a7f638edf01d
2021-05-07 00:28:10 +00:00
zyxwvu Shi
7ac26c3497 Use monotonic clock to check for Lua script timeout. (#8812)
This prevents a case where NTP moves the system clock
forward resulting in a false detection of a busy script.

Signed-off-by: zyxwvu Shi <i@shiyc.cn>
(cherry picked from commit f61c37cec900ba391541f20f7655aad44a26bafc)
2021-05-03 22:57:00 +03:00
Madelyn Olson
3e97074649 Fix memory leak when doing lazyfreeing client tracking table (#8822)
Interior rax pointers were not being freed

(cherry picked from commit c73b4ddfd96d00ed0d0fde17953ce63d78bc3777)
2021-05-03 22:57:00 +03:00
Hanna Fadida
6dc47733d0 Modules: adding a module type for key space notification (#8759)
Adding a new type mask ​for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported).

Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents,
and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A' 
(REDISMODULE_NOTIFY_MODULE type mask is part of the '**A**ll' notation for key space notifications).

Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.
2021-04-19 21:33:26 +03:00
guybe7
0bf6c205db Add a timeout mechanism for replicas stuck in fullsync (#8762)
Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork
child so that the parent is the one sending data to the replicas.
This mechanism has an issue in which a hung replica will cause the master to wait
for it to read the data sent to it forever, thus preventing the fork child from terminating
and preventing the creations of any other forks.

This PR adds a timeout mechanism, much like the ACK-based timeout,
we disconnect replicas that aren't reading the RDB file fast enough.
2021-04-15 17:18:51 +03:00
Wang Yuan
8c5634f4eb Fix wrong check for aof fsync and handle aof fsync errno (#8751)
The bio aof fsync fd may be closed by main thread (AOFRW done handler)
and even possibly reused for another socket, pipe, or file.
This can can an EBADF or EINVAL fsync error, which will lead to -MISCONF errors failing all writes.
We just ignore these errno because aof fsync did not really fail.

We handle errno when fsyncing aof in bio, so we could know the real reason
when users get -MISCONF Errors writing to the AOF file error

Issue created with #8419
2021-04-11 08:14:31 +03:00
Huang Zhw
f5cc88e65d Fix "default" and overwritten / reset users will not have pubsub channels permissions by default. (#8723)
Background:
Redis 6.2 added ACL control for pubsub channels (#7993), which were supposed
to be permissive by default to retain compatibility with redis 6.0 ACL. 
But due to a bug, only newly created users got this `acl-pubsub-default` applied,
while overwritten (updated) users got reset to `resetchannels` (denied).

Since the "default" user exists before loading the config file,
any ACL change to it, results in an update / overwrite.

So when a "default" user is loaded from config file or include ACL
file with no channels related rules, the user will not have any
permissions to any channels. But other users will have default
permissions to any channels.

When upgraded from 6.0 with config rewrite, this will lead to
"default" user channels permissions lost.
When users are loaded from include file, then call "acl load", users
will also lost channels permissions.

Similarly, the `reset` ACL rule, would have reset the user to be denied
access to any channels, ignoring `acl-pubsub-default` and breaking
compatibility with redis 6.0.

The implication of this fix is that it regains compatibility with redis 6.0,
but breaks compatibility with redis 6.2.0 and 2.0.1. e.g. after the upgrade,
the default user will regain access to pubsub channels.

Other changes:
Additionally this commit rename server.acl_pubusub_default to
server.acl_pubsub_default and fix typo in acl tests.
2021-04-05 23:13:20 +03:00
Sokolov Yura
0d7a17b252 Add cluster-allow-replica-migration option. (#5285)
Previously (and by default after commit) when master loose its last slot
(due to migration, for example), its replicas will migrate to new last slot
holder.

There are cases where this is not desired:
* Consolidation that results with removed nodes (including the replica, eventually).
* Manually configured cluster topologies, which the admin wishes to preserve.

Needlessly migrating a replica triggers a full synchronization and can have a negative impact, so
we prefer to be able to avoid it where possible.

This commit adds 'cluster-allow-replica-migration' configuration option that is
enabled by default to preserve existed behavior. When disabled, replicas will
not be auto-migrated.

Fixes #4896

Co-authored-by: Oran Agra <oran@redislabs.com>
2021-04-04 09:43:24 +03:00
Wang Yuan
689b9ce067 Handle remaining fsync errors (#8419)
In `aof.c`, we call fsync when stop aof, and now print a log to let user know that if fail.
In `cluster.c`, we now return error, the calling function already handles these write errors.
In `redis-cli.c`, users hope to save rdb, we now print a message if fsync failed.
In `rio.c`, we now treat fsync errors like we do for write errors. 
In `server.c`, we try to fsync aof file when shutdown redis, we only can print one log if fail.
In `bio.c`, if failing to fsync aof file, we will set `aof_bio_fsync_status` to error , and reject writing just like last writing aof error,  moreover also set INFO command field `aof_last_write_status` to error.
2021-04-01 12:45:15 +03:00
Wen Hui
b37ac422e7 generalize config file check for sentinel (#8730)
The implications of this change is just that in the past when a config file was missing,
in some cases it was exiting before printing the sever startup prints and sometimes after,
and now it'll always exit before printing them.
2021-04-01 09:01:05 +03:00
John Sully
577ba39c70 Fix issue #300
Former-commit-id: e9551c9e8d196f37e3742dfc7df824e164181d60
2021-03-30 23:30:41 +00:00
Jérôme Loyet
1ac84a68c7 Add replica-announced config option (#8653)
The 'sentinel replicas <master>' command will ignore replicas with
`replica-announced` set to no.

The goal of disabling the config setting replica-announced is to allow ghost
replicas. The replica is in the cluster, synchronize with its master, can be
promoted to master and is not exposed to sentinel clients. This way, it is
acting as a live backup or living ghost.

In addition, to prevent the replica to be promoted as master, set
replica-priority to 0.
2021-03-30 23:40:22 +03:00
Viktor Söderqvist
371d7f120f Add support for plaintext clients in TLS cluster (#8587)
The cluster bus is established over TLS or non-TLS depending on the configuration tls-cluster. The client ports distributed in the cluster and sent to clients are assumed to be TLS or non-TLS also depending on tls-cluster.

The cluster bus is now extended to also contain the non-TLS port of clients in a TLS cluster, when available. The non-TLS port of a cluster node, when available, is sent to clients connected without TLS in responses to CLUSTER SLOTS, CLUSTER NODES, CLUSTER SLAVES and MOVED and ASK redirects, instead of the TLS port.

The user was able to override the client port by defining cluster-announce-port. Now cluster-announce-tls-port is added, so the user can define an alternative announce port for both TLS and non-TLS clients.

Fixes #8134
2021-03-30 23:11:32 +03:00
VivekSainiEQ
239bb9cf65 Added logic back to only acquire/release GIL if modules are enabled, without causing deadlocks
Former-commit-id: 9ab36ddc36e1d12e41d2eca917ee24a44a82df52
2021-03-30 13:46:03 -04:00
VivekSainiEQ
cb43eda420 Created and initialized seperate thread variables for modules
Former-commit-id: 3bb6b16c4a8f692b46040b72a51bef57fa03f1e6
2021-03-30 13:46:03 -04:00
John Sully
c0d7601aa4 Merge branch 'unstable' of https://github.com/JohnSully/KeyDB into unstable
Former-commit-id: 2910cee32ba6e4ef4b79b83bec2980c582a9310c
2021-03-29 00:49:34 +00:00
Huang Zhw
c25ccc3208 make processCommand check publish channel permissions. (#8534)
Add publish channel permissions check in processCommand.

processCommand didn't check publish channel permissions, so we can
queue a publish command in a transaction. But when exec the transaction,
it will fail with -NOPERM.

We also union keys/commands/channels permissions check togegher in
ACLCheckAllPerm. Remove pubsubCheckACLPermissionsOrReply in 
publishCommand/subscribeCommand/psubscribeCommand. Always 
check permissions in processCommand/execCommand/
luaRedisGenericCommand.
2021-03-26 14:10:01 +03:00
Oran Agra
28862376dd Fix SLOWLOG for blocked commands (#8632)
* SLOWLOG didn't record anything for blocked commands because the client
  was reset and argv was already empty. there was a fix for this issue
  specifically for modules, now it works for all blocked clients.
* The original command argv (before being re-written) was also reset
  before adding the slowlog on behalf of the blocked command.
* Latency monitor is now updated regardless of the slowlog flags of the
  command or its execution (their purpose is to hide sensitive info from
  the slowlog, not hide the fact the latency happened).
* Latency monitor now uses real_cmd rather than c->cmd (which may be
  different if the command got re-written, e.g. GEOADD)

Changes:
* Unify shared code between slowlog insertion in call() and
  updateStatsOnUnblock(), hopefully prevent future bugs from happening
  due to the later being overlooked.
* Reset CLIENT_PREVENT_LOGGING in resetClient rather than after command
  processing.
* Add a test for SLOWLOG and BLPOP

Notes:
- real_cmd == c->lastcmd, except inside MULTI and Lua.
- blocked commands never happen in these cases (MULTI / Lua)
- real_cmd == c->cmd, except for when the command is rewritten (e.g.
  GEOADD)
- blocked commands (currently) are never rewritten
- other than the command's CLIENT_PREVENT_LOGGING, and the
  execution flag CLIENT_PREVENT_LOGGING, other cases that we want to
  avoid slowlog are on AOF loading (specifically CMD_CALL_SLOWLOG will
  be off when executed from execCommand that runs from an AOF)
2021-03-25 10:20:27 +02:00
yoav-steinberg
895fb348f4 Avoid evaluating log arguments when log filtered by level. (#8685) 2021-03-24 08:22:12 +02:00
Yossi Gottlieb
bd7e2b6e34 Add support for reading encrypted keyfiles. (#8644) 2021-03-22 13:27:46 +02:00
Yossi Gottlieb
0cd485db5e Fix slowdown due to child reporting CoW. (#8645)
Reading CoW from /proc/<pid>/smaps can be slow with large processes on
some platforms.

This measures the time it takes to read CoW info and limits the duty
cycle of future updates to roughly 1/100.

As current_cow_size no longer represnets a current, fixed interval value
there is also a new current_cow_size_age field that provides information
about the age of the size value, in seconds.
2021-03-22 13:25:58 +02:00
VivekSainiEQ
f4dbaab531 Removed hasModuleGIL boolean and added fix from PR #292
Former-commit-id: 68d213f4c9c1c3161929a5e20ca4f2b27665c8fd
2021-03-19 20:10:24 +00:00
VivekSainiEQ
932fd2e79a added lock releasing w/ hasModuleGIL, changed module serverTL, and moved module_blocking_pipe to global scope to fix issue #276
Former-commit-id: 7d9a2ce827a2f8d48e4682b3cc95460cc82f9778
2021-03-19 20:08:28 +00:00
Madelyn Olson
139181e9eb Redact slowlog entries for config with sensitive data. (#8584)
Redact config set requirepass/masterauth/masteruser from slowlog in addition to showing ACL commands without sensitive values.
2021-03-15 22:00:29 -07:00
Huang Zhw
bbb0393c60 Fix typo and outdated comments. (#8640) 2021-03-14 09:41:43 +02:00
guybe7
5f91161823 Fix some issues with modules and MULTI/EXEC (#8617)
Bug 1:
When a module ctx is freed moduleHandlePropagationAfterCommandCallback
is called and handles propagation. We want to prevent it from propagating
commands that were not replicated by the same context. Example:
1. module1.foo does: RM_Replicate(cmd1); RM_Call(cmd2); RM_Replicate(cmd3)
2. RM_Replicate(cmd1) propagates MULTI and adds cmd1 to also_propagagte
3. RM_Call(cmd2) create a new ctx, calls call() and destroys the ctx.
4. moduleHandlePropagationAfterCommandCallback is called, calling
   alsoPropagates EXEC (Note: EXEC is still not written to socket),
   setting server.in_trnsaction = 0
5. RM_Replicate(cmd3) is called, propagagting yet another MULTI (now
   we have nested MULTI calls, which is no good) and then cmd3

We must prevent RM_Call(cmd2) from resetting server.in_transaction.
REDISMODULE_CTX_MULTI_EMITTED was revived for that purpose.

Bug 2:
Fix issues with nested RM_Call where some have '!' and some don't.
Example:
1. module1.foo does RM_Call of module2.bar without replication (i.e. no '!')
2. module2.bar internally calls RM_Call of INCR with '!'
3. at the end of module1.foo we call RM_ReplicateVerbatim

We want the replica/AOF to see only module1.foo and not the INCR from module2.bar

Introduced a global replication_allowed flag inside RM_Call to determine
whether we need to replicate or not (even if '!' was specified)

Other changes:
Split beforePropagateMultiOrExec to beforePropagateMulti afterPropagateExec
just for better readability
2021-03-10 18:02:17 +02:00