There are two issues fixed in this commit:
1. we want to fail the EXEC command in case there is a watched key that's logically
expired but not yet deleted by active expire or lazy expire.
2. we saw that currently cache time is update in every `call()` (including nested calls),
this time is being also being use for the isKeyExpired comparison, we want to update
the cache time only in the first call (execCommand)
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit ac8b1df8850cc80fbf9ce8c2fbde0c1d3a1b4e91)
The `Tracking gets notification of expired keys` test in tracking.tcl
used to hung in valgrind CI quite a lot.
It turns out the reason is that with valgrind and a busy machine, the
server cron active expire cycle could easily run in the same event loop
as the command that created `mykey`, so that when they key got expired,
there were two change events to broadcast, one that set the key and one
that expired it, but since we used raxTryInsert, the client that was
associated with the "last" change was the one that created the key, so
the NOLOOP filtered that event.
This commit adds a test that reproduces the problem by using lazy expire
in a multi-exec which makes sure the key expires in the same event loop
as the one that added it.
(cherry picked from commit 9b564b525d8ce88295ec14ffdc3bede7e5f5c33e)
In diskless replication, we create a read pipe for the RDB, between the child and the parent.
When we close this pipe (fd), the read handler also needs to be removed from the event loop (if it still registered).
Otherwise, next time we will use the same fd, the registration will be fail (panic), because
we will use EPOLL_CTL_MOD (the fd still register in the event loop), on fd that already removed from epoll_ctl
(cherry picked from commit 501d7755831527b4237f9ed6050ec84203934e4d)
When client breached the output buffer soft limit but then went idle,
we didn't disconnect on soft limit timeout, now we do.
Note this also resolves some sporadic test failures in due to Linux
buffering data which caused tests to fail if during the test we went
back under the soft COB limit.
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 152fce5e2cbf947a389da414a431f7331981a374)
Use an invalid IP address to trigger CONFIG SET bind failure, instead of DNS which is not guaranteed to always fail.
(cherry picked from commit 2b22fffc787e91df789dabf23ddcf19ecf34cf6f)
When redis-check-aof finds an error, it prints the line number for faster troubleshooting.
(cherry picked from commit 761d7d27711edfbf737def41ff28f5b325fb16c8)
Adding a new type mask for key space notification, REDISMODULE_NOTIFY_MODULE, to enable unique notifications from commands on REDISMODULE_KEYTYPE_MODULE type keys (which is currently unsupported).
Modules can subscribe to a module key keyspace notification by RM_SubscribeToKeyspaceEvents,
and clients by notify-keyspace-events of redis.conf or via the CONFIG SET, with the characters 'd' or 'A'
(REDISMODULE_NOTIFY_MODULE type mask is part of the '**A**ll' notation for key space notifications).
Refactor: move some pubsub test infra from pubsub.tcl to util.tcl to be re-used by other tests.
Before this commit using RM_Call without "!" could cause the master
to lazy-expire a key (delete it) but without replicating to replicas.
This could cause the replica's memory usage to gradually grow and
could also cause consistency issues if the master and replica have
a clock diff.
This bug was introduced in #8617
Added a test which demonstrates that scenario.
In the initial release of Redis 6.2 setting a user to only allow pubsub access to
a specific channel, and doing ACL SAVE, resulted in an assertion when
ACL LOAD was used. This was later changed by #8723 (not yet released),
but still not properly resolved (now it errors instead of crash).
The problem is that the server that generates an ACL file, doesn't know what
would be the setting of the acl-pubsub-default config in the server that will load it.
so ACL SAVE needs to always start with resetchannels directive.
This should still be compatible with old acl files (from redis 6.0), and ones from earlier
versions of 6.2 that didn't mess with channels.
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
The tail size of c->reply is 16kb, but in the test only publish a
few chars each time, due to a change in #8699, the obuf limit
is now checked a new memory allocation is made, so this test
would have sometimes failed to trigger a soft limit disconnection
in time.
The solution is to write bigger payloads to the output buffer, but
still limit their rate (not more than 100k/s).
In github actions CI with valgrind, i saw that even the fast replica
(one that wasn't paused), didn't get to complete the replication fast
enough, and ended up getting disconnected by timeout.
Additionally, due to a typo in uname, we didn't get to actually run the
CPU efficiency part of the test.
1. the `dump_logs` option would have printed only logs of servers that were
spawn before the test proc started, and not ones that the test proc
started inside it.
2. when a server proc catches an exception it should normally forward the
exception upwards, specifically when it's an assertion that should be
caught by a test proc above. however, in `durable` mode, we caught all
exceptions printed them to stdout and let the code continue,
this was wrong to do for assertions, which should have still been
propagated to the test function.
3. don't bother to search for crash log to print if we printed the the
entire log anyway
4. if no crash log was found, no need to print anything (i.e. the fact it
wasn't found)
5. rename warnings_from_file to crashlog_from_file
Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork
child so that the parent is the one sending data to the replicas.
This mechanism has an issue in which a hung replica will cause the master to wait
for it to read the data sent to it forever, thus preventing the fork child from terminating
and preventing the creations of any other forks.
This PR adds a timeout mechanism, much like the ACK-based timeout,
we disconnect replicas that aren't reading the RDB file fast enough.