Before this commit, we would have continued to add replies to the reply buffer even if client
output buffer limit is reached, so the used memory would keep increasing over the configured limit.
What's more, we shouldn’t write any reply to the client if it is set 'CLIENT_CLOSE_ASAP' flag
because that doesn't conform to its definition and we will close all clients flagged with
'CLIENT_CLOSE_ASAP' in ‘beforeSleep’.
Because of code execution order, before this, we may firstly write to part of the replies to
the socket before disconnecting it, but in fact, we may can’t send the full replies to clients
since OS socket buffer is limited. But this unexpected behavior makes some commands work well,
for instance ACL DELUSER, if the client deletes the current user, we need to send reply to client
and close the connection, but before, we close the client firstly and write the reply to reply
buffer. secondly, we shouldn't do this despite the fact it works well in most cases.
We add a flag 'CLIENT_CLOSE_AFTER_COMMAND' to mark clients, this flag means we will close the
client after executing commands and send all entire replies, so that we can write replies to
reply buffer during executing commands, send replies to clients, and close them later.
We also fix some implicit problems. If client output buffer limit is enforced in 'multi/exec',
all commands will be executed completely in redis and clients will not read any reply instead of
partial replies. Even more, if the client executes 'ACL deluser' the using user in 'multi/exec',
it will not read the replies after 'ACL deluser' just like before executing 'client kill' itself
in 'multi/exec'.
We added some tests for output buffer limit breach during multi-exec and using a pipeline of
many small commands rather than one with big response.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 57709c4bc663ddcb9313777c551a92dabf07095e)
This happens only on diskless replicas when attempting to reconnect after
failing to load an RDB file. It is more likely to occur with larger datasets.
After reconnection is initiated, replicationEmptyDbCallback() may get called
and try to write to an unconnected socket. This triggered another issue where
the connection is put into an error state and the connect handler never gets
called. The problem is a regression introduced by commit cad93ed.
(cherry picked from commit ecd86283ec292c1062f377f5707be57a8a77adb4)
This happens only on diskless replicas when attempting to reconnect after
failing to load an RDB file. It is more likely to occur with larger datasets.
After reconnection is initiated, replicationEmptyDbCallback() may get called
and try to write to an unconnected socket. This triggered another issue where
the connection is put into an error state and the connect handler never gets
called. The problem is a regression introduced by commit c17e597.
(cherry picked from commit 1980f639b161f46da2944d60f1c2facaf547dc1a)
redis-check-rdb was unable to parse rdb files containing module aux data.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit b914d4fc4825cc20cebca43431af5029ee077d09)
redis-check-rdb was unable to parse rdb files containing module aux data.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 63a05dde462c1be4bd74c32630eca6e794ae440a)
This commit adds streamIteratorStop call in rewriteStreamObject function in some of the return statement. Although currently this will not cause memory leak since stream id is only 16 bytes long.
(cherry picked from commit 7934f163b4b6c1c0c0fc55710d3c7e49f56281f1)
This commit adds streamIteratorStop call in rewriteStreamObject function in some of the return statement. Although currently this will not cause memory leak since stream id is only 16 bytes long.
(cherry picked from commit 23b50bccccf4efac4a1672a332e470a82b7bd514)
Refine comment of makeThreadKillable().
This commit can be backported to 5.0, only if we also backport cf8a6e3.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit d2291627305d606a5d3b1e3b3bfa17ab10a3ef32)
Refine comment of makeThreadKillable().
This commit can be backported to 5.0, only if we also backport 8b70cb0.
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 647cac5bb469171a5b97eade276335ab9c552edc)
We're already using bg_unlink in several places to delete the rdb file in the background,
and avoid paying the cost of the deletion from our main thread.
This commit uses bg_unlink to remove the temporary rdb file in the background too.
However, in case we delete that rdb file just before exiting, we don't actually wait for the
background thread or the main thread to delete it, and just let the OS clean up after us.
i.e. we open the file, unlink it and exit with the fd still open.
Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is
not async-signal-safe, we now use ll2string instead.
(cherry picked from commit 6638f6129553d0f19c60944e70fe619a4217658c)
We're already using bg_unlink in several places to delete the rdb file in the background,
and avoid paying the cost of the deletion from our main thread.
This commit uses bg_unlink to remove the temporary rdb file in the background too.
However, in case we delete that rdb file just before exiting, we don't actually wait for the
background thread or the main thread to delete it, and just let the OS clean up after us.
i.e. we open the file, unlink it and exit with the fd still open.
Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is
not async-signal-safe, we now use ll2string instead.
(cherry picked from commit b002d2b4f1415f4db805081bc8f5b85d00f30e33)
If one thread got SIGSEGV, function sigsegvHandler() would be triggered,
it would call bioKillThreads(). But call pthread_cancel() to cancel itself
would make it block. Also note that if SIGSEGV is caught by bio thread, it
should kill the main thread in order to give a positive report.
(cherry picked from commit cf8a6e3c7a0448851f0c00ff1a726701a2be9f1a)
If one thread got SIGSEGV, function sigsegvHandler() would be triggered,
it would call bioKillThreads(). But call pthread_cancel() to cancel itself
would make it block. Also note that if SIGSEGV is caught by bio thread, it
should kill the main thread in order to give a positive report.
(cherry picked from commit 8b70cb0ef8e654d09d0d2974ad72388f46be9fde)
This commit makes stream object returning "stream" as encoding type in OBJECT ENCODING subcommand and DEBUG OBJECT command.
Till now, it would return "unknown"
(cherry picked from commit 2a8803f534728a6fd1b7c29a2d7e195f6a928f50)
This commit makes stream object returning "stream" as encoding type in OBJECT ENCODING subcommand and DEBUG OBJECT command.
Till now, it would return "unknown"
(cherry picked from commit 6ff741b5cf4d1cf7e23610aa9eef7ed485bd3113)
Before this commit, following command did not show --tls option:
./runtest-cluster --help
./runtest-sentinel --help
(cherry picked from commit e4a1280a0e6c33d03ec6b622b8159b2b26f0f9c3)
Before this commit, following command did not show --tls option:
./runtest-cluster --help
./runtest-sentinel --help
(cherry picked from commit 98c8ac0d705608e866a8c0055b09d4a40655f492)
These tests started failing every day on http 404 (not being able to
install valgrind)
(cherry picked from commit 9428c1a591472fc87775781e5955aa527c6f1ff0)
These tests started failing every day on http 404 (not being able to
install valgrind)
(cherry picked from commit 78a6e5eb2b19adb40b5ec8fc867345b3c5479586)
This test was nearly always failing on MacOS github actions.
This is because of bugs in the test that caused it to nearly always run
all 3 attempts and just look at the last one as the pass/fail creteria.
i.e. the test was nearly always running all 3 attempts and still sometimes
succeed. this is because the break condition was different than the test
completion condition.
The reason the test succeeded is because the break condition tested the
results of all 3 tests (PSETEX/PEXPIRE/PEXPIREAT), but the success check
at the end was only testing the result of PSETEX.
The reason the PEXPIREAT test nearly always failed is because it was
getting the current time wrong: getting the current second and loosing
the sub-section time, so the only chance for it to succeed is if it run
right when a certain second started.
Because i now get the time from redis, adding another round trip, i
added another 100ms to the PEXPIRE test to make it less fragile, and
also added many more attempts.
Adding many more attempts before failure to account for slow platforms,
github actions and valgrind
(cherry picked from commit 1fd56bb75a9afa5469b3ecb70d394b2adaf9baac)
This test was nearly always failing on MacOS github actions.
This is because of bugs in the test that caused it to nearly always run
all 3 attempts and just look at the last one as the pass/fail creteria.
i.e. the test was nearly always running all 3 attempts and still sometimes
succeed. this is because the break condition was different than the test
completion condition.
The reason the test succeeded is because the break condition tested the
results of all 3 tests (PSETEX/PEXPIRE/PEXPIREAT), but the success check
at the end was only testing the result of PSETEX.
The reason the PEXPIREAT test nearly always failed is because it was
getting the current time wrong: getting the current second and loosing
the sub-section time, so the only chance for it to succeed is if it run
right when a certain second started.
Because i now get the time from redis, adding another round trip, i
added another 100ms to the PEXPIRE test to make it less fragile, and
also added many more attempts.
Adding many more attempts before failure to account for slow platforms,
github actions and valgrind
(cherry picked from commit ed9bfe226289759756126f35583394e5db93048f)