27369 Commits

Author SHA1 Message Date
John Sully
ac42f938e8 Fix whitespace
Former-commit-id: d47aeb1fc8a6804a44035253ad87478b817605cf
2020-09-07 03:35:46 +00:00
John Sully
855753ebb3 Fix whitespace
Former-commit-id: d47aeb1fc8a6804a44035253ad87478b817605cf
2020-09-07 03:35:46 +00:00
John Sully
1c1b114555 Dramatically improve perf by blocking commands
Former-commit-id: e47584b286c41cf0783fe014ac8b6ec187564ade
2020-09-07 00:49:53 +00:00
John Sully
b6d8a5938d Dramatically improve perf by blocking commands
Former-commit-id: e47584b286c41cf0783fe014ac8b6ec187564ade
2020-09-07 00:49:53 +00:00
Oran Agra
725616534e if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (6fd5ff8).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
Oran Agra
573246f73c
if diskless repl child is killed, make sure to reap the pid (#7742)
Starting redis 6.0 and the changes we made to the diskless master to be
suitable for TLS, I made the master avoid reaping (wait3) the pid of the
child until we know all replicas are done reading their rdb.

I did that in order to avoid a state where the rdb_child_pid is -1 but
we don't yet want to start another fork (still busy serving that data to
replicas).

It turns out that the solution used so far was problematic in case the
fork child was being killed (e.g. by the kernel OOM killer), in that
case there's a chance that we currently disabled the read event on the
rdb pipe, since we're waiting for a replica to become writable again.
and in that scenario the master would have never realized the child
exited, and the replica will remain hung too.
Note that there's no mechanism to detect a hung replica while it's in
rdb transfer state.

The solution here is to add another pipe which is used by the parent to
tell the child it is safe to exit. this mean that when the child exits,
for whatever reason, it is safe to reap it.

Besides that, i'm re-introducing an adjustment to REPLCONF ACK which was
part of #6271 (Accelerate diskless master connections) but was dropped
when that PR was rebased after the TLS fork/pipe changes (5a47794).
Now that RdbPipeCleanup no longer calls checkChildrenDone, and the ACK
has chance to detect that the child exited, it should be the one to call
it so that we don't have to wait for cron (server.hz) to do that.
2020-09-06 16:43:57 +03:00
Oran Agra
da723a917d Improve valgrind support for cluster tests (#7725)
- redirect valgrind reports to a dedicated file rather than console
- try to avoid killing instances with SIGKILL so that we get the memory
  leak report (killing with SIGTERM before resorting to SIGKILL)
- search for valgrind reports when done, print them and fail the tests
- add --dont-clean option to keep the logs on exit
- fix exit error code when crash is found (would have exited with 0)

changes that affect the normal redis test suite:
- refactor check_valgrind_errors into two functions one to search and
  one to report
- move the search half into util.tcl to serve the cluster tests too
- ignore "address range perms" valgrind warnings which seem non relevant.
2020-09-06 11:11:49 +03:00
Oran Agra
2b998de460
Improve valgrind support for cluster tests (#7725)
- redirect valgrind reports to a dedicated file rather than console
- try to avoid killing instances with SIGKILL so that we get the memory
  leak report (killing with SIGTERM before resorting to SIGKILL)
- search for valgrind reports when done, print them and fail the tests
- add --dont-clean option to keep the logs on exit
- fix exit error code when crash is found (would have exited with 0)

changes that affect the normal redis test suite:
- refactor check_valgrind_errors into two functions one to search and
  one to report
- move the search half into util.tcl to serve the cluster tests too
- ignore "address range perms" valgrind warnings which seem non relevant.
2020-09-06 11:11:49 +03:00
Oran Agra
cf22e8eb91 test infra - add durable mode to work around test suite crashing
in some cases a command that returns an error possibly due to a timing
issue causes the tcl code to crash and thus prevents the rest of the
tests from running. this adds an option to make the test proceed despite
the crash.
maybe it should be the default mode some day.
2020-09-06 09:59:19 +03:00
Oran Agra
fe5da2e60d test infra - add durable mode to work around test suite crashing
in some cases a command that returns an error possibly due to a timing
issue causes the tcl code to crash and thus prevents the rest of the
tests from running. this adds an option to make the test proceed despite
the crash.
maybe it should be the default mode some day.
2020-09-06 09:59:19 +03:00
Oran Agra
cc455a710c test infra - wait_done_loading
reduce code duplication in aof.tcl.
move creation of clients into the test so that it can be skipped
2020-09-06 09:59:19 +03:00
Oran Agra
1b7ba44e79 test infra - wait_done_loading
reduce code duplication in aof.tcl.
move creation of clients into the test so that it can be skipped
2020-09-06 09:59:19 +03:00
Oran Agra
2468c17a32 test infra - flushall between tests in external mode 2020-09-06 09:59:19 +03:00
Oran Agra
b65e5aca86 test infra - flushall between tests in external mode 2020-09-06 09:59:19 +03:00
Oran Agra
5c61f1a6ed test infra - improve test skipping ability
- skip full units
- skip a single test (not just a list of tests)
- when skipping tag, skip spinning up servers, not just the tests
- skip tags when running against an external server too
- allow using multiple tags (split them)
2020-09-06 09:59:19 +03:00
Oran Agra
677d14c213 test infra - improve test skipping ability
- skip full units
- skip a single test (not just a list of tests)
- when skipping tag, skip spinning up servers, not just the tests
- skip tags when running against an external server too
- allow using multiple tags (split them)
2020-09-06 09:59:19 +03:00
Oran Agra
fc18f16260 test infra - reduce disk space usage
this is important when running a test with --loop
2020-09-06 09:59:19 +03:00
Oran Agra
e3e69c25fd test infra - reduce disk space usage
this is important when running a test with --loop
2020-09-06 09:59:19 +03:00
Oran Agra
e783c03dd1 test infra - write test name to logfile 2020-09-06 09:59:19 +03:00
Oran Agra
9d527d076b test infra - write test name to logfile 2020-09-06 09:59:19 +03:00
Yossi Gottlieb
94cd74e5de redis-cli: fix writeConn() buffer handling. (#7749)
Fix issues with writeConn() which resulted with corruption of the stream by leaving an extra byte in the buffer. The trigger for this is partial writes or write errors which were not experienced on Linux but reported on macOS.
2020-09-03 18:15:48 +03:00
Yossi Gottlieb
58e5feb3f4
redis-cli: fix writeConn() buffer handling. (#7749)
Fix issues with writeConn() which resulted with corruption of the stream by leaving an extra byte in the buffer. The trigger for this is partial writes or write errors which were not experienced on Linux but reported on macOS.
2020-09-03 18:15:48 +03:00
WuYunlong
44cc2e282f fix wrong comments in redis.conf, change default always-show-logo (#5695)
1. default value of always-show-logo was not consistent with the default in the code
2. comment about cluster-replica-no-failover is wrong since we can only do manually failover upon replicas
3. improve description about always-show-logo
2020-09-03 10:31:18 +03:00
WuYunlong
12f798dc18
fix wrong comments in redis.conf, change default always-show-logo (#5695)
1. default value of always-show-logo was not consistent with the default in the code
2. comment about cluster-replica-no-failover is wrong since we can only do manually failover upon replicas
3. improve description about always-show-logo
2020-09-03 10:31:18 +03:00
Oran Agra
eca1014fe3 Run active defrag while blocked / loading (#7726)
During long running scripts or loading RDB/AOF, we may need to do some
defragging. Since processEventsWhileBlocked is called periodically at
unknown intervals, and many cron jobs either depend on run_with_period
(including active defrag), or rely on being called at server.hz rate
(i.e. active defrag knows ho much time to run by looking at server.hz),
the whileBlockedCron may have to run a loop triggering the cron jobs in it
(currently only active defrag) several times.

Other changes:
- Adding a test for defrag during aof loading.
- Changing key-load-delay config to take negative values for fractions
  of a microsecond sleep
2020-09-03 08:47:29 +03:00
Oran Agra
9ef8d2f671
Run active defrag while blocked / loading (#7726)
During long running scripts or loading RDB/AOF, we may need to do some
defragging. Since processEventsWhileBlocked is called periodically at
unknown intervals, and many cron jobs either depend on run_with_period
(including active defrag), or rely on being called at server.hz rate
(i.e. active defrag knows ho much time to run by looking at server.hz),
the whileBlockedCron may have to run a loop triggering the cron jobs in it
(currently only active defrag) several times.

Other changes:
- Adding a test for defrag during aof loading.
- Changing key-load-delay config to take negative values for fractions
  of a microsecond sleep
2020-09-03 08:47:29 +03:00
Pierre Jambet
42dc0e98aa Fix error message for the DEBUG ZIPLIST command (#7745)
DEBUG ZIPLIST <key> currently returns the following error string if the
key is not a ziplist: "ERR Not an sds encoded string.". This looks like
an accidental copy/paste error from the error returned in the else if
branch above where this string is returned if the key is not an sds
string. The command was added in
f898429fe149f476d61270ed4299dd1f8f75ae50 and looking at the commit,
nothing indicates that it is not an accidental typo.

The error string now returns a correct error: "Not a ziplist encoded
object", which accurately describes the error.
2020-09-02 23:27:48 +03:00
Pierre Jambet
d52ce4ea1a
Fix error message for the DEBUG ZIPLIST command (#7745)
DEBUG ZIPLIST <key> currently returns the following error string if the
key is not a ziplist: "ERR Not an sds encoded string.". This looks like
an accidental copy/paste error from the error returned in the else if
branch above where this string is returned if the key is not an sds
string. The command was added in
ac61f9062583d67dd43f7d698824464d1e30d84b and looking at the commit,
nothing indicates that it is not an accidental typo.

The error string now returns a correct error: "Not a ziplist encoded
object", which accurately describes the error.
2020-09-02 23:27:48 +03:00
Oran Agra
0db61f5649 Print server startup messages after daemonization (#7743)
When redis isn't configured to have a log file, having these prints
before damonization puts them in the calling process stdout rather than
/dev/null
2020-09-02 17:18:09 +03:00
Oran Agra
8b0747d657
Print server startup messages after daemonization (#7743)
When redis isn't configured to have a log file, having these prints
before damonization puts them in the calling process stdout rather than
/dev/null
2020-09-02 17:18:09 +03:00
Thandayuthapani
5352220639 Add masters/replicas options to redis-cli --cluster call command (#6491)
* Add master/slave option in --cluster call command

* Update src/redis-cli.c

* Update src/redis-cli.c

Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-09-02 16:23:49 +03:00
Thandayuthapani
f22f64f0db
Add masters/replicas options to redis-cli --cluster call command (#6491)
* Add master/slave option in --cluster call command

* Update src/redis-cli.c

* Update src/redis-cli.c

Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-09-02 16:23:49 +03:00
Oran Agra
9b61917d7f fix README about BUILD_WITH_SYSTEMD usage (#7739)
BUILD_WITH_SYSTEMD is an internal variable. Users should use USE_SYSTEMD=yes.
2020-09-01 21:31:37 +03:00
Oran Agra
747b4004ea
fix README about BUILD_WITH_SYSTEMD usage (#7739)
BUILD_WITH_SYSTEMD is an internal variable. Users should use USE_SYSTEMD=yes.
2020-09-01 21:31:37 +03:00
Yossi Gottlieb
d377b116ba Fix double-make issue with make && make install. (#7734)
All user-supplied variables that affect the build should be explicitly
persisted.

Fixes #7254
2020-09-01 10:02:14 +03:00
Yossi Gottlieb
b35d6e5cff
Fix double-make issue with make && make install. (#7734)
All user-supplied variables that affect the build should be explicitly
persisted.

Fixes #7254
2020-09-01 10:02:14 +03:00
Oran Agra
6e7733c276 Redis 6.0.7 2020-09-01 09:27:58 +03:00
Oran Agra
dbea5f7a8d Redis 6.0.7 2020-09-01 09:27:58 +03:00
Oran Agra
6041fc99b5 Reduce the probability of failure when start redis in runtest-cluster #7554 (#7635)
When runtest-cluster, at first, we need to create a cluster use spawn_instance,
a port which is not used is choosen, however sometimes we can't run server on
the port. possibley due to a race with another process taking it first.
such as redis/redis/runs/896537490. It may be due to the machine problem or
In order to reduce the probability of failure when start redis in
runtest-cluster, we attemp to use another port when find server do not start up.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: yanhui13 <yanhui13@meituan.com>
(cherry picked from commit 1deaad884c38e92e5b691f36b253ef4ee2201ca4)
2020-09-01 09:27:58 +03:00
Oran Agra
4bb40a9688 Reduce the probability of failure when start redis in runtest-cluster #7554 (#7635)
When runtest-cluster, at first, we need to create a cluster use spawn_instance,
a port which is not used is choosen, however sometimes we can't run server on
the port. possibley due to a race with another process taking it first.
such as redis/redis/runs/896537490. It may be due to the machine problem or
In order to reduce the probability of failure when start redis in
runtest-cluster, we attemp to use another port when find server do not start up.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: yanhui13 <yanhui13@meituan.com>
(cherry picked from commit e2d64485b8262971776fb1be803c7296c98d1572)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
f2ab7ac5d7 Backport Lua 5.2.2 stack overflow fix. (#7733)
This fixes the issue described in CVE-2014-5461. At this time we cannot
confirm that the original issue has a real impact on Redis, but it is
included as an extra safety measure.

(cherry picked from commit 374270d3a04e8b224a12655518c815497aeb497d)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
d2532d1335 Backport Lua 5.2.2 stack overflow fix. (#7733)
This fixes the issue described in CVE-2014-5461. At this time we cannot
confirm that the original issue has a real impact on Redis, but it is
included as an extra safety measure.

(cherry picked from commit d75ad774a92bd7de0b9448be3d622d7a13b7af27)
2020-09-01 09:27:58 +03:00
Leoš Literák
6c68ac1d4c Update README.md with instructions how to build with systemd support (#7730)
#7728 - update instructions for systemd support

(cherry picked from commit 635d6ca6390ebab09bca3214777253910cb46547)
2020-09-01 09:27:58 +03:00
Leoš Literák
00d0d870d2 Update README.md with instructions how to build with systemd support (#7730)
#7728 - update instructions for systemd support

(cherry picked from commit 571571ca192ec0b7cc66ca61cd6794dcb6a9d8bc)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
ba1da77a3d Fix oom-score-adj on older distros. (#7724)
Don't assume `ps` handles `-h` to display output without headers and
manually trim headers line from output.

(cherry picked from commit ae8420298cacc2737e8e3ffa3c5acc038cd27849)
2020-09-01 09:27:58 +03:00
Yossi Gottlieb
f38e2802b6 Fix oom-score-adj on older distros. (#7724)
Don't assume `ps` handles `-h` to display output without headers and
manually trim headers line from output.

(cherry picked from commit b61b663895f16d9f559a14c408c225062254a57b)
2020-09-01 09:27:58 +03:00
Wang Yuan
a399ca9bf7 Fix wrong format specifiers of 'sdscatfmt' for the INFO command (#7706)
unlike printf, sdscatfmt doesn't take %d

(cherry picked from commit 48a00e6b99430d493ae8e4daa169f4a9ee9a8fa6)
2020-09-01 09:27:58 +03:00
Wang Yuan
1b100a167a Fix wrong format specifiers of 'sdscatfmt' for the INFO command (#7706)
unlike printf, sdscatfmt doesn't take %d

(cherry picked from commit 43af28f5b487370bd3d65d00be93c4a23ee42fa7)
2020-09-01 09:27:58 +03:00
Wen Hui
edcc2032e4 fix make warnings (#7692)
(cherry picked from commit 7386b998e80affe8696b89b750ba86c9d8b9f453)
2020-09-01 09:27:58 +03:00
Wen Hui
d4e1c88052 fix make warnings (#7692)
(cherry picked from commit e61adc0d897074d8c2ca8e0f7bf08fa2985d9b01)
2020-09-01 09:27:58 +03:00