317 Commits

Author SHA1 Message Date
christianEQ
0dacd68163 added tests for flash psync
Former-commit-id: 79f7376a4fea2de9bd06c06d96f3f98ee7308874
2021-11-11 13:43:00 +00:00
malavan
d443babbea ignore rdb specific tests when running in --flash mode
Former-commit-id: b44ba3efb21c390c3d199ea1cdd238a57611fd85
2021-11-10 20:46:09 +00:00
John Sully
b12b48ab27 Initial implementation of snapshot fast replication. There are still a few TODOs in progress
Former-commit-id: 0febdcdab8693af443f350968ed3d8c80106675d
2021-11-09 19:36:07 +00:00
John Sully
c77ce968c5 Merge branch 'keydbpro_collab' into multithread_load
Former-commit-id: 8016c20f1f9a648e658c816e2f6777c5718d5e19
2021-08-09 20:20:34 +00:00
John Sully
d3d7c3a865 Merge branch 'keydbpro' into keydbpro_collab
Former-commit-id: 8eec3e948ffd204bb2d6170ad3ca42fa8a2c6d8b
2021-07-09 05:25:04 +00:00
John Sully
2616ed75cc Merge branch 'unstable' into keydbpro
Former-commit-id: 81ded8a35daa5100cac7299a7d0b5f43ee7ac74f
2021-07-09 04:41:47 +00:00
John Sully
c0160a4415 Merge tag '6.2.3' into unstable
Former-commit-id: 1895dbb7680fa9aadf6040912e89c733abc8c706
2021-07-09 04:40:31 +00:00
John Sully
420b07960c Prevent test code crash due to no log data
Former-commit-id: 0a56a73bd98d4e692ae77683fdb9dd644ecfc2eb
2021-06-14 22:06:36 +00:00
John Sully
eef508736d Merge branch 'keydbpro_collab' into multithread_load
Former-commit-id: b580a5561220bc6dc7bc073135f3045ba0cdda51
2021-06-02 04:24:49 +00:00
John Sully
833faf96c5 Merge branch 'merge_6.2.2' into keydbpro_collab
Former-commit-id: 54fe037e4e215b1e5bdb42b567c6df04a69ba150
2021-06-02 02:05:44 +00:00
John Sully
992d515b05 Merge remote-tracking branch 'collab/keydbpro' into multithread_load
Former-commit-id: a09454b3a4b295b2d04bdb7d742db7f9a3e17be7
2021-05-31 01:16:01 +00:00
John Sully
989799df85 Merge branch 'unstable' into keydbpro
Former-commit-id: 14c24ce161ddcbddb701a20062659261397cb0e4
2021-05-29 06:49:05 +00:00
John Sully
ef5bf1fdd7 Prevent partial sync in test that requires only full syncs
Former-commit-id: 1b9fea066914d7f23d6bec220f26b8c0112d7f8b
2021-05-29 02:22:20 +00:00
John Sully
4e70e85ab8 Fix failover command test failures
Former-commit-id: d3c37c7159a92319759a33851669862a82cf1b28
2021-05-29 01:19:12 +00:00
John Sully
f4151f0d6b Merge branch 'unstable' into keydbpro
Former-commit-id: 205d8f18d2bb8df5253bab40578b006b7aa73fd5
2021-05-28 23:32:46 +00:00
John Sully
ea6a0f214b Merge tag '6.2.2' into unstable
Former-commit-id: 93ebb31b17adec5d406d2e30a5b9ea71c07fce5c
2021-05-21 05:54:39 +00:00
John Sully
f49d8f9adb Merge tag '6.2.1' into unstable
Former-commit-id: bfed57e3e0edaa724b9d060a6bb8edc5a6de65fa
2021-05-19 02:59:48 +00:00
John Sully
40fdb3ce05 Add endurance testing to better detect threading bugs
Former-commit-id: 945e428aa110968479fdcdfc2d5c5308a99eadc3
2021-05-06 00:09:07 +00:00
bugwz
d7221e0135 Print the number of abnormal line in AOF (#8823)
When redis-check-aof finds an error, it prints the line number for faster troubleshooting. 

(cherry picked from commit 761d7d27711edfbf737def41ff28f5b325fb16c8)
2021-05-03 22:57:00 +03:00
John Sully
c58739bbcb Respect replica output buffer limits when adding large commands to the ring buffer
Former-commit-id: 37ec01cfd8a8da1e895c7cdc358d382d35ad59dd
2021-05-03 16:33:16 +00:00
Oran Agra
442bd29612 Fix timing of new replication test (#8807)
In github actions CI with valgrind, i saw that even the fast replica
(one that wasn't paused), didn't get to complete the replication fast
enough, and ended up getting disconnected by timeout.

Additionally, due to a typo in uname, we didn't get to actually run the
CPU efficiency part of the test.
2021-04-18 15:12:34 +03:00
guybe7
0bf6c205db Add a timeout mechanism for replicas stuck in fullsync (#8762)
Starting redis 6.0 (part of the TLS feature), diskless master uses pipe from the fork
child so that the parent is the one sending data to the replicas.
This mechanism has an issue in which a hung replica will cause the master to wait
for it to read the data sent to it forever, thus preventing the fork child from terminating
and preventing the creations of any other forks.

This PR adds a timeout mechanism, much like the ACK-based timeout,
we disconnect replicas that aren't reading the RDB file fast enough.
2021-04-15 17:18:51 +03:00
christianEQ
ec11cc177e check for musl in logging.tcl as backtrace() is not available with it
Former-commit-id: 2c8239f9cb30aa32de936be09522e6429aa40326
2021-04-05 11:09:49 -04:00
Oran Agra
259eb03afb solve race conditions in psync2-pingoff test (#8720)
Another test race condition in the macos tests.
the test was waiting for PINGs to be generated and put on the replication stream,
but waiting for 1 or 2 seconds doesn't really guarantee that.
then the test that expected 6 full syncs, found only 4
2021-03-30 11:41:06 +03:00
Qu Chen
1372210a03 Properly initialize variable to make valgrind happy in checkChildrenDone(). Removed usage for the obsolete wait3() and wait4() in favor of waitpid(), and properly check for the exit status code. (#8666) 2021-03-24 08:41:05 -07:00
Oran Agra
5653935571 Corrupt stream key access to uninitialized memory (#8681)
the corrupt-dump-fuzzer test found a case where an access to a corrupt
stream would have caused accessing to uninitialized memory.
now it'll panic instead.

The issue was that there was a stream that says it has more than 0
records, but looking for the max ID came back empty handed.

p.s. when sanitize-dump-payload is used, this corruption is detected,
and the RESTORE command is gracefully rejected.
2021-03-24 11:33:49 +02:00
Oran Agra
10929ace19 Fix race in replication test (#8679)
Since redis 6.2, redis immediately tries to connect to the master, not
waiting for replication cron.

in the slow freebsd CI, this test failed and master_link_status was
already "up" when INFO was called.
2021-03-22 10:50:39 +02:00
Yossi Gottlieb
ed23d94107 Improve redis-cli non-binary safe string handling. (#8566)
* The `redis-cli --scan` output should honor output mode (set explicitly or implicitly), and quote key names when not in raw mode.
  * Technically this is a breaking change, but it should be very minor since raw mode is by default on for non-tty output.
  * It should only affect  TTY output (human users) or non-tty output if `--no-raw` is specified.

* Added `--quoted-input` option to treat all arguments as potentially quoted strings.
* Added `--quoted-pattern` option to accept a potentially quoted pattern.

Unquoting is applied to potentially quoted input only if single or double quotes are used. 

Fixes #8561, #8563
2021-03-04 15:03:49 +02:00
Yossi Gottlieb
a7f552ef3a Fix potential replication-4 test race condition. (#8583)
Co-authored-by: Oran Agra <oran@redislabs.com>
2021-03-02 18:12:11 +02:00
Oran Agra
e978fc37ed fix stream deep sanitization with deleted records (#8568)
When sanitizing the stream listpack, we need to count the deleted records too.
otherwise the last line that checks the next pointer fails.

Add test to cover that state in the stream tests.
2021-03-01 17:23:29 +02:00
Yossi Gottlieb
665da8a056 Fix failed tests on Linux Alpine and add a CI job. (#8532)
* Remove linux/version.h dependency.

This introduces unnecessary dependencies, and generally not a good idea
as the platform we build on may be different than the platform we run
on.

To determine if sync_file_range exists we can simply rely on header file
hints.

* Fix setproctitle() on libmusl.

The previous ifdef checks were a bit too strict for no apparent
reason.

* Fix tests failure on Linux with no backtrace.

* Add alpine daily CI job.
2021-02-23 12:57:45 +02:00
uriyage
a8d6c6493a Adds INFO fields to track fork child progress (#8414)
* Adding current_save_keys_total and current_save_keys_processed info fields.
  Present in replication, BGSAVE and AOFRW.
* Changing RM_SendChildCOWInfo() to RM_SendChildHeartbeat(double progress)
* Adding new info field current_fork_perc. Present in Replication, BGSAVE, AOFRW,
  and module forks.
2021-02-16 16:06:51 +02:00
Yossi Gottlieb
064d807db3 Escape unsafe field name characters in INFO. (#8492)
Fixes #8489
2021-02-15 17:08:53 +02:00
Oran Agra
dffb64f5cc solve race in replication-2 test - again (#8491)
this should make it timing independent and also faster in most cases
2021-02-15 12:50:23 +02:00
Oran Agra
8bfac022f8 solve race in replication-2 test (#8461)
use SIGSTOP instead of DEBUG SLEEP, reduces the test
time by some 2 seconds and avoids failures on slow machines
2021-02-07 16:22:30 +02:00
christianEQ
c69b2845fd Merge remote-tracking branch 'opensource/unstable' into keydbpro
Former-commit-id: 5bad058733de2c217340bb9ee48f02b07d754808
2021-02-03 18:10:27 +00:00
Yossi Gottlieb
9b564e6bc0 Fix FreeBSD tests and CI Daily issues. (#8438)
* Add bash temporarily to allow sentinel fd leaks test to run.
* Use vmactions-freebsd rdist sync to work around bind permission denied
  and slow execution issues.
* Upgrade to tcl8.6 to be aligned with latest Ubuntu envs.
* Concat all command executions to avoid ignoring failures.
* Skip intensive fuzzer on FreeBSD. For some yet unknown reason, generate_fuzzy_traffic_on_key causes TCL to significantly bloat on FreeBSD resulting with out of memory.
2021-02-03 17:35:28 +02:00
Oran Agra
fcb5309700 Fix test issues from introduction of HRANDFIELD (#8424)
* The corrupt dump fuzzer found a division by zero.
* in some cases the random fields from the HRANDFIELD tests produced
  fields with newlines and other special chars (due to \ char), this caused
  the TCL tests to see a bulk response that has a newline in it and add {}
  around it, later it can think this is a nested list. in fact the `alpha` random
  string generator isn't using spaces and newlines, so it should not use `\`
  either.
2021-01-31 12:13:45 +02:00
Allen Farris
264f1b7f65 implement FAILOVER command (#8315)
Implement FAILOVER command, which coordinates failover
between the server and one of its replicas.
2021-01-28 13:18:05 -08:00
christianEQ
9d62e397bd fixed replicaof no one config, added test case
Former-commit-id: e2615ccf88ddb2a93536b62318983780890c4819
2021-01-28 19:49:22 +00:00
Raghav Muddur
c1eea9e6a2 GETEX, GETDEL and SET PXAT/EXAT (#8327)
This commit introduces two new command and two options for an existing command

GETEX <key> [PERSIST][EX seconds][PX milliseconds] [EXAT seconds-timestamp]
[PXAT milliseconds-timestamp]

The getexCommand() function implements extended options and variants of the GET
command. Unlike GET command this command is not read-only. Only one of the options
can be used at a given time.

1. PERSIST removes any TTL associated with the key.
2. EX Set expiry TTL in seconds.
3. PX Set expiry TTL in milliseconds.
4. EXAT Same like EX instead of specifying the number of seconds representing the
    TTL (time to live), it takes an absolute Unix timestamp
5. PXAT Same like PX instead of specifying the number of milliseconds representing the
    TTL (time to live), it takes an absolute Unix timestamp

Command would return either the bulk string, error or nil.

GETDEL <key>
Would delete the key after getting.

SET key value [NX] [XX] [KEEPTTL] [GET] [EX <seconds>] [PX <milliseconds>]
[EXAT <seconds-timestamp>][PXAT <milliseconds-timestamp>]

Two new options added here are EXAT and PXAT

Key implementation notes
- `SET` with `PX/EX/EXAT/PXAT` is always translated to `PXAT` in `AOF`. When relative time is
  specified (`PX/EX`), replication will always use `PX`.
- `setexCommand` and `psetexCommand` would no longer need translation in `feedAppendOnlyFile`
  as they are modified to invoke `setGenericCommand ` with appropriate flags which will take care of
  correct AOF translation.
- `GETEX` without any optional argument behaves like `GET`.
- `GETEX` command is never propagated, It is either propagated as `PEXPIRE[AT], or PERSIST`.
- `GETDEL` command is propagated as `DEL`
- Combined the validation for `SET` and `GETEX` arguments. 
- Test cases to validate AOF/Replication propagation
2021-01-27 19:47:26 +02:00
christianEQ
c068f2cd3d Merge tag 'tags/6.0.10' into redismerge_2021-01-20
Former-commit-id: dadce055f897cee83946c2d3e5cbb76341b94230
2021-01-26 21:43:09 +00:00
Yossi Gottlieb
e4d0ba933d Add io-thread daily CI tests. (#8232)
This adds basic coverage to IO threads by running the cluster and few selected Redis test suite tests with the IO threads enabled.

Also provides some necessary additional improvements to the test suite:

* Add --config to sentinel/cluster tests for arbitrary configuration.
* Fix --tags whitelisting which was broken.
* Add a `network` tag to some tests that are more network intensive. This is work in progress and more tests should be properly tagged in the future.
2021-01-17 15:48:48 +02:00
Oran Agra
46729a7404 Fix last COW INFO report, Skip test on non-linux platforms (#8301)
- the last COW report wasn't always read from the pipe
  (receiveLastChildInfo wasn't used)
- but in fact, there's no reason we won't always try to drain that pipe
  so i'm unifying receiveLastChildInfo with receiveChildInfo
- adjust threshold of the COW test when run in accurate mode
- add some prints in case this test fails again
- fix indentation, page size, and PID! in MacOS proc info

p.s. it seems that pri_pages_dirtied is always 0
2021-01-08 23:35:30 +02:00
YaacovHazan
4dc4343d18 Report child copy-on-write info continuously
Add INFO field, rdb_active_cow_size, to report COW of a live fork child while
it's active.
- once in 1024 keys check the time, and if there's more than one second since
  the last report send a report to the parent via the pipe.
- refactor the child_info_data struct, it's an implementation detail that
  shouldn't be in the server struct, and not used to communicate data between
  caller and callee
- remove the magic value from that struct (not sure what it was good for), and
  instead add handling of short reads.
- add another value to the structure, cow_type, to indicate if the report is
  for the new rdb_active_cow_size field, or it's the last report of a
  successful operation
- add new Module API to report the active COW
- add more asserts variants to test.tcl
2021-01-07 16:14:29 +02:00
Oran Agra
c9a843c9bb Sanitize dump payload: excessive free on dup zset fields (#8189) 2020-12-14 17:10:31 +02:00
John Sully
5c832c5ad0 Merge branch 'unstable' into keydbpro
Former-commit-id: b7a1e16c0f04e8aeb3764e3681fa14fc0f97f6a3
2020-12-10 02:37:28 +00:00
Oran Agra
595cff7fd4 Sanitize dump payload: fail RESTORE if memory allocation fails
When RDB input attempts to make a huge memory allocation that fails,
RESTORE should fail gracefully rather than die with panic
2020-12-06 14:54:34 +02:00
Oran Agra
7006f02599 Sanitize dump payload: validate no duplicate records in hash/zset/intset
If RESTORE passes successfully with full sanitization, we can't affort
to crash later on assertion due to duplicate records in a hash when
converting it form ziplist to dict.
This means that when doing full sanitization, we must make sure there
are no duplicate records in any of the collections.
2020-12-06 14:54:34 +02:00
Oran Agra
912e22b4f9 Sanitize dump payload: fuzz tester and fixes for segfaults and leaks it exposed
The test creates keys with various encodings, DUMP them, corrupt the payload
and RESTORES it.
It utilizes the recently added use-exit-on-panic config to distinguish between
 asserts and segfaults.
If the restore succeeds, it runs random commands on the key to attempt to
trigger a crash.

It runs in two modes, one with deep sanitation enabled and one without.
In the first one we don't expect any assertions or segfaults, in the second one
we expect assertions, but no segfaults.
We also check for leaks and invalid reads using valgrind, and if we find them
we print the commands that lead to that issue.

Changes in the code (other than the test):
- Replace a few NPD (null pointer deference) flows and division by zero with an
  assertion, so that it doesn't fail the test. (since we set the server to use
  `exit` rather than `abort` on assertion).
- Fix quite a lot of flows in rdb.c that could have lead to memory leaks in
  RESTORE command (since it now responds with an error rather than panic)
- Add a DEBUG flag for SET-SKIP-CHECKSUM-VALIDATION so that the test don't need
  to bother with faking a valid checksum
- Remove a pile of code in serverLogObjectDebugInfo which is actually unsafe to
  run in the crash report (see comments in the code)
- fix a missing boundary check in lzf_decompress

test suite infra improvements:
- be able to run valgrind checks before the process terminates
- rotate log files when restarting servers
2020-12-06 14:54:34 +02:00