11 Commits

Author SHA1 Message Date
sundb
b69f619f17 Sanitize dump payload: fix double free after insert dup nodekey to stream rax and returns 0 (#9399)
(cherry picked from commit 492d8d09613cff88f15dcef98732392b8d509eb1)
2021-10-04 13:59:40 +03:00
sundb
b2f4ca8c64 Sanitize dump payload: handle remaining empty key when RDB loading and restore command (#9349)
This commit mainly fixes empty keys due to RDB loading and restore command,
which was omitted in #9297.

1) When loading quicklsit, if all the ziplists in the quicklist are empty, NULL will be returned.
    If only some of the ziplists are empty, then we will skip the empty ziplists silently.
2) When loading hash zipmap, if zipmap is empty, sanitization check will fail.
3) When loading hash ziplist, if ziplist is empty, NULL will be returned.
4) Add RDB loading test with sanitize.

(cherry picked from commit cbda492909cd2fff25263913cd2e1f00bc48a541)
2021-10-04 13:59:40 +03:00
Oran Agra
1334180f82 Improvements to corrupt payload sanitization (#9321)
Recently we found two issues in the fuzzer tester: #9302 #9285
After fixing them, more problems surfaced and this PR (as well as #9297) aims to fix them.

Here's a list of the fixes
- Prevent an overflow when allocating a dict hashtable
- Prevent OOM when attempting to allocate a huge string
- Prevent a few invalid accesses in listpack
- Improve sanitization of listpack first entry
- Validate integrity of stream consumer groups PEL
- Validate integrity of stream listpack entry IDs
- Validate ziplist tail followed by extra data which start with 0xff

Co-authored-by: sundb <sundbcn@gmail.com>
(cherry picked from commit 0c90370e6d71cc68e4d9cc79a0d8b1e768712a5b)
2021-10-04 13:59:40 +03:00
sundb
a85fb0283c Sanitize dump payload: fix empty keys when RDB loading and restore command (#9297)
When we load rdb or restore command, if we encounter a length of 0, it will result in the creation of an empty key.
This could either be a corrupt payload, or a result of a bug (see #8453 )

This PR mainly fixes the following:
1) When restore command will return `Bad data format` error.
2) When loading RDB, we will silently discard the key.

Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 8ea777a6a02cae22aeff95f054d810f30b7b69ad)
2021-10-04 13:59:40 +03:00
Oran Agra
5653935571 Corrupt stream key access to uninitialized memory (#8681)
the corrupt-dump-fuzzer test found a case where an access to a corrupt
stream would have caused accessing to uninitialized memory.
now it'll panic instead.

The issue was that there was a stream that says it has more than 0
records, but looking for the max ID came back empty handed.

p.s. when sanitize-dump-payload is used, this corruption is detected,
and the RESTORE command is gracefully rejected.
2021-03-24 11:33:49 +02:00
Oran Agra
fcb5309700 Fix test issues from introduction of HRANDFIELD (#8424)
* The corrupt dump fuzzer found a division by zero.
* in some cases the random fields from the HRANDFIELD tests produced
  fields with newlines and other special chars (due to \ char), this caused
  the TCL tests to see a bulk response that has a newline in it and add {}
  around it, later it can think this is a nested list. in fact the `alpha` random
  string generator isn't using spaces and newlines, so it should not use `\`
  either.
2021-01-31 12:13:45 +02:00
Oran Agra
c9a843c9bb Sanitize dump payload: excessive free on dup zset fields (#8189) 2020-12-14 17:10:31 +02:00
Oran Agra
595cff7fd4 Sanitize dump payload: fail RESTORE if memory allocation fails
When RDB input attempts to make a huge memory allocation that fails,
RESTORE should fail gracefully rather than die with panic
2020-12-06 14:54:34 +02:00
Oran Agra
7006f02599 Sanitize dump payload: validate no duplicate records in hash/zset/intset
If RESTORE passes successfully with full sanitization, we can't affort
to crash later on assertion due to duplicate records in a hash when
converting it form ziplist to dict.
This means that when doing full sanitization, we must make sure there
are no duplicate records in any of the collections.
2020-12-06 14:54:34 +02:00
Oran Agra
912e22b4f9 Sanitize dump payload: fuzz tester and fixes for segfaults and leaks it exposed
The test creates keys with various encodings, DUMP them, corrupt the payload
and RESTORES it.
It utilizes the recently added use-exit-on-panic config to distinguish between
 asserts and segfaults.
If the restore succeeds, it runs random commands on the key to attempt to
trigger a crash.

It runs in two modes, one with deep sanitation enabled and one without.
In the first one we don't expect any assertions or segfaults, in the second one
we expect assertions, but no segfaults.
We also check for leaks and invalid reads using valgrind, and if we find them
we print the commands that lead to that issue.

Changes in the code (other than the test):
- Replace a few NPD (null pointer deference) flows and division by zero with an
  assertion, so that it doesn't fail the test. (since we set the server to use
  `exit` rather than `abort` on assertion).
- Fix quite a lot of flows in rdb.c that could have lead to memory leaks in
  RESTORE command (since it now responds with an error rather than panic)
- Add a DEBUG flag for SET-SKIP-CHECKSUM-VALIDATION so that the test don't need
  to bother with faking a valid checksum
- Remove a pile of code in serverLogObjectDebugInfo which is actually unsafe to
  run in the crash report (see comments in the code)
- fix a missing boundary check in lzf_decompress

test suite infra improvements:
- be able to run valgrind checks before the process terminates
- rotate log files when restarting servers
2020-12-06 14:54:34 +02:00
Oran Agra
137434d50a Sanitize dump payload: ziplist, listpack, zipmap, intset, stream
When loading an encoded payload we will at least do a shallow validation to
check that the size that's encoded in the payload matches the size of the
allocation.
This let's us later use this encoded size to make sure the various offsets
inside encoded payload don't reach outside the allocation, if they do, we'll
assert/panic, but at least we won't segfault or smear memory.

We can also do 'deep' validation which runs on all the records of the encoded
payload and validates that they don't contain invalid offsets. This lets us
detect corruptions early and reject a RESTORE command rather than accepting
it and asserting (crashing) later when accessing that payload via some command.

configuration:
- adding ACL flag skip-sanitize-payload
- adding config sanitize-dump-payload [yes/no/clients]

For now, we don't have a good way to ensure MIGRATE in cluster resharding isn't
being slowed down by these sanitation, so i'm setting the default value to `no`,
but later on it should be set to `clients` by default.

changes:
- changing rdbReportError not to `exit` in RESTORE command
- adding a new stat to be able to later check if cluster MIGRATE isn't being
  slowed down by sanitation.
2020-12-06 14:54:34 +02:00