9206 Commits

Author SHA1 Message Date
Oran Agra
640a4479b0 Don't queue commands in an already aborted MULTI state 2020-06-09 11:53:01 +02:00
Oran Agra
881d2c4663 Avoid rejecting WATCH / UNWATCH, like MULTI/EXEC/DISCARD
Much like MULTI/EXEC/DISCARD, the WATCH and UNWATCH are not actually
operating on the database or server state, but instead operate on the
client state. the client may send them all in one long pipeline and check
all the responses only at the end, so failing them may lead to a
mismatch between the client state on the server and the one on the
client end, and execute the wrong commands (ones that were meant to be
discarded)

the watched keys are not actually stored in the client struct, but they
are in fact part of the client state. for instance, they're not cleared
or moved in SWAPDB or FLUSHDB.
2020-06-09 11:53:01 +02:00
zhaozhao.zz
d496ce7271 AOF: append origin SET if no expire option 2020-06-09 11:53:01 +02:00
Oran Agra
216b5a1aae fix disconnectSlaves, to try to free each slave.
the recent change in that loop (iteration rather than waiting for it to
be empty) was intended to avoid an endless loop in case some slave would
refuse to be freed.

but the lookup of the first client remained, which would have caused it
to try the first one again and again instead of moving on.
2020-06-09 11:53:01 +02:00
zhaozhao.zz
2a8ee55176 donot free protected client in freeClientsInAsyncFreeQueue
related #7234
2020-06-09 11:53:01 +02:00
Oran Agra
fed743b2e1 fix pingoff test race 2020-06-06 11:44:21 +02:00
Kevin Fwu
ce014c2a3b Fix TLS certificate loading for chained certificates.
This impacts client verification for chained certificates (such as Lets
Encrypt certificates). Client Verify requires the full chain in order to
properly verify the certificate.
2020-06-06 11:44:21 +02:00
antirez
bec1ac9899 Revert "Implements sendfile for redis."
This reverts commit 5675053269b0cbc2cf525c99321c96b7c2b39abe.
2020-06-06 11:43:29 +02:00
antirez
8bfdba1e08 Revert "avoid using sendfile if tls-replication is enabled"
This reverts commit 13bbd165e87923558952203d310e9ad053d4d7c0.
2020-06-06 11:43:29 +02:00
Liu Zhen
123d94ffc0 fix clusters mixing accidentally by gossip
`clusterStartHandshake` will start hand handshake
and eventually send CLUSTER MEET message, which is strictly prohibited
in the REDIS CLUSTER SPEC.
Only system administrator can initiate CLUSTER MEET message.
Futher, according to the SPEC, rather than IP/PORT pairs, only nodeid
can be trusted.
2020-06-06 11:43:29 +02:00
antirez
a5f0fa7f7a Fix handling of special chars in ACL LOAD.
Now it is also possible for ACL SETUSER to accept empty strings
as valid operations (doing nothing), so for instance

    ACL SETUSER myuser ""

Will have just the effect of creating a user in the default state.

This should fix #7329.
2020-06-06 11:43:29 +02:00
antirez
c512f64428 Redis 6.0.4. 2020-05-28 12:18:38 +02:00
antirez
41bb699867 Test: take PSYNC2 test master timeout high during switch.
This will likely avoid false positives due to trailing pings.
2020-05-28 10:56:14 +02:00
antirez
40578433c7 Test: add the tracking unit as default. 2020-05-28 10:22:29 +02:00
Oran Agra
571b03021a tests: find_available_port start search from next port
i.e. don't start the search from scratch hitting the used ones again.
this will also reduce the likelihood of collisions (if there are any
left) by increasing the time until we re-use a port we did use in the
past.
2020-05-28 10:09:51 +02:00
Oran Agra
4653d796f0 tests: each test client work on a distinct port range
apparently when running tests in parallel (the default of --clients 16),
there's a chance for two tests to use the same port.
specifically, one test might shutdown a master and still have the
replica up, and then another test will re-use the port number of master
for another master, and then that replica will connect to the master of
the other test.

this can cause a master to count too many full syncs and fail a test if
we run the tests with --single integration/psync2 --loop --stop

see Probmem 2 in #7314
2020-05-28 10:09:51 +02:00
Oran Agra
31bd963557 32bit CI needs to build modules correctly 2020-05-28 10:09:51 +02:00
Oran Agra
01039e5964 adjust revived meaningful offset tests
these tests create several edge cases that are otherwise uncovered (at
least not consistently) by the test suite, so although they're no longer
testing what they were meant to test, it's still a good idea to keep
them in hope that they'll expose some issue in the future.
2020-05-28 10:09:51 +02:00
Oran Agra
98e6f2cd5b revive meaningful offset tests 2020-05-28 10:09:51 +02:00
antirez
84117d13b7 Replication: showLatestBacklog() refactored out. 2020-05-28 10:09:51 +02:00
antirez
14d99c183f Drop useless line from replicationCacheMaster(). 2020-05-28 10:09:51 +02:00
antirez
0163e4e495 Another meaningful offset test removed. 2020-05-28 10:09:51 +02:00
antirez
24a0f7bf55 Remove the PSYNC2 meaningful offset test. 2020-05-28 10:09:51 +02:00
antirez
911c579b68 Remove the meaningful offset feature.
After a closer look, the Redis core devleopers all believe that this was
too fragile, caused many bugs that we didn't expect and that were very
hard to track. Better to find an alternative solution that is simpler.
2020-05-28 10:09:51 +02:00
antirez
7e55485b21 Set a protocol error if master use the inline protocol.
We want to react a bit more aggressively if we sense that the master is
sending us some corrupted stream. By setting the protocol error we both
ensure that the replica will disconnect, and avoid caching the master so
that a full SYNC will be required. This is protective against
replication bugs.
2020-05-28 10:09:51 +02:00
Oran Agra
abb9dcd975 daily CI test with tls 2020-05-28 10:09:51 +02:00
Oran Agra
0705a29959 avoid using sendfile if tls-replication is enabled
this obviously broke the tests, but went unnoticed so far since tls
wasn't often tested.
2020-05-28 10:09:51 +02:00
antirez
fee0c76304 Replication: log backlog creation event. 2020-05-28 10:09:51 +02:00
antirez
2411e4e33f Test: PSYNC2 test can now show server logs. 2020-05-28 10:09:51 +02:00
antirez
7a32a8485e Clarify what is happening in PR #7320. 2020-05-25 12:08:01 +02:00
zhaozhao.zz
d089cc8963 PSYNC2: second_replid_offset should be real meaningful offset
After adjustMeaningfulReplOffset(), all the other related variable
should be updated, including server.second_replid_offset.

Or the old version redis like 5.0 may receive wrong data from
replication stream, cause redis 5.0 can sync with redis 6.0,
but doesn't know meaningful offset.
2020-05-25 12:08:01 +02:00
Oran Agra
1f5163f454 add CI for 32bit build 2020-05-25 12:08:01 +02:00
antirez
8a4e01f2bc Make disconnectSlaves() synchronous in the base case.
Otherwise we run into that:

Backtrace:
src/redis-server 127.0.0.1:21322(logStackTrace+0x45)[0x479035]
src/redis-server 127.0.0.1:21322(sigsegvHandler+0xb9)[0x4797f9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fd373c5e390]
src/redis-server 127.0.0.1:21322(_serverAssert+0x6a)[0x47660a]
src/redis-server 127.0.0.1:21322(freeReplicationBacklog+0x42)[0x451282]
src/redis-server 127.0.0.1:21322[0x4552d4]
src/redis-server 127.0.0.1:21322[0x4c5593]
src/redis-server 127.0.0.1:21322(aeProcessEvents+0x2e6)[0x42e786]
src/redis-server 127.0.0.1:21322(aeMain+0x1d)[0x42eb0d]
src/redis-server 127.0.0.1:21322(main+0x4c5)[0x42b145]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fd3738a3830]
src/redis-server 127.0.0.1:21322(_start+0x29)[0x42b409]

Since we disconnect all the replicas and free the replication backlog in
certain replication paths, and the code that will free the replication
backlog expects that no replica is connected.

However we still need to free the replicas asynchronously in certain
cases, as documented in the top comment of disconnectSlaves().
2020-05-25 12:08:01 +02:00
ShooterIT
fc18f9a798 Implements sendfile for redis. 2020-05-25 12:08:01 +02:00
antirez
ed3fd6c524 Fix #7306 less aggressively.
Citing from the issue:

btw I suggest we change this fix to something else:
* We revert the fix.
* We add a call that disconnects chained replicas in the place where we trim the replica (that is a master i this case) offset.
This way we can avoid disconnections when there is no trimming of the backlog.

Note that we now want to disconnect replicas asynchronously in
disconnectSlaves(), because it's in general safer now that we can call
it from freeClient(). Otherwise for instance the command:

    CLIENT KILL TYPE master

May crash: clientCommand() starts running the linked of of clients,
looking for clients to kill. However it finds the master, kills it
calling freeClient(), but this in turn calls replicationCacheMaster()
that may also call disconnectSlaves() now. So the linked list iterator
of the clientCommand() will no longer be valid.
2020-05-25 12:08:01 +02:00
Madelyn Olson
8802cbbcde EAGAIN for tls during diskless load 2020-05-22 12:37:59 +02:00
Qu Chen
5d59bbb6d9 Disconnect chained replicas when the replica performs PSYNC with the master always to avoid replication offset mismatch between master and chained replicas. 2020-05-22 12:37:59 +02:00
hwware
08c70f7dce using moreargs variable 2020-05-22 12:37:59 +02:00
hwware
5c70dab9fc fix server crash for STRALGO command 2020-05-22 12:37:49 +02:00
ShooterIT
e60f63a718 Replace addDeferredMultiBulkLength with addReplyDeferredLen in comment 2020-05-22 12:37:49 +02:00
Yossi Gottlieb
26143c2936 TLS: Improve tls-protocols clarity in redis.conf. 2020-05-22 12:37:49 +02:00
ShooterIT
9d1265240f Fix reply bytes calculation error
Fix #7275.
2020-05-22 12:37:49 +02:00
zhaozhao.zz
830c674845 Tracking: flag CLIENT_TRACKING_BROKEN_REDIR when redir broken 2020-05-22 12:37:49 +02:00
Oran Agra
56d63f4d7d fix a rare active defrag edge case bug leading to stagnation
There's a rare case which leads to stagnation in the defragger, causing
it to keep scanning the keyspace and do nothing (not moving any
allocation), this happens when all the allocator slabs of a certain bin
have the same % utilization, but the slab from which new allocations are
made have a lower utilization.

this commit fixes it by removing the current slab from the overall
average utilization of the bin, and also eliminate any precision loss in
the utilization calculation and move the decision about the defrag to
reside inside jemalloc.

and also add a test that consistently reproduce this issue.
2020-05-22 12:37:49 +02:00
Oran Agra
4f8e19f877 improve DEBUG MALLCTL to be able to write to write only fields.
also support:
  debug mallctl-str thread.tcache.flush VOID
2020-05-22 12:37:49 +02:00
hujie
6bb5b6d942 fix clear USER_FLAG_ALLCOMMANDS flag in acl
in ACLSetUserCommandBit, when the command bit overflows, no operation
is performed, so no need clear the USER_FLAG_ALLCOMMANDS flag.

in ACLSetUser, when adding subcommand, we don't need to call
ACLGetCommandID ahead since subcommand may be empty.
2020-05-22 12:37:49 +02:00
ShooterIT
2478a83ce7 Redis Benchmark: generate random test data
The function of generating random data is designed by antirez. See #7196.
2020-05-22 12:37:49 +02:00
hwware
3ffb3ac7ea Redis-Benchmark: avoid potentical memmory leaking 2020-05-22 12:37:49 +02:00
WuYunlong
65de5a1a7d Handle keys with hash tag when computing hash slot using tcl cluster client. 2020-05-22 12:37:49 +02:00
WuYunlong
71036d4cdb Add a test to prove current tcl cluster client can not handle keys with hash tag. 2020-05-22 12:37:49 +02:00