9263 Commits

Author SHA1 Message Date
zhenwei pi
1a632b6964 Support setcpuaffinity on linux/bsd
Currently, there are several types of threads/child processes of a
redis server. Sometimes we need deeply optimise the performance of
redis, so we would like to isolate threads/processes.

There were some discussion about cpu affinity cases in the issue:
https://github.com/antirez/redis/issues/2863

So implement cpu affinity setting by redis.conf in this patch, then
we can config server_cpulist/bio_cpulist/aof_rewrite_cpulist/
bgsave_cpulist by cpu list.

Examples of cpulist in redis.conf:
server_cpulist 0-7:2      means cpu affinity 0,2,4,6
bio_cpulist 1,3           means cpu affinity 1,3
aof_rewrite_cpulist 8-11  means cpu affinity 8,9,10,11
bgsave_cpulist 1,10-11    means cpu affinity 1,10,11

Test on linux/freebsd, both work fine.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-05-08 10:37:35 +02:00
Guy Benoish
3843eb300d XPENDING should not update consumer's seen-time
Same goes for XGROUP DELCONSUMER (But in this case, it doesn't
have any visible effect)
2020-05-08 10:37:35 +02:00
Oran Agra
8b40f686fb optimize memory usage of deferred replies - fixed
When deffered reply is added the previous reply node cannot be used so
all the extra space we allocated in it is wasted. in case someone uses
deffered replies in a loop, each time adding a small reply, each of
these reply nodes (the small string reply) would have consumed a 16k
block.
now when we add anther diferred reply node, we trim the unused portion
of the previous reply block.

see #7123

cherry picked from commit 4ed5b7cb74caf5bef6606909603e371af0da4f9b
with fix to handle a crash with LIBC allocator, which apparently can
return the same pointer despite changing it's size.
i.e. shrinking an allocation of 16k into 56 bytes without changing the
pointer.
2020-05-08 10:37:35 +02:00
Deliang Yang
5133e970cc reformat code 2020-05-08 10:37:35 +02:00
Oran Agra
eb9d28903d add daily github actions with libc malloc and valgrind
* fix memlry leaks with diskless replica short read.
* fix a few timing issues with valgrind runs
* fix issue with valgrind and watchdog schedule signal

about the valgrind WD issue:
the stack trace test in logging.tcl, has issues with valgrind:
==28808== Can't extend stack to 0x1ffeffdb38 during signal delivery for thread 1:
==28808==   too small or bad protection modes

it seems to be some valgrind bug with SA_ONSTACK.
SA_ONSTACK seems unneeded since WD is not recursive (SA_NODEFER was removed),
also, not sure if it's even valid without a call to sigaltstack()
2020-05-08 10:37:35 +02:00
antirez
d36292e2b8 Redis 6.0.1. 2020-05-02 00:10:20 +02:00
antirez
1812dc1644 Cast printf() argument to the format specifier.
We could use uint64_t specific macros, but after all it's simpler to
just use an obvious equivalent type plus casting: this will be a no op
and is simpler than fixed size types printf macros.
2020-05-02 00:04:19 +02:00
antirez
410e48ffb4 Revert "optimize memory usage of deferred replies"
This reverts commit 4ed5b7cb74caf5bef6606909603e371af0da4f9b.
2020-05-02 00:04:19 +02:00
antirez
ef0c2a9268 Save a call to stopThreadedIOIfNeeded() for the base case.
Probably no performance changes, but the code should be trivial to
read as in "No threading? Use the normal function and return".
2020-05-02 00:04:19 +02:00
antirez
1367cb7b6d Redis 6.0.0 GA. 2020-04-30 15:04:41 +02:00
antirez
c7e3501d25 Update help.h again before Redis 6 GA. 2020-04-30 13:43:58 +02:00
antirez
9d5ba2e7be redis-cli: fix hints with subcommands. 2020-04-30 13:43:58 +02:00
antirez
6ffa739c13 redis-cli command help updated. 2020-04-30 13:02:59 +02:00
zhaozhao.zz
b5dc95d4cc lazyfree & eviction: record latency generated by lazyfree eviction
1. add eviction-lazyfree monitor
2. put eviction-del & eviction-lazyfree into eviction-cycle
   that means eviction-cycle contains all the latency in
   the eviction cycle including del and lazyfree
3. use getMaxmemoryState to check if we can break in lazyfree-evict
2020-04-30 13:02:59 +02:00
antirez
c3aa13a46a MIGRATE AUTH2 for ACL support. 2020-04-30 13:02:59 +02:00
antirez
4f61650c3c CLIENT KILL USER <username>. 2020-04-30 13:02:59 +02:00
antirez
aec7e4a836 Fix tracking table max keys option in redis.conf. 2020-04-30 13:02:58 +02:00
antirez
99569af4aa redis-cli: safer cluster fix with unreachalbe masters. 2020-04-30 13:02:58 +02:00
antirez
9a9953d331 redis-cli: simplify cluster nodes coverage display. 2020-04-30 13:02:58 +02:00
antirez
dbf803bf9c redis-cli: try to make clusterManagerFixOpenSlot() more readable.
Also improve the message to make clear that there is no *clear* owner,
not that there is no owner at all.
2020-04-30 13:02:58 +02:00
Guy Benoish
20a9fe531c XINFO STREAM FULL should have a default COUNT of 10 2020-04-30 13:02:58 +02:00
antirez
78b9c097c9 Comment clearly why we moved some code in #6623. 2020-04-30 13:02:58 +02:00
srzhao
e9811c3b12 fix pipelined WAIT performance issue.
If client gets blocked again in `processUnblockedClients`, redis will not send
`REPLCONF GETACK *` to slaves untill next eventloop, so the client will be
blocked for 100ms by default(10hz) if no other file event fired.

move server.get_ack_from_slaves sinppet after `processUnblockedClients`, so
that both the first WAIT command that puts client in blocked context and the
following WAIT command processed in processUnblockedClients would trigger
redis-sever to send `REPLCONF GETACK *`, so that the eventloop would get
`REPLCONG ACK <reploffset>` from slaves and unblocked ASAP.
2020-04-30 13:02:58 +02:00
antirez
d66ac30fd4 Fix create-cluster BIN_PATH. 2020-04-30 13:02:58 +02:00
Guy Benoish
dc3d865edc Extend XINFO STREAM output
Introducing XINFO STREAM <key> FULL
2020-04-30 13:02:58 +02:00
hwware
12bb6b0f08 Fix not used marco in cluster.c 2020-04-30 13:02:58 +02:00
Itamar Haber
1541e3e522 Update create-cluster 2020-04-30 13:02:58 +02:00
Itamar Haber
c028751ef0 Adds BIN_PATH to create-cluster
Allows for setting the binaries path if used outside the upstream repo.

Also documents `call` in usage clause (TODO: port to
`redis-cli --cluster call` or just deprecate it).
2020-04-30 13:02:58 +02:00
Oran Agra
fb0a0c6451 hickup, re-fix dictEncObjKeyCompare
come to think of it, in theory (not in practice), getDecodedObject can
return the same original object with refcount incremented, so the
pointer comparision in the previous commit was invalid.
so now instead of checking the encoding, we explicitly check the
refcount.
2020-04-28 11:20:15 +02:00
Oran Agra
a8995ce3c9 fix loading race in psync2 tests 2020-04-28 11:20:15 +02:00
antirez
ffbe6543ab Rework comment in dictEncObjKeyCompare(). 2020-04-28 11:20:15 +02:00
Oran Agra
d92f14e825 allow dictFind using static robj
since the recent addition of OBJ_STATIC_REFCOUNT and the assertion in
incrRefCount it is now impossible to use dictFind using a static robj,
because dictEncObjKeyCompare will call getDecodedObject which tries to
increment the refcount just in order to decrement it later.
2020-04-28 11:20:15 +02:00
Madelyn Olson
e853b8f137 Added crcspeed library 2020-04-28 11:20:15 +02:00
Madelyn Olson
e49a60d9df Made crc64 test consistent 2020-04-28 11:20:15 +02:00
Madelyn Olson
1652f7b897 Implemented CRC64 based on slice by 4 2020-04-28 11:20:15 +02:00
Oran Agra
f1cd7f5880 optimize memory usage of deferred replies
When deffered reply is added the previous reply node cannot be used so
all the extra space we allocated in it is wasted. in case someone uses
deffered replies in a loop, each time adding a small reply, each of
these reply nodes (the small string reply) would have consumed a 16k
block.
now when we add anther diferred reply node, we trim the unused portion
of the previous reply block.

see #7123
2020-04-27 16:46:14 +02:00
Oran Agra
58619c1286 Keep track of meaningful replication offset in replicas too
Now both master and replicas keep track of the last replication offset
that contains meaningful data (ignoring the tailing pings), and both
trim that tail from the replication backlog, and the offset with which
they try to use for psync.

the implication is that if someone missed some pings, or even have
excessive pings that the promoted replica has, it'll still be able to
psync (avoid full sync).

the downside (which was already committed) is that replicas running old
code may fail to psync, since the promoted replica trims pings form it's
backlog.

This commit adds a test that reproduces several cases of promotions and
demotions with stale and non-stale pings

Background:
The mearningful offset on the master was added recently to solve a problem were
the master is left all alone, injecting PINGs into it's backlog when no one is
listening and then gets demoted and tries to replicate from a replica that didn't
have any of the PINGs (or at least not the last ones).

however, consider this case:
master A has two replicas (B and C) replicating directly from it.
there's no traffic at all, and also no network issues, just many pings in the
tail of the backlog. now B gets promoted, A becomes a replica of B, and C
remains a replica of A. when A gets demoted, it trims the pings from its
backlog, and successfully replicate from B. however, C is still aware of
these PINGs, when it'll disconnect and re-connect to A, it'll ask for something
that's not in the backlog anymore (since A trimmed the tail of it's backlog),
and be forced to do a full sync (something it didn't have to do before the
meaningful offset fix).

Besides that, the psync2 test was always failing randomly here and there, it
turns out the reason were PINGs. Investigating it shows the following scenario:

cycle 1: redis #1 is master, and all the rest are direct replicas of #1
cycle 2: redis #2 is promoted to master, #1 is a replica of #2 and #3 is replica of #1
now we see that when #1 is demoted it prints:
17339:S 21 Apr 2020 11:16:38.523 * Using the meaningful offset 3929963 instead of 3929977 to exclude the final PINGs (14 bytes difference)
17339:S 21 Apr 2020 11:16:39.391 * Trying a partial resynchronization (request e2b3f8817735fdfe5fa4626766daa938b61419e5:3929964).
17339:S 21 Apr 2020 11:16:39.392 * Successful partial resynchronization with master.
and when #3 connects to the demoted #2, #2 says:
17339:S 21 Apr 2020 11:16:40.084 * Partial resynchronization not accepted: Requested offset for secondary ID was 3929978, but I can reply up to 3929964

so the issue here is that the meaningful offset feature saved the day for the
demoted master (since it needs to sync from a replica that didn't get the last
ping), but it didn't help one of the other replicas which did get the last ping.
2020-04-27 15:52:49 +02:00
antirez
a0c54d5622 Fix STRALGO command flags. 2020-04-27 15:52:49 +02:00
Dave-in-lafayette
1713328723 fix for unintended crash during panic response
If redis crashes early, before lua is set up (like, if File Descriptor 0 is closed before exec), it will crash again trying to print memory statistics.
2020-04-27 15:52:49 +02:00
Guy Benoish
aafc91fcc9 Add the stream tag to XSETID tests 2020-04-27 15:52:49 +02:00
Dave-in-lafayette
4c30d6d732 fix for crash during panic before all threads are up
If there's a panic before all threads have been started (say, if file descriptor 0 is closed at exec), the panic response will crash here again.
2020-04-27 15:52:49 +02:00
antirez
262da0ba78 LCS -> STRALGO LCS.
STRALGO should be a container for mostly read-only string
algorithms in Redis. The algorithms should have two main
characteristics:

1. They should be non trivial to compute, and often not part of
programming language standard libraries.
2. They should be fast enough that it is a good idea to have optimized C
implementations.

Next thing I would love to see? A small strings compression algorithm.
2020-04-24 16:49:27 +02:00
antirez
e772b55a9c Also use propagate() in streamPropagateGroupID(). 2020-04-24 10:15:04 +02:00
yanhui13
e0add7e0f1 add tcl test for cluster slots 2020-04-24 10:15:04 +02:00
yanhui13
782d9f2ff9 optimize the output of cluster slots 2020-04-24 10:15:04 +02:00
antirez
6d3bd2ed5a Minor aesthetic changes to #7135. 2020-04-24 10:15:04 +02:00
Valentino Geron
a2a5b1d6ae XREADGROUP with NOACK should propagate only one XGROUP SETID command 2020-04-24 10:15:04 +02:00
antirez
408d4fb35d ACL: re-enable command execution of disabled users.
After all I changed idea again: enabled/disabled should have a more
clear meaning, and it only means: you can't authenticate with such user
with new connections, however old connections continue to work as
expected.
2020-04-24 10:15:04 +02:00
antirez
76aa8a43ab getRandomBytes(): use HMAC-SHA256.
Now that we have an interface to use this API directly, via ACL GENPASS,
we are no longer sure what people could do with it. So why don't make it
a strong primitive exported by Redis in order to create unique IDs and
so forth?

The implementation was tested against the test vectors that can
be found in RFC4231.
2020-04-24 10:14:48 +02:00
antirez
32c6699847 ACL GENPASS: take number of bits as argument. 2020-04-24 10:14:48 +02:00