494 Commits

Author SHA1 Message Date
Madelyn Olson
32d225c644 Properly reset errno for rdbLoad (#7542)
(cherry picked from commit 9615c7480de53c920baadf6279b527b60de8f0d4)
2020-09-01 09:27:58 +03:00
dmurnane
81d36bc6c8 Notify systemd on sentinel startup (#7168)
Co-authored-by: Daniel Murnane <dmurnane@eitccorp.com>
(cherry picked from commit c292d43fec492f555e6889b6f53b7f380f54bccc)
2020-07-20 21:08:26 +03:00
Oran Agra
8510588213 EXEC always fails with EXECABORT and multi-state is cleared
In order to support the use of multi-exec in pipeline, it is important that
MULTI and EXEC are never rejected and it is easy for the client to know if the
connection is still in multi state.

It was easy to make sure MULTI and DISCARD never fail (done by previous
commits) since these only change the client state and don't do any actual
change in the server, but EXEC is a different story.

Since in the past, it was possible for clients to handle some EXEC errors and
retry the EXEC, we now can't affort to return any error on EXEC other than
EXECABORT, which now carries with it the real reason for the abort too.

Other fixes in this commit:
- Some checks that where performed at the time of queuing need to be re-
  validated when EXEC runs, for instance if the transaction contains writes
  commands, it needs to be aborted. there was one check that was already done
  in execCommand (-READONLY), but other checks where missing: -OOM, -MISCONF,
  -NOREPLICAS, -MASTERDOWN
- When a command is rejected by processCommand it was rejected with addReply,
  which was not recognized as an error in case the bad command came from the
  master. this will enable to count or MONITOR these errors in the future.
- make it easier for tests to create additional (non deferred) clients.
- add tests for the fixes of this commit.

(cherry picked from commit fe8d6fe74920798c146a587810ee91ff049a9093)
2020-07-20 21:08:26 +03:00
Tomasz Poradowski
f13381da22 ensure SHUTDOWN_NOSAVE in Sentinel mode
- enforcing of SHUTDOWN_NOSAVE flag in one place to make it consitent
  when running in Sentinel mode

(cherry picked from commit 6782243054d7f8b5b861b5ae367db22b0f47a3b9)
2020-07-20 21:08:26 +03:00
antirez
00e400ed66 LPOS: implement the final design. 2020-06-12 12:08:06 +02:00
Paul Spooren
795c807fed LRANK: Add command (the command will be renamed LPOS).
The `LRANK` command returns the index (position) of a given element
within a list. Using the `direction` argument it is possible to specify
going from head to tail (acending, 1) or from tail to head (decending,
-1). Only the first found index is returend. The complexity is O(N).

When using lists as a queue it can be of interest at what position a
given element is, for instance to monitor a job processing through a
work queue. This came up within the Python `rq` project which is based
on Redis[0].

[0]: https://github.com/rq/rq/issues/1197

Signed-off-by: Paul Spooren <mail@aparcar.org>
2020-06-12 12:08:06 +02:00
Oran Agra
881d2c4663 Avoid rejecting WATCH / UNWATCH, like MULTI/EXEC/DISCARD
Much like MULTI/EXEC/DISCARD, the WATCH and UNWATCH are not actually
operating on the database or server state, but instead operate on the
client state. the client may send them all in one long pipeline and check
all the responses only at the end, so failing them may lead to a
mismatch between the client state on the server and the one on the
client end, and execute the wrong commands (ones that were meant to be
discarded)

the watched keys are not actually stored in the client struct, but they
are in fact part of the client state. for instance, they're not cleared
or moved in SWAPDB or FLUSHDB.
2020-06-09 11:53:01 +02:00
antirez
911c579b68 Remove the meaningful offset feature.
After a closer look, the Redis core devleopers all believe that this was
too fragile, caused many bugs that we didn't expect and that were very
hard to track. Better to find an alternative solution that is simpler.
2020-05-28 10:09:51 +02:00
antirez
9d7e2fef3d Track events processed while blocked globally.
Related to #7234.
2020-05-14 11:29:43 +02:00
antirez
e1b134e972 Some rework of #7234. 2020-05-14 11:29:43 +02:00
Oran Agra
a3dd04410d fix redis 6.0 not freeing closed connections during loading.
This bug was introduced by a recent change in which readQueryFromClient
is using freeClientAsync, and despite the fact that now
freeClientsInAsyncFreeQueue is in beforeSleep, that's not enough since
it's not called during loading in processEventsWhileBlocked.
furthermore, afterSleep was called in that case but beforeSleep wasn't.

This bug also caused slowness sine the level-triggered mode of epoll
kept signaling these connections as readable causing us to keep doing
connRead again and again for ll of these, which keep accumulating.

now both before and after sleep are called, but not all of their actions
are performed during loading, some are only reserved for the main loop.

fixes issue #7215
2020-05-14 11:29:43 +02:00
antirez
0b21426a45 Tracking: send eviction messages when evicting entries.
A fix for #7249.
2020-05-14 11:29:43 +02:00
antirez
24075946c7 Don't propagate spurious MULTI on DEBUG LOADAOF. 2020-05-08 10:37:36 +02:00
antirez
6b8fdaa8d6 Move CRC64 initialization in main(). 2020-05-08 10:37:35 +02:00
hwware
d45fd94b8a Client Side Caching: Add Tracking Prefix Number Stats in Server Info 2020-05-08 10:37:35 +02:00
zhenwei pi
1a632b6964 Support setcpuaffinity on linux/bsd
Currently, there are several types of threads/child processes of a
redis server. Sometimes we need deeply optimise the performance of
redis, so we would like to isolate threads/processes.

There were some discussion about cpu affinity cases in the issue:
https://github.com/antirez/redis/issues/2863

So implement cpu affinity setting by redis.conf in this patch, then
we can config server_cpulist/bio_cpulist/aof_rewrite_cpulist/
bgsave_cpulist by cpu list.

Examples of cpulist in redis.conf:
server_cpulist 0-7:2      means cpu affinity 0,2,4,6
bio_cpulist 1,3           means cpu affinity 1,3
aof_rewrite_cpulist 8-11  means cpu affinity 8,9,10,11
bgsave_cpulist 1,10-11    means cpu affinity 1,10,11

Test on linux/freebsd, both work fine.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-05-08 10:37:35 +02:00
antirez
78b9c097c9 Comment clearly why we moved some code in #6623. 2020-04-30 13:02:58 +02:00
srzhao
e9811c3b12 fix pipelined WAIT performance issue.
If client gets blocked again in `processUnblockedClients`, redis will not send
`REPLCONF GETACK *` to slaves untill next eventloop, so the client will be
blocked for 100ms by default(10hz) if no other file event fired.

move server.get_ack_from_slaves sinppet after `processUnblockedClients`, so
that both the first WAIT command that puts client in blocked context and the
following WAIT command processed in processUnblockedClients would trigger
redis-sever to send `REPLCONF GETACK *`, so that the eventloop would get
`REPLCONG ACK <reploffset>` from slaves and unblocked ASAP.
2020-04-30 13:02:58 +02:00
Oran Agra
fb0a0c6451 hickup, re-fix dictEncObjKeyCompare
come to think of it, in theory (not in practice), getDecodedObject can
return the same original object with refcount incremented, so the
pointer comparision in the previous commit was invalid.
so now instead of checking the encoding, we explicitly check the
refcount.
2020-04-28 11:20:15 +02:00
antirez
ffbe6543ab Rework comment in dictEncObjKeyCompare(). 2020-04-28 11:20:15 +02:00
Oran Agra
d92f14e825 allow dictFind using static robj
since the recent addition of OBJ_STATIC_REFCOUNT and the assertion in
incrRefCount it is now impossible to use dictFind using a static robj,
because dictEncObjKeyCompare will call getDecodedObject which tries to
increment the refcount just in order to decrement it later.
2020-04-28 11:20:15 +02:00
Madelyn Olson
1652f7b897 Implemented CRC64 based on slice by 4 2020-04-28 11:20:15 +02:00
antirez
a0c54d5622 Fix STRALGO command flags. 2020-04-27 15:52:49 +02:00
Dave-in-lafayette
1713328723 fix for unintended crash during panic response
If redis crashes early, before lua is set up (like, if File Descriptor 0 is closed before exec), it will crash again trying to print memory statistics.
2020-04-27 15:52:49 +02:00
antirez
262da0ba78 LCS -> STRALGO LCS.
STRALGO should be a container for mostly read-only string
algorithms in Redis. The algorithms should have two main
characteristics:

1. They should be non trivial to compute, and often not part of
programming language standard libraries.
2. They should be fast enough that it is a good idea to have optimized C
implementations.

Next thing I would love to see? A small strings compression algorithm.
2020-04-24 16:49:27 +02:00
antirez
0b1b56b6fc Fix XCLAIM propagation in AOF/replicas for blocking XREADGROUP.
See issue #7105.
2020-04-17 12:40:54 +02:00
antirez
3b373fbb0b Speedup: unblock clients on keys in O(1).
See #7071.
2020-04-15 16:03:16 +02:00
antirez
fdd6da647d Speedup INFO by counting client memory incrementally.
Related to #5145.

Design note: clients may change type when they turn into replicas or are
moved into the Pub/Sub category and so forth. Moreover the recomputation
of the bytes used is problematic for obvious reasons: it changes
continuously, so as a conservative way to avoid accumulating errors,
each client remembers the contribution it gave to the sum, and removes
it when it is freed or before updating it with the new memory usage.
2020-04-07 16:53:13 +02:00
antirez
087ed099d1 LCS: initial functionality implemented. 2020-04-07 16:52:57 +02:00
srzhao
c5d805f877 Check OOM at script start to get stable lua OOM state.
Checking OOM by `getMaxMemoryState` inside script might get different result
with `freeMemoryIfNeededAndSafe` at script start, because lua stack and
arguments also consume memory.

This leads to memory `borderline` when memory grows near server.maxmemory:

- `freeMemoryIfNeededAndSafe` at script start detects no OOM, no memory freed
- `getMaxMemoryState` inside script detects OOM, script aborted

We solve this 'borderline' issue by saving OOM state at script start to get
stable lua OOM state.

related to issue #6565 and #5250.
2020-04-07 16:52:28 +02:00
Valentino Geron
dc38619afb XREAD and XREADGROUP should not be allowed from scripts when BLOCK option is being used 2020-04-07 16:52:04 +02:00
Guy Benoish
115fd1136e Stale replica should allow MULTI/EXEC
Example: Client uses a pipe to send the following to a
stale replica:

MULTI
.. do something ...
DISCARD

The replica will reply the MUTLI with -MASTERDOWN and
execute the rest of the commands... A client using a
pipe might not be aware that MULTI failed until it's
too late.

I can't think of a reason why MULTI/EXEC/DISCARD should
not be executed on stale replicas...

Also, enable MULTI/EXEC/DISCARD during loading
2020-04-07 16:52:04 +02:00
antirez
81bf978d8c cast raxSize() to avoid warning with format spec. 2020-03-31 17:41:23 +02:00
antirez
c202c54ff7 Minor changes to #7037. 2020-03-31 17:12:19 +02:00
Guy Benoish
5c23cd55d4 Modules: Test MULTI/EXEC replication of RM_Replicate
Makse sure call() doesn't wrap replicated commands with
a redundant MULTI/EXEC

Other, unrelated changes:
1. Formatting compiler warning in INFO CLIENTS
2. Use CLIENT_ID_AOF instead of UINT64_MAX
2020-03-31 17:12:19 +02:00
antirez
514ca204bb Fix module commands propagation double MULTI bug.
b512cb40 introduced automatic wrapping of MULTI/EXEC for the
alsoPropagate API. However this collides with the built-in mechanism
already present in module.c. To avoid complex changes near Redis 6 GA
this commit introduces the ability to exclude call() MUTLI/EXEC wrapping
for also propagate in order to continue to use the old code paths in
module.c.
2020-03-31 16:57:20 +02:00
antirez
2a820251c8 timeout.c created: move client timeouts code there. 2020-03-31 16:57:20 +02:00
antirez
c881ba1680 Precise timeouts: cleaup the table on unblock.
Now that this mechanism is the sole one used for blocked clients
timeouts, it is more wise to cleanup the table when the client unblocks
for any reason. We use a flag: CLIENT_IN_TO_TABLE, in order to avoid a
radix tree lookup when the client was already removed from the table
because we processed it by scanning the radix tree.
2020-03-31 16:56:46 +02:00
antirez
0e5723ff18 Precise timeouts: fix comments after functional change. 2020-03-31 16:56:46 +02:00
antirez
444089c55e Precise timeouts: use only radix tree for timeouts. 2020-03-31 16:56:46 +02:00
antirez
c49ff92431 Precise timeouts: fast exit for clientsHandleShortTimeout(). 2020-03-31 16:56:46 +02:00
antirez
e512707aed Precise timeouts: fix bugs in initial implementation. 2020-03-31 16:56:46 +02:00
antirez
cedeec01f2 Precise timeouts: working initial implementation. 2020-03-31 16:56:46 +02:00
antirez
b636856df0 Precise timeouts: refactor unblocking on timeout. 2020-03-31 16:56:46 +02:00
伯成
2a13f59a18 Boost up performance for redis PUB-SUB patterns matching
If lots of clients PSUBSCRIBE to same patterns, multiple pattens matching will take place. This commit change it into just one single pattern matching by using a `dict *` to store the unique pattern and which clients subscribe to it.
2020-03-31 12:47:14 +02:00
antirez
ce73158a9c PSYNC2: meaningful offset implemented.
A very commonly signaled operational problem with Redis master-replicas
sets is that, once the master becomes unavailable for some reason,
especially because of network problems, many times it wont be able to
perform a partial resynchronization with the new master, once it rejoins
the partition, for the following reason:

1. The master becomes isolated, however it keeps sending PINGs to the
replicas. Such PINGs will never be received since the link connection is
actually already severed.
2. On the other side, one of the replicas will turn into the new master,
setting its secondary replication ID offset to the one of the last
command received from the old master: this offset will not include the
PINGs sent by the master once the link was already disconnected.
3. When the master rejoins the partion and is turned into a replica, its
offset will be too advanced because of the PINGs, so a PSYNC will fail,
and a full synchronization will be required.

Related to issue #7002 and other discussion we had in the past around
this problem.
2020-03-25 15:55:24 +01:00
antirez
45ba72ad01 Explain why we allow transactions in -BUSY state.
Related to #7022.
2020-03-25 15:55:24 +01:00
Oran Agra
e1e2f91589 MULTI/EXEC during LUA script timeout are messed up
Redis refusing to run MULTI or EXEC during script timeout may cause partial
transactions to run.

1) if the client sends MULTI+commands+EXEC in pipeline without waiting for
response, but these arrive to the shards partially while there's a busy script,
and partially after it eventually finishes: we'll end up running only part of
the transaction (since multi was ignored, and exec would fail).

2) similar to the above if EXEC arrives during busy script, it'll be ignored and
the client state remains in a transaction.

the 3rd test which i added for a case where MULTI and EXEC are ok, and
only the body arrives during busy script was already handled correctly
since processCommand calls flagTransaction
2020-03-25 15:55:24 +01:00
antirez
6a1a5cb2a1 Abort transactions after -READONLY error. Fix #7014. 2020-03-25 15:55:24 +01:00
bodong.ybd
015d1cb2ff Added BITFIELD_RO variants for read-only operations. 2020-03-25 15:55:24 +01:00