2009-03-22 10:30:00 +01:00
|
|
|
/* A simple event-driven programming library. Originally I wrote this code
|
|
|
|
* for the Jim's event-loop (Jim is a Tcl interpreter) but later translated
|
|
|
|
* it in form of a library for easy reuse.
|
|
|
|
*
|
2024-08-14 17:20:36 +01:00
|
|
|
* Copyright (c) 2006-2010, Redis Ltd.
|
2009-03-22 10:30:00 +01:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions are met:
|
|
|
|
*
|
|
|
|
* * Redistributions of source code must retain the above copyright notice,
|
|
|
|
* this list of conditions and the following disclaimer.
|
|
|
|
* * Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
* * Neither the name of Redis nor the names of its contributors may be used
|
|
|
|
* to endorse or promote products derived from this software without
|
|
|
|
* specific prior written permission.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
|
|
|
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
|
|
|
|
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
|
|
|
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
|
|
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
|
|
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
|
|
|
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
|
|
|
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
|
|
|
* POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
*/
|
|
|
|
|
2020-08-28 01:54:10 -07:00
|
|
|
#include "ae.h"
|
2021-01-20 04:57:30 +08:00
|
|
|
#include "anet.h"
|
2024-04-02 20:56:17 +07:00
|
|
|
#include "serverassert.h"
|
2020-08-28 01:54:10 -07:00
|
|
|
|
2009-03-22 10:30:00 +01:00
|
|
|
#include <stdio.h>
|
|
|
|
#include <sys/time.h>
|
|
|
|
#include <sys/types.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <stdlib.h>
|
2012-01-06 18:56:07 +08:00
|
|
|
#include <poll.h>
|
2012-02-08 22:24:59 +01:00
|
|
|
#include <string.h>
|
2012-10-05 10:10:34 +02:00
|
|
|
#include <time.h>
|
2013-01-03 14:18:03 +01:00
|
|
|
#include <errno.h>
|
2009-03-22 10:30:00 +01:00
|
|
|
|
|
|
|
#include "zmalloc.h"
|
2009-11-23 18:50:39 +01:00
|
|
|
#include "config.h"
|
|
|
|
|
|
|
|
/* Include the best multiplexing layer supported by this system.
|
|
|
|
* The following should be ordered by performances, descending. */
|
2012-03-26 17:58:19 -07:00
|
|
|
#ifdef HAVE_EVPORT
|
|
|
|
#include "ae_evport.c"
|
2009-11-23 18:50:39 +01:00
|
|
|
#else
|
2024-05-22 23:24:12 -07:00
|
|
|
#ifdef HAVE_EPOLL
|
|
|
|
#include "ae_epoll.c"
|
|
|
|
#else
|
|
|
|
#ifdef HAVE_KQUEUE
|
|
|
|
#include "ae_kqueue.c"
|
|
|
|
#else
|
|
|
|
#include "ae_select.c"
|
|
|
|
#endif
|
|
|
|
#endif
|
2009-11-23 18:50:39 +01:00
|
|
|
#endif
|
2009-03-22 10:30:00 +01:00
|
|
|
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
#define AE_LOCK(eventLoop) \
|
|
|
|
if ((eventLoop)->flags & AE_PROTECT_POLL) { \
|
|
|
|
assert(pthread_mutex_lock(&(eventLoop)->poll_mutex) == 0); \
|
|
|
|
}
|
|
|
|
|
|
|
|
#define AE_UNLOCK(eventLoop) \
|
|
|
|
if ((eventLoop)->flags & AE_PROTECT_POLL) { \
|
|
|
|
assert(pthread_mutex_unlock(&(eventLoop)->poll_mutex) == 0); \
|
|
|
|
}
|
2020-08-28 01:54:10 -07:00
|
|
|
|
2011-12-15 11:42:40 +01:00
|
|
|
aeEventLoop *aeCreateEventLoop(int setsize) {
|
2009-03-22 10:30:00 +01:00
|
|
|
aeEventLoop *eventLoop;
|
2009-11-23 18:50:39 +01:00
|
|
|
int i;
|
2009-03-22 10:30:00 +01:00
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
monotonicInit(); /* just in case the calling app didn't initialize */
|
2020-08-28 01:54:10 -07:00
|
|
|
|
2012-01-25 10:35:47 +01:00
|
|
|
if ((eventLoop = zmalloc(sizeof(*eventLoop))) == NULL) goto err;
|
2024-05-22 23:24:12 -07:00
|
|
|
eventLoop->events = zmalloc(sizeof(aeFileEvent) * setsize);
|
|
|
|
eventLoop->fired = zmalloc(sizeof(aeFiredEvent) * setsize);
|
2012-01-25 10:35:47 +01:00
|
|
|
if (eventLoop->events == NULL || eventLoop->fired == NULL) goto err;
|
2011-12-15 11:42:40 +01:00
|
|
|
eventLoop->setsize = setsize;
|
2009-03-22 10:30:00 +01:00
|
|
|
eventLoop->timeEventHead = NULL;
|
|
|
|
eventLoop->timeEventNextId = 0;
|
|
|
|
eventLoop->stop = 0;
|
2009-11-23 18:50:39 +01:00
|
|
|
eventLoop->maxfd = -1;
|
2010-01-28 10:12:04 -05:00
|
|
|
eventLoop->beforesleep = NULL;
|
2017-05-03 11:26:21 +02:00
|
|
|
eventLoop->aftersleep = NULL;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
eventLoop->custompoll = NULL;
|
2019-08-11 16:07:53 +03:00
|
|
|
eventLoop->flags = 0;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
/* Initialize the eventloop mutex with PTHREAD_MUTEX_ERRORCHECK type */
|
|
|
|
pthread_mutexattr_t attr;
|
|
|
|
pthread_mutexattr_init(&attr);
|
|
|
|
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK);
|
|
|
|
if (pthread_mutex_init(&eventLoop->poll_mutex, &attr) != 0) goto err;
|
|
|
|
|
2012-01-25 10:35:47 +01:00
|
|
|
if (aeApiCreate(eventLoop) == -1) goto err;
|
2009-11-23 18:50:39 +01:00
|
|
|
/* Events with mask == AE_NONE are not set. So let's initialize the
|
|
|
|
* vector with it. */
|
2024-05-22 23:24:12 -07:00
|
|
|
for (i = 0; i < setsize; i++) eventLoop->events[i].mask = AE_NONE;
|
2009-03-22 10:30:00 +01:00
|
|
|
return eventLoop;
|
2012-01-25 10:35:47 +01:00
|
|
|
|
|
|
|
err:
|
|
|
|
if (eventLoop) {
|
|
|
|
zfree(eventLoop->events);
|
|
|
|
zfree(eventLoop->fired);
|
|
|
|
zfree(eventLoop);
|
|
|
|
}
|
|
|
|
return NULL;
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
|
|
|
|
2013-06-28 16:39:49 +02:00
|
|
|
/* Return the current set size. */
|
|
|
|
int aeGetSetSize(aeEventLoop *eventLoop) {
|
|
|
|
return eventLoop->setsize;
|
|
|
|
}
|
|
|
|
|
2023-04-23 22:02:09 +08:00
|
|
|
/*
|
|
|
|
* Tell the event processing to change the wait timeout as soon as possible.
|
|
|
|
*
|
|
|
|
* Note: it just means you turn on/off the global AE_DONT_WAIT.
|
|
|
|
*/
|
2019-10-15 17:21:33 +03:00
|
|
|
void aeSetDontWait(aeEventLoop *eventLoop, int noWait) {
|
2019-08-11 16:07:53 +03:00
|
|
|
if (noWait)
|
|
|
|
eventLoop->flags |= AE_DONT_WAIT;
|
|
|
|
else
|
|
|
|
eventLoop->flags &= ~AE_DONT_WAIT;
|
|
|
|
}
|
|
|
|
|
2013-06-28 16:39:49 +02:00
|
|
|
/* Resize the maximum set size of the event loop.
|
|
|
|
* If the requested set size is smaller than the current set size, but
|
|
|
|
* there is already a file descriptor in use that is >= the requested
|
|
|
|
* set size minus one, AE_ERR is returned and the operation is not
|
|
|
|
* performed at all.
|
|
|
|
*
|
|
|
|
* Otherwise AE_OK is returned and the operation is successful. */
|
|
|
|
int aeResizeSetSize(aeEventLoop *eventLoop, int setsize) {
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
AE_LOCK(eventLoop);
|
|
|
|
int ret = AE_OK;
|
2013-06-28 16:39:49 +02:00
|
|
|
int i;
|
|
|
|
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
if (setsize == eventLoop->setsize) goto done;
|
|
|
|
if (eventLoop->maxfd >= setsize) goto err;
|
|
|
|
if (aeApiResize(eventLoop, setsize) == -1) goto err;
|
2013-06-28 16:39:49 +02:00
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
eventLoop->events = zrealloc(eventLoop->events, sizeof(aeFileEvent) * setsize);
|
|
|
|
eventLoop->fired = zrealloc(eventLoop->fired, sizeof(aeFiredEvent) * setsize);
|
2013-06-28 16:39:49 +02:00
|
|
|
eventLoop->setsize = setsize;
|
|
|
|
|
|
|
|
/* Make sure that if we created new slots, they are initialized with
|
|
|
|
* an AE_NONE mask. */
|
2024-05-22 23:24:12 -07:00
|
|
|
for (i = eventLoop->maxfd + 1; i < setsize; i++) eventLoop->events[i].mask = AE_NONE;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
goto done;
|
|
|
|
|
|
|
|
err:
|
|
|
|
ret = AE_ERR;
|
|
|
|
done:
|
|
|
|
AE_UNLOCK(eventLoop);
|
|
|
|
return ret;
|
2013-06-28 16:39:49 +02:00
|
|
|
}
|
|
|
|
|
2009-03-22 10:30:00 +01:00
|
|
|
void aeDeleteEventLoop(aeEventLoop *eventLoop) {
|
2009-11-23 18:50:39 +01:00
|
|
|
aeApiFree(eventLoop);
|
2011-12-16 09:55:01 +01:00
|
|
|
zfree(eventLoop->events);
|
|
|
|
zfree(eventLoop->fired);
|
2020-02-27 17:41:48 +01:00
|
|
|
|
|
|
|
/* Free the time events list. */
|
2019-12-31 19:53:00 +08:00
|
|
|
aeTimeEvent *next_te, *te = eventLoop->timeEventHead;
|
|
|
|
while (te) {
|
|
|
|
next_te = te->next;
|
2024-05-22 23:24:12 -07:00
|
|
|
if (te->finalizerProc) te->finalizerProc(eventLoop, te->clientData);
|
2019-12-31 19:53:00 +08:00
|
|
|
zfree(te);
|
|
|
|
te = next_te;
|
|
|
|
}
|
2009-03-22 10:30:00 +01:00
|
|
|
zfree(eventLoop);
|
|
|
|
}
|
|
|
|
|
|
|
|
void aeStop(aeEventLoop *eventLoop) {
|
|
|
|
eventLoop->stop = 1;
|
|
|
|
}
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
int aeCreateFileEvent(aeEventLoop *eventLoop, int fd, int mask, aeFileProc *proc, void *clientData) {
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
AE_LOCK(eventLoop);
|
|
|
|
int ret = AE_ERR;
|
|
|
|
|
2013-01-03 14:18:03 +01:00
|
|
|
if (fd >= eventLoop->setsize) {
|
|
|
|
errno = ERANGE;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
goto done;
|
2013-01-03 14:18:03 +01:00
|
|
|
}
|
2009-11-23 18:50:39 +01:00
|
|
|
aeFileEvent *fe = &eventLoop->events[fd];
|
|
|
|
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
if (aeApiAddEvent(eventLoop, fd, mask) == -1) goto done;
|
2009-11-23 18:50:39 +01:00
|
|
|
fe->mask |= mask;
|
|
|
|
if (mask & AE_READABLE) fe->rfileProc = proc;
|
|
|
|
if (mask & AE_WRITABLE) fe->wfileProc = proc;
|
2009-03-22 10:30:00 +01:00
|
|
|
fe->clientData = clientData;
|
2024-05-22 23:24:12 -07:00
|
|
|
if (fd > eventLoop->maxfd) eventLoop->maxfd = fd;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
|
|
|
|
ret = AE_OK;
|
|
|
|
|
|
|
|
done:
|
|
|
|
AE_UNLOCK(eventLoop);
|
|
|
|
return ret;
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
void aeDeleteFileEvent(aeEventLoop *eventLoop, int fd, int mask) {
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
AE_LOCK(eventLoop);
|
|
|
|
if (fd >= eventLoop->setsize) goto done;
|
|
|
|
|
2009-11-23 18:50:39 +01:00
|
|
|
aeFileEvent *fe = &eventLoop->events[fd];
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
if (fe->mask == AE_NONE) goto done;
|
2013-12-06 22:27:00 -05:00
|
|
|
|
ae.c: introduce the concept of read->write barrier.
AOF fsync=always, and certain Redis Cluster bus operations, require to
fsync data on disk before replying with an acknowledge.
In such case, in order to implement Group Commits, we want to be sure
that queries that are read in a given cycle of the event loop, are never
served to clients in the same event loop iteration. This way, by using
the event loop "before sleep" callback, we can fsync the information
just one time before returning into the event loop for the next cycle.
This is much more efficient compared to calling fsync() multiple times.
Unfortunately because of a bug, this was not always guaranteed: the
actual way the events are installed was the sole thing that could
control. Normally this problem is hard to trigger when AOF is enabled
with fsync=always, because we try to flush the output buffers to the
socekt directly in the beforeSleep() function of Redis. However if the
output buffers are full, we actually install a write event, and in such
a case, this bug could happen.
This change to ae.c modifies the event loop implementation to make this
concept explicit. Write events that are registered with:
AE_WRITABLE|AE_BARRIER
Are guaranteed to never fire after the readable event was fired for the
same file descriptor. In this way we are sure that data is persisted to
disk before the client performing the operation receives an
acknowledged.
However note that this semantics does not provide all the guarantees
that one may believe are automatically provided. Take the example of the
blocking list operations in Redis.
With AOF and fsync=always we could have:
Client A doing: BLPOP myqueue 0
Client B doing: RPUSH myqueue a b c
In this scenario, Client A will get the "a" elements immediately after
the Client B RPUSH will be executed, even before the operation is persisted.
However when Client B will get the acknowledge, it can be sure that
"b,c" are already safe on disk inside the list.
What to note here is that it cannot be assumed that Client A receiving
the element is a guaranteed that the operation succeeded from the point
of view of Client B.
This is due to the fact that the barrier exists within the same socket,
and not between different sockets. However in the case above, the
element "a" was not going to be persisted regardless, so it is a pretty
synthetic argument.
2018-02-23 17:42:24 +01:00
|
|
|
/* We want to always remove AE_BARRIER if set when AE_WRITABLE
|
|
|
|
* is removed. */
|
|
|
|
if (mask & AE_WRITABLE) mask |= AE_BARRIER;
|
|
|
|
|
2024-07-09 11:29:44 +08:00
|
|
|
/* Only remove attached events */
|
|
|
|
mask = mask & fe->mask;
|
|
|
|
|
2009-11-23 18:50:39 +01:00
|
|
|
fe->mask = fe->mask & (~mask);
|
|
|
|
if (fd == eventLoop->maxfd && fe->mask == AE_NONE) {
|
|
|
|
/* Update the max fd */
|
|
|
|
int j;
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
for (j = eventLoop->maxfd - 1; j >= 0; j--)
|
2009-11-23 18:50:39 +01:00
|
|
|
if (eventLoop->events[j].mask != AE_NONE) break;
|
|
|
|
eventLoop->maxfd = j;
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
2024-07-09 11:29:44 +08:00
|
|
|
|
|
|
|
/* Check whether there are events to be removed.
|
|
|
|
* Note: user may remove the AE_BARRIER without
|
|
|
|
* touching the actual events. */
|
|
|
|
if (mask & (AE_READABLE | AE_WRITABLE)) {
|
|
|
|
/* Must be invoked after the eventLoop mask is modified,
|
|
|
|
* which is required by evport and epoll */
|
|
|
|
aeApiDelEvent(eventLoop, fd, mask);
|
|
|
|
}
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
|
|
|
|
done:
|
|
|
|
AE_UNLOCK(eventLoop);
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
|
|
|
|
Add event loop support to the module API (#10001)
Modules can now register sockets/pipe to the Redis main thread event loop and do network operations asynchronously. Previously, modules had to maintain an event loop and another thread for asynchronous network operations.
Also, if a module is calling API functions after doing some network operations, it had to synchronize its event loop thread's access with Redis main thread by locking the GIL, causing contention on the lock. After this commit, no synchronization is needed as module can operate in Redis main thread context. So, this commit may improve the performance for some use cases.
Added three functions to the module API:
* RedisModule_EventLoopAdd(int fd, int mask, RedisModuleEventLoopFunc func, void *user_data)
* RedisModule_EventLoopDel(int fd, int mask)
* RedisModule_EventLoopAddOneShot(RedisModuleEventLoopOneShotFunc func, void *user_data) - This function can be called from other threads to trigger callback on Redis main thread. Callback will be triggered only once. If Redis main thread is sleeping, this call will wake up the Redis main thread.
Event loop callbacks are called by Redis main thread after locking the GIL. Inside callbacks, modules can operate as if they are holding the GIL.
Added REDISMODULE_EVENT_EVENTLOOP event with two subevents:
* REDISMODULE_SUBEVENT_EVENTLOOP_BEFORE_SLEEP
* REDISMODULE_SUBEVENT_EVENTLOOP_AFTER_SLEEP
These events are for modules that want to participate in the before and after sleep action. e.g It might be useful to implement batching : Read data from the network, write all to a file in one go on BEFORE_SLEEP event.
2022-01-18 14:10:07 +03:00
|
|
|
void *aeGetFileClientData(aeEventLoop *eventLoop, int fd) {
|
|
|
|
if (fd >= eventLoop->setsize) return NULL;
|
|
|
|
aeFileEvent *fe = &eventLoop->events[fd];
|
|
|
|
if (fe->mask == AE_NONE) return NULL;
|
|
|
|
|
|
|
|
return fe->clientData;
|
|
|
|
}
|
|
|
|
|
2011-11-21 16:05:29 +01:00
|
|
|
int aeGetFileEvents(aeEventLoop *eventLoop, int fd) {
|
2011-12-15 11:42:40 +01:00
|
|
|
if (fd >= eventLoop->setsize) return 0;
|
2011-11-21 16:05:29 +01:00
|
|
|
aeFileEvent *fe = &eventLoop->events[fd];
|
|
|
|
|
|
|
|
return fe->mask;
|
|
|
|
}
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
long long aeCreateTimeEvent(aeEventLoop *eventLoop,
|
|
|
|
long long milliseconds,
|
|
|
|
aeTimeProc *proc,
|
|
|
|
void *clientData,
|
|
|
|
aeEventFinalizerProc *finalizerProc) {
|
2009-03-22 10:30:00 +01:00
|
|
|
long long id = eventLoop->timeEventNextId++;
|
|
|
|
aeTimeEvent *te;
|
|
|
|
|
|
|
|
te = zmalloc(sizeof(*te));
|
|
|
|
if (te == NULL) return AE_ERR;
|
|
|
|
te->id = id;
|
2020-08-28 01:54:10 -07:00
|
|
|
te->when = getMonotonicUs() + milliseconds * 1000;
|
2009-03-22 10:30:00 +01:00
|
|
|
te->timeProc = proc;
|
|
|
|
te->finalizerProc = finalizerProc;
|
|
|
|
te->clientData = clientData;
|
Fix ae.c when a timer finalizerProc adds an event.
While this feature is not used by Redis, ae.c implements the ability for
a timer to call a finalizer callback when an timer event is deleted.
This feature was bugged since the start, and because it was never used
we never noticed a problem. However Anthony LaTorre was using the same
library in order to implement a different system: he found a bug that he
describes as follows, and which he fixed with the patch in this commit,
sent me by private email:
--- Anthony email ---
've found one bug in the current implementation of the timed events.
It's possible to lose track of a timed event if an event is added in
the finalizerProc of another event.
For example, suppose you start off with three timed events 1, 2, and
3. Then the linked list looks like:
3 -> 2 -> 1
Then, you run processTimeEvents and events 2 and 3 finish, so now the
list looks like:
-1 -> -1 -> 2
Now, on the next iteration of processTimeEvents it starts by deleting
the first event, and suppose this finalizerProc creates a new event,
so that the list looks like this:
4 -> -1 -> 2
On the next iteration of the while loop, when it gets to the second
event, the variable prev is still set to NULL, so that the head of the
event loop after the next event will be set to 2, i.e. after deleting
the next event the event loop will look like:
2
and the event with id 4 will be lost.
I've attached an example program to illustrate the issue. If you run
it you will see that it prints:
```
foo id = 0
spam!
```
But if you uncomment line 29 and run it again it won't print "spam!".
--- End of email ---
Test.c source code is as follows:
#include "ae.h"
#include <stdio.h>
aeEventLoop *el;
int foo(struct aeEventLoop *el, long long id, void *data)
{
printf("foo id = %lld\n", id);
return AE_NOMORE;
}
int spam(struct aeEventLoop *el, long long id, void *data)
{
printf("spam!\n");
return AE_NOMORE;
}
void bar(struct aeEventLoop *el, void *data)
{
aeCreateTimeEvent(el, 0, spam, NULL, NULL);
}
int main(int argc, char **argv)
{
el = aeCreateEventLoop(100);
//aeCreateTimeEvent(el, 0, foo, NULL, NULL);
aeCreateTimeEvent(el, 0, foo, NULL, bar);
aeMain(el);
return 0;
}
Anthony fixed the problem by using a linked list for the list of timers, and
sent me back this patch after he tested the code in production for some time.
The code looks sane to me, so committing it to Redis.
2018-03-28 14:06:08 +02:00
|
|
|
te->prev = NULL;
|
2009-03-22 10:30:00 +01:00
|
|
|
te->next = eventLoop->timeEventHead;
|
2020-05-14 08:37:24 -07:00
|
|
|
te->refcount = 0;
|
2024-05-22 23:24:12 -07:00
|
|
|
if (te->next) te->next->prev = te;
|
2009-03-22 10:30:00 +01:00
|
|
|
eventLoop->timeEventHead = te;
|
|
|
|
return id;
|
|
|
|
}
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
int aeDeleteTimeEvent(aeEventLoop *eventLoop, long long id) {
|
2016-01-08 15:05:14 +01:00
|
|
|
aeTimeEvent *te = eventLoop->timeEventHead;
|
2024-05-22 23:24:12 -07:00
|
|
|
while (te) {
|
2009-03-22 10:30:00 +01:00
|
|
|
if (te->id == id) {
|
2016-01-08 15:05:14 +01:00
|
|
|
te->id = AE_DELETED_EVENT_ID;
|
2009-03-22 10:30:00 +01:00
|
|
|
return AE_OK;
|
|
|
|
}
|
|
|
|
te = te->next;
|
|
|
|
}
|
|
|
|
return AE_ERR; /* NO event with the specified ID found */
|
|
|
|
}
|
|
|
|
|
Fix busy loop in ae.c when timer event is about to fire (#8764)
The code used to decide on the next time to wake on a timer with
microsecond accuracy, but when deciding to go to sleep it used
milliseconds accuracy (with truncation), this means that it would wake
up too early, see that there's no timer to process, and go to sleep
again for 0ms again and again until the right microsecond arrived.
i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop
through the kernel in the last millisecond, triggering many calls to
beforeSleep.
The fix is to change all the logic in ae.c to work with microseconds,
which is good since most of the ae backends support micro (or even nano)
seconds. however the epoll backend, doesn't support micro, so to avoid
this problem it needs to round upwards, rather than truncate.
Issue created by the monotonic timer PR #7644 (redis 6.2)
Before that, all the timers in ae.c were in milliseconds (using
mstime), so when it requested the backend to sleep till the next timer
event, it would have worked ok.
2021-04-13 07:35:03 +03:00
|
|
|
/* How many microseconds until the first timer should fire.
|
2020-08-28 01:54:10 -07:00
|
|
|
* If there are no timers, -1 is returned.
|
2009-03-22 10:30:00 +01:00
|
|
|
*
|
2009-11-23 12:00:23 +01:00
|
|
|
* Note that's O(N) since time events are unsorted.
|
2024-04-09 01:24:03 -07:00
|
|
|
* Possible optimizations (not needed so far, but...):
|
2009-11-23 12:00:23 +01:00
|
|
|
* 1) Insert the event in order, so that the nearest is just the head.
|
|
|
|
* Much better but still insertion or deletion of timers is O(N).
|
|
|
|
* 2) Use a skiplist to have this operation as O(1) and insertion as O(log(N)).
|
|
|
|
*/
|
Fix busy loop in ae.c when timer event is about to fire (#8764)
The code used to decide on the next time to wake on a timer with
microsecond accuracy, but when deciding to go to sleep it used
milliseconds accuracy (with truncation), this means that it would wake
up too early, see that there's no timer to process, and go to sleep
again for 0ms again and again until the right microsecond arrived.
i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop
through the kernel in the last millisecond, triggering many calls to
beforeSleep.
The fix is to change all the logic in ae.c to work with microseconds,
which is good since most of the ae backends support micro (or even nano)
seconds. however the epoll backend, doesn't support micro, so to avoid
this problem it needs to round upwards, rather than truncate.
Issue created by the monotonic timer PR #7644 (redis 6.2)
Before that, all the timers in ae.c were in milliseconds (using
mstime), so when it requested the backend to sleep till the next timer
event, it would have worked ok.
2021-04-13 07:35:03 +03:00
|
|
|
static int64_t usUntilEarliestTimer(aeEventLoop *eventLoop) {
|
2009-03-22 10:30:00 +01:00
|
|
|
aeTimeEvent *te = eventLoop->timeEventHead;
|
2020-08-28 01:54:10 -07:00
|
|
|
if (te == NULL) return -1;
|
2009-03-22 10:30:00 +01:00
|
|
|
|
2020-08-28 01:54:10 -07:00
|
|
|
aeTimeEvent *earliest = NULL;
|
|
|
|
while (te) {
|
2024-05-22 23:24:12 -07:00
|
|
|
if ((!earliest || te->when < earliest->when) && te->id != AE_DELETED_EVENT_ID) earliest = te;
|
2009-03-22 10:30:00 +01:00
|
|
|
te = te->next;
|
|
|
|
}
|
2020-08-28 01:54:10 -07:00
|
|
|
|
|
|
|
monotime now = getMonotonicUs();
|
Fix busy loop in ae.c when timer event is about to fire (#8764)
The code used to decide on the next time to wake on a timer with
microsecond accuracy, but when deciding to go to sleep it used
milliseconds accuracy (with truncation), this means that it would wake
up too early, see that there's no timer to process, and go to sleep
again for 0ms again and again until the right microsecond arrived.
i.e. a timer for 100ms, would sleep for 99ms, but then do a busy loop
through the kernel in the last millisecond, triggering many calls to
beforeSleep.
The fix is to change all the logic in ae.c to work with microseconds,
which is good since most of the ae backends support micro (or even nano)
seconds. however the epoll backend, doesn't support micro, so to avoid
this problem it needs to round upwards, rather than truncate.
Issue created by the monotonic timer PR #7644 (redis 6.2)
Before that, all the timers in ae.c were in milliseconds (using
mstime), so when it requested the backend to sleep till the next timer
event, it would have worked ok.
2021-04-13 07:35:03 +03:00
|
|
|
return (now >= earliest->when) ? 0 : earliest->when - now;
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
|
|
|
|
2009-11-23 12:00:23 +01:00
|
|
|
/* Process time events */
|
|
|
|
static int processTimeEvents(aeEventLoop *eventLoop) {
|
|
|
|
int processed = 0;
|
Fix ae.c when a timer finalizerProc adds an event.
While this feature is not used by Redis, ae.c implements the ability for
a timer to call a finalizer callback when an timer event is deleted.
This feature was bugged since the start, and because it was never used
we never noticed a problem. However Anthony LaTorre was using the same
library in order to implement a different system: he found a bug that he
describes as follows, and which he fixed with the patch in this commit,
sent me by private email:
--- Anthony email ---
've found one bug in the current implementation of the timed events.
It's possible to lose track of a timed event if an event is added in
the finalizerProc of another event.
For example, suppose you start off with three timed events 1, 2, and
3. Then the linked list looks like:
3 -> 2 -> 1
Then, you run processTimeEvents and events 2 and 3 finish, so now the
list looks like:
-1 -> -1 -> 2
Now, on the next iteration of processTimeEvents it starts by deleting
the first event, and suppose this finalizerProc creates a new event,
so that the list looks like this:
4 -> -1 -> 2
On the next iteration of the while loop, when it gets to the second
event, the variable prev is still set to NULL, so that the head of the
event loop after the next event will be set to 2, i.e. after deleting
the next event the event loop will look like:
2
and the event with id 4 will be lost.
I've attached an example program to illustrate the issue. If you run
it you will see that it prints:
```
foo id = 0
spam!
```
But if you uncomment line 29 and run it again it won't print "spam!".
--- End of email ---
Test.c source code is as follows:
#include "ae.h"
#include <stdio.h>
aeEventLoop *el;
int foo(struct aeEventLoop *el, long long id, void *data)
{
printf("foo id = %lld\n", id);
return AE_NOMORE;
}
int spam(struct aeEventLoop *el, long long id, void *data)
{
printf("spam!\n");
return AE_NOMORE;
}
void bar(struct aeEventLoop *el, void *data)
{
aeCreateTimeEvent(el, 0, spam, NULL, NULL);
}
int main(int argc, char **argv)
{
el = aeCreateEventLoop(100);
//aeCreateTimeEvent(el, 0, foo, NULL, NULL);
aeCreateTimeEvent(el, 0, foo, NULL, bar);
aeMain(el);
return 0;
}
Anthony fixed the problem by using a linked list for the list of timers, and
sent me back this patch after he tested the code in production for some time.
The code looks sane to me, so committing it to Redis.
2018-03-28 14:06:08 +02:00
|
|
|
aeTimeEvent *te;
|
2009-11-23 12:00:23 +01:00
|
|
|
long long maxId;
|
|
|
|
|
|
|
|
te = eventLoop->timeEventHead;
|
2024-05-22 23:24:12 -07:00
|
|
|
maxId = eventLoop->timeEventNextId - 1;
|
2020-08-28 01:54:10 -07:00
|
|
|
monotime now = getMonotonicUs();
|
2024-05-22 23:24:12 -07:00
|
|
|
while (te) {
|
2009-11-23 12:00:23 +01:00
|
|
|
long long id;
|
|
|
|
|
2016-01-08 15:05:14 +01:00
|
|
|
/* Remove events scheduled for deletion. */
|
|
|
|
if (te->id == AE_DELETED_EVENT_ID) {
|
|
|
|
aeTimeEvent *next = te->next;
|
2020-05-14 08:37:24 -07:00
|
|
|
/* If a reference exists for this timer event,
|
|
|
|
* don't free it. This is currently incremented
|
|
|
|
* for recursive timerProc calls */
|
|
|
|
if (te->refcount) {
|
|
|
|
te = next;
|
|
|
|
continue;
|
|
|
|
}
|
Fix ae.c when a timer finalizerProc adds an event.
While this feature is not used by Redis, ae.c implements the ability for
a timer to call a finalizer callback when an timer event is deleted.
This feature was bugged since the start, and because it was never used
we never noticed a problem. However Anthony LaTorre was using the same
library in order to implement a different system: he found a bug that he
describes as follows, and which he fixed with the patch in this commit,
sent me by private email:
--- Anthony email ---
've found one bug in the current implementation of the timed events.
It's possible to lose track of a timed event if an event is added in
the finalizerProc of another event.
For example, suppose you start off with three timed events 1, 2, and
3. Then the linked list looks like:
3 -> 2 -> 1
Then, you run processTimeEvents and events 2 and 3 finish, so now the
list looks like:
-1 -> -1 -> 2
Now, on the next iteration of processTimeEvents it starts by deleting
the first event, and suppose this finalizerProc creates a new event,
so that the list looks like this:
4 -> -1 -> 2
On the next iteration of the while loop, when it gets to the second
event, the variable prev is still set to NULL, so that the head of the
event loop after the next event will be set to 2, i.e. after deleting
the next event the event loop will look like:
2
and the event with id 4 will be lost.
I've attached an example program to illustrate the issue. If you run
it you will see that it prints:
```
foo id = 0
spam!
```
But if you uncomment line 29 and run it again it won't print "spam!".
--- End of email ---
Test.c source code is as follows:
#include "ae.h"
#include <stdio.h>
aeEventLoop *el;
int foo(struct aeEventLoop *el, long long id, void *data)
{
printf("foo id = %lld\n", id);
return AE_NOMORE;
}
int spam(struct aeEventLoop *el, long long id, void *data)
{
printf("spam!\n");
return AE_NOMORE;
}
void bar(struct aeEventLoop *el, void *data)
{
aeCreateTimeEvent(el, 0, spam, NULL, NULL);
}
int main(int argc, char **argv)
{
el = aeCreateEventLoop(100);
//aeCreateTimeEvent(el, 0, foo, NULL, NULL);
aeCreateTimeEvent(el, 0, foo, NULL, bar);
aeMain(el);
return 0;
}
Anthony fixed the problem by using a linked list for the list of timers, and
sent me back this patch after he tested the code in production for some time.
The code looks sane to me, so committing it to Redis.
2018-03-28 14:06:08 +02:00
|
|
|
if (te->prev)
|
|
|
|
te->prev->next = te->next;
|
2016-01-08 15:05:14 +01:00
|
|
|
else
|
Fix ae.c when a timer finalizerProc adds an event.
While this feature is not used by Redis, ae.c implements the ability for
a timer to call a finalizer callback when an timer event is deleted.
This feature was bugged since the start, and because it was never used
we never noticed a problem. However Anthony LaTorre was using the same
library in order to implement a different system: he found a bug that he
describes as follows, and which he fixed with the patch in this commit,
sent me by private email:
--- Anthony email ---
've found one bug in the current implementation of the timed events.
It's possible to lose track of a timed event if an event is added in
the finalizerProc of another event.
For example, suppose you start off with three timed events 1, 2, and
3. Then the linked list looks like:
3 -> 2 -> 1
Then, you run processTimeEvents and events 2 and 3 finish, so now the
list looks like:
-1 -> -1 -> 2
Now, on the next iteration of processTimeEvents it starts by deleting
the first event, and suppose this finalizerProc creates a new event,
so that the list looks like this:
4 -> -1 -> 2
On the next iteration of the while loop, when it gets to the second
event, the variable prev is still set to NULL, so that the head of the
event loop after the next event will be set to 2, i.e. after deleting
the next event the event loop will look like:
2
and the event with id 4 will be lost.
I've attached an example program to illustrate the issue. If you run
it you will see that it prints:
```
foo id = 0
spam!
```
But if you uncomment line 29 and run it again it won't print "spam!".
--- End of email ---
Test.c source code is as follows:
#include "ae.h"
#include <stdio.h>
aeEventLoop *el;
int foo(struct aeEventLoop *el, long long id, void *data)
{
printf("foo id = %lld\n", id);
return AE_NOMORE;
}
int spam(struct aeEventLoop *el, long long id, void *data)
{
printf("spam!\n");
return AE_NOMORE;
}
void bar(struct aeEventLoop *el, void *data)
{
aeCreateTimeEvent(el, 0, spam, NULL, NULL);
}
int main(int argc, char **argv)
{
el = aeCreateEventLoop(100);
//aeCreateTimeEvent(el, 0, foo, NULL, NULL);
aeCreateTimeEvent(el, 0, foo, NULL, bar);
aeMain(el);
return 0;
}
Anthony fixed the problem by using a linked list for the list of timers, and
sent me back this patch after he tested the code in production for some time.
The code looks sane to me, so committing it to Redis.
2018-03-28 14:06:08 +02:00
|
|
|
eventLoop->timeEventHead = te->next;
|
2024-05-22 23:24:12 -07:00
|
|
|
if (te->next) te->next->prev = te->prev;
|
2020-08-28 01:54:10 -07:00
|
|
|
if (te->finalizerProc) {
|
2016-01-08 15:05:14 +01:00
|
|
|
te->finalizerProc(eventLoop, te->clientData);
|
2020-08-28 01:54:10 -07:00
|
|
|
now = getMonotonicUs();
|
|
|
|
}
|
2016-01-08 15:05:14 +01:00
|
|
|
zfree(te);
|
|
|
|
te = next;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2016-04-04 12:23:10 +02:00
|
|
|
/* Make sure we don't process time events created by time events in
|
|
|
|
* this iteration. Note that this check is currently useless: we always
|
|
|
|
* add new timers on the head, however if we change the implementation
|
|
|
|
* detail, this check may be useful again: we keep it here for future
|
|
|
|
* defense. */
|
2009-11-23 12:00:23 +01:00
|
|
|
if (te->id > maxId) {
|
|
|
|
te = te->next;
|
|
|
|
continue;
|
|
|
|
}
|
2020-08-28 01:54:10 -07:00
|
|
|
|
|
|
|
if (te->when <= now) {
|
2009-11-23 12:00:23 +01:00
|
|
|
int retval;
|
|
|
|
|
|
|
|
id = te->id;
|
2020-05-14 08:37:24 -07:00
|
|
|
te->refcount++;
|
2009-11-23 12:00:23 +01:00
|
|
|
retval = te->timeProc(eventLoop, id, te->clientData);
|
2020-05-14 08:37:24 -07:00
|
|
|
te->refcount--;
|
2009-11-23 12:00:23 +01:00
|
|
|
processed++;
|
2020-08-28 01:54:10 -07:00
|
|
|
now = getMonotonicUs();
|
2009-11-23 12:00:23 +01:00
|
|
|
if (retval != AE_NOMORE) {
|
2023-09-24 06:31:12 -04:00
|
|
|
te->when = now + (monotime)retval * 1000;
|
2009-11-23 12:00:23 +01:00
|
|
|
} else {
|
2016-01-08 15:05:14 +01:00
|
|
|
te->id = AE_DELETED_EVENT_ID;
|
2009-11-23 12:00:23 +01:00
|
|
|
}
|
|
|
|
}
|
2016-01-08 15:05:14 +01:00
|
|
|
te = te->next;
|
2009-11-23 12:00:23 +01:00
|
|
|
}
|
|
|
|
return processed;
|
|
|
|
}
|
|
|
|
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
/* This function provides direct access to the aeApiPoll call.
|
|
|
|
* It is intended to be called from a custom poll function.*/
|
|
|
|
int aePoll(aeEventLoop *eventLoop, struct timeval *tvp) {
|
|
|
|
AE_LOCK(eventLoop);
|
|
|
|
|
|
|
|
int ret = aeApiPoll(eventLoop, tvp);
|
|
|
|
|
|
|
|
AE_UNLOCK(eventLoop);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2023-12-26 10:58:14 +08:00
|
|
|
/* Process every pending file event, then every pending time event
|
|
|
|
* (that may be registered by file event callbacks just processed).
|
2009-03-22 10:30:00 +01:00
|
|
|
* Without special flags the function sleeps until some file event
|
2013-01-17 01:00:20 +08:00
|
|
|
* fires, or when the next time event occurs (if any).
|
2009-03-22 10:30:00 +01:00
|
|
|
*
|
|
|
|
* If flags is 0, the function does nothing and returns.
|
|
|
|
* if flags has AE_ALL_EVENTS set, all the kind of events are processed.
|
|
|
|
* if flags has AE_FILE_EVENTS set, file events are processed.
|
|
|
|
* if flags has AE_TIME_EVENTS set, time events are processed.
|
2021-06-29 19:37:02 +08:00
|
|
|
* if flags has AE_DONT_WAIT set, the function returns ASAP once all
|
|
|
|
* the events that can be handled without a wait are processed.
|
2019-01-06 15:01:25 +08:00
|
|
|
* if flags has AE_CALL_AFTER_SLEEP set, the aftersleep callback is called.
|
2020-05-12 13:07:44 +02:00
|
|
|
* if flags has AE_CALL_BEFORE_SLEEP set, the beforesleep callback is called.
|
2009-03-22 10:30:00 +01:00
|
|
|
*
|
|
|
|
* The function returns the number of events processed. */
|
2024-05-22 23:24:12 -07:00
|
|
|
int aeProcessEvents(aeEventLoop *eventLoop, int flags) {
|
2009-11-23 18:50:39 +01:00
|
|
|
int processed = 0, numevents;
|
2009-03-22 10:30:00 +01:00
|
|
|
|
|
|
|
/* Nothing to do? return ASAP */
|
|
|
|
if (!(flags & AE_TIME_EVENTS) && !(flags & AE_FILE_EVENTS)) return 0;
|
|
|
|
|
2023-04-23 22:02:09 +08:00
|
|
|
/* Note that we want to call aeApiPoll() even if there are no
|
2009-03-22 10:30:00 +01:00
|
|
|
* file events to process as long as we want to process time
|
|
|
|
* events, in order to sleep until the next time event is ready
|
|
|
|
* to fire. */
|
2024-05-22 23:24:12 -07:00
|
|
|
if (eventLoop->maxfd != -1 || ((flags & AE_TIME_EVENTS) && !(flags & AE_DONT_WAIT))) {
|
2009-11-23 18:50:39 +01:00
|
|
|
int j;
|
2023-04-23 22:02:09 +08:00
|
|
|
struct timeval tv, *tvp = NULL; /* NULL means infinite wait. */
|
|
|
|
int64_t usUntilTimer;
|
2009-03-22 10:30:00 +01:00
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
if (eventLoop->beforesleep != NULL && (flags & AE_CALL_BEFORE_SLEEP)) eventLoop->beforesleep(eventLoop);
|
2009-03-22 10:30:00 +01:00
|
|
|
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
if (eventLoop->custompoll != NULL) {
|
|
|
|
numevents = eventLoop->custompoll(eventLoop);
|
|
|
|
} else {
|
|
|
|
/* The eventLoop->flags may be changed inside beforesleep.
|
|
|
|
* So we should check it after beforesleep be called. At the same time,
|
|
|
|
* the parameter flags always should have the highest priority.
|
|
|
|
* That is to say, once the parameter flag is set to AE_DONT_WAIT,
|
|
|
|
* no matter what value eventLoop->flags is set to, we should ignore it. */
|
|
|
|
if ((flags & AE_DONT_WAIT) || (eventLoop->flags & AE_DONT_WAIT)) {
|
|
|
|
tv.tv_sec = tv.tv_usec = 0;
|
2009-03-22 10:30:00 +01:00
|
|
|
tvp = &tv;
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
} else if (flags & AE_TIME_EVENTS) {
|
|
|
|
usUntilTimer = usUntilEarliestTimer(eventLoop);
|
|
|
|
if (usUntilTimer >= 0) {
|
|
|
|
tv.tv_sec = usUntilTimer / 1000000;
|
|
|
|
tv.tv_usec = usUntilTimer % 1000000;
|
|
|
|
tvp = &tv;
|
|
|
|
}
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
/* Call the multiplexing API, will return only on timeout or when
|
|
|
|
* some event fires. */
|
|
|
|
numevents = aeApiPoll(eventLoop, tvp);
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
2017-05-03 11:26:21 +02:00
|
|
|
|
2022-10-30 01:24:12 +02:00
|
|
|
/* Don't process file events if not requested. */
|
|
|
|
if (!(flags & AE_FILE_EVENTS)) {
|
|
|
|
numevents = 0;
|
|
|
|
}
|
|
|
|
|
2017-05-03 11:26:21 +02:00
|
|
|
/* After sleep callback. */
|
Async IO threads (#758)
This PR is 1 of 3 PRs intended to achieve the goal of 1 million requests
per second, as detailed by [dan touitou](https://github.com/touitou-dan)
in https://github.com/valkey-io/valkey/issues/22. This PR modifies the
IO threads to be fully asynchronous, which is a first and necessary step
to allow more work offloading and better utilization of the IO threads.
### Current IO threads state:
Valkey IO threads were introduced in Redis 6.0 to allow better
utilization of multi-core machines. Before this, Redis was
single-threaded and could only use one CPU core for network and command
processing. The introduction of IO threads helps in offloading the IO
operations to multiple threads.
**Current IO Threads flow:**
1. Initialization: When Redis starts, it initializes a specified number
of IO threads. These threads are in addition to the main thread, each
thread starts with an empty list, the main thread will populate that
list in each event-loop with pending-read-clients or
pending-write-clients.
2. Read Phase: The main thread accepts incoming connections and reads
requests from clients. The reading of requests are offloaded to IO
threads. The main thread puts the clients ready-to-read in a list and
set the global io_threads_op to IO_THREADS_OP_READ, the IO threads pick
the clients up, perform the read operation and parse the first incoming
command.
3. Command Processing: After reading the requests, command processing is
still single-threaded and handled by the main thread.
4. Write Phase: Similar to the read phase, the write phase is also be
offloaded to IO threads. The main thread prepares the response in the
clients’ output buffer then the main thread puts the client in the list,
and sets the global io_threads_op to the IO_THREADS_OP_WRITE. The IO
threads then pick the clients up and perform the write operation to send
the responses back to clients.
5. Synchronization: The main-thread communicate with the threads on how
many jobs left per each thread with atomic counter. The main-thread
doesn’t access the clients while being handled by the IO threads.
**Issues with current implementation:**
* Underutilized Cores: The current implementation of IO-threads leads to
the underutilization of CPU cores.
* The main thread remains responsible for a significant portion of
IO-related tasks that could be offloaded to IO-threads.
* When the main-thread is processing client’s commands, the IO threads
are idle for a considerable amount of time.
* Notably, the main thread's performance during the IO-related tasks is
constrained by the speed of the slowest IO-thread.
* Limited Offloading: Currently, Since the Main-threads waits
synchronously for the IO threads, the Threads perform only read-parse,
and write operations, with parsing done only for the first command. If
the threads can do work asynchronously we may offload more work to the
threads reducing the load from the main-thread.
* TLS: Currently, we don't support IO threads with TLS (where offloading
IO would be more beneficial) since TLS read/write operations are not
thread-safe with the current implementation.
### Suggested change
Non-blocking main thread - The main thread and IO threads will operate
in parallel to maximize efficiency. The main thread will not be blocked
by IO operations. It will continue to process commands independently of
the IO thread's activities.
**Implementation details**
**Inter-thread communication.**
* We use a static, lock-free ring buffer of fixed size (2048 jobs) for
the main thread to send jobs and for the IO to receive them. If the ring
buffer fills up, the main thread will handle the task itself, acting as
back pressure (in case IO operations are more expensive than command
processing). A static ring buffer is a better candidate than a dynamic
job queue as it eliminates the need for allocation/freeing per job.
* An IO job will be in the format: ` [void* function-call-back | void
*data] `where data is either a client to read/write from and the
function-ptr is the function to be called with the data for example
readQueryFromClient using this format we can use it later to offload
other types of works to the IO threads.
* The Ring buffer is one way from the main-thread to the IO thread, Upon
read/write event the main thread will send a read/write job then in
before sleep it will iterate over the pending read/write clients to
checking for each client if the IO threads has already finished handling
it. The IO thread signals it has finished handling a client read/write
by toggling an atomic flag read_state / write_state on the client
struct.
**Thread Safety**
As suggested in this solution, the IO threads are reading from and
writing to the clients' buffers while the main thread may access those
clients.
We must ensure no race conditions or unsafe access occurs while keeping
the Valkey code simple and lock free.
Minimal Action in the IO Threads
The main change is to limit the IO thread operations to the bare
minimum. The IO thread will access only the client's struct and only the
necessary fields in this struct.
The IO threads will be responsible for the following:
* Read Operation: The IO thread will only read and parse a single
command. It will not update the server stats, handle read errors, or
parsing errors. These tasks will be taken care of by the main thread.
* Write Operation: The IO thread will only write the available data. It
will not free the client's replies, handle write errors, or update the
server statistics.
To achieve this without code duplication, the read/write code has been
refactored into smaller, independent components:
* Functions that perform only the read/parse/write calls.
* Functions that handle the read/parse/write results.
This refactor accounts for the majority of the modifications in this PR.
**Client Struct Safe Access**
As we ensure that the IO threads access memory only within the client
struct, we need to ensure thread safety only for the client's struct's
shared fields.
* Query Buffer
* Command parsing - The main thread will not try to parse a command from
the query buffer when a client is offloaded to the IO thread.
* Client's memory checks in client-cron - The main thread will not
access the client query buffer if it is offloaded and will handle the
querybuf grow/shrink when the client is back.
* CLIENT LIST command - The main thread will busy-wait for the IO thread
to finish handling the client, falling back to the current behavior
where the main thread waits for the IO thread to finish their
processing.
* Output Buffer
* The IO thread will not change the client's bufpos and won't free the
client's reply lists. These actions will be done by the main thread on
the client's return from the IO thread.
* bufpos / block→used: As the main thread may change the bufpos, the
reply-block→used, or add/delete blocks to the reply list while the IO
thread writes, we add two fields to the client struct: io_last_bufpos
and io_last_reply_block. The IO thread will write until the
io_last_bufpos, which was set by the main-thread before sending the
client to the IO thread. If more data has been added to the cob in
between, it will be written in the next write-job. In addition, the main
thread will not trim or merge reply blocks while the client is
offloaded.
* Parsing Fields
* Client's cmd, argc, argv, reqtype, etc., are set during parsing.
* The main thread will indicate to the IO thread not to parse a cmd if
the client is not reset. In this case, the IO thread will only read from
the network and won't attempt to parse a new command.
* The main thread won't access the c→cmd/c→argv in the CLIENT LIST
command as stated before it will busy wait for the IO threads.
* Client Flags
* c→flags, which may be changed by the main thread in multiple places,
won't be accessed by the IO thread. Instead, the main thread will set
the c→io_flags with the information necessary for the IO thread to know
the client's state.
* Client Close
* On freeClient, the main thread will busy wait for the IO thread to
finish processing the client's read/write before proceeding to free the
client.
* Client's Memory Limits
* The IO thread won't handle the qb/cob limits. In case a client crosses
the qb limit, the IO thread will stop reading for it, letting the main
thread know that the client crossed the limit.
**TLS**
TLS is currently not supported with IO threads for the following
reasons:
1. Pending reads - If SSL has pending data that has already been read
from the socket, there is a risk of not calling the read handler again.
To handle this, a list is used to hold the pending clients. With IO
threads, multiple threads can access the list concurrently.
2. Event loop modification - Currently, the TLS code
registers/unregisters the file descriptor from the event loop depending
on the read/write results. With IO threads, multiple threads can modify
the event loop struct simultaneously.
3. The same client can be sent to 2 different threads concurrently
(https://github.com/redis/redis/issues/12540).
Those issues were handled in the current PR:
1. The IO thread only performs the read operation. The main thread will
check for pending reads after the client returns from the IO thread and
will be the only one to access the pending list.
2. The registering/unregistering of events will be similarly postponed
and handled by the main thread only.
3. Each client is being sent to the same dedicated thread (c→id %
num_of_threads).
**Sending Replies Immediately with IO threads.**
Currently, after processing a command, we add the client to the
pending_writes_list. Only after processing all the clients do we send
all the replies. Since the IO threads are now working asynchronously, we
can send the reply immediately after processing the client’s requests,
reducing the command latency. However, if we are using AOF=always, we
must wait for the AOF buffer to be written, in which case we revert to
the current behavior.
**IO threads dynamic adjustment**
Currently, we use an all-or-nothing approach when activating the IO
threads. The current logic is as follows: if the number of pending write
clients is greater than twice the number of threads (including the main
thread), we enable all threads; otherwise, we enable none. For example,
if 8 IO threads are defined, we enable all 8 threads if there are 16
pending clients; else, we enable none.
It makes more sense to enable partial activation of the IO threads. If
we have 10 pending clients, we will enable 5 threads, and so on. This
approach allows for a more granular and efficient allocation of
resources based on the current workload.
In addition, the user will now be able to change the number of I/O
threads at runtime. For example, when decreasing the number of threads
from 4 to 2, threads 3 and 4 will be closed after flushing their job
queues.
**Tests**
Currently, we run the io-threads tests with 4 IO threads
(https://github.com/valkey-io/valkey/blob/443d80f1686377ad42cbf92d98ecc6d240325ee1/.github/workflows/daily.yml#L353).
This means that we will not activate the IO threads unless there are 8
(threads * 2) pending write clients per single loop, which is unlikely
to happened in most of tests, meaning the IO threads are not currently
being tested.
To enforce the main thread to always offload work to the IO threads,
regardless of the number of pending events, we add an
events-per-io-thread configuration with a default value of 2. When set
to 0, this configuration will force the main thread to always offload
work to the IO threads.
When we offload every single read/write operation to the IO threads, the
IO-threads are running with 100% CPU when running multiple tests
concurrently some tests fail as a result of larger than expected command
latencies. To address this issue, we have to add some after or wait_for
calls to some of the tests to ensure they pass with IO threads as well.
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-09 06:01:39 +03:00
|
|
|
if (eventLoop->aftersleep != NULL && flags & AE_CALL_AFTER_SLEEP) eventLoop->aftersleep(eventLoop, numevents);
|
2017-05-03 11:26:21 +02:00
|
|
|
|
2009-11-23 18:50:39 +01:00
|
|
|
for (j = 0; j < numevents; j++) {
|
|
|
|
int fd = eventLoop->fired[j].fd;
|
2021-05-05 20:12:53 +08:00
|
|
|
aeFileEvent *fe = &eventLoop->events[fd];
|
|
|
|
int mask = eventLoop->fired[j].mask;
|
2018-02-27 12:16:26 +01:00
|
|
|
int fired = 0; /* Number of events fired for current fd. */
|
|
|
|
|
|
|
|
/* Normally we execute the readable event first, and the writable
|
Squash merging 125 typo/grammar/comment/doc PRs (#7773)
List of squashed commits or PRs
===============================
commit 66801ea
Author: hwware <wen.hui.ware@gmail.com>
Date: Mon Jan 13 00:54:31 2020 -0500
typo fix in acl.c
commit 46f55db
Author: Itamar Haber <itamar@redislabs.com>
Date: Sun Sep 6 18:24:11 2020 +0300
Updates a couple of comments
Specifically:
* RM_AutoMemory completed instead of pointing to docs
* Updated link to custom type doc
commit 61a2aa0
Author: xindoo <xindoo@qq.com>
Date: Tue Sep 1 19:24:59 2020 +0800
Correct errors in code comments
commit a5871d1
Author: yz1509 <pro-756@qq.com>
Date: Tue Sep 1 18:36:06 2020 +0800
fix typos in module.c
commit 41eede7
Author: bookug <bookug@qq.com>
Date: Sat Aug 15 01:11:33 2020 +0800
docs: fix typos in comments
commit c303c84
Author: lazy-snail <ws.niu@outlook.com>
Date: Fri Aug 7 11:15:44 2020 +0800
fix spelling in redis.conf
commit 1eb76bf
Author: zhujian <zhujianxyz@gmail.com>
Date: Thu Aug 6 15:22:10 2020 +0800
add a missing 'n' in comment
commit 1530ec2
Author: Daniel Dai <764122422@qq.com>
Date: Mon Jul 27 00:46:35 2020 -0400
fix spelling in tracking.c
commit e517b31
Author: Hunter-Chen <huntcool001@gmail.com>
Date: Fri Jul 17 22:33:32 2020 +0800
Update redis.conf
Co-authored-by: Itamar Haber <itamar@redislabs.com>
commit c300eff
Author: Hunter-Chen <huntcool001@gmail.com>
Date: Fri Jul 17 22:33:23 2020 +0800
Update redis.conf
Co-authored-by: Itamar Haber <itamar@redislabs.com>
commit 4c058a8
Author: 陈浩鹏 <chenhaopeng@heytea.com>
Date: Thu Jun 25 19:00:56 2020 +0800
Grammar fix and clarification
commit 5fcaa81
Author: bodong.ybd <bodong.ybd@alibaba-inc.com>
Date: Fri Jun 19 10:09:00 2020 +0800
Fix typos
commit 4caca9a
Author: Pruthvi P <pruthvi@ixigo.com>
Date: Fri May 22 00:33:22 2020 +0530
Fix typo eviciton => eviction
commit b2a25f6
Author: Brad Dunbar <dunbarb2@gmail.com>
Date: Sun May 17 12:39:59 2020 -0400
Fix a typo.
commit 12842ae
Author: hwware <wen.hui.ware@gmail.com>
Date: Sun May 3 17:16:59 2020 -0400
fix spelling in redis conf
commit ddba07c
Author: Chris Lamb <chris@chris-lamb.co.uk>
Date: Sat May 2 23:25:34 2020 +0100
Correct a "conflicts" spelling error.
commit 8fc7bf2
Author: Nao YONASHIRO <yonashiro@r.recruit.co.jp>
Date: Thu Apr 30 10:25:27 2020 +0900
docs: fix EXPIRE_FAST_CYCLE_DURATION to ACTIVE_EXPIRE_CYCLE_FAST_DURATION
commit 9b2b67a
Author: Brad Dunbar <dunbarb2@gmail.com>
Date: Fri Apr 24 11:46:22 2020 -0400
Fix a typo.
commit 0746f10
Author: devilinrust <63737265+devilinrust@users.noreply.github.com>
Date: Thu Apr 16 00:17:53 2020 +0200
Fix typos in server.c
commit 92b588d
Author: benjessop12 <56115861+benjessop12@users.noreply.github.com>
Date: Mon Apr 13 13:43:55 2020 +0100
Fix spelling mistake in lazyfree.c
commit 1da37aa
Merge: 2d4ba28 af347a8
Author: hwware <wen.hui.ware@gmail.com>
Date: Thu Mar 5 22:41:31 2020 -0500
Merge remote-tracking branch 'upstream/unstable' into expiretypofix
commit 2d4ba28
Author: hwware <wen.hui.ware@gmail.com>
Date: Mon Mar 2 00:09:40 2020 -0500
fix typo in expire.c
commit 1a746f7
Author: SennoYuki <minakami1yuki@gmail.com>
Date: Thu Feb 27 16:54:32 2020 +0800
fix typo
commit 8599b1a
Author: dongheejeong <donghee950403@gmail.com>
Date: Sun Feb 16 20:31:43 2020 +0000
Fix typo in server.c
commit f38d4e8
Author: hwware <wen.hui.ware@gmail.com>
Date: Sun Feb 2 22:58:38 2020 -0500
fix typo in evict.c
commit fe143fc
Author: Leo Murillo <leonardo.murillo@gmail.com>
Date: Sun Feb 2 01:57:22 2020 -0600
Fix a few typos in redis.conf
commit 1ab4d21
Author: viraja1 <anchan.viraj@gmail.com>
Date: Fri Dec 27 17:15:58 2019 +0530
Fix typo in Latency API docstring
commit ca1f70e
Author: gosth <danxuedexing@qq.com>
Date: Wed Dec 18 15:18:02 2019 +0800
fix typo in sort.c
commit a57c06b
Author: ZYunH <zyunhjob@163.com>
Date: Mon Dec 16 22:28:46 2019 +0800
fix-zset-typo
commit b8c92b5
Author: git-hulk <hulk.website@gmail.com>
Date: Mon Dec 16 15:51:42 2019 +0800
FIX: typo in cluster.c, onformation->information
commit 9dd981c
Author: wujm2007 <jim.wujm@gmail.com>
Date: Mon Dec 16 09:37:52 2019 +0800
Fix typo
commit e132d7a
Author: Sebastien Williams-Wynn <s.williamswynn.mail@gmail.com>
Date: Fri Nov 15 00:14:07 2019 +0000
Minor typo change
commit 47f44d5
Author: happynote3966 <01ssrmikururudevice01@gmail.com>
Date: Mon Nov 11 22:08:48 2019 +0900
fix comment typo in redis-cli.c
commit b8bdb0d
Author: fulei <fulei@kuaishou.com>
Date: Wed Oct 16 18:00:17 2019 +0800
Fix a spelling mistake of comments in defragDictBucketCallback
commit 0def46a
Author: fulei <fulei@kuaishou.com>
Date: Wed Oct 16 13:09:27 2019 +0800
fix some spelling mistakes of comments in defrag.c
commit f3596fd
Author: Phil Rajchgot <tophil@outlook.com>
Date: Sun Oct 13 02:02:32 2019 -0400
Typo and grammar fixes
Redis and its documentation are great -- just wanted to submit a few corrections in the spirit of Hacktoberfest. Thanks for all your work on this project. I use it all the time and it works beautifully.
commit 2b928cd
Author: KangZhiDong <worldkzd@gmail.com>
Date: Sun Sep 1 07:03:11 2019 +0800
fix typos
commit 33aea14
Author: Axlgrep <axlgrep@gmail.com>
Date: Tue Aug 27 11:02:18 2019 +0800
Fixed eviction spelling issues
commit e282a80
Author: Simen Flatby <simen@oms.no>
Date: Tue Aug 20 15:25:51 2019 +0200
Update comments to reflect prop name
In the comments the prop is referenced as replica-validity-factor,
but it is really named cluster-replica-validity-factor.
commit 74d1f9a
Author: Jim Green <jimgreen2013@qq.com>
Date: Tue Aug 20 20:00:31 2019 +0800
fix comment error, the code is ok
commit eea1407
Author: Liao Tonglang <liaotonglang@gmail.com>
Date: Fri May 31 10:16:18 2019 +0800
typo fix
fix cna't to can't
commit 0da553c
Author: KAWACHI Takashi <tkawachi@gmail.com>
Date: Wed Jul 17 00:38:16 2019 +0900
Fix typo
commit 7fc8fb6
Author: Michael Prokop <mika@grml.org>
Date: Tue May 28 17:58:42 2019 +0200
Typo fixes
s/familar/familiar/
s/compatiblity/compatibility/
s/ ot / to /
s/itsef/itself/
commit 5f46c9d
Author: zhumoing <34539422+zhumoing@users.noreply.github.com>
Date: Tue May 21 21:16:50 2019 +0800
typo-fixes
typo-fixes
commit 321dfe1
Author: wxisme <850885154@qq.com>
Date: Sat Mar 16 15:10:55 2019 +0800
typo fix
commit b4fb131
Merge: 267e0e6 3df1eb8
Author: Nikitas Bastas <nikitasbst@gmail.com>
Date: Fri Feb 8 22:55:45 2019 +0200
Merge branch 'unstable' of antirez/redis into unstable
commit 267e0e6
Author: Nikitas Bastas <nikitasbst@gmail.com>
Date: Wed Jan 30 21:26:04 2019 +0200
Minor typo fix
commit 30544e7
Author: inshal96 <39904558+inshal96@users.noreply.github.com>
Date: Fri Jan 4 16:54:50 2019 +0500
remove an extra 'a' in the comments
commit 337969d
Author: BrotherGao <yangdongheng11@gmail.com>
Date: Sat Dec 29 12:37:29 2018 +0800
fix typo in redis.conf
commit 9f4b121
Merge: 423a030 e504583
Author: BrotherGao <yangdongheng@xiaomi.com>
Date: Sat Dec 29 11:41:12 2018 +0800
Merge branch 'unstable' of antirez/redis into unstable
commit 423a030
Merge: 42b02b7 46a51cd
Author: 杨东衡 <yangdongheng@xiaomi.com>
Date: Tue Dec 4 23:56:11 2018 +0800
Merge branch 'unstable' of antirez/redis into unstable
commit 42b02b7
Merge: 68c0e6e b8febe6
Author: Dongheng Yang <yangdongheng11@gmail.com>
Date: Sun Oct 28 15:54:23 2018 +0800
Merge pull request #1 from antirez/unstable
update local data
commit 714b589
Author: Christian <crifei93@gmail.com>
Date: Fri Dec 28 01:17:26 2018 +0100
fix typo "resulution"
commit e23259d
Author: garenchan <1412950785@qq.com>
Date: Wed Dec 26 09:58:35 2018 +0800
fix typo: segfauls -> segfault
commit a9359f8
Author: xjp <jianping_xie@aliyun.com>
Date: Tue Dec 18 17:31:44 2018 +0800
Fixed REDISMODULE_H spell bug
commit a12c3e4
Author: jdiaz <jrd.palacios@gmail.com>
Date: Sat Dec 15 23:39:52 2018 -0600
Fixes hyperloglog hash function comment block description
commit 770eb11
Author: 林上耀 <1210tom@163.com>
Date: Sun Nov 25 17:16:10 2018 +0800
fix typo
commit fd97fbb
Author: Chris Lamb <chris@chris-lamb.co.uk>
Date: Fri Nov 23 17:14:01 2018 +0100
Correct "unsupported" typo.
commit a85522d
Author: Jungnam Lee <jungnam.lee@oracle.com>
Date: Thu Nov 8 23:01:29 2018 +0900
fix typo in test comments
commit ade8007
Author: Arun Kumar <palerdot@users.noreply.github.com>
Date: Tue Oct 23 16:56:35 2018 +0530
Fixed grammatical typo
Fixed typo for word 'dictionary'
commit 869ee39
Author: Hamid Alaei <hamid.a85@gmail.com>
Date: Sun Aug 12 16:40:02 2018 +0430
fix documentations: (ThreadSafeContextStart/Stop -> ThreadSafeContextLock/Unlock), minor typo
commit f89d158
Author: Mayank Jain <mayankjain255@gmail.com>
Date: Tue Jul 31 23:01:21 2018 +0530
Updated README.md with some spelling corrections.
Made correction in spelling of some misspelled words.
commit 892198e
Author: dsomeshwar <someshwar.dhayalan@gmail.com>
Date: Sat Jul 21 23:23:04 2018 +0530
typo fix
commit 8a4d780
Author: Itamar Haber <itamar@redislabs.com>
Date: Mon Apr 30 02:06:52 2018 +0300
Fixes some typos
commit e3acef6
Author: Noah Rosamilia <ivoahivoah@gmail.com>
Date: Sat Mar 3 23:41:21 2018 -0500
Fix typo in /deps/README.md
commit 04442fb
Author: WuYunlong <xzsyeb@126.com>
Date: Sat Mar 3 10:32:42 2018 +0800
Fix typo in readSyncBulkPayload() comment.
commit 9f36880
Author: WuYunlong <xzsyeb@126.com>
Date: Sat Mar 3 10:20:37 2018 +0800
replication.c comment: run_id -> replid.
commit f866b4a
Author: Francesco 'makevoid' Canessa <makevoid@gmail.com>
Date: Thu Feb 22 22:01:56 2018 +0000
fix comment typo in server.c
commit 0ebc69b
Author: 줍 <jubee0124@gmail.com>
Date: Mon Feb 12 16:38:48 2018 +0900
Fix typo in redis.conf
Fix `five behaviors` to `eight behaviors` in [this sentence ](antirez/redis@unstable/redis.conf#L564)
commit b50a620
Author: martinbroadhurst <martinbroadhurst@users.noreply.github.com>
Date: Thu Dec 28 12:07:30 2017 +0000
Fix typo in valgrind.sup
commit 7d8f349
Author: Peter Boughton <peter@sorcerersisle.com>
Date: Mon Nov 27 19:52:19 2017 +0000
Update CONTRIBUTING; refer doc updates to redis-doc repo.
commit 02dec7e
Author: Klauswk <klauswk1@hotmail.com>
Date: Tue Oct 24 16:18:38 2017 -0200
Fix typo in comment
commit e1efbc8
Author: chenshi <baiwfg2@gmail.com>
Date: Tue Oct 3 18:26:30 2017 +0800
Correct two spelling errors of comments
commit 93327d8
Author: spacewander <spacewanderlzx@gmail.com>
Date: Wed Sep 13 16:47:24 2017 +0800
Update the comment for OBJ_ENCODING_EMBSTR_SIZE_LIMIT's value
The value of OBJ_ENCODING_EMBSTR_SIZE_LIMIT is 44 now instead of 39.
commit 63d361f
Author: spacewander <spacewanderlzx@gmail.com>
Date: Tue Sep 12 15:06:42 2017 +0800
Fix <prevlen> related doc in ziplist.c
According to the definition of ZIP_BIG_PREVLEN and other related code,
the guard of single byte <prevlen> should be 254 instead of 255.
commit ebe228d
Author: hanael80 <hanael80@gmail.com>
Date: Tue Aug 15 09:09:40 2017 +0900
Fix typo
commit 6b696e6
Author: Matt Robenolt <matt@ydekproductions.com>
Date: Mon Aug 14 14:50:47 2017 -0700
Fix typo in LATENCY DOCTOR output
commit a2ec6ae
Author: caosiyang <caosiyang@qiyi.com>
Date: Tue Aug 15 14:15:16 2017 +0800
Fix a typo: form => from
commit 3ab7699
Author: caosiyang <caosiyang@qiyi.com>
Date: Thu Aug 10 18:40:33 2017 +0800
Fix a typo: replicationFeedSlavesFromMaster() => replicationFeedSlavesFromMasterStream()
commit 72d43ef
Author: caosiyang <caosiyang@qiyi.com>
Date: Tue Aug 8 15:57:25 2017 +0800
fix a typo: servewr => server
commit 707c958
Author: Bo Cai <charpty@gmail.com>
Date: Wed Jul 26 21:49:42 2017 +0800
redis-cli.c typo: conut -> count.
Signed-off-by: Bo Cai <charpty@gmail.com>
commit b9385b2
Author: JackDrogon <jack.xsuperman@gmail.com>
Date: Fri Jun 30 14:22:31 2017 +0800
Fix some spell problems
commit 20d9230
Author: akosel <aaronjkosel@gmail.com>
Date: Sun Jun 4 19:35:13 2017 -0500
Fix typo
commit b167bfc
Author: Krzysiek Witkowicz <krzysiekwitkowicz@gmail.com>
Date: Mon May 22 21:32:27 2017 +0100
Fix #4008 small typo in comment
commit 2b78ac8
Author: Jake Clarkson <jacobwclarkson@gmail.com>
Date: Wed Apr 26 15:49:50 2017 +0100
Correct typo in tests/unit/hyperloglog.tcl
commit b0f1cdb
Author: Qi Luo <qiluo-msft@users.noreply.github.com>
Date: Wed Apr 19 14:25:18 2017 -0700
Fix typo
commit a90b0f9
Author: charsyam <charsyam@naver.com>
Date: Thu Mar 16 18:19:53 2017 +0900
fix typos
fix typos
fix typos
commit 8430a79
Author: Richard Hart <richardhart92@gmail.com>
Date: Mon Mar 13 22:17:41 2017 -0400
Fixed log message typo in listenToPort.
commit 481a1c2
Author: Vinod Kumar <kumar003vinod@gmail.com>
Date: Sun Jan 15 23:04:51 2017 +0530
src/db.c: Correct "save" -> "safe" typo
commit 586b4d3
Author: wangshaonan <wshn13@gmail.com>
Date: Wed Dec 21 20:28:27 2016 +0800
Fix typo they->the in helloworld.c
commit c1c4b5e
Author: Jenner <hypxm@qq.com>
Date: Mon Dec 19 16:39:46 2016 +0800
typo error
commit 1ee1a3f
Author: tielei <43289893@qq.com>
Date: Mon Jul 18 13:52:25 2016 +0800
fix some comments
commit 11a41fb
Author: Otto Kekäläinen <otto@seravo.fi>
Date: Sun Jul 3 10:23:55 2016 +0100
Fix spelling in documentation and comments
commit 5fb5d82
Author: francischan <f1ancis621@gmail.com>
Date: Tue Jun 28 00:19:33 2016 +0800
Fix outdated comments about redis.c file.
It should now refer to server.c file.
commit 6b254bc
Author: lmatt-bit <lmatt123n@gmail.com>
Date: Thu Apr 21 21:45:58 2016 +0800
Refine the comment of dictRehashMilliseconds func
SLAVECONF->REPLCONF in comment - by andyli029
commit ee9869f
Author: clark.kang <charsyam@naver.com>
Date: Tue Mar 22 11:09:51 2016 +0900
fix typos
commit f7b3b11
Author: Harisankar H <harisankarh@gmail.com>
Date: Wed Mar 9 11:49:42 2016 +0530
Typo correction: "faield" --> "failed"
Typo correction: "faield" --> "failed"
commit 3fd40fc
Author: Itamar Haber <itamar@redislabs.com>
Date: Thu Feb 25 10:31:51 2016 +0200
Fixes a typo in comments
commit 621c160
Author: Prayag Verma <prayag.verma@gmail.com>
Date: Mon Feb 1 12:36:20 2016 +0530
Fix typo in Readme.md
Spelling mistakes -
`eviciton` > `eviction`
`familar` > `familiar`
commit d7d07d6
Author: WonCheol Lee <toctoc21c@gmail.com>
Date: Wed Dec 30 15:11:34 2015 +0900
Typo fixed
commit a4dade7
Author: Felix Bünemann <buenemann@louis.info>
Date: Mon Dec 28 11:02:55 2015 +0100
[ci skip] Improve supervised upstart config docs
This mentions that "expect stop" is required for supervised upstart
to work correctly. See http://upstart.ubuntu.com/cookbook/#expect-stop
for an explanation.
commit d9caba9
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:30:03 2015 +1100
README: Remove trailing whitespace
commit 72d42e5
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:29:32 2015 +1100
README: Fix typo. th => the
commit dd6e957
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:29:20 2015 +1100
README: Fix typo. familar => familiar
commit 3a12b23
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:28:54 2015 +1100
README: Fix typo. eviciton => eviction
commit 2d1d03b
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:21:45 2015 +1100
README: Fix typo. sever => server
commit 3973b06
Author: Itamar Haber <itamar@garantiadata.com>
Date: Sat Dec 19 17:01:20 2015 +0200
Typo fix
commit 4f2e460
Author: Steve Gao <fu@2token.com>
Date: Fri Dec 4 10:22:05 2015 +0800
Update README - fix typos
commit b21667c
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 22:48:37 2015 +0800
delete redundancy color judge in sdscatcolor
commit 88894c7
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 22:14:42 2015 +0800
the example output shoule be HelloWorld
commit 2763470
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 17:41:39 2015 +0800
modify error word keyevente
Signed-off-by: binyan <binbin.yan@nokia.com>
commit 0847b3d
Author: Bruno Martins <bscmartins@gmail.com>
Date: Wed Nov 4 11:37:01 2015 +0000
typo
commit bbb9e9e
Author: dawedawe <dawedawe@gmx.de>
Date: Fri Mar 27 00:46:41 2015 +0100
typo: zimap -> zipmap
commit 5ed297e
Author: Axel Advento <badwolf.bloodseeker.rev@gmail.com>
Date: Tue Mar 3 15:58:29 2015 +0800
Fix 'salve' typos to 'slave'
commit edec9d6
Author: LudwikJaniuk <ludvig.janiuk@gmail.com>
Date: Wed Jun 12 14:12:47 2019 +0200
Update README.md
Co-Authored-By: Qix <Qix-@users.noreply.github.com>
commit 692a7af
Author: LudwikJaniuk <ludvig.janiuk@gmail.com>
Date: Tue May 28 14:32:04 2019 +0200
grammar
commit d962b0a
Author: Nick Frost <nickfrostatx@gmail.com>
Date: Wed Jul 20 15:17:12 2016 -0700
Minor grammar fix
commit 24fff01aaccaf5956973ada8c50ceb1462e211c6 (typos)
Author: Chad Miller <chadm@squareup.com>
Date: Tue Sep 8 13:46:11 2020 -0400
Fix faulty comment about operation of unlink()
commit 3cd5c1f3326c52aa552ada7ec797c6bb16452355
Author: Kevin <kevin.xgr@gmail.com>
Date: Wed Nov 20 00:13:50 2019 +0800
Fix typo in server.c.
From a83af59 Mon Sep 17 00:00:00 2001
From: wuwo <wuwo@wacai.com>
Date: Fri, 17 Mar 2017 20:37:45 +0800
Subject: [PATCH] falure to failure
From c961896 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=B7=A6=E6=87=B6?= <veficos@gmail.com>
Date: Sat, 27 May 2017 15:33:04 +0800
Subject: [PATCH] fix typo
From e600ef2 Mon Sep 17 00:00:00 2001
From: "rui.zou" <rui.zou@yunify.com>
Date: Sat, 30 Sep 2017 12:38:15 +0800
Subject: [PATCH] fix a typo
From c7d07fa Mon Sep 17 00:00:00 2001
From: Alexandre Perrin <alex@kaworu.ch>
Date: Thu, 16 Aug 2018 10:35:31 +0200
Subject: [PATCH] deps README.md typo
From b25cb67 Mon Sep 17 00:00:00 2001
From: Guy Korland <gkorland@gmail.com>
Date: Wed, 26 Sep 2018 10:55:37 +0300
Subject: [PATCH 1/2] fix typos in header
From ad28ca6 Mon Sep 17 00:00:00 2001
From: Guy Korland <gkorland@gmail.com>
Date: Wed, 26 Sep 2018 11:02:36 +0300
Subject: [PATCH 2/2] fix typos
commit 34924cdedd8552466fc22c1168d49236cb7ee915
Author: Adrian Lynch <adi_ady_ade@hotmail.com>
Date: Sat Apr 4 21:59:15 2015 +0100
Typos fixed
commit fd2a1e7
Author: Jan <jsteemann@users.noreply.github.com>
Date: Sat Oct 27 19:13:01 2018 +0200
Fix typos
Fix typos
commit e14e47c1a234b53b0e103c5f6a1c61481cbcbb02
Author: Andy Lester <andy@petdance.com>
Date: Fri Aug 2 22:30:07 2019 -0500
Fix multiple misspellings of "following"
commit 79b948ce2dac6b453fe80995abbcaac04c213d5a
Author: Andy Lester <andy@petdance.com>
Date: Fri Aug 2 22:24:28 2019 -0500
Fix misspelling of create-cluster
commit 1fffde52666dc99ab35efbd31071a4c008cb5a71
Author: Andy Lester <andy@petdance.com>
Date: Wed Jul 31 17:57:56 2019 -0500
Fix typos
commit 204c9ba9651e9e05fd73936b452b9a30be456cfe
Author: Xiaobo Zhu <xiaobo.zhu@shopee.com>
Date: Tue Aug 13 22:19:25 2019 +0800
fix typos
Squashed commit of the following:
commit 1d9aaf8
Author: danmedani <danmedani@gmail.com>
Date: Sun Aug 2 11:40:26 2015 -0700
README typo fix.
Squashed commit of the following:
commit 32bfa7c
Author: Erik Dubbelboer <erik@dubbelboer.com>
Date: Mon Jul 6 21:15:08 2015 +0200
Fixed grammer
Squashed commit of the following:
commit b24f69c
Author: Sisir Koppaka <sisir.koppaka@gmail.com>
Date: Mon Mar 2 22:38:45 2015 -0500
utils/hashtable/rehashing.c: Fix typos
Squashed commit of the following:
commit 4e04082
Author: Erik Dubbelboer <erik@dubbelboer.com>
Date: Mon Mar 23 08:22:21 2015 +0000
Small config file documentation improvements
Squashed commit of the following:
commit acb8773
Author: ctd1500 <ctd1500@gmail.com>
Date: Fri May 8 01:52:48 2015 -0700
Typo and grammar fixes in readme
commit 2eb75b6
Author: ctd1500 <ctd1500@gmail.com>
Date: Fri May 8 01:36:18 2015 -0700
fixed redis.conf comment
Squashed commit of the following:
commit a8249a2
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri Dec 11 11:39:52 2015 +0530
Revise correction of typos.
Squashed commit of the following:
commit 3c02028
Author: zhaojun11 <zhaojun11@jd.com>
Date: Wed Jan 17 19:05:28 2018 +0800
Fix typos include two code typos in cluster.c and latency.c
Squashed commit of the following:
commit 9dba47c
Author: q191201771 <191201771@qq.com>
Date: Sat Jan 4 11:31:04 2020 +0800
fix function listCreate comment in adlist.c
Update src/server.c
commit 2c7c2cb536e78dd211b1ac6f7bda00f0f54faaeb
Author: charpty <charpty@gmail.com>
Date: Tue May 1 23:16:59 2018 +0800
server.c typo: modules system dictionary type comment
Signed-off-by: charpty <charpty@gmail.com>
commit a8395323fb63cb59cb3591cb0f0c8edb7c29a680
Author: Itamar Haber <itamar@redislabs.com>
Date: Sun May 6 00:25:18 2018 +0300
Updates test_helper.tcl's help with undocumented options
Specifically:
* Host
* Port
* Client
commit bde6f9ced15755cd6407b4af7d601b030f36d60b
Author: wxisme <850885154@qq.com>
Date: Wed Aug 8 15:19:19 2018 +0800
fix comments in deps files
commit 3172474ba991532ab799ee1873439f3402412331
Author: wxisme <850885154@qq.com>
Date: Wed Aug 8 14:33:49 2018 +0800
fix some comments
commit 01b6f2b6858b5cf2ce4ad5092d2c746e755f53f0
Author: Thor Juhasz <thor@juhasz.pro>
Date: Sun Nov 18 14:37:41 2018 +0100
Minor fixes to comments
Found some parts a little unclear on a first read, which prompted me to have a better look at the file and fix some minor things I noticed.
Fixing minor typos and grammar. There are no changes to configuration options.
These changes are only meant to help the user better understand the explanations to the various configuration options
2020-09-10 13:43:38 +03:00
|
|
|
* event later. This is useful as sometimes we may be able
|
2018-02-27 12:16:26 +01:00
|
|
|
* to serve the reply of a query immediately after processing the
|
|
|
|
* query.
|
|
|
|
*
|
|
|
|
* However if AE_BARRIER is set in the mask, our application is
|
|
|
|
* asking us to do the reverse: never fire the writable event
|
|
|
|
* after the readable. In such a case, we invert the calls.
|
|
|
|
* This is useful when, for instance, we want to do things
|
Squash merging 125 typo/grammar/comment/doc PRs (#7773)
List of squashed commits or PRs
===============================
commit 66801ea
Author: hwware <wen.hui.ware@gmail.com>
Date: Mon Jan 13 00:54:31 2020 -0500
typo fix in acl.c
commit 46f55db
Author: Itamar Haber <itamar@redislabs.com>
Date: Sun Sep 6 18:24:11 2020 +0300
Updates a couple of comments
Specifically:
* RM_AutoMemory completed instead of pointing to docs
* Updated link to custom type doc
commit 61a2aa0
Author: xindoo <xindoo@qq.com>
Date: Tue Sep 1 19:24:59 2020 +0800
Correct errors in code comments
commit a5871d1
Author: yz1509 <pro-756@qq.com>
Date: Tue Sep 1 18:36:06 2020 +0800
fix typos in module.c
commit 41eede7
Author: bookug <bookug@qq.com>
Date: Sat Aug 15 01:11:33 2020 +0800
docs: fix typos in comments
commit c303c84
Author: lazy-snail <ws.niu@outlook.com>
Date: Fri Aug 7 11:15:44 2020 +0800
fix spelling in redis.conf
commit 1eb76bf
Author: zhujian <zhujianxyz@gmail.com>
Date: Thu Aug 6 15:22:10 2020 +0800
add a missing 'n' in comment
commit 1530ec2
Author: Daniel Dai <764122422@qq.com>
Date: Mon Jul 27 00:46:35 2020 -0400
fix spelling in tracking.c
commit e517b31
Author: Hunter-Chen <huntcool001@gmail.com>
Date: Fri Jul 17 22:33:32 2020 +0800
Update redis.conf
Co-authored-by: Itamar Haber <itamar@redislabs.com>
commit c300eff
Author: Hunter-Chen <huntcool001@gmail.com>
Date: Fri Jul 17 22:33:23 2020 +0800
Update redis.conf
Co-authored-by: Itamar Haber <itamar@redislabs.com>
commit 4c058a8
Author: 陈浩鹏 <chenhaopeng@heytea.com>
Date: Thu Jun 25 19:00:56 2020 +0800
Grammar fix and clarification
commit 5fcaa81
Author: bodong.ybd <bodong.ybd@alibaba-inc.com>
Date: Fri Jun 19 10:09:00 2020 +0800
Fix typos
commit 4caca9a
Author: Pruthvi P <pruthvi@ixigo.com>
Date: Fri May 22 00:33:22 2020 +0530
Fix typo eviciton => eviction
commit b2a25f6
Author: Brad Dunbar <dunbarb2@gmail.com>
Date: Sun May 17 12:39:59 2020 -0400
Fix a typo.
commit 12842ae
Author: hwware <wen.hui.ware@gmail.com>
Date: Sun May 3 17:16:59 2020 -0400
fix spelling in redis conf
commit ddba07c
Author: Chris Lamb <chris@chris-lamb.co.uk>
Date: Sat May 2 23:25:34 2020 +0100
Correct a "conflicts" spelling error.
commit 8fc7bf2
Author: Nao YONASHIRO <yonashiro@r.recruit.co.jp>
Date: Thu Apr 30 10:25:27 2020 +0900
docs: fix EXPIRE_FAST_CYCLE_DURATION to ACTIVE_EXPIRE_CYCLE_FAST_DURATION
commit 9b2b67a
Author: Brad Dunbar <dunbarb2@gmail.com>
Date: Fri Apr 24 11:46:22 2020 -0400
Fix a typo.
commit 0746f10
Author: devilinrust <63737265+devilinrust@users.noreply.github.com>
Date: Thu Apr 16 00:17:53 2020 +0200
Fix typos in server.c
commit 92b588d
Author: benjessop12 <56115861+benjessop12@users.noreply.github.com>
Date: Mon Apr 13 13:43:55 2020 +0100
Fix spelling mistake in lazyfree.c
commit 1da37aa
Merge: 2d4ba28 af347a8
Author: hwware <wen.hui.ware@gmail.com>
Date: Thu Mar 5 22:41:31 2020 -0500
Merge remote-tracking branch 'upstream/unstable' into expiretypofix
commit 2d4ba28
Author: hwware <wen.hui.ware@gmail.com>
Date: Mon Mar 2 00:09:40 2020 -0500
fix typo in expire.c
commit 1a746f7
Author: SennoYuki <minakami1yuki@gmail.com>
Date: Thu Feb 27 16:54:32 2020 +0800
fix typo
commit 8599b1a
Author: dongheejeong <donghee950403@gmail.com>
Date: Sun Feb 16 20:31:43 2020 +0000
Fix typo in server.c
commit f38d4e8
Author: hwware <wen.hui.ware@gmail.com>
Date: Sun Feb 2 22:58:38 2020 -0500
fix typo in evict.c
commit fe143fc
Author: Leo Murillo <leonardo.murillo@gmail.com>
Date: Sun Feb 2 01:57:22 2020 -0600
Fix a few typos in redis.conf
commit 1ab4d21
Author: viraja1 <anchan.viraj@gmail.com>
Date: Fri Dec 27 17:15:58 2019 +0530
Fix typo in Latency API docstring
commit ca1f70e
Author: gosth <danxuedexing@qq.com>
Date: Wed Dec 18 15:18:02 2019 +0800
fix typo in sort.c
commit a57c06b
Author: ZYunH <zyunhjob@163.com>
Date: Mon Dec 16 22:28:46 2019 +0800
fix-zset-typo
commit b8c92b5
Author: git-hulk <hulk.website@gmail.com>
Date: Mon Dec 16 15:51:42 2019 +0800
FIX: typo in cluster.c, onformation->information
commit 9dd981c
Author: wujm2007 <jim.wujm@gmail.com>
Date: Mon Dec 16 09:37:52 2019 +0800
Fix typo
commit e132d7a
Author: Sebastien Williams-Wynn <s.williamswynn.mail@gmail.com>
Date: Fri Nov 15 00:14:07 2019 +0000
Minor typo change
commit 47f44d5
Author: happynote3966 <01ssrmikururudevice01@gmail.com>
Date: Mon Nov 11 22:08:48 2019 +0900
fix comment typo in redis-cli.c
commit b8bdb0d
Author: fulei <fulei@kuaishou.com>
Date: Wed Oct 16 18:00:17 2019 +0800
Fix a spelling mistake of comments in defragDictBucketCallback
commit 0def46a
Author: fulei <fulei@kuaishou.com>
Date: Wed Oct 16 13:09:27 2019 +0800
fix some spelling mistakes of comments in defrag.c
commit f3596fd
Author: Phil Rajchgot <tophil@outlook.com>
Date: Sun Oct 13 02:02:32 2019 -0400
Typo and grammar fixes
Redis and its documentation are great -- just wanted to submit a few corrections in the spirit of Hacktoberfest. Thanks for all your work on this project. I use it all the time and it works beautifully.
commit 2b928cd
Author: KangZhiDong <worldkzd@gmail.com>
Date: Sun Sep 1 07:03:11 2019 +0800
fix typos
commit 33aea14
Author: Axlgrep <axlgrep@gmail.com>
Date: Tue Aug 27 11:02:18 2019 +0800
Fixed eviction spelling issues
commit e282a80
Author: Simen Flatby <simen@oms.no>
Date: Tue Aug 20 15:25:51 2019 +0200
Update comments to reflect prop name
In the comments the prop is referenced as replica-validity-factor,
but it is really named cluster-replica-validity-factor.
commit 74d1f9a
Author: Jim Green <jimgreen2013@qq.com>
Date: Tue Aug 20 20:00:31 2019 +0800
fix comment error, the code is ok
commit eea1407
Author: Liao Tonglang <liaotonglang@gmail.com>
Date: Fri May 31 10:16:18 2019 +0800
typo fix
fix cna't to can't
commit 0da553c
Author: KAWACHI Takashi <tkawachi@gmail.com>
Date: Wed Jul 17 00:38:16 2019 +0900
Fix typo
commit 7fc8fb6
Author: Michael Prokop <mika@grml.org>
Date: Tue May 28 17:58:42 2019 +0200
Typo fixes
s/familar/familiar/
s/compatiblity/compatibility/
s/ ot / to /
s/itsef/itself/
commit 5f46c9d
Author: zhumoing <34539422+zhumoing@users.noreply.github.com>
Date: Tue May 21 21:16:50 2019 +0800
typo-fixes
typo-fixes
commit 321dfe1
Author: wxisme <850885154@qq.com>
Date: Sat Mar 16 15:10:55 2019 +0800
typo fix
commit b4fb131
Merge: 267e0e6 3df1eb8
Author: Nikitas Bastas <nikitasbst@gmail.com>
Date: Fri Feb 8 22:55:45 2019 +0200
Merge branch 'unstable' of antirez/redis into unstable
commit 267e0e6
Author: Nikitas Bastas <nikitasbst@gmail.com>
Date: Wed Jan 30 21:26:04 2019 +0200
Minor typo fix
commit 30544e7
Author: inshal96 <39904558+inshal96@users.noreply.github.com>
Date: Fri Jan 4 16:54:50 2019 +0500
remove an extra 'a' in the comments
commit 337969d
Author: BrotherGao <yangdongheng11@gmail.com>
Date: Sat Dec 29 12:37:29 2018 +0800
fix typo in redis.conf
commit 9f4b121
Merge: 423a030 e504583
Author: BrotherGao <yangdongheng@xiaomi.com>
Date: Sat Dec 29 11:41:12 2018 +0800
Merge branch 'unstable' of antirez/redis into unstable
commit 423a030
Merge: 42b02b7 46a51cd
Author: 杨东衡 <yangdongheng@xiaomi.com>
Date: Tue Dec 4 23:56:11 2018 +0800
Merge branch 'unstable' of antirez/redis into unstable
commit 42b02b7
Merge: 68c0e6e b8febe6
Author: Dongheng Yang <yangdongheng11@gmail.com>
Date: Sun Oct 28 15:54:23 2018 +0800
Merge pull request #1 from antirez/unstable
update local data
commit 714b589
Author: Christian <crifei93@gmail.com>
Date: Fri Dec 28 01:17:26 2018 +0100
fix typo "resulution"
commit e23259d
Author: garenchan <1412950785@qq.com>
Date: Wed Dec 26 09:58:35 2018 +0800
fix typo: segfauls -> segfault
commit a9359f8
Author: xjp <jianping_xie@aliyun.com>
Date: Tue Dec 18 17:31:44 2018 +0800
Fixed REDISMODULE_H spell bug
commit a12c3e4
Author: jdiaz <jrd.palacios@gmail.com>
Date: Sat Dec 15 23:39:52 2018 -0600
Fixes hyperloglog hash function comment block description
commit 770eb11
Author: 林上耀 <1210tom@163.com>
Date: Sun Nov 25 17:16:10 2018 +0800
fix typo
commit fd97fbb
Author: Chris Lamb <chris@chris-lamb.co.uk>
Date: Fri Nov 23 17:14:01 2018 +0100
Correct "unsupported" typo.
commit a85522d
Author: Jungnam Lee <jungnam.lee@oracle.com>
Date: Thu Nov 8 23:01:29 2018 +0900
fix typo in test comments
commit ade8007
Author: Arun Kumar <palerdot@users.noreply.github.com>
Date: Tue Oct 23 16:56:35 2018 +0530
Fixed grammatical typo
Fixed typo for word 'dictionary'
commit 869ee39
Author: Hamid Alaei <hamid.a85@gmail.com>
Date: Sun Aug 12 16:40:02 2018 +0430
fix documentations: (ThreadSafeContextStart/Stop -> ThreadSafeContextLock/Unlock), minor typo
commit f89d158
Author: Mayank Jain <mayankjain255@gmail.com>
Date: Tue Jul 31 23:01:21 2018 +0530
Updated README.md with some spelling corrections.
Made correction in spelling of some misspelled words.
commit 892198e
Author: dsomeshwar <someshwar.dhayalan@gmail.com>
Date: Sat Jul 21 23:23:04 2018 +0530
typo fix
commit 8a4d780
Author: Itamar Haber <itamar@redislabs.com>
Date: Mon Apr 30 02:06:52 2018 +0300
Fixes some typos
commit e3acef6
Author: Noah Rosamilia <ivoahivoah@gmail.com>
Date: Sat Mar 3 23:41:21 2018 -0500
Fix typo in /deps/README.md
commit 04442fb
Author: WuYunlong <xzsyeb@126.com>
Date: Sat Mar 3 10:32:42 2018 +0800
Fix typo in readSyncBulkPayload() comment.
commit 9f36880
Author: WuYunlong <xzsyeb@126.com>
Date: Sat Mar 3 10:20:37 2018 +0800
replication.c comment: run_id -> replid.
commit f866b4a
Author: Francesco 'makevoid' Canessa <makevoid@gmail.com>
Date: Thu Feb 22 22:01:56 2018 +0000
fix comment typo in server.c
commit 0ebc69b
Author: 줍 <jubee0124@gmail.com>
Date: Mon Feb 12 16:38:48 2018 +0900
Fix typo in redis.conf
Fix `five behaviors` to `eight behaviors` in [this sentence ](antirez/redis@unstable/redis.conf#L564)
commit b50a620
Author: martinbroadhurst <martinbroadhurst@users.noreply.github.com>
Date: Thu Dec 28 12:07:30 2017 +0000
Fix typo in valgrind.sup
commit 7d8f349
Author: Peter Boughton <peter@sorcerersisle.com>
Date: Mon Nov 27 19:52:19 2017 +0000
Update CONTRIBUTING; refer doc updates to redis-doc repo.
commit 02dec7e
Author: Klauswk <klauswk1@hotmail.com>
Date: Tue Oct 24 16:18:38 2017 -0200
Fix typo in comment
commit e1efbc8
Author: chenshi <baiwfg2@gmail.com>
Date: Tue Oct 3 18:26:30 2017 +0800
Correct two spelling errors of comments
commit 93327d8
Author: spacewander <spacewanderlzx@gmail.com>
Date: Wed Sep 13 16:47:24 2017 +0800
Update the comment for OBJ_ENCODING_EMBSTR_SIZE_LIMIT's value
The value of OBJ_ENCODING_EMBSTR_SIZE_LIMIT is 44 now instead of 39.
commit 63d361f
Author: spacewander <spacewanderlzx@gmail.com>
Date: Tue Sep 12 15:06:42 2017 +0800
Fix <prevlen> related doc in ziplist.c
According to the definition of ZIP_BIG_PREVLEN and other related code,
the guard of single byte <prevlen> should be 254 instead of 255.
commit ebe228d
Author: hanael80 <hanael80@gmail.com>
Date: Tue Aug 15 09:09:40 2017 +0900
Fix typo
commit 6b696e6
Author: Matt Robenolt <matt@ydekproductions.com>
Date: Mon Aug 14 14:50:47 2017 -0700
Fix typo in LATENCY DOCTOR output
commit a2ec6ae
Author: caosiyang <caosiyang@qiyi.com>
Date: Tue Aug 15 14:15:16 2017 +0800
Fix a typo: form => from
commit 3ab7699
Author: caosiyang <caosiyang@qiyi.com>
Date: Thu Aug 10 18:40:33 2017 +0800
Fix a typo: replicationFeedSlavesFromMaster() => replicationFeedSlavesFromMasterStream()
commit 72d43ef
Author: caosiyang <caosiyang@qiyi.com>
Date: Tue Aug 8 15:57:25 2017 +0800
fix a typo: servewr => server
commit 707c958
Author: Bo Cai <charpty@gmail.com>
Date: Wed Jul 26 21:49:42 2017 +0800
redis-cli.c typo: conut -> count.
Signed-off-by: Bo Cai <charpty@gmail.com>
commit b9385b2
Author: JackDrogon <jack.xsuperman@gmail.com>
Date: Fri Jun 30 14:22:31 2017 +0800
Fix some spell problems
commit 20d9230
Author: akosel <aaronjkosel@gmail.com>
Date: Sun Jun 4 19:35:13 2017 -0500
Fix typo
commit b167bfc
Author: Krzysiek Witkowicz <krzysiekwitkowicz@gmail.com>
Date: Mon May 22 21:32:27 2017 +0100
Fix #4008 small typo in comment
commit 2b78ac8
Author: Jake Clarkson <jacobwclarkson@gmail.com>
Date: Wed Apr 26 15:49:50 2017 +0100
Correct typo in tests/unit/hyperloglog.tcl
commit b0f1cdb
Author: Qi Luo <qiluo-msft@users.noreply.github.com>
Date: Wed Apr 19 14:25:18 2017 -0700
Fix typo
commit a90b0f9
Author: charsyam <charsyam@naver.com>
Date: Thu Mar 16 18:19:53 2017 +0900
fix typos
fix typos
fix typos
commit 8430a79
Author: Richard Hart <richardhart92@gmail.com>
Date: Mon Mar 13 22:17:41 2017 -0400
Fixed log message typo in listenToPort.
commit 481a1c2
Author: Vinod Kumar <kumar003vinod@gmail.com>
Date: Sun Jan 15 23:04:51 2017 +0530
src/db.c: Correct "save" -> "safe" typo
commit 586b4d3
Author: wangshaonan <wshn13@gmail.com>
Date: Wed Dec 21 20:28:27 2016 +0800
Fix typo they->the in helloworld.c
commit c1c4b5e
Author: Jenner <hypxm@qq.com>
Date: Mon Dec 19 16:39:46 2016 +0800
typo error
commit 1ee1a3f
Author: tielei <43289893@qq.com>
Date: Mon Jul 18 13:52:25 2016 +0800
fix some comments
commit 11a41fb
Author: Otto Kekäläinen <otto@seravo.fi>
Date: Sun Jul 3 10:23:55 2016 +0100
Fix spelling in documentation and comments
commit 5fb5d82
Author: francischan <f1ancis621@gmail.com>
Date: Tue Jun 28 00:19:33 2016 +0800
Fix outdated comments about redis.c file.
It should now refer to server.c file.
commit 6b254bc
Author: lmatt-bit <lmatt123n@gmail.com>
Date: Thu Apr 21 21:45:58 2016 +0800
Refine the comment of dictRehashMilliseconds func
SLAVECONF->REPLCONF in comment - by andyli029
commit ee9869f
Author: clark.kang <charsyam@naver.com>
Date: Tue Mar 22 11:09:51 2016 +0900
fix typos
commit f7b3b11
Author: Harisankar H <harisankarh@gmail.com>
Date: Wed Mar 9 11:49:42 2016 +0530
Typo correction: "faield" --> "failed"
Typo correction: "faield" --> "failed"
commit 3fd40fc
Author: Itamar Haber <itamar@redislabs.com>
Date: Thu Feb 25 10:31:51 2016 +0200
Fixes a typo in comments
commit 621c160
Author: Prayag Verma <prayag.verma@gmail.com>
Date: Mon Feb 1 12:36:20 2016 +0530
Fix typo in Readme.md
Spelling mistakes -
`eviciton` > `eviction`
`familar` > `familiar`
commit d7d07d6
Author: WonCheol Lee <toctoc21c@gmail.com>
Date: Wed Dec 30 15:11:34 2015 +0900
Typo fixed
commit a4dade7
Author: Felix Bünemann <buenemann@louis.info>
Date: Mon Dec 28 11:02:55 2015 +0100
[ci skip] Improve supervised upstart config docs
This mentions that "expect stop" is required for supervised upstart
to work correctly. See http://upstart.ubuntu.com/cookbook/#expect-stop
for an explanation.
commit d9caba9
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:30:03 2015 +1100
README: Remove trailing whitespace
commit 72d42e5
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:29:32 2015 +1100
README: Fix typo. th => the
commit dd6e957
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:29:20 2015 +1100
README: Fix typo. familar => familiar
commit 3a12b23
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:28:54 2015 +1100
README: Fix typo. eviciton => eviction
commit 2d1d03b
Author: daurnimator <quae@daurnimator.com>
Date: Mon Dec 21 18:21:45 2015 +1100
README: Fix typo. sever => server
commit 3973b06
Author: Itamar Haber <itamar@garantiadata.com>
Date: Sat Dec 19 17:01:20 2015 +0200
Typo fix
commit 4f2e460
Author: Steve Gao <fu@2token.com>
Date: Fri Dec 4 10:22:05 2015 +0800
Update README - fix typos
commit b21667c
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 22:48:37 2015 +0800
delete redundancy color judge in sdscatcolor
commit 88894c7
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 22:14:42 2015 +0800
the example output shoule be HelloWorld
commit 2763470
Author: binyan <binbin.yan@nokia.com>
Date: Wed Dec 2 17:41:39 2015 +0800
modify error word keyevente
Signed-off-by: binyan <binbin.yan@nokia.com>
commit 0847b3d
Author: Bruno Martins <bscmartins@gmail.com>
Date: Wed Nov 4 11:37:01 2015 +0000
typo
commit bbb9e9e
Author: dawedawe <dawedawe@gmx.de>
Date: Fri Mar 27 00:46:41 2015 +0100
typo: zimap -> zipmap
commit 5ed297e
Author: Axel Advento <badwolf.bloodseeker.rev@gmail.com>
Date: Tue Mar 3 15:58:29 2015 +0800
Fix 'salve' typos to 'slave'
commit edec9d6
Author: LudwikJaniuk <ludvig.janiuk@gmail.com>
Date: Wed Jun 12 14:12:47 2019 +0200
Update README.md
Co-Authored-By: Qix <Qix-@users.noreply.github.com>
commit 692a7af
Author: LudwikJaniuk <ludvig.janiuk@gmail.com>
Date: Tue May 28 14:32:04 2019 +0200
grammar
commit d962b0a
Author: Nick Frost <nickfrostatx@gmail.com>
Date: Wed Jul 20 15:17:12 2016 -0700
Minor grammar fix
commit 24fff01aaccaf5956973ada8c50ceb1462e211c6 (typos)
Author: Chad Miller <chadm@squareup.com>
Date: Tue Sep 8 13:46:11 2020 -0400
Fix faulty comment about operation of unlink()
commit 3cd5c1f3326c52aa552ada7ec797c6bb16452355
Author: Kevin <kevin.xgr@gmail.com>
Date: Wed Nov 20 00:13:50 2019 +0800
Fix typo in server.c.
From a83af59 Mon Sep 17 00:00:00 2001
From: wuwo <wuwo@wacai.com>
Date: Fri, 17 Mar 2017 20:37:45 +0800
Subject: [PATCH] falure to failure
From c961896 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=B7=A6=E6=87=B6?= <veficos@gmail.com>
Date: Sat, 27 May 2017 15:33:04 +0800
Subject: [PATCH] fix typo
From e600ef2 Mon Sep 17 00:00:00 2001
From: "rui.zou" <rui.zou@yunify.com>
Date: Sat, 30 Sep 2017 12:38:15 +0800
Subject: [PATCH] fix a typo
From c7d07fa Mon Sep 17 00:00:00 2001
From: Alexandre Perrin <alex@kaworu.ch>
Date: Thu, 16 Aug 2018 10:35:31 +0200
Subject: [PATCH] deps README.md typo
From b25cb67 Mon Sep 17 00:00:00 2001
From: Guy Korland <gkorland@gmail.com>
Date: Wed, 26 Sep 2018 10:55:37 +0300
Subject: [PATCH 1/2] fix typos in header
From ad28ca6 Mon Sep 17 00:00:00 2001
From: Guy Korland <gkorland@gmail.com>
Date: Wed, 26 Sep 2018 11:02:36 +0300
Subject: [PATCH 2/2] fix typos
commit 34924cdedd8552466fc22c1168d49236cb7ee915
Author: Adrian Lynch <adi_ady_ade@hotmail.com>
Date: Sat Apr 4 21:59:15 2015 +0100
Typos fixed
commit fd2a1e7
Author: Jan <jsteemann@users.noreply.github.com>
Date: Sat Oct 27 19:13:01 2018 +0200
Fix typos
Fix typos
commit e14e47c1a234b53b0e103c5f6a1c61481cbcbb02
Author: Andy Lester <andy@petdance.com>
Date: Fri Aug 2 22:30:07 2019 -0500
Fix multiple misspellings of "following"
commit 79b948ce2dac6b453fe80995abbcaac04c213d5a
Author: Andy Lester <andy@petdance.com>
Date: Fri Aug 2 22:24:28 2019 -0500
Fix misspelling of create-cluster
commit 1fffde52666dc99ab35efbd31071a4c008cb5a71
Author: Andy Lester <andy@petdance.com>
Date: Wed Jul 31 17:57:56 2019 -0500
Fix typos
commit 204c9ba9651e9e05fd73936b452b9a30be456cfe
Author: Xiaobo Zhu <xiaobo.zhu@shopee.com>
Date: Tue Aug 13 22:19:25 2019 +0800
fix typos
Squashed commit of the following:
commit 1d9aaf8
Author: danmedani <danmedani@gmail.com>
Date: Sun Aug 2 11:40:26 2015 -0700
README typo fix.
Squashed commit of the following:
commit 32bfa7c
Author: Erik Dubbelboer <erik@dubbelboer.com>
Date: Mon Jul 6 21:15:08 2015 +0200
Fixed grammer
Squashed commit of the following:
commit b24f69c
Author: Sisir Koppaka <sisir.koppaka@gmail.com>
Date: Mon Mar 2 22:38:45 2015 -0500
utils/hashtable/rehashing.c: Fix typos
Squashed commit of the following:
commit 4e04082
Author: Erik Dubbelboer <erik@dubbelboer.com>
Date: Mon Mar 23 08:22:21 2015 +0000
Small config file documentation improvements
Squashed commit of the following:
commit acb8773
Author: ctd1500 <ctd1500@gmail.com>
Date: Fri May 8 01:52:48 2015 -0700
Typo and grammar fixes in readme
commit 2eb75b6
Author: ctd1500 <ctd1500@gmail.com>
Date: Fri May 8 01:36:18 2015 -0700
fixed redis.conf comment
Squashed commit of the following:
commit a8249a2
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri Dec 11 11:39:52 2015 +0530
Revise correction of typos.
Squashed commit of the following:
commit 3c02028
Author: zhaojun11 <zhaojun11@jd.com>
Date: Wed Jan 17 19:05:28 2018 +0800
Fix typos include two code typos in cluster.c and latency.c
Squashed commit of the following:
commit 9dba47c
Author: q191201771 <191201771@qq.com>
Date: Sat Jan 4 11:31:04 2020 +0800
fix function listCreate comment in adlist.c
Update src/server.c
commit 2c7c2cb536e78dd211b1ac6f7bda00f0f54faaeb
Author: charpty <charpty@gmail.com>
Date: Tue May 1 23:16:59 2018 +0800
server.c typo: modules system dictionary type comment
Signed-off-by: charpty <charpty@gmail.com>
commit a8395323fb63cb59cb3591cb0f0c8edb7c29a680
Author: Itamar Haber <itamar@redislabs.com>
Date: Sun May 6 00:25:18 2018 +0300
Updates test_helper.tcl's help with undocumented options
Specifically:
* Host
* Port
* Client
commit bde6f9ced15755cd6407b4af7d601b030f36d60b
Author: wxisme <850885154@qq.com>
Date: Wed Aug 8 15:19:19 2018 +0800
fix comments in deps files
commit 3172474ba991532ab799ee1873439f3402412331
Author: wxisme <850885154@qq.com>
Date: Wed Aug 8 14:33:49 2018 +0800
fix some comments
commit 01b6f2b6858b5cf2ce4ad5092d2c746e755f53f0
Author: Thor Juhasz <thor@juhasz.pro>
Date: Sun Nov 18 14:37:41 2018 +0100
Minor fixes to comments
Found some parts a little unclear on a first read, which prompted me to have a better look at the file and fix some minor things I noticed.
Fixing minor typos and grammar. There are no changes to configuration options.
These changes are only meant to help the user better understand the explanations to the various configuration options
2020-09-10 13:43:38 +03:00
|
|
|
* in the beforeSleep() hook, like fsyncing a file to disk,
|
2018-02-27 12:16:26 +01:00
|
|
|
* before replying to a client. */
|
|
|
|
int invert = fe->mask & AE_BARRIER;
|
|
|
|
|
2018-07-04 20:04:06 +08:00
|
|
|
/* Note the "fe->mask & mask & ..." code: maybe an already
|
2018-02-27 12:16:26 +01:00
|
|
|
* processed event removed an element that fired and we still
|
|
|
|
* didn't processed, so we check if the event is still valid.
|
|
|
|
*
|
|
|
|
* Fire the readable event if the call sequence is not
|
|
|
|
* inverted. */
|
|
|
|
if (!invert && fe->mask & mask & AE_READABLE) {
|
2024-05-22 23:24:12 -07:00
|
|
|
fe->rfileProc(eventLoop, fd, fe->clientData, mask);
|
2018-02-27 12:16:26 +01:00
|
|
|
fired++;
|
2020-03-12 12:59:44 +01:00
|
|
|
fe = &eventLoop->events[fd]; /* Refresh in case of resize. */
|
2010-01-20 13:38:59 -05:00
|
|
|
}
|
2018-02-27 12:16:26 +01:00
|
|
|
|
|
|
|
/* Fire the writable event. */
|
2010-01-20 13:38:59 -05:00
|
|
|
if (fe->mask & mask & AE_WRITABLE) {
|
2018-02-27 12:16:26 +01:00
|
|
|
if (!fired || fe->wfileProc != fe->rfileProc) {
|
2024-05-22 23:24:12 -07:00
|
|
|
fe->wfileProc(eventLoop, fd, fe->clientData, mask);
|
2018-02-27 12:16:26 +01:00
|
|
|
fired++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If we have to invert the call, fire the readable event now
|
|
|
|
* after the writable one. */
|
2020-03-12 12:59:44 +01:00
|
|
|
if (invert) {
|
|
|
|
fe = &eventLoop->events[fd]; /* Refresh in case of resize. */
|
2024-05-22 23:24:12 -07:00
|
|
|
if ((fe->mask & mask & AE_READABLE) && (!fired || fe->wfileProc != fe->rfileProc)) {
|
|
|
|
fe->rfileProc(eventLoop, fd, fe->clientData, mask);
|
2018-02-27 12:16:26 +01:00
|
|
|
fired++;
|
ae.c: introduce the concept of read->write barrier.
AOF fsync=always, and certain Redis Cluster bus operations, require to
fsync data on disk before replying with an acknowledge.
In such case, in order to implement Group Commits, we want to be sure
that queries that are read in a given cycle of the event loop, are never
served to clients in the same event loop iteration. This way, by using
the event loop "before sleep" callback, we can fsync the information
just one time before returning into the event loop for the next cycle.
This is much more efficient compared to calling fsync() multiple times.
Unfortunately because of a bug, this was not always guaranteed: the
actual way the events are installed was the sole thing that could
control. Normally this problem is hard to trigger when AOF is enabled
with fsync=always, because we try to flush the output buffers to the
socekt directly in the beforeSleep() function of Redis. However if the
output buffers are full, we actually install a write event, and in such
a case, this bug could happen.
This change to ae.c modifies the event loop implementation to make this
concept explicit. Write events that are registered with:
AE_WRITABLE|AE_BARRIER
Are guaranteed to never fire after the readable event was fired for the
same file descriptor. In this way we are sure that data is persisted to
disk before the client performing the operation receives an
acknowledged.
However note that this semantics does not provide all the guarantees
that one may believe are automatically provided. Take the example of the
blocking list operations in Redis.
With AOF and fsync=always we could have:
Client A doing: BLPOP myqueue 0
Client B doing: RPUSH myqueue a b c
In this scenario, Client A will get the "a" elements immediately after
the Client B RPUSH will be executed, even before the operation is persisted.
However when Client B will get the acknowledge, it can be sure that
"b,c" are already safe on disk inside the list.
What to note here is that it cannot be assumed that Client A receiving
the element is a guaranteed that the operation succeeded from the point
of view of Client B.
This is due to the fact that the barrier exists within the same socket,
and not between different sockets. However in the case above, the
element "a" was not going to be persisted regardless, so it is a pretty
synthetic argument.
2018-02-23 17:42:24 +01:00
|
|
|
}
|
2010-01-20 13:38:59 -05:00
|
|
|
}
|
2018-02-27 12:16:26 +01:00
|
|
|
|
2009-11-23 18:50:39 +01:00
|
|
|
processed++;
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
/* Check time events */
|
2024-05-22 23:24:12 -07:00
|
|
|
if (flags & AE_TIME_EVENTS) processed += processTimeEvents(eventLoop);
|
2009-03-22 10:30:00 +01:00
|
|
|
|
|
|
|
return processed; /* return the number of processed file/time events */
|
|
|
|
}
|
|
|
|
|
2013-01-17 01:00:20 +08:00
|
|
|
/* Wait for milliseconds until the given file descriptor becomes
|
2009-03-22 10:30:00 +01:00
|
|
|
* writable/readable/exception */
|
|
|
|
int aeWait(int fd, int mask, long long milliseconds) {
|
2012-01-06 18:56:07 +08:00
|
|
|
struct pollfd pfd;
|
2009-03-22 10:30:00 +01:00
|
|
|
int retmask = 0, retval;
|
|
|
|
|
2012-01-06 18:56:07 +08:00
|
|
|
memset(&pfd, 0, sizeof(pfd));
|
|
|
|
pfd.fd = fd;
|
|
|
|
if (mask & AE_READABLE) pfd.events |= POLLIN;
|
|
|
|
if (mask & AE_WRITABLE) pfd.events |= POLLOUT;
|
|
|
|
|
2024-05-22 23:24:12 -07:00
|
|
|
if ((retval = poll(&pfd, 1, milliseconds)) == 1) {
|
2012-01-06 18:56:07 +08:00
|
|
|
if (pfd.revents & POLLIN) retmask |= AE_READABLE;
|
|
|
|
if (pfd.revents & POLLOUT) retmask |= AE_WRITABLE;
|
2018-07-04 20:04:06 +08:00
|
|
|
if (pfd.revents & POLLERR) retmask |= AE_WRITABLE;
|
2012-05-23 17:19:49 +08:00
|
|
|
if (pfd.revents & POLLHUP) retmask |= AE_WRITABLE;
|
2009-03-22 10:30:00 +01:00
|
|
|
return retmask;
|
|
|
|
} else {
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2009-11-28 17:06:28 +01:00
|
|
|
void aeMain(aeEventLoop *eventLoop) {
|
2009-03-22 10:30:00 +01:00
|
|
|
eventLoop->stop = 0;
|
2010-01-28 10:12:04 -05:00
|
|
|
while (!eventLoop->stop) {
|
2024-05-22 23:24:12 -07:00
|
|
|
aeProcessEvents(eventLoop, AE_ALL_EVENTS | AE_CALL_BEFORE_SLEEP | AE_CALL_AFTER_SLEEP);
|
2010-01-28 10:12:04 -05:00
|
|
|
}
|
2009-03-22 10:30:00 +01:00
|
|
|
}
|
2009-11-28 17:06:28 +01:00
|
|
|
|
|
|
|
char *aeGetApiName(void) {
|
|
|
|
return aeApiName();
|
|
|
|
}
|
2010-01-28 10:12:04 -05:00
|
|
|
|
|
|
|
void aeSetBeforeSleepProc(aeEventLoop *eventLoop, aeBeforeSleepProc *beforesleep) {
|
|
|
|
eventLoop->beforesleep = beforesleep;
|
|
|
|
}
|
2017-05-03 11:26:21 +02:00
|
|
|
|
Async IO threads (#758)
This PR is 1 of 3 PRs intended to achieve the goal of 1 million requests
per second, as detailed by [dan touitou](https://github.com/touitou-dan)
in https://github.com/valkey-io/valkey/issues/22. This PR modifies the
IO threads to be fully asynchronous, which is a first and necessary step
to allow more work offloading and better utilization of the IO threads.
### Current IO threads state:
Valkey IO threads were introduced in Redis 6.0 to allow better
utilization of multi-core machines. Before this, Redis was
single-threaded and could only use one CPU core for network and command
processing. The introduction of IO threads helps in offloading the IO
operations to multiple threads.
**Current IO Threads flow:**
1. Initialization: When Redis starts, it initializes a specified number
of IO threads. These threads are in addition to the main thread, each
thread starts with an empty list, the main thread will populate that
list in each event-loop with pending-read-clients or
pending-write-clients.
2. Read Phase: The main thread accepts incoming connections and reads
requests from clients. The reading of requests are offloaded to IO
threads. The main thread puts the clients ready-to-read in a list and
set the global io_threads_op to IO_THREADS_OP_READ, the IO threads pick
the clients up, perform the read operation and parse the first incoming
command.
3. Command Processing: After reading the requests, command processing is
still single-threaded and handled by the main thread.
4. Write Phase: Similar to the read phase, the write phase is also be
offloaded to IO threads. The main thread prepares the response in the
clients’ output buffer then the main thread puts the client in the list,
and sets the global io_threads_op to the IO_THREADS_OP_WRITE. The IO
threads then pick the clients up and perform the write operation to send
the responses back to clients.
5. Synchronization: The main-thread communicate with the threads on how
many jobs left per each thread with atomic counter. The main-thread
doesn’t access the clients while being handled by the IO threads.
**Issues with current implementation:**
* Underutilized Cores: The current implementation of IO-threads leads to
the underutilization of CPU cores.
* The main thread remains responsible for a significant portion of
IO-related tasks that could be offloaded to IO-threads.
* When the main-thread is processing client’s commands, the IO threads
are idle for a considerable amount of time.
* Notably, the main thread's performance during the IO-related tasks is
constrained by the speed of the slowest IO-thread.
* Limited Offloading: Currently, Since the Main-threads waits
synchronously for the IO threads, the Threads perform only read-parse,
and write operations, with parsing done only for the first command. If
the threads can do work asynchronously we may offload more work to the
threads reducing the load from the main-thread.
* TLS: Currently, we don't support IO threads with TLS (where offloading
IO would be more beneficial) since TLS read/write operations are not
thread-safe with the current implementation.
### Suggested change
Non-blocking main thread - The main thread and IO threads will operate
in parallel to maximize efficiency. The main thread will not be blocked
by IO operations. It will continue to process commands independently of
the IO thread's activities.
**Implementation details**
**Inter-thread communication.**
* We use a static, lock-free ring buffer of fixed size (2048 jobs) for
the main thread to send jobs and for the IO to receive them. If the ring
buffer fills up, the main thread will handle the task itself, acting as
back pressure (in case IO operations are more expensive than command
processing). A static ring buffer is a better candidate than a dynamic
job queue as it eliminates the need for allocation/freeing per job.
* An IO job will be in the format: ` [void* function-call-back | void
*data] `where data is either a client to read/write from and the
function-ptr is the function to be called with the data for example
readQueryFromClient using this format we can use it later to offload
other types of works to the IO threads.
* The Ring buffer is one way from the main-thread to the IO thread, Upon
read/write event the main thread will send a read/write job then in
before sleep it will iterate over the pending read/write clients to
checking for each client if the IO threads has already finished handling
it. The IO thread signals it has finished handling a client read/write
by toggling an atomic flag read_state / write_state on the client
struct.
**Thread Safety**
As suggested in this solution, the IO threads are reading from and
writing to the clients' buffers while the main thread may access those
clients.
We must ensure no race conditions or unsafe access occurs while keeping
the Valkey code simple and lock free.
Minimal Action in the IO Threads
The main change is to limit the IO thread operations to the bare
minimum. The IO thread will access only the client's struct and only the
necessary fields in this struct.
The IO threads will be responsible for the following:
* Read Operation: The IO thread will only read and parse a single
command. It will not update the server stats, handle read errors, or
parsing errors. These tasks will be taken care of by the main thread.
* Write Operation: The IO thread will only write the available data. It
will not free the client's replies, handle write errors, or update the
server statistics.
To achieve this without code duplication, the read/write code has been
refactored into smaller, independent components:
* Functions that perform only the read/parse/write calls.
* Functions that handle the read/parse/write results.
This refactor accounts for the majority of the modifications in this PR.
**Client Struct Safe Access**
As we ensure that the IO threads access memory only within the client
struct, we need to ensure thread safety only for the client's struct's
shared fields.
* Query Buffer
* Command parsing - The main thread will not try to parse a command from
the query buffer when a client is offloaded to the IO thread.
* Client's memory checks in client-cron - The main thread will not
access the client query buffer if it is offloaded and will handle the
querybuf grow/shrink when the client is back.
* CLIENT LIST command - The main thread will busy-wait for the IO thread
to finish handling the client, falling back to the current behavior
where the main thread waits for the IO thread to finish their
processing.
* Output Buffer
* The IO thread will not change the client's bufpos and won't free the
client's reply lists. These actions will be done by the main thread on
the client's return from the IO thread.
* bufpos / block→used: As the main thread may change the bufpos, the
reply-block→used, or add/delete blocks to the reply list while the IO
thread writes, we add two fields to the client struct: io_last_bufpos
and io_last_reply_block. The IO thread will write until the
io_last_bufpos, which was set by the main-thread before sending the
client to the IO thread. If more data has been added to the cob in
between, it will be written in the next write-job. In addition, the main
thread will not trim or merge reply blocks while the client is
offloaded.
* Parsing Fields
* Client's cmd, argc, argv, reqtype, etc., are set during parsing.
* The main thread will indicate to the IO thread not to parse a cmd if
the client is not reset. In this case, the IO thread will only read from
the network and won't attempt to parse a new command.
* The main thread won't access the c→cmd/c→argv in the CLIENT LIST
command as stated before it will busy wait for the IO threads.
* Client Flags
* c→flags, which may be changed by the main thread in multiple places,
won't be accessed by the IO thread. Instead, the main thread will set
the c→io_flags with the information necessary for the IO thread to know
the client's state.
* Client Close
* On freeClient, the main thread will busy wait for the IO thread to
finish processing the client's read/write before proceeding to free the
client.
* Client's Memory Limits
* The IO thread won't handle the qb/cob limits. In case a client crosses
the qb limit, the IO thread will stop reading for it, letting the main
thread know that the client crossed the limit.
**TLS**
TLS is currently not supported with IO threads for the following
reasons:
1. Pending reads - If SSL has pending data that has already been read
from the socket, there is a risk of not calling the read handler again.
To handle this, a list is used to hold the pending clients. With IO
threads, multiple threads can access the list concurrently.
2. Event loop modification - Currently, the TLS code
registers/unregisters the file descriptor from the event loop depending
on the read/write results. With IO threads, multiple threads can modify
the event loop struct simultaneously.
3. The same client can be sent to 2 different threads concurrently
(https://github.com/redis/redis/issues/12540).
Those issues were handled in the current PR:
1. The IO thread only performs the read operation. The main thread will
check for pending reads after the client returns from the IO thread and
will be the only one to access the pending list.
2. The registering/unregistering of events will be similarly postponed
and handled by the main thread only.
3. Each client is being sent to the same dedicated thread (c→id %
num_of_threads).
**Sending Replies Immediately with IO threads.**
Currently, after processing a command, we add the client to the
pending_writes_list. Only after processing all the clients do we send
all the replies. Since the IO threads are now working asynchronously, we
can send the reply immediately after processing the client’s requests,
reducing the command latency. However, if we are using AOF=always, we
must wait for the AOF buffer to be written, in which case we revert to
the current behavior.
**IO threads dynamic adjustment**
Currently, we use an all-or-nothing approach when activating the IO
threads. The current logic is as follows: if the number of pending write
clients is greater than twice the number of threads (including the main
thread), we enable all threads; otherwise, we enable none. For example,
if 8 IO threads are defined, we enable all 8 threads if there are 16
pending clients; else, we enable none.
It makes more sense to enable partial activation of the IO threads. If
we have 10 pending clients, we will enable 5 threads, and so on. This
approach allows for a more granular and efficient allocation of
resources based on the current workload.
In addition, the user will now be able to change the number of I/O
threads at runtime. For example, when decreasing the number of threads
from 4 to 2, threads 3 and 4 will be closed after flushing their job
queues.
**Tests**
Currently, we run the io-threads tests with 4 IO threads
(https://github.com/valkey-io/valkey/blob/443d80f1686377ad42cbf92d98ecc6d240325ee1/.github/workflows/daily.yml#L353).
This means that we will not activate the IO threads unless there are 8
(threads * 2) pending write clients per single loop, which is unlikely
to happened in most of tests, meaning the IO threads are not currently
being tested.
To enforce the main thread to always offload work to the IO threads,
regardless of the number of pending events, we add an
events-per-io-thread configuration with a default value of 2. When set
to 0, this configuration will force the main thread to always offload
work to the IO threads.
When we offload every single read/write operation to the IO threads, the
IO-threads are running with 100% CPU when running multiple tests
concurrently some tests fail as a result of larger than expected command
latencies. To address this issue, we have to add some after or wait_for
calls to some of the tests to ensure they pass with IO threads as well.
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-09 06:01:39 +03:00
|
|
|
void aeSetAfterSleepProc(aeEventLoop *eventLoop, aeAfterSleepProc *aftersleep) {
|
2017-05-03 11:26:21 +02:00
|
|
|
eventLoop->aftersleep = aftersleep;
|
|
|
|
}
|
Io thread work offload (#763)
### IO-Threads Work Offloading
This PR is the 2nd of 3 PRs intended to achieve the goal of 1M requests
per second.
(1st PR: https://github.com/valkey-io/valkey/pull/758)
This PR offloads additional work to the I/O threads, beyond the current
read-parse/write operations, to better utilize the I/O threads and
reduce the load on the main thread.
It contains the following 3 commits:
### Poll Offload
Currently, the main thread is responsible for executing the poll-wait
system call, while the IO threads wait for tasks from the main thread.
The poll-wait operation is expensive and can consume up to 30% of the
main thread's time. We could have let the IO threads do the poll-wait by
themselves, with each thread listening to some of the clients and
notifying the main thread when a client's command is ready to execute.
However, the current approach, where the main thread listens for events
from the network, has several benefits. The main thread remains in
charge, allowing it to know the state of each client
(idle/read/write/close) at any given time. Additionally, it makes the
threads flexible, enabling us to drain an IO thread's job queue and stop
a thread when the load is light without modifying the event loop and
moving its clients to a different IO thread. Furthermore, with this
approach, the IO threads don't need to wait for both messages from the
network and from the main thread instead, the threads wait only for
tasks from the main thread.
To enjoy the benefits of both the main thread remaining in charge and
the poll being offloaded, we propose offloading the poll-wait as a
single-time, non-blocking job to one of the IO threads. The IO thread
will perform a poll-wait non-blocking call while the main thread
processes the client commands. Later, in `aeProcessEvents`, instead of
sleeping on the poll, we check for the IO thread's poll-wait results.
The poll-wait will be offloaded in `beforeSleep` only when there are
ready events for the main thread to process. If no events are pending,
the main thread will revert to the current behavior and sleep on the
poll by itself.
**Implementation Details**
A new call back `custompoll` was added to the `aeEventLoop` when not set
to `NULL` the ae will call the `custompoll` callback instead of the
`aeApiPoll`.
When the poll is offloaded we will set the `custompoll` to
`getIOThreadPollResults` and send a poll-job to the thread. the thread
will take a mutex, call a non-blocking (with timeout 0) to `aePoll`
which will populate the fired events array. the IO thread will set the
`server.io_fired_events` to the number of the returning `numevents`,
later the main-thread in `custompoll` will return the
`server.io_fired_events` and will set the `customPoll` back to `NULL`.
To ensure thread safety when accessing server.el, all functions that
modify the eventloop events were wrapped with a mutex to ensure mutual
exclusion when modifying the events.
### Command Lookup Offload
As the IO thread parses the command from the client's Querybuf, it can
perform a command lookup in the commands dictionary, which can consume
up to ~5% of the main-thread runtime.
**Implementation details**
The IO thread will store the looked-up command in the client's new field
`io_parsed_cmd` field. We can't use `c->cmd` for that since we use
`c->cmd `to check if a command was reprocessed or not.
To ensure thread safety when accessing the command dictionary, we make
sure the main thread isn't changing the dictionary while IO threads are
accessing it. This is accomplished by introducing a new flag called
`no_incremental_rehash` for the `dictType` commands. When performing
`dictResize`, we will rehash the entire dictionary in place rather than
deferring the process.
### Free Offload
Since the command arguments are allocated by the I/O thread, it would be
beneficial if they were also freed by the same thread. If the main
thread frees objects allocated by the I/O thread, two issues arise:
1. During the freeing process, the main thread needs to access the SDS
pointed to by the object to get its length.
2. With Jemalloc, each thread manages thread local pool (`tcache`) of
buffers for quick reallocation without accessing the arena. If the main
thread constantly frees objects allocated by other threads, those
threads will have to frequently access the shared arena to obtain new
memory allocations
**Implementation Details**
When freeing the client's argv, we will send the argv array to the
thread that allocated it. The thread will be identified by the client
ID. When freeing an object during `dbOverwrite`, we will offload the
object free as well. We will extend this to offload the free during
`dbDelete` in a future PR, as its effects on defrag/memory evictions
need to be studied.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-19 05:21:45 +03:00
|
|
|
|
|
|
|
/* This function allows setting a custom poll procedure to be used by the event loop.
|
|
|
|
* The custom poll procedure, if set, will be called instead of the default aeApiPoll */
|
|
|
|
void aeSetCustomPollProc(aeEventLoop *eventLoop, aeCustomPollProc *custompoll) {
|
|
|
|
eventLoop->custompoll = custompoll;
|
|
|
|
}
|
|
|
|
|
|
|
|
void aeSetPollProtect(aeEventLoop *eventLoop, int protect) {
|
|
|
|
if (protect) {
|
|
|
|
eventLoop->flags |= AE_PROTECT_POLL;
|
|
|
|
} else {
|
|
|
|
eventLoop->flags &= ~AE_PROTECT_POLL;
|
|
|
|
}
|
|
|
|
}
|