futriix/src/tls.c
uriyage bbfd041895
Async IO threads (#758)
This PR is 1 of 3 PRs intended to achieve the goal of 1 million requests
per second, as detailed by [dan touitou](https://github.com/touitou-dan)
in https://github.com/valkey-io/valkey/issues/22. This PR modifies the
IO threads to be fully asynchronous, which is a first and necessary step
to allow more work offloading and better utilization of the IO threads.

### Current IO threads state:

Valkey IO threads were introduced in Redis 6.0 to allow better
utilization of multi-core machines. Before this, Redis was
single-threaded and could only use one CPU core for network and command
processing. The introduction of IO threads helps in offloading the IO
operations to multiple threads.

**Current IO Threads flow:**

1. Initialization: When Redis starts, it initializes a specified number
of IO threads. These threads are in addition to the main thread, each
thread starts with an empty list, the main thread will populate that
list in each event-loop with pending-read-clients or
pending-write-clients.
2. Read Phase: The main thread accepts incoming connections and reads
requests from clients. The reading of requests are offloaded to IO
threads. The main thread puts the clients ready-to-read in a list and
set the global io_threads_op to IO_THREADS_OP_READ, the IO threads pick
the clients up, perform the read operation and parse the first incoming
command.
3. Command Processing: After reading the requests, command processing is
still single-threaded and handled by the main thread.
4. Write Phase: Similar to the read phase, the write phase is also be
offloaded to IO threads. The main thread prepares the response in the
clients’ output buffer then the main thread puts the client in the list,
and sets the global io_threads_op to the IO_THREADS_OP_WRITE. The IO
threads then pick the clients up and perform the write operation to send
the responses back to clients.
5. Synchronization: The main-thread communicate with the threads on how
many jobs left per each thread with atomic counter. The main-thread
doesn’t access the clients while being handled by the IO threads.

**Issues with current implementation:**

* Underutilized Cores: The current implementation of IO-threads leads to
the underutilization of CPU cores.
* The main thread remains responsible for a significant portion of
IO-related tasks that could be offloaded to IO-threads.
* When the main-thread is processing client’s commands, the IO threads
are idle for a considerable amount of time.
* Notably, the main thread's performance during the IO-related tasks is
constrained by the speed of the slowest IO-thread.
* Limited Offloading: Currently, Since the Main-threads waits
synchronously for the IO threads, the Threads perform only read-parse,
and write operations, with parsing done only for the first command. If
the threads can do work asynchronously we may offload more work to the
threads reducing the load from the main-thread.
* TLS: Currently, we don't support IO threads with TLS (where offloading
IO would be more beneficial) since TLS read/write operations are not
thread-safe with the current implementation.

### Suggested change

Non-blocking main thread - The main thread and IO threads will operate
in parallel to maximize efficiency. The main thread will not be blocked
by IO operations. It will continue to process commands independently of
the IO thread's activities.

**Implementation details**

**Inter-thread communication.**

* We use a static, lock-free ring buffer of fixed size (2048 jobs) for
the main thread to send jobs and for the IO to receive them. If the ring
buffer fills up, the main thread will handle the task itself, acting as
back pressure (in case IO operations are more expensive than command
processing). A static ring buffer is a better candidate than a dynamic
job queue as it eliminates the need for allocation/freeing per job.
* An IO job will be in the format: ` [void* function-call-back | void
*data] `where data is either a client to read/write from and the
function-ptr is the function to be called with the data for example
readQueryFromClient using this format we can use it later to offload
other types of works to the IO threads.
* The Ring buffer is one way from the main-thread to the IO thread, Upon
read/write event the main thread will send a read/write job then in
before sleep it will iterate over the pending read/write clients to
checking for each client if the IO threads has already finished handling
it. The IO thread signals it has finished handling a client read/write
by toggling an atomic flag read_state / write_state on the client
struct.

**Thread Safety**

As suggested in this solution, the IO threads are reading from and
writing to the clients' buffers while the main thread may access those
clients.
We must ensure no race conditions or unsafe access occurs while keeping
the Valkey code simple and lock free.

Minimal Action in the IO Threads
The main change is to limit the IO thread operations to the bare
minimum. The IO thread will access only the client's struct and only the
necessary fields in this struct.
The IO threads will be responsible for the following:

* Read Operation: The IO thread will only read and parse a single
command. It will not update the server stats, handle read errors, or
parsing errors. These tasks will be taken care of by the main thread.
* Write Operation: The IO thread will only write the available data. It
will not free the client's replies, handle write errors, or update the
server statistics.


To achieve this without code duplication, the read/write code has been
refactored into smaller, independent components:

* Functions that perform only the read/parse/write calls.
* Functions that handle the read/parse/write results.

This refactor accounts for the majority of the modifications in this PR.

**Client Struct Safe Access**

As we ensure that the IO threads access memory only within the client
struct, we need to ensure thread safety only for the client's struct's
shared fields.

* Query Buffer 
* Command parsing - The main thread will not try to parse a command from
the query buffer when a client is offloaded to the IO thread.
* Client's memory checks in client-cron - The main thread will not
access the client query buffer if it is offloaded and will handle the
querybuf grow/shrink when the client is back.
* CLIENT LIST command - The main thread will busy-wait for the IO thread
to finish handling the client, falling back to the current behavior
where the main thread waits for the IO thread to finish their
processing.
* Output Buffer 
* The IO thread will not change the client's bufpos and won't free the
client's reply lists. These actions will be done by the main thread on
the client's return from the IO thread.
* bufpos / block→used: As the main thread may change the bufpos, the
reply-block→used, or add/delete blocks to the reply list while the IO
thread writes, we add two fields to the client struct: io_last_bufpos
and io_last_reply_block. The IO thread will write until the
io_last_bufpos, which was set by the main-thread before sending the
client to the IO thread. If more data has been added to the cob in
between, it will be written in the next write-job. In addition, the main
thread will not trim or merge reply blocks while the client is
offloaded.
* Parsing Fields 
    * Client's cmd, argc, argv, reqtype, etc., are set during parsing.
* The main thread will indicate to the IO thread not to parse a cmd if
the client is not reset. In this case, the IO thread will only read from
the network and won't attempt to parse a new command.
* The main thread won't access the c→cmd/c→argv in the CLIENT LIST
command as stated before it will busy wait for the IO threads.
* Client Flags 
* c→flags, which may be changed by the main thread in multiple places,
won't be accessed by the IO thread. Instead, the main thread will set
the c→io_flags with the information necessary for the IO thread to know
the client's state.
* Client Close 
* On freeClient, the main thread will busy wait for the IO thread to
finish processing the client's read/write before proceeding to free the
client.
* Client's Memory Limits 
* The IO thread won't handle the qb/cob limits. In case a client crosses
the qb limit, the IO thread will stop reading for it, letting the main
thread know that the client crossed the limit.

**TLS**

TLS is currently not supported with IO threads for the following
reasons:

1. Pending reads - If SSL has pending data that has already been read
from the socket, there is a risk of not calling the read handler again.
To handle this, a list is used to hold the pending clients. With IO
threads, multiple threads can access the list concurrently.
2. Event loop modification - Currently, the TLS code
registers/unregisters the file descriptor from the event loop depending
on the read/write results. With IO threads, multiple threads can modify
the event loop struct simultaneously.
3. The same client can be sent to 2 different threads concurrently
(https://github.com/redis/redis/issues/12540).

Those issues were handled in the current PR:

1. The IO thread only performs the read operation. The main thread will
check for pending reads after the client returns from the IO thread and
will be the only one to access the pending list.
2. The registering/unregistering of events will be similarly postponed
and handled by the main thread only.
3. Each client is being sent to the same dedicated thread (c→id %
num_of_threads).


**Sending Replies Immediately with IO threads.**

Currently, after processing a command, we add the client to the
pending_writes_list. Only after processing all the clients do we send
all the replies. Since the IO threads are now working asynchronously, we
can send the reply immediately after processing the client’s requests,
reducing the command latency. However, if we are using AOF=always, we
must wait for the AOF buffer to be written, in which case we revert to
the current behavior.

**IO threads dynamic adjustment**

Currently, we use an all-or-nothing approach when activating the IO
threads. The current logic is as follows: if the number of pending write
clients is greater than twice the number of threads (including the main
thread), we enable all threads; otherwise, we enable none. For example,
if 8 IO threads are defined, we enable all 8 threads if there are 16
pending clients; else, we enable none.
It makes more sense to enable partial activation of the IO threads. If
we have 10 pending clients, we will enable 5 threads, and so on. This
approach allows for a more granular and efficient allocation of
resources based on the current workload.

In addition, the user will now be able to change the number of I/O
threads at runtime. For example, when decreasing the number of threads
from 4 to 2, threads 3 and 4 will be closed after flushing their job
queues.

**Tests**

Currently, we run the io-threads tests with 4 IO threads
(443d80f168/.github/workflows/daily.yml (L353)).
This means that we will not activate the IO threads unless there are 8
(threads * 2) pending write clients per single loop, which is unlikely
to happened in most of tests, meaning the IO threads are not currently
being tested.

To enforce the main thread to always offload work to the IO threads,
regardless of the number of pending events, we add an
events-per-io-thread configuration with a default value of 2. When set
to 0, this configuration will force the main thread to always offload
work to the IO threads.

When we offload every single read/write operation to the IO threads, the
IO-threads are running with 100% CPU when running multiple tests
concurrently some tests fail as a result of larger than expected command
latencies. To address this issue, we have to add some after or wait_for
calls to some of the tests to ensure they pass with IO threads as well.

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-07-08 20:01:39 -07:00

1208 lines
40 KiB
C

/*
* Copyright (c) 2019, Redis Labs
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Redis nor the names of its contributors may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#define VALKEYMODULE_CORE_MODULE /* A module that's part of the server core, uses server.h too. */
#include "server.h"
#include "connhelpers.h"
#include "adlist.h"
#if (USE_OPENSSL == 1 /* BUILD_YES */) || ((USE_OPENSSL == 2 /* BUILD_MODULE */) && (BUILD_TLS_MODULE == 2))
#include <openssl/conf.h>
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <openssl/rand.h>
#include <openssl/pem.h>
#if OPENSSL_VERSION_NUMBER >= 0x30000000L
#include <openssl/decoder.h>
#endif
#include <sys/uio.h>
#include <arpa/inet.h>
#define REDIS_TLS_PROTO_TLSv1 (1 << 0)
#define REDIS_TLS_PROTO_TLSv1_1 (1 << 1)
#define REDIS_TLS_PROTO_TLSv1_2 (1 << 2)
#define REDIS_TLS_PROTO_TLSv1_3 (1 << 3)
/* Use safe defaults */
#ifdef TLS1_3_VERSION
#define REDIS_TLS_PROTO_DEFAULT (REDIS_TLS_PROTO_TLSv1_2 | REDIS_TLS_PROTO_TLSv1_3)
#else
#define REDIS_TLS_PROTO_DEFAULT (REDIS_TLS_PROTO_TLSv1_2)
#endif
SSL_CTX *valkey_tls_ctx = NULL;
SSL_CTX *valkey_tls_client_ctx = NULL;
static int parseProtocolsConfig(const char *str) {
int i, count = 0;
int protocols = 0;
if (!str) return REDIS_TLS_PROTO_DEFAULT;
sds *tokens = sdssplitlen(str, strlen(str), " ", 1, &count);
if (!tokens) {
serverLog(LL_WARNING, "Invalid tls-protocols configuration string");
return -1;
}
for (i = 0; i < count; i++) {
if (!strcasecmp(tokens[i], "tlsv1"))
protocols |= REDIS_TLS_PROTO_TLSv1;
else if (!strcasecmp(tokens[i], "tlsv1.1"))
protocols |= REDIS_TLS_PROTO_TLSv1_1;
else if (!strcasecmp(tokens[i], "tlsv1.2"))
protocols |= REDIS_TLS_PROTO_TLSv1_2;
else if (!strcasecmp(tokens[i], "tlsv1.3")) {
#ifdef TLS1_3_VERSION
protocols |= REDIS_TLS_PROTO_TLSv1_3;
#else
serverLog(LL_WARNING, "TLSv1.3 is specified in tls-protocols but not supported by OpenSSL.");
protocols = -1;
break;
#endif
} else {
serverLog(LL_WARNING, "Invalid tls-protocols specified. "
"Use a combination of 'TLSv1', 'TLSv1.1', 'TLSv1.2' and 'TLSv1.3'.");
protocols = -1;
break;
}
}
sdsfreesplitres(tokens, count);
return protocols;
}
/* list of connections with pending data already read from the socket, but not
* served to the reader yet. */
static list *pending_list = NULL;
/**
* OpenSSL global initialization and locking handling callbacks.
* Note that this is only required for OpenSSL < 1.1.0.
*/
#if OPENSSL_VERSION_NUMBER < 0x10100000L
#define USE_CRYPTO_LOCKS
#endif
#ifdef USE_CRYPTO_LOCKS
static pthread_mutex_t *openssl_locks;
static void sslLockingCallback(int mode, int lock_id, const char *f, int line) {
pthread_mutex_t *mt = openssl_locks + lock_id;
if (mode & CRYPTO_LOCK) {
pthread_mutex_lock(mt);
} else {
pthread_mutex_unlock(mt);
}
(void)f;
(void)line;
}
static void initCryptoLocks(void) {
unsigned i, nlocks;
if (CRYPTO_get_locking_callback() != NULL) {
/* Someone already set the callback before us. Don't destroy it! */
return;
}
nlocks = CRYPTO_num_locks();
openssl_locks = zmalloc(sizeof(*openssl_locks) * nlocks);
for (i = 0; i < nlocks; i++) {
pthread_mutex_init(openssl_locks + i, NULL);
}
CRYPTO_set_locking_callback(sslLockingCallback);
}
#endif /* USE_CRYPTO_LOCKS */
static void tlsInit(void) {
/* Enable configuring OpenSSL using the standard openssl.cnf
* OPENSSL_config()/OPENSSL_init_crypto() should be the first
* call to the OpenSSL* library.
* - OPENSSL_config() should be used for OpenSSL versions < 1.1.0
* - OPENSSL_init_crypto() should be used for OpenSSL versions >= 1.1.0
*/
#if OPENSSL_VERSION_NUMBER < 0x10100000L
OPENSSL_config(NULL);
SSL_load_error_strings();
SSL_library_init();
#elif OPENSSL_VERSION_NUMBER < 0x10101000L
OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL);
#else
OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG | OPENSSL_INIT_ATFORK, NULL);
#endif
#ifdef USE_CRYPTO_LOCKS
initCryptoLocks();
#endif
if (!RAND_poll()) {
serverLog(LL_WARNING, "OpenSSL: Failed to seed random number generator.");
}
pending_list = listCreate();
}
static void tlsCleanup(void) {
if (valkey_tls_ctx) {
SSL_CTX_free(valkey_tls_ctx);
valkey_tls_ctx = NULL;
}
if (valkey_tls_client_ctx) {
SSL_CTX_free(valkey_tls_client_ctx);
valkey_tls_client_ctx = NULL;
}
#if OPENSSL_VERSION_NUMBER >= 0x10100000L && !defined(LIBRESSL_VERSION_NUMBER)
// unavailable on LibreSSL
OPENSSL_cleanup();
#endif
}
/* Callback for passing a keyfile password stored as an sds to OpenSSL */
static int tlsPasswordCallback(char *buf, int size, int rwflag, void *u) {
UNUSED(rwflag);
const char *pass = u;
size_t pass_len;
if (!pass) return -1;
pass_len = strlen(pass);
if (pass_len > (size_t)size) return -1;
memcpy(buf, pass, pass_len);
return (int)pass_len;
}
/* Create a *base* SSL_CTX using the SSL configuration provided. The base context
* includes everything that's common for both client-side and server-side connections.
*/
static SSL_CTX *createSSLContext(serverTLSContextConfig *ctx_config, int protocols, int client) {
const char *cert_file = client ? ctx_config->client_cert_file : ctx_config->cert_file;
const char *key_file = client ? ctx_config->client_key_file : ctx_config->key_file;
const char *key_file_pass = client ? ctx_config->client_key_file_pass : ctx_config->key_file_pass;
char errbuf[256];
SSL_CTX *ctx = NULL;
ctx = SSL_CTX_new(SSLv23_method());
if (!ctx) goto error;
SSL_CTX_set_options(ctx, SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
#ifdef SSL_OP_DONT_INSERT_EMPTY_FRAGMENTS
SSL_CTX_set_options(ctx, SSL_OP_DONT_INSERT_EMPTY_FRAGMENTS);
#endif
if (!(protocols & REDIS_TLS_PROTO_TLSv1)) SSL_CTX_set_options(ctx, SSL_OP_NO_TLSv1);
if (!(protocols & REDIS_TLS_PROTO_TLSv1_1)) SSL_CTX_set_options(ctx, SSL_OP_NO_TLSv1_1);
#ifdef SSL_OP_NO_TLSv1_2
if (!(protocols & REDIS_TLS_PROTO_TLSv1_2)) SSL_CTX_set_options(ctx, SSL_OP_NO_TLSv1_2);
#endif
#ifdef SSL_OP_NO_TLSv1_3
if (!(protocols & REDIS_TLS_PROTO_TLSv1_3)) SSL_CTX_set_options(ctx, SSL_OP_NO_TLSv1_3);
#endif
#ifdef SSL_OP_NO_COMPRESSION
SSL_CTX_set_options(ctx, SSL_OP_NO_COMPRESSION);
#endif
SSL_CTX_set_mode(ctx, SSL_MODE_ENABLE_PARTIAL_WRITE | SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER);
SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER | SSL_VERIFY_FAIL_IF_NO_PEER_CERT, NULL);
SSL_CTX_set_default_passwd_cb(ctx, tlsPasswordCallback);
SSL_CTX_set_default_passwd_cb_userdata(ctx, (void *)key_file_pass);
if (SSL_CTX_use_certificate_chain_file(ctx, cert_file) <= 0) {
ERR_error_string_n(ERR_get_error(), errbuf, sizeof(errbuf));
serverLog(LL_WARNING, "Failed to load certificate: %s: %s", cert_file, errbuf);
goto error;
}
if (SSL_CTX_use_PrivateKey_file(ctx, key_file, SSL_FILETYPE_PEM) <= 0) {
ERR_error_string_n(ERR_get_error(), errbuf, sizeof(errbuf));
serverLog(LL_WARNING, "Failed to load private key: %s: %s", key_file, errbuf);
goto error;
}
if ((ctx_config->ca_cert_file || ctx_config->ca_cert_dir) &&
SSL_CTX_load_verify_locations(ctx, ctx_config->ca_cert_file, ctx_config->ca_cert_dir) <= 0) {
ERR_error_string_n(ERR_get_error(), errbuf, sizeof(errbuf));
serverLog(LL_WARNING, "Failed to configure CA certificate(s) file/directory: %s", errbuf);
goto error;
}
if (ctx_config->ciphers && !SSL_CTX_set_cipher_list(ctx, ctx_config->ciphers)) {
serverLog(LL_WARNING, "Failed to configure ciphers: %s", ctx_config->ciphers);
goto error;
}
#ifdef TLS1_3_VERSION
if (ctx_config->ciphersuites && !SSL_CTX_set_ciphersuites(ctx, ctx_config->ciphersuites)) {
serverLog(LL_WARNING, "Failed to configure ciphersuites: %s", ctx_config->ciphersuites);
goto error;
}
#endif
return ctx;
error:
if (ctx) SSL_CTX_free(ctx);
return NULL;
}
/* Attempt to configure/reconfigure TLS. This operation is atomic and will
* leave the SSL_CTX unchanged if fails.
* @priv: config of serverTLSContextConfig.
* @reconfigure: if true, ignore the previous configure; if false, only
* configure from @ctx_config if valkey_tls_ctx is NULL.
*/
static int tlsConfigure(void *priv, int reconfigure) {
serverTLSContextConfig *ctx_config = (serverTLSContextConfig *)priv;
char errbuf[256];
SSL_CTX *ctx = NULL;
SSL_CTX *client_ctx = NULL;
if (!reconfigure && valkey_tls_ctx) {
return C_OK;
}
if (!ctx_config->cert_file) {
serverLog(LL_WARNING, "No tls-cert-file configured!");
goto error;
}
if (!ctx_config->key_file) {
serverLog(LL_WARNING, "No tls-key-file configured!");
goto error;
}
if (((server.tls_auth_clients != TLS_CLIENT_AUTH_NO) || server.tls_cluster || server.tls_replication) &&
!ctx_config->ca_cert_file && !ctx_config->ca_cert_dir) {
serverLog(LL_WARNING, "Either tls-ca-cert-file or tls-ca-cert-dir must be specified when tls-cluster, "
"tls-replication or tls-auth-clients are enabled!");
goto error;
}
int protocols = parseProtocolsConfig(ctx_config->protocols);
if (protocols == -1) goto error;
/* Create server side/general context */
ctx = createSSLContext(ctx_config, protocols, 0);
if (!ctx) goto error;
if (ctx_config->session_caching) {
SSL_CTX_set_session_cache_mode(ctx, SSL_SESS_CACHE_SERVER);
SSL_CTX_sess_set_cache_size(ctx, ctx_config->session_cache_size);
SSL_CTX_set_timeout(ctx, ctx_config->session_cache_timeout);
SSL_CTX_set_session_id_context(ctx, (void *)"redis", 5);
} else {
SSL_CTX_set_session_cache_mode(ctx, SSL_SESS_CACHE_OFF);
}
#ifdef SSL_OP_NO_CLIENT_RENEGOTIATION
SSL_CTX_set_options(ctx, SSL_OP_NO_CLIENT_RENEGOTIATION);
#endif
if (ctx_config->prefer_server_ciphers) SSL_CTX_set_options(ctx, SSL_OP_CIPHER_SERVER_PREFERENCE);
#if ((OPENSSL_VERSION_NUMBER < 0x30000000L) && defined(SSL_CTX_set_ecdh_auto))
SSL_CTX_set_ecdh_auto(ctx, 1);
#endif
SSL_CTX_set_options(ctx, SSL_OP_SINGLE_DH_USE);
if (ctx_config->dh_params_file) {
FILE *dhfile = fopen(ctx_config->dh_params_file, "r");
if (!dhfile) {
serverLog(LL_WARNING, "Failed to load %s: %s", ctx_config->dh_params_file, strerror(errno));
goto error;
}
#if (OPENSSL_VERSION_NUMBER >= 0x30000000L)
EVP_PKEY *pkey = NULL;
OSSL_DECODER_CTX *dctx =
OSSL_DECODER_CTX_new_for_pkey(&pkey, "PEM", NULL, "DH", OSSL_KEYMGMT_SELECT_DOMAIN_PARAMETERS, NULL, NULL);
if (!dctx) {
serverLog(LL_WARNING, "No decoder for DH params.");
fclose(dhfile);
goto error;
}
if (!OSSL_DECODER_from_fp(dctx, dhfile)) {
serverLog(LL_WARNING, "%s: failed to read DH params.", ctx_config->dh_params_file);
OSSL_DECODER_CTX_free(dctx);
fclose(dhfile);
goto error;
}
OSSL_DECODER_CTX_free(dctx);
fclose(dhfile);
if (SSL_CTX_set0_tmp_dh_pkey(ctx, pkey) <= 0) {
ERR_error_string_n(ERR_get_error(), errbuf, sizeof(errbuf));
serverLog(LL_WARNING, "Failed to load DH params file: %s: %s", ctx_config->dh_params_file, errbuf);
EVP_PKEY_free(pkey);
goto error;
}
/* Not freeing pkey, it is owned by OpenSSL now */
#else
DH *dh = PEM_read_DHparams(dhfile, NULL, NULL, NULL);
fclose(dhfile);
if (!dh) {
serverLog(LL_WARNING, "%s: failed to read DH params.", ctx_config->dh_params_file);
goto error;
}
if (SSL_CTX_set_tmp_dh(ctx, dh) <= 0) {
ERR_error_string_n(ERR_get_error(), errbuf, sizeof(errbuf));
serverLog(LL_WARNING, "Failed to load DH params file: %s: %s", ctx_config->dh_params_file, errbuf);
DH_free(dh);
goto error;
}
DH_free(dh);
#endif
} else {
#if (OPENSSL_VERSION_NUMBER >= 0x30000000L)
SSL_CTX_set_dh_auto(ctx, 1);
#endif
}
/* If a client-side certificate is configured, create an explicit client context */
if (ctx_config->client_cert_file && ctx_config->client_key_file) {
client_ctx = createSSLContext(ctx_config, protocols, 1);
if (!client_ctx) goto error;
}
SSL_CTX_free(valkey_tls_ctx);
SSL_CTX_free(valkey_tls_client_ctx);
valkey_tls_ctx = ctx;
valkey_tls_client_ctx = client_ctx;
return C_OK;
error:
if (ctx) SSL_CTX_free(ctx);
if (client_ctx) SSL_CTX_free(client_ctx);
return C_ERR;
}
#ifdef TLS_DEBUGGING
#define TLSCONN_DEBUG(fmt, ...) serverLog(LL_DEBUG, "TLSCONN: " fmt, __VA_ARGS__)
#else
#define TLSCONN_DEBUG(fmt, ...)
#endif
static ConnectionType CT_TLS;
/* Normal socket connections have a simple events/handler correlation.
*
* With TLS connections we need to handle cases where during a logical read
* or write operation, the SSL library asks to block for the opposite
* socket operation.
*
* When this happens, we need to do two things:
* 1. Make sure we register for the event.
* 2. Make sure we know which handler needs to execute when the
* event fires. That is, if we notify the caller of a write operation
* that it blocks, and SSL asks for a read, we need to trigger the
* write handler again on the next read event.
*
*/
typedef enum { WANT_READ = 1, WANT_WRITE } WantIOType;
#define TLS_CONN_FLAG_READ_WANT_WRITE (1 << 0)
#define TLS_CONN_FLAG_WRITE_WANT_READ (1 << 1)
#define TLS_CONN_FLAG_FD_SET (1 << 2)
#define TLS_CONN_FLAG_POSTPONE_UPDATE_STATE (1 << 3)
typedef struct tls_connection {
connection c;
int flags;
SSL *ssl;
char *ssl_error;
listNode *pending_list_node;
} tls_connection;
static connection *createTLSConnection(int client_side) {
SSL_CTX *ctx = valkey_tls_ctx;
if (client_side && valkey_tls_client_ctx) ctx = valkey_tls_client_ctx;
tls_connection *conn = zcalloc(sizeof(tls_connection));
conn->c.type = &CT_TLS;
conn->c.fd = -1;
conn->c.iovcnt = IOV_MAX;
conn->ssl = SSL_new(ctx);
return (connection *)conn;
}
static connection *connCreateTLS(void) {
return createTLSConnection(1);
}
/* Fetch the latest OpenSSL error and store it in the connection */
static void updateTLSError(tls_connection *conn) {
conn->c.last_errno = 0;
if (conn->ssl_error) zfree(conn->ssl_error);
conn->ssl_error = zmalloc(512);
ERR_error_string_n(ERR_get_error(), conn->ssl_error, 512);
}
/* Create a new TLS connection that is already associated with
* an accepted underlying file descriptor.
*
* The socket is not ready for I/O until connAccept() was called and
* invoked the connection-level accept handler.
*
* Callers should use connGetState() and verify the created connection
* is not in an error state.
*/
static connection *connCreateAcceptedTLS(int fd, void *priv) {
int require_auth = *(int *)priv;
tls_connection *conn = (tls_connection *)createTLSConnection(0);
conn->c.fd = fd;
conn->c.state = CONN_STATE_ACCEPTING;
if (!conn->ssl) {
updateTLSError(conn);
conn->c.state = CONN_STATE_ERROR;
return (connection *)conn;
}
switch (require_auth) {
case TLS_CLIENT_AUTH_NO: SSL_set_verify(conn->ssl, SSL_VERIFY_NONE, NULL); break;
case TLS_CLIENT_AUTH_OPTIONAL: SSL_set_verify(conn->ssl, SSL_VERIFY_PEER, NULL); break;
default: /* TLS_CLIENT_AUTH_YES, also fall-secure */
SSL_set_verify(conn->ssl, SSL_VERIFY_PEER | SSL_VERIFY_FAIL_IF_NO_PEER_CERT, NULL);
break;
}
SSL_set_fd(conn->ssl, conn->c.fd);
SSL_set_accept_state(conn->ssl);
return (connection *)conn;
}
static void tlsEventHandler(struct aeEventLoop *el, int fd, void *clientData, int mask);
static void updateSSLEvent(tls_connection *conn);
/* Process the return code received from OpenSSL>
* Update the want parameter with expected I/O.
* Update the connection's error state if a real error has occurred.
* Returns an SSL error code, or 0 if no further handling is required.
*/
static int handleSSLReturnCode(tls_connection *conn, int ret_value, WantIOType *want) {
if (ret_value <= 0) {
int ssl_err = SSL_get_error(conn->ssl, ret_value);
switch (ssl_err) {
case SSL_ERROR_WANT_WRITE: *want = WANT_WRITE; return 0;
case SSL_ERROR_WANT_READ: *want = WANT_READ; return 0;
case SSL_ERROR_SYSCALL:
conn->c.last_errno = errno;
if (conn->ssl_error) zfree(conn->ssl_error);
conn->ssl_error = errno ? zstrdup(strerror(errno)) : NULL;
break;
default:
/* Error! */
updateTLSError(conn);
break;
}
return ssl_err;
}
return 0;
}
/* Handle OpenSSL return code following SSL_write() or SSL_read():
*
* - Updates conn state and last_errno.
* - If update_event is nonzero, calls updateSSLEvent() when necessary.
*
* Returns ret_value, or -1 on error or dropped connection.
*/
static int updateStateAfterSSLIO(tls_connection *conn, int ret_value, int update_event) {
/* If system call was interrupted, there's no need to go through the full
* OpenSSL error handling and just report this for the caller to retry the
* operation.
*/
if (errno == EINTR) {
conn->c.last_errno = EINTR;
return -1;
}
if (ret_value <= 0) {
WantIOType want = 0;
int ssl_err;
if (!(ssl_err = handleSSLReturnCode(conn, ret_value, &want))) {
if (want == WANT_READ) conn->flags |= TLS_CONN_FLAG_WRITE_WANT_READ;
if (want == WANT_WRITE) conn->flags |= TLS_CONN_FLAG_READ_WANT_WRITE;
if (update_event) updateSSLEvent(conn);
errno = EAGAIN;
return -1;
} else {
if (ssl_err == SSL_ERROR_ZERO_RETURN || ((ssl_err == SSL_ERROR_SYSCALL && !errno))) {
conn->c.state = CONN_STATE_CLOSED;
return -1;
} else {
conn->c.state = CONN_STATE_ERROR;
return -1;
}
}
}
return ret_value;
}
static void registerSSLEvent(tls_connection *conn, WantIOType want) {
int mask = aeGetFileEvents(server.el, conn->c.fd);
switch (want) {
case WANT_READ:
if (mask & AE_WRITABLE) aeDeleteFileEvent(server.el, conn->c.fd, AE_WRITABLE);
if (!(mask & AE_READABLE)) aeCreateFileEvent(server.el, conn->c.fd, AE_READABLE, tlsEventHandler, conn);
break;
case WANT_WRITE:
if (mask & AE_READABLE) aeDeleteFileEvent(server.el, conn->c.fd, AE_READABLE);
if (!(mask & AE_WRITABLE)) aeCreateFileEvent(server.el, conn->c.fd, AE_WRITABLE, tlsEventHandler, conn);
break;
default: serverAssert(0); break;
}
}
static void postPoneUpdateSSLState(connection *conn_, int postpone) {
tls_connection *conn = (tls_connection *)conn_;
if (postpone) {
conn->flags |= TLS_CONN_FLAG_POSTPONE_UPDATE_STATE;
} else {
conn->flags &= ~TLS_CONN_FLAG_POSTPONE_UPDATE_STATE;
}
}
static void updatePendingData(tls_connection *conn) {
if (conn->flags & TLS_CONN_FLAG_POSTPONE_UPDATE_STATE) return;
/* If SSL has pending data, already read from the socket, we're at risk of not calling the read handler again, make
* sure to add it to a list of pending connection that should be handled anyway. */
if (SSL_pending(conn->ssl) > 0) {
if (!conn->pending_list_node) {
listAddNodeTail(pending_list, conn);
conn->pending_list_node = listLast(pending_list);
}
} else if (conn->pending_list_node) {
listDelNode(pending_list, conn->pending_list_node);
conn->pending_list_node = NULL;
}
}
static void updateSSLEvent(tls_connection *conn) {
if (conn->flags & TLS_CONN_FLAG_POSTPONE_UPDATE_STATE) return;
int mask = aeGetFileEvents(server.el, conn->c.fd);
int need_read = conn->c.read_handler || (conn->flags & TLS_CONN_FLAG_WRITE_WANT_READ);
int need_write = conn->c.write_handler || (conn->flags & TLS_CONN_FLAG_READ_WANT_WRITE);
if (need_read && !(mask & AE_READABLE))
aeCreateFileEvent(server.el, conn->c.fd, AE_READABLE, tlsEventHandler, conn);
if (!need_read && (mask & AE_READABLE)) aeDeleteFileEvent(server.el, conn->c.fd, AE_READABLE);
if (need_write && !(mask & AE_WRITABLE))
aeCreateFileEvent(server.el, conn->c.fd, AE_WRITABLE, tlsEventHandler, conn);
if (!need_write && (mask & AE_WRITABLE)) aeDeleteFileEvent(server.el, conn->c.fd, AE_WRITABLE);
}
static void updateSSLState(connection *conn_) {
tls_connection *conn = (tls_connection *)conn_;
updateSSLEvent(conn);
updatePendingData(conn);
}
static void tlsHandleEvent(tls_connection *conn, int mask) {
int ret, conn_error;
TLSCONN_DEBUG("tlsEventHandler(): fd=%d, state=%d, mask=%d, r=%d, w=%d, flags=%d", fd, conn->c.state, mask,
conn->c.read_handler != NULL, conn->c.write_handler != NULL, conn->flags);
ERR_clear_error();
switch (conn->c.state) {
case CONN_STATE_CONNECTING:
conn_error = anetGetError(conn->c.fd);
if (conn_error) {
conn->c.last_errno = conn_error;
conn->c.state = CONN_STATE_ERROR;
} else {
if (!(conn->flags & TLS_CONN_FLAG_FD_SET)) {
SSL_set_fd(conn->ssl, conn->c.fd);
conn->flags |= TLS_CONN_FLAG_FD_SET;
}
ret = SSL_connect(conn->ssl);
if (ret <= 0) {
WantIOType want = 0;
if (!handleSSLReturnCode(conn, ret, &want)) {
registerSSLEvent(conn, want);
/* Avoid hitting UpdateSSLEvent, which knows nothing
* of what SSL_connect() wants and instead looks at our
* R/W handlers.
*/
return;
}
/* If not handled, it's an error */
conn->c.state = CONN_STATE_ERROR;
} else {
conn->c.state = CONN_STATE_CONNECTED;
}
}
if (!callHandler((connection *)conn, conn->c.conn_handler)) return;
conn->c.conn_handler = NULL;
break;
case CONN_STATE_ACCEPTING:
ret = SSL_accept(conn->ssl);
if (ret <= 0) {
WantIOType want = 0;
if (!handleSSLReturnCode(conn, ret, &want)) {
/* Avoid hitting UpdateSSLEvent, which knows nothing
* of what SSL_connect() wants and instead looks at our
* R/W handlers.
*/
registerSSLEvent(conn, want);
return;
}
/* If not handled, it's an error */
conn->c.state = CONN_STATE_ERROR;
} else {
conn->c.state = CONN_STATE_CONNECTED;
}
if (!callHandler((connection *)conn, conn->c.conn_handler)) return;
conn->c.conn_handler = NULL;
break;
case CONN_STATE_CONNECTED: {
int call_read = ((mask & AE_READABLE) && conn->c.read_handler) ||
((mask & AE_WRITABLE) && (conn->flags & TLS_CONN_FLAG_READ_WANT_WRITE));
int call_write = ((mask & AE_WRITABLE) && conn->c.write_handler) ||
((mask & AE_READABLE) && (conn->flags & TLS_CONN_FLAG_WRITE_WANT_READ));
/* Normally we execute the readable event first, and the writable
* event laster. This is useful as sometimes we may be able
* to serve the reply of a query immediately after processing the
* query.
*
* However if WRITE_BARRIER is set in the mask, our application is
* asking us to do the reverse: never fire the writable event
* after the readable. In such a case, we invert the calls.
* This is useful when, for instance, we want to do things
* in the beforeSleep() hook, like fsynching a file to disk,
* before replying to a client. */
int invert = conn->c.flags & CONN_FLAG_WRITE_BARRIER;
if (!invert && call_read) {
conn->flags &= ~TLS_CONN_FLAG_READ_WANT_WRITE;
if (!callHandler((connection *)conn, conn->c.read_handler)) return;
}
/* Fire the writable event. */
if (call_write) {
conn->flags &= ~TLS_CONN_FLAG_WRITE_WANT_READ;
if (!callHandler((connection *)conn, conn->c.write_handler)) return;
}
/* If we have to invert the call, fire the readable event now
* after the writable one. */
if (invert && call_read) {
conn->flags &= ~TLS_CONN_FLAG_READ_WANT_WRITE;
if (!callHandler((connection *)conn, conn->c.read_handler)) return;
}
if (mask & AE_READABLE) {
updatePendingData(conn);
}
break;
}
default: break;
}
updateSSLEvent(conn);
}
static void tlsEventHandler(struct aeEventLoop *el, int fd, void *clientData, int mask) {
UNUSED(el);
UNUSED(fd);
tls_connection *conn = clientData;
tlsHandleEvent(conn, mask);
}
static void tlsAcceptHandler(aeEventLoop *el, int fd, void *privdata, int mask) {
int cport, cfd;
int max = server.max_new_tls_conns_per_cycle;
char cip[NET_IP_STR_LEN];
struct ClientFlags flags = {0};
UNUSED(el);
UNUSED(mask);
UNUSED(privdata);
while (max--) {
cfd = anetTcpAccept(server.neterr, fd, cip, sizeof(cip), &cport);
if (cfd == ANET_ERR) {
if (errno != EWOULDBLOCK) serverLog(LL_WARNING, "Accepting client connection: %s", server.neterr);
return;
}
serverLog(LL_VERBOSE, "Accepted %s:%d", cip, cport);
acceptCommonHandler(connCreateAcceptedTLS(cfd, &server.tls_auth_clients), flags, cip);
}
}
static int connTLSAddr(connection *conn, char *ip, size_t ip_len, int *port, int remote) {
return anetFdToString(conn->fd, ip, ip_len, port, remote);
}
static int connTLSIsLocal(connection *conn) {
return connectionTypeTcp()->is_local(conn);
}
static int connTLSListen(connListener *listener) {
return listenToPort(listener);
}
static void connTLSShutdown(connection *conn_) {
tls_connection *conn = (tls_connection *)conn_;
if (conn->ssl) {
if (conn->c.state == CONN_STATE_CONNECTED) SSL_shutdown(conn->ssl);
SSL_free(conn->ssl);
conn->ssl = NULL;
}
connectionTypeTcp()->shutdown(conn_);
}
static void connTLSClose(connection *conn_) {
tls_connection *conn = (tls_connection *)conn_;
if (conn->ssl) {
if (conn->c.state == CONN_STATE_CONNECTED) SSL_shutdown(conn->ssl);
SSL_free(conn->ssl);
conn->ssl = NULL;
}
if (conn->ssl_error) {
zfree(conn->ssl_error);
conn->ssl_error = NULL;
}
if (conn->pending_list_node) {
listDelNode(pending_list, conn->pending_list_node);
conn->pending_list_node = NULL;
}
connectionTypeTcp()->close(conn_);
}
static int connTLSAccept(connection *_conn, ConnectionCallbackFunc accept_handler) {
tls_connection *conn = (tls_connection *)_conn;
int ret;
if (conn->c.state != CONN_STATE_ACCEPTING) return C_ERR;
ERR_clear_error();
/* Try to accept */
conn->c.conn_handler = accept_handler;
ret = SSL_accept(conn->ssl);
if (ret <= 0) {
WantIOType want = 0;
if (!handleSSLReturnCode(conn, ret, &want)) {
registerSSLEvent(conn, want); /* We'll fire back */
return C_OK;
} else {
conn->c.state = CONN_STATE_ERROR;
return C_ERR;
}
}
conn->c.state = CONN_STATE_CONNECTED;
if (!callHandler((connection *)conn, conn->c.conn_handler)) return C_OK;
conn->c.conn_handler = NULL;
return C_OK;
}
static int connTLSConnect(connection *conn_,
const char *addr,
int port,
const char *src_addr,
ConnectionCallbackFunc connect_handler) {
tls_connection *conn = (tls_connection *)conn_;
unsigned char addr_buf[sizeof(struct in6_addr)];
if (conn->c.state != CONN_STATE_NONE) return C_ERR;
ERR_clear_error();
/* Check whether addr is an IP address, if not, use the value for Server Name Indication */
if (inet_pton(AF_INET, addr, addr_buf) != 1 && inet_pton(AF_INET6, addr, addr_buf) != 1) {
SSL_set_tlsext_host_name(conn->ssl, addr);
}
/* Initiate Socket connection first */
if (connectionTypeTcp()->connect(conn_, addr, port, src_addr, connect_handler) == C_ERR) return C_ERR;
/* Return now, once the socket is connected we'll initiate
* TLS connection from the event handler.
*/
return C_OK;
}
static int connTLSWrite(connection *conn_, const void *data, size_t data_len) {
tls_connection *conn = (tls_connection *)conn_;
int ret;
if (conn->c.state != CONN_STATE_CONNECTED) return -1;
ERR_clear_error();
ret = SSL_write(conn->ssl, data, data_len);
return updateStateAfterSSLIO(conn, ret, 1);
}
static int connTLSWritev(connection *conn_, const struct iovec *iov, int iovcnt) {
if (iovcnt == 1) return connTLSWrite(conn_, iov[0].iov_base, iov[0].iov_len);
/* Accumulate the amount of bytes of each buffer and check if it exceeds NET_MAX_WRITES_PER_EVENT. */
size_t iov_bytes_len = 0;
for (int i = 0; i < iovcnt; i++) {
iov_bytes_len += iov[i].iov_len;
if (iov_bytes_len > NET_MAX_WRITES_PER_EVENT) break;
}
/* The amount of all buffers is greater than NET_MAX_WRITES_PER_EVENT,
* which is not worth doing so much memory copying to reduce system calls,
* therefore, invoke connTLSWrite() multiple times to avoid memory copies. */
if (iov_bytes_len > NET_MAX_WRITES_PER_EVENT) {
ssize_t tot_sent = 0;
for (int i = 0; i < iovcnt; i++) {
ssize_t sent = connTLSWrite(conn_, iov[i].iov_base, iov[i].iov_len);
if (sent <= 0) return tot_sent > 0 ? tot_sent : sent;
tot_sent += sent;
if ((size_t)sent != iov[i].iov_len) break;
}
return tot_sent;
}
/* The amount of all buffers is less than NET_MAX_WRITES_PER_EVENT,
* which is worth doing more memory copies in exchange for fewer system calls,
* so concatenate these scattered buffers into a contiguous piece of memory
* and send it away by one call to connTLSWrite(). */
char buf[iov_bytes_len];
size_t offset = 0;
for (int i = 0; i < iovcnt; i++) {
memcpy(buf + offset, iov[i].iov_base, iov[i].iov_len);
offset += iov[i].iov_len;
}
return connTLSWrite(conn_, buf, iov_bytes_len);
}
static int connTLSRead(connection *conn_, void *buf, size_t buf_len) {
tls_connection *conn = (tls_connection *)conn_;
int ret;
if (conn->c.state != CONN_STATE_CONNECTED) return -1;
ERR_clear_error();
ret = SSL_read(conn->ssl, buf, buf_len);
return updateStateAfterSSLIO(conn, ret, 1);
}
static const char *connTLSGetLastError(connection *conn_) {
tls_connection *conn = (tls_connection *)conn_;
if (conn->ssl_error) return conn->ssl_error;
return NULL;
}
static int connTLSSetWriteHandler(connection *conn, ConnectionCallbackFunc func, int barrier) {
conn->write_handler = func;
if (barrier)
conn->flags |= CONN_FLAG_WRITE_BARRIER;
else
conn->flags &= ~CONN_FLAG_WRITE_BARRIER;
updateSSLEvent((tls_connection *)conn);
return C_OK;
}
static int connTLSSetReadHandler(connection *conn, ConnectionCallbackFunc func) {
conn->read_handler = func;
updateSSLEvent((tls_connection *)conn);
return C_OK;
}
static void setBlockingTimeout(tls_connection *conn, long long timeout) {
anetBlock(NULL, conn->c.fd);
anetSendTimeout(NULL, conn->c.fd, timeout);
anetRecvTimeout(NULL, conn->c.fd, timeout);
}
static void unsetBlockingTimeout(tls_connection *conn) {
anetNonBlock(NULL, conn->c.fd);
anetSendTimeout(NULL, conn->c.fd, 0);
anetRecvTimeout(NULL, conn->c.fd, 0);
}
static int connTLSBlockingConnect(connection *conn_, const char *addr, int port, long long timeout) {
tls_connection *conn = (tls_connection *)conn_;
int ret;
if (conn->c.state != CONN_STATE_NONE) return C_ERR;
/* Initiate socket blocking connect first */
if (connectionTypeTcp()->blocking_connect(conn_, addr, port, timeout) == C_ERR) return C_ERR;
/* Initiate TLS connection now. We set up a send/recv timeout on the socket,
* which means the specified timeout will not be enforced accurately. */
SSL_set_fd(conn->ssl, conn->c.fd);
setBlockingTimeout(conn, timeout);
if ((ret = SSL_connect(conn->ssl)) <= 0) {
conn->c.state = CONN_STATE_ERROR;
return C_ERR;
}
unsetBlockingTimeout(conn);
conn->c.state = CONN_STATE_CONNECTED;
return C_OK;
}
static ssize_t connTLSSyncWrite(connection *conn_, char *ptr, ssize_t size, long long timeout) {
tls_connection *conn = (tls_connection *)conn_;
setBlockingTimeout(conn, timeout);
SSL_clear_mode(conn->ssl, SSL_MODE_ENABLE_PARTIAL_WRITE);
ERR_clear_error();
int ret = SSL_write(conn->ssl, ptr, size);
ret = updateStateAfterSSLIO(conn, ret, 0);
SSL_set_mode(conn->ssl, SSL_MODE_ENABLE_PARTIAL_WRITE);
unsetBlockingTimeout(conn);
return ret;
}
static ssize_t connTLSSyncRead(connection *conn_, char *ptr, ssize_t size, long long timeout) {
tls_connection *conn = (tls_connection *)conn_;
setBlockingTimeout(conn, timeout);
ERR_clear_error();
int ret = SSL_read(conn->ssl, ptr, size);
ret = updateStateAfterSSLIO(conn, ret, 0);
unsetBlockingTimeout(conn);
return ret;
}
static ssize_t connTLSSyncReadLine(connection *conn_, char *ptr, ssize_t size, long long timeout) {
tls_connection *conn = (tls_connection *)conn_;
ssize_t nread = 0;
setBlockingTimeout(conn, timeout);
size--;
while (size) {
char c;
ERR_clear_error();
int ret = SSL_read(conn->ssl, &c, 1);
ret = updateStateAfterSSLIO(conn, ret, 0);
if (ret <= 0) {
nread = -1;
goto exit;
}
if (c == '\n') {
*ptr = '\0';
if (nread && *(ptr - 1) == '\r') *(ptr - 1) = '\0';
goto exit;
} else {
*ptr++ = c;
*ptr = '\0';
nread++;
}
size--;
}
exit:
unsetBlockingTimeout(conn);
return nread;
}
static const char *connTLSGetType(connection *conn_) {
(void)conn_;
return CONN_TYPE_TLS;
}
static int tlsHasPendingData(void) {
if (!pending_list) return 0;
return listLength(pending_list) > 0;
}
static int tlsProcessPendingData(void) {
listIter li;
listNode *ln;
int processed = 0;
listRewind(pending_list, &li);
while ((ln = listNext(&li))) {
tls_connection *conn = listNodeValue(ln);
if (conn->flags & TLS_CONN_FLAG_POSTPONE_UPDATE_STATE) continue;
tlsHandleEvent(conn, AE_READABLE);
processed++;
}
return processed;
}
/* Fetch the peer certificate used for authentication on the specified
* connection and return it as a PEM-encoded sds.
*/
static sds connTLSGetPeerCert(connection *conn_) {
tls_connection *conn = (tls_connection *)conn_;
if ((conn_->type != connectionTypeTls()) || !conn->ssl) return NULL;
X509 *cert = SSL_get_peer_certificate(conn->ssl);
if (!cert) return NULL;
BIO *bio = BIO_new(BIO_s_mem());
if (bio == NULL || !PEM_write_bio_X509(bio, cert)) {
if (bio != NULL) BIO_free(bio);
return NULL;
}
const char *bio_ptr;
long long bio_len = BIO_get_mem_data(bio, &bio_ptr);
sds cert_pem = sdsnewlen(bio_ptr, bio_len);
BIO_free(bio);
return cert_pem;
}
static ConnectionType CT_TLS = {
/* connection type */
.get_type = connTLSGetType,
/* connection type initialize & finalize & configure */
.init = tlsInit,
.cleanup = tlsCleanup,
.configure = tlsConfigure,
/* ae & accept & listen & error & address handler */
.ae_handler = tlsEventHandler,
.accept_handler = tlsAcceptHandler,
.addr = connTLSAddr,
.is_local = connTLSIsLocal,
.listen = connTLSListen,
/* create/shutdown/close connection */
.conn_create = connCreateTLS,
.conn_create_accepted = connCreateAcceptedTLS,
.shutdown = connTLSShutdown,
.close = connTLSClose,
/* connect & accept */
.connect = connTLSConnect,
.blocking_connect = connTLSBlockingConnect,
.accept = connTLSAccept,
/* IO */
.read = connTLSRead,
.write = connTLSWrite,
.writev = connTLSWritev,
.set_write_handler = connTLSSetWriteHandler,
.set_read_handler = connTLSSetReadHandler,
.get_last_error = connTLSGetLastError,
.sync_write = connTLSSyncWrite,
.sync_read = connTLSSyncRead,
.sync_readline = connTLSSyncReadLine,
/* pending data */
.has_pending_data = tlsHasPendingData,
.process_pending_data = tlsProcessPendingData,
.postpone_update_state = postPoneUpdateSSLState,
.update_state = updateSSLState,
/* TLS specified methods */
.get_peer_cert = connTLSGetPeerCert,
};
int RedisRegisterConnectionTypeTLS(void) {
return connTypeRegister(&CT_TLS);
}
#else /* USE_OPENSSL */
int RedisRegisterConnectionTypeTLS(void) {
serverLog(LL_VERBOSE, "Connection type %s not builtin", CONN_TYPE_TLS);
return C_ERR;
}
#endif
#if BUILD_TLS_MODULE == 2 /* BUILD_MODULE */
#include "release.h"
int ValkeyModule_OnLoad(void *ctx, ValkeyModuleString **argv, int argc) {
UNUSED(argv);
UNUSED(argc);
/* Connection modules must be part of the same build as the server. */
if (strcmp(REDIS_BUILD_ID_RAW, serverBuildIdRaw())) {
serverLog(LL_NOTICE, "Connection type %s was not built together with the valkey-server used.", CONN_TYPE_TLS);
return VALKEYMODULE_ERR;
}
if (ValkeyModule_Init(ctx, "tls", 1, VALKEYMODULE_APIVER_1) == VALKEYMODULE_ERR) return VALKEYMODULE_ERR;
/* Connection modules is available only bootup. */
if ((ValkeyModule_GetContextFlags(ctx) & VALKEYMODULE_CTX_FLAGS_SERVER_STARTUP) == 0) {
serverLog(LL_NOTICE, "Connection type %s can be loaded only during bootup", CONN_TYPE_TLS);
return VALKEYMODULE_ERR;
}
ValkeyModule_SetModuleOptions(ctx, VALKEYMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD);
if (connTypeRegister(&CT_TLS) != C_OK) return VALKEYMODULE_ERR;
return VALKEYMODULE_OK;
}
int ValkeyModule_OnUnload(void *arg) {
UNUSED(arg);
serverLog(LL_NOTICE, "Connection type %s can not be unloaded", CONN_TYPE_TLS);
return VALKEYMODULE_ERR;
}
#endif