diff --git a/README.md b/README.md index 8dbad7dbf..2b4eeb19b 100644 --- a/README.md +++ b/README.md @@ -119,7 +119,7 @@ parameter (the path of the configuration file): It is possible to alter the Redis configuration by passing parameters directly as options using the command line. Examples: - % ./redis-server --port 9999 --slaveof 127.0.0.1 6379 + % ./redis-server --port 9999 --replicaof 127.0.0.1 6379 % ./redis-server /etc/redis/6379.conf --loglevel debug All the options in redis.conf are also supported as options using the command @@ -216,7 +216,7 @@ Inside the root are the following important directories: * `src`: contains the Redis implementation, written in C. * `tests`: contains the unit tests, implemented in Tcl. -* `deps`: contains libraries Redis uses. Everything needed to compile Redis is inside this directory; your system just needs to provide `libc`, a POSIX compatible interface and a C compiler. Notably `deps` contains a copy of `jemalloc`, which is the default allocator of Redis under Linux. Note that under `deps` there are also things which started with the Redis project, but for which the main repository is not `anitrez/redis`. An exception to this rule is `deps/geohash-int` which is the low level geocoding library used by Redis: it originated from a different project, but at this point it diverged so much that it is developed as a separated entity directly inside the Redis repository. +* `deps`: contains libraries Redis uses. Everything needed to compile Redis is inside this directory; your system just needs to provide `libc`, a POSIX compatible interface and a C compiler. Notably `deps` contains a copy of `jemalloc`, which is the default allocator of Redis under Linux. Note that under `deps` there are also things which started with the Redis project, but for which the main repository is not `antirez/redis`. There are a few more directories but they are not very important for our goals here. We'll focus mostly on `src`, where the Redis implementation is contained, @@ -227,7 +227,7 @@ of complexity incrementally. Note: lately Redis was refactored quite a bit. Function names and file names have been changed, so you may find that this documentation reflects the `unstable` branch more closely. For instance in Redis 3.0 the `server.c` -and `server.h` files were named to `redis.c` and `redis.h`. However the overall +and `server.h` files were named `redis.c` and `redis.h`. However the overall structure is the same. Keep in mind that all the new developments and pull requests should be performed against the `unstable` branch. @@ -245,7 +245,7 @@ A few important fields in this structure are: * `server.db` is an array of Redis databases, where data is stored. * `server.commands` is the command table. * `server.clients` is a linked list of clients connected to the server. -* `server.master` is a special client, the master, if the instance is a slave. +* `server.master` is a special client, the master, if the instance is a replica. There are tons of other fields. Most fields are commented directly inside the structure definition. @@ -323,7 +323,7 @@ Inside server.c you can find code that handles other vital things of the Redis s networking.c --- -This file defines all the I/O functions with clients, masters and slaves +This file defines all the I/O functions with clients, masters and replicas (which in Redis are just special clients): * `createClient()` allocates and initializes a new client. @@ -390,16 +390,16 @@ replication.c This is one of the most complex files inside Redis, it is recommended to approach it only after getting a bit familiar with the rest of the code base. -In this file there is the implementation of both the master and slave role +In this file there is the implementation of both the master and replica role of Redis. -One of the most important functions inside this file is `replicationFeedSlaves()` that writes commands to the clients representing slave instances connected -to our master, so that the slaves can get the writes performed by the clients: +One of the most important functions inside this file is `replicationFeedSlaves()` that writes commands to the clients representing replica instances connected +to our master, so that the replicas can get the writes performed by the clients: this way their data set will remain synchronized with the one in the master. This file also implements both the `SYNC` and `PSYNC` commands that are used in order to perform the first synchronization between masters and -slaves, or to continue the replication after a disconnection. +replicas, or to continue the replication after a disconnection. Other C files --- diff --git a/deps/README.md b/deps/README.md index 367ee1627..685dbb40d 100644 --- a/deps/README.md +++ b/deps/README.md @@ -2,7 +2,6 @@ This directory contains all Redis dependencies, except for the libc that should be provided by the operating system. * **Jemalloc** is our memory allocator, used as replacement for libc malloc on Linux by default. It has good performances and excellent fragmentation behavior. This component is upgraded from time to time. -* **geohash-int** is inside the dependencies directory but is actually part of the Redis project, since it is our private fork (heavily modified) of a library initially developed for Ardb, which is in turn a fork of Redis. * **hiredis** is the official C client library for Redis. It is used by redis-cli, redis-benchmark and Redis Sentinel. It is part of the Redis official ecosystem but is developed externally from the Redis repository, so we just upgrade it as needed. * **linenoise** is a readline replacement. It is developed by the same authors of Redis but is managed as a separated project and updated as needed. * **lua** is Lua 5.1 with minor changes for security and additional libraries. @@ -42,11 +41,6 @@ the following additional steps: changed, otherwise you could just copy the old implementation if you are upgrading just to a similar version of Jemalloc. -Geohash ---- - -This is never upgraded since it's part of the Redis project. If there are changes to merge from Ardb there is the need to manually check differences, but at this point the source code is pretty different. - Hiredis --- diff --git a/deps/hiredis/.travis.yml b/deps/hiredis/.travis.yml index ad08076d8..faf2ce684 100644 --- a/deps/hiredis/.travis.yml +++ b/deps/hiredis/.travis.yml @@ -8,6 +8,12 @@ os: - linux - osx +branches: + only: + - staging + - trying + - master + before_script: - if [ "$TRAVIS_OS_NAME" == "osx" ] ; then brew update; brew install redis; fi diff --git a/deps/hiredis/CHANGELOG.md b/deps/hiredis/CHANGELOG.md index f92bcb3c9..a7fe3ac11 100644 --- a/deps/hiredis/CHANGELOG.md +++ b/deps/hiredis/CHANGELOG.md @@ -1,7 +1,51 @@ ### 1.0.0 (unreleased) -**Fixes**: +**BREAKING CHANGES**: +* Bulk and multi-bulk lengths less than -1 or greater than `LLONG_MAX` are now + protocol errors. This is consistent with the RESP specification. On 32-bit + platforms, the upper bound is lowered to `SIZE_MAX`. + +* Change `redisReply.len` to `size_t`, as it denotes the the size of a string + + User code should compare this to `size_t` values as well. If it was used to + compare to other values, casting might be necessary or can be removed, if + casting was applied before. + +### 0.14.0 (2018-09-25) + +* Make string2ll static to fix conflict with Redis (Tom Lee [c3188b]) +* Use -dynamiclib instead of -shared for OSX (Ryan Schmidt [a65537]) +* Use string2ll from Redis w/added tests (Michael Grunder [7bef04, 60f622]) +* Makefile - OSX compilation fixes (Ryan Schmidt [881fcb, 0e9af8]) +* Remove redundant NULL checks (Justin Brewer [54acc8, 58e6b8]) +* Fix bulk and multi-bulk length truncation (Justin Brewer [109197]) +* Fix SIGSEGV in OpenBSD by checking for NULL before calling freeaddrinfo (Justin Brewer [546d94]) +* Several POSIX compatibility fixes (Justin Brewer [bbeab8, 49bbaa, d1c1b6]) +* Makefile - Compatibility fixes (Dimitri Vorobiev [3238cf, 12a9d1]) +* Makefile - Fix make install on FreeBSD (Zach Shipko [a2ef2b]) +* Makefile - don't assume $(INSTALL) is cp (Igor Gnatenko [725a96]) +* Separate side-effect causing function from assert and small cleanup (amallia [b46413, 3c3234]) +* Don't send negative values to `__redisAsyncCommand` (Frederik Deweerdt [706129]) +* Fix leak if setsockopt fails (Frederik Deweerdt [e21c9c]) +* Fix libevent leak (zfz [515228]) +* Clean up GCC warning (Ichito Nagata [2ec774]) +* Keep track of errno in `__redisSetErrorFromErrno()` as snprintf may use it (Jin Qing [25cd88]) +* Solaris compilation fix (Donald Whyte [41b07d]) +* Reorder linker arguments when building examples (Tustfarm-heart [06eedd]) +* Keep track of subscriptions in case of rapid subscribe/unsubscribe (Hyungjin Kim [073dc8, be76c5, d46999]) +* libuv use after free fix (Paul Scott [cbb956]) +* Properly close socket fd on reconnect attempt (WSL [64d1ec]) +* Skip valgrind in OSX tests (Jan-Erik Rediger [9deb78]) +* Various updates for Travis testing OSX (Ted Nyman [fa3774, 16a459, bc0ea5]) +* Update libevent (Chris Xin [386802]) +* Change sds.h for building in C++ projects (Ali Volkan ATLI [f5b32e]) +* Use proper format specifier in redisFormatSdsCommandArgv (Paulino Huerta, Jan-Erik Rediger [360a06, 8655a6]) +* Better handling of NULL reply in example code (Jan-Erik Rediger [1b8ed3]) +* Prevent overflow when formatting an error (Jan-Erik Rediger [0335cb]) +* Compatibility fix for strerror_r (Tom Lee [bb1747]) +* Properly detect integer parse/overflow errors (Justin Brewer [93421f]) +* Adds CI for Windows and cygwin fixes (owent, [6c53d6, 6c3e40]) * Catch a buffer overflow when formatting the error message * Import latest upstream sds. This breaks applications that are linked against the old hiredis v0.13 * Fix warnings, when compiled with -Wshadow @@ -9,11 +53,6 @@ **BREAKING CHANGES**: -* Change `redisReply.len` to `size_t`, as it denotes the the size of a string - -User code should compare this to `size_t` values as well. -If it was used to compare to other values, casting might be necessary or can be removed, if casting was applied before. - * Remove backwards compatibility macro's This removes the following old function aliases, use the new name now: @@ -94,7 +133,7 @@ The parser, standalone since v0.12.0, can now be compiled on Windows * Add IPv6 support -* Remove possiblity of multiple close on same fd +* Remove possibility of multiple close on same fd * Add ability to bind source address on connect diff --git a/deps/hiredis/Makefile b/deps/hiredis/Makefile index 9a4de8360..06ca99468 100644 --- a/deps/hiredis/Makefile +++ b/deps/hiredis/Makefile @@ -36,13 +36,13 @@ endef export REDIS_TEST_CONFIG # Fallback to gcc when $CC is not in $PATH. -CC:=$(shell sh -c 'type $(CC) >/dev/null 2>/dev/null && echo $(CC) || echo gcc') -CXX:=$(shell sh -c 'type $(CXX) >/dev/null 2>/dev/null && echo $(CXX) || echo g++') +CC:=$(shell sh -c 'type $${CC%% *} >/dev/null 2>/dev/null && echo $(CC) || echo gcc') +CXX:=$(shell sh -c 'type $${CXX%% *} >/dev/null 2>/dev/null && echo $(CXX) || echo g++') OPTIMIZATION?=-O3 WARNINGS=-Wall -W -Wstrict-prototypes -Wwrite-strings DEBUG_FLAGS?= -g -ggdb -REAL_CFLAGS=$(OPTIMIZATION) -fPIC $(CFLAGS) $(WARNINGS) $(DEBUG_FLAGS) $(ARCH) -REAL_LDFLAGS=$(LDFLAGS) $(ARCH) +REAL_CFLAGS=$(OPTIMIZATION) -fPIC $(CPPFLAGS) $(CFLAGS) $(WARNINGS) $(DEBUG_FLAGS) +REAL_LDFLAGS=$(LDFLAGS) DYLIBSUFFIX=so STLIBSUFFIX=a @@ -58,12 +58,11 @@ uname_S := $(shell sh -c 'uname -s 2>/dev/null || echo not') ifeq ($(uname_S),SunOS) REAL_LDFLAGS+= -ldl -lnsl -lsocket DYLIB_MAKE_CMD=$(CC) -G -o $(DYLIBNAME) -h $(DYLIB_MINOR_NAME) $(LDFLAGS) - INSTALL= cp -r endif ifeq ($(uname_S),Darwin) DYLIBSUFFIX=dylib DYLIB_MINOR_NAME=$(LIBNAME).$(HIREDIS_SONAME).$(DYLIBSUFFIX) - DYLIB_MAKE_CMD=$(CC) -shared -Wl,-install_name,$(DYLIB_MINOR_NAME) -o $(DYLIBNAME) $(LDFLAGS) + DYLIB_MAKE_CMD=$(CC) -dynamiclib -Wl,-install_name,$(PREFIX)/$(LIBRARY_PATH)/$(DYLIB_MINOR_NAME) -o $(DYLIBNAME) $(LDFLAGS) endif all: $(DYLIBNAME) $(STLIBNAME) hiredis-test $(PKGCONFNAME) @@ -94,7 +93,7 @@ hiredis-example-libev: examples/example-libev.c adapters/libev.h $(STLIBNAME) $(CC) -o examples/$@ $(REAL_CFLAGS) $(REAL_LDFLAGS) -I. $< -lev $(STLIBNAME) hiredis-example-glib: examples/example-glib.c adapters/glib.h $(STLIBNAME) - $(CC) -o examples/$@ $(REAL_CFLAGS) $(REAL_LDFLAGS) $(shell pkg-config --cflags --libs glib-2.0) -I. $< $(STLIBNAME) + $(CC) -o examples/$@ $(REAL_CFLAGS) $(REAL_LDFLAGS) -I. $< $(shell pkg-config --cflags --libs glib-2.0) $(STLIBNAME) hiredis-example-ivykis: examples/example-ivykis.c adapters/ivykis.h $(STLIBNAME) $(CC) -o examples/$@ $(REAL_CFLAGS) $(REAL_LDFLAGS) -I. $< -livykis $(STLIBNAME) @@ -161,11 +160,7 @@ clean: dep: $(CC) -MM *.c -ifeq ($(uname_S),SunOS) - INSTALL?= cp -r -endif - -INSTALL?= cp -a +INSTALL?= cp -pPR $(PKGCONFNAME): hiredis.h @echo "Generating $@ for pkgconfig..." @@ -181,8 +176,9 @@ $(PKGCONFNAME): hiredis.h @echo Cflags: -I\$${includedir} -D_FILE_OFFSET_BITS=64 >> $@ install: $(DYLIBNAME) $(STLIBNAME) $(PKGCONFNAME) - mkdir -p $(INSTALL_INCLUDE_PATH) $(INSTALL_LIBRARY_PATH) - $(INSTALL) hiredis.h async.h read.h sds.h adapters $(INSTALL_INCLUDE_PATH) + mkdir -p $(INSTALL_INCLUDE_PATH) $(INSTALL_INCLUDE_PATH)/adapters $(INSTALL_LIBRARY_PATH) + $(INSTALL) hiredis.h async.h read.h sds.h $(INSTALL_INCLUDE_PATH) + $(INSTALL) adapters/*.h $(INSTALL_INCLUDE_PATH)/adapters $(INSTALL) $(DYLIBNAME) $(INSTALL_LIBRARY_PATH)/$(DYLIB_MINOR_NAME) cd $(INSTALL_LIBRARY_PATH) && ln -sf $(DYLIB_MINOR_NAME) $(DYLIBNAME) $(INSTALL) $(STLIBNAME) $(INSTALL_LIBRARY_PATH) diff --git a/deps/hiredis/adapters/libevent.h b/deps/hiredis/adapters/libevent.h index 273d8b2dd..7d2bef180 100644 --- a/deps/hiredis/adapters/libevent.h +++ b/deps/hiredis/adapters/libevent.h @@ -73,8 +73,8 @@ static void redisLibeventDelWrite(void *privdata) { static void redisLibeventCleanup(void *privdata) { redisLibeventEvents *e = (redisLibeventEvents*)privdata; - event_del(e->rev); - event_del(e->wev); + event_free(e->rev); + event_free(e->wev); free(e); } diff --git a/deps/hiredis/adapters/libuv.h b/deps/hiredis/adapters/libuv.h index ff08c25e1..39ef7cf5e 100644 --- a/deps/hiredis/adapters/libuv.h +++ b/deps/hiredis/adapters/libuv.h @@ -15,15 +15,12 @@ typedef struct redisLibuvEvents { static void redisLibuvPoll(uv_poll_t* handle, int status, int events) { redisLibuvEvents* p = (redisLibuvEvents*)handle->data; + int ev = (status ? p->events : events); - if (status != 0) { - return; - } - - if (p->context != NULL && (events & UV_READABLE)) { + if (p->context != NULL && (ev & UV_READABLE)) { redisAsyncHandleRead(p->context); } - if (p->context != NULL && (events & UV_WRITABLE)) { + if (p->context != NULL && (ev & UV_WRITABLE)) { redisAsyncHandleWrite(p->context); } } diff --git a/deps/hiredis/appveyor.yml b/deps/hiredis/appveyor.yml index 06bbef117..819efbd58 100644 --- a/deps/hiredis/appveyor.yml +++ b/deps/hiredis/appveyor.yml @@ -1,24 +1,13 @@ # Appveyor configuration file for CI build of hiredis on Windows (under Cygwin) environment: matrix: - - CYG_ROOT: C:\cygwin64 - CYG_SETUP: setup-x86_64.exe - CYG_MIRROR: http://cygwin.mirror.constant.com - CYG_CACHE: C:\cygwin64\var\cache\setup - CYG_BASH: C:\cygwin64\bin\bash + - CYG_BASH: C:\cygwin64\bin\bash CC: gcc - - CYG_ROOT: C:\cygwin - CYG_SETUP: setup-x86.exe - CYG_MIRROR: http://cygwin.mirror.constant.com - CYG_CACHE: C:\cygwin\var\cache\setup - CYG_BASH: C:\cygwin\bin\bash + - CYG_BASH: C:\cygwin\bin\bash CC: gcc TARGET: 32bit TARGET_VARS: 32bit-vars -# Cache Cygwin files to speed up build -cache: - - '%CYG_CACHE%' clone_depth: 1 # Attempt to ensure we don't try to convert line endings to Win32 CRLF as this will cause build to fail @@ -27,8 +16,6 @@ init: # Install needed build dependencies install: - - ps: 'Start-FileDownload "http://cygwin.com/$env:CYG_SETUP" -FileName "$env:CYG_SETUP"' - - '%CYG_SETUP% --quiet-mode --no-shortcuts --only-site --root "%CYG_ROOT%" --site "%CYG_MIRROR%" --local-package-dir "%CYG_CACHE%" --packages automake,bison,gcc-core,libtool,make,gettext-devel,gettext,intltool,pkg-config,clang,llvm > NUL 2>&1' - '%CYG_BASH% -lc "cygcheck -dc cygwin"' build_script: diff --git a/deps/hiredis/async.c b/deps/hiredis/async.c index d955203f8..0cecd30d9 100644 --- a/deps/hiredis/async.c +++ b/deps/hiredis/async.c @@ -336,7 +336,8 @@ static void __redisAsyncDisconnect(redisAsyncContext *ac) { if (ac->err == 0) { /* For clean disconnects, there should be no pending callbacks. */ - assert(__redisShiftCallback(&ac->replies,NULL) == REDIS_ERR); + int ret = __redisShiftCallback(&ac->replies,NULL); + assert(ret == REDIS_ERR); } else { /* Disconnection is caused by an error, make sure that pending * callbacks cannot call new commands. */ @@ -364,6 +365,7 @@ void redisAsyncDisconnect(redisAsyncContext *ac) { static int __redisGetSubscribeCallback(redisAsyncContext *ac, redisReply *reply, redisCallback *dstcb) { redisContext *c = &(ac->c); dict *callbacks; + redisCallback *cb; dictEntry *de; int pvariant; char *stype; @@ -387,16 +389,28 @@ static int __redisGetSubscribeCallback(redisAsyncContext *ac, redisReply *reply, sname = sdsnewlen(reply->element[1]->str,reply->element[1]->len); de = dictFind(callbacks,sname); if (de != NULL) { - memcpy(dstcb,dictGetEntryVal(de),sizeof(*dstcb)); + cb = dictGetEntryVal(de); + + /* If this is an subscribe reply decrease pending counter. */ + if (strcasecmp(stype+pvariant,"subscribe") == 0) { + cb->pending_subs -= 1; + } + + memcpy(dstcb,cb,sizeof(*dstcb)); /* If this is an unsubscribe message, remove it. */ if (strcasecmp(stype+pvariant,"unsubscribe") == 0) { - dictDelete(callbacks,sname); + if (cb->pending_subs == 0) + dictDelete(callbacks,sname); /* If this was the last unsubscribe message, revert to * non-subscribe mode. */ assert(reply->element[2]->type == REDIS_REPLY_INTEGER); - if (reply->element[2]->integer == 0) + + /* Unset subscribed flag only when no pipelined pending subscribe. */ + if (reply->element[2]->integer == 0 + && dictSize(ac->sub.channels) == 0 + && dictSize(ac->sub.patterns) == 0) c->flags &= ~REDIS_SUBSCRIBED; } } @@ -410,7 +424,7 @@ static int __redisGetSubscribeCallback(redisAsyncContext *ac, redisReply *reply, void redisProcessCallbacks(redisAsyncContext *ac) { redisContext *c = &(ac->c); - redisCallback cb = {NULL, NULL, NULL}; + redisCallback cb = {NULL, NULL, 0, NULL}; void *reply = NULL; int status; @@ -492,22 +506,22 @@ void redisProcessCallbacks(redisAsyncContext *ac) { * write event fires. When connecting was not successful, the connect callback * is called with a REDIS_ERR status and the context is free'd. */ static int __redisAsyncHandleConnect(redisAsyncContext *ac) { + int completed = 0; redisContext *c = &(ac->c); - - if (redisCheckSocketError(c) == REDIS_ERR) { - /* Try again later when connect(2) is still in progress. */ - if (errno == EINPROGRESS) - return REDIS_OK; - - if (ac->onConnect) ac->onConnect(ac,REDIS_ERR); + if (redisCheckConnectDone(c, &completed) == REDIS_ERR) { + /* Error! */ + redisCheckSocketError(c); + if (ac->onConnect) ac->onConnect(ac, REDIS_ERR); __redisAsyncDisconnect(ac); return REDIS_ERR; + } else if (completed == 1) { + /* connected! */ + if (ac->onConnect) ac->onConnect(ac, REDIS_OK); + c->flags |= REDIS_CONNECTED; + return REDIS_OK; + } else { + return REDIS_OK; } - - /* Mark context as connected. */ - c->flags |= REDIS_CONNECTED; - if (ac->onConnect) ac->onConnect(ac,REDIS_OK); - return REDIS_OK; } /* This function should be called when the socket is readable. @@ -583,6 +597,9 @@ static const char *nextArgument(const char *start, const char **str, size_t *len static int __redisAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void *privdata, const char *cmd, size_t len) { redisContext *c = &(ac->c); redisCallback cb; + struct dict *cbdict; + dictEntry *de; + redisCallback *existcb; int pvariant, hasnext; const char *cstr, *astr; size_t clen, alen; @@ -596,6 +613,7 @@ static int __redisAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void /* Setup callback */ cb.fn = fn; cb.privdata = privdata; + cb.pending_subs = 1; /* Find out which command will be appended. */ p = nextArgument(cmd,&cstr,&clen); @@ -612,9 +630,18 @@ static int __redisAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void while ((p = nextArgument(p,&astr,&alen)) != NULL) { sname = sdsnewlen(astr,alen); if (pvariant) - ret = dictReplace(ac->sub.patterns,sname,&cb); + cbdict = ac->sub.patterns; else - ret = dictReplace(ac->sub.channels,sname,&cb); + cbdict = ac->sub.channels; + + de = dictFind(cbdict,sname); + + if (de != NULL) { + existcb = dictGetEntryVal(de); + cb.pending_subs = existcb->pending_subs + 1; + } + + ret = dictReplace(cbdict,sname,&cb); if (ret == 0) sdsfree(sname); } @@ -676,6 +703,8 @@ int redisAsyncCommandArgv(redisAsyncContext *ac, redisCallbackFn *fn, void *priv int len; int status; len = redisFormatSdsCommandArgv(&cmd,argc,argv,argvlen); + if (len < 0) + return REDIS_ERR; status = __redisAsyncCommand(ac,fn,privdata,cmd,len); sdsfree(cmd); return status; diff --git a/deps/hiredis/async.h b/deps/hiredis/async.h index 59cbf469b..740555c24 100644 --- a/deps/hiredis/async.h +++ b/deps/hiredis/async.h @@ -45,6 +45,7 @@ typedef void (redisCallbackFn)(struct redisAsyncContext*, void*, void*); typedef struct redisCallback { struct redisCallback *next; /* simple singly linked list */ redisCallbackFn *fn; + int pending_subs; void *privdata; } redisCallback; @@ -92,6 +93,10 @@ typedef struct redisAsyncContext { /* Regular command callbacks */ redisCallbackList replies; + /* Address used for connect() */ + struct sockaddr *saddr; + size_t addrlen; + /* Subscription callbacks */ struct { redisCallbackList invalid; diff --git a/deps/hiredis/fmacros.h b/deps/hiredis/fmacros.h index 9a56643df..3227faafd 100644 --- a/deps/hiredis/fmacros.h +++ b/deps/hiredis/fmacros.h @@ -1,25 +1,12 @@ #ifndef __HIREDIS_FMACRO_H #define __HIREDIS_FMACRO_H -#if defined(__linux__) -#define _BSD_SOURCE -#define _DEFAULT_SOURCE -#endif - -#if defined(__CYGWIN__) -#include -#endif - -#if defined(__sun__) -#define _POSIX_C_SOURCE 200112L -#else -#if !(defined(__APPLE__) && defined(__MACH__)) && !(defined(__FreeBSD__)) #define _XOPEN_SOURCE 600 -#endif -#endif +#define _POSIX_C_SOURCE 200112L #if defined(__APPLE__) && defined(__MACH__) -#define _OSX +/* Enable TCP_KEEPALIVE */ +#define _DARWIN_C_SOURCE #endif #endif diff --git a/deps/hiredis/hiredis.c b/deps/hiredis/hiredis.c index 18bdfc99c..0947d1ed7 100644 --- a/deps/hiredis/hiredis.c +++ b/deps/hiredis/hiredis.c @@ -47,7 +47,9 @@ static redisReply *createReplyObject(int type); static void *createStringObject(const redisReadTask *task, char *str, size_t len); static void *createArrayObject(const redisReadTask *task, int elements); static void *createIntegerObject(const redisReadTask *task, long long value); +static void *createDoubleObject(const redisReadTask *task, double value, char *str, size_t len); static void *createNilObject(const redisReadTask *task); +static void *createBoolObject(const redisReadTask *task, int bval); /* Default set of functions to build the reply. Keep in mind that such a * function returning NULL is interpreted as OOM. */ @@ -55,7 +57,9 @@ static redisReplyObjectFunctions defaultFunctions = { createStringObject, createArrayObject, createIntegerObject, + createDoubleObject, createNilObject, + createBoolObject, freeReplyObject }; @@ -82,18 +86,19 @@ void freeReplyObject(void *reply) { case REDIS_REPLY_INTEGER: break; /* Nothing to free */ case REDIS_REPLY_ARRAY: + case REDIS_REPLY_MAP: + case REDIS_REPLY_SET: if (r->element != NULL) { for (j = 0; j < r->elements; j++) - if (r->element[j] != NULL) - freeReplyObject(r->element[j]); + freeReplyObject(r->element[j]); free(r->element); } break; case REDIS_REPLY_ERROR: case REDIS_REPLY_STATUS: case REDIS_REPLY_STRING: - if (r->str != NULL) - free(r->str); + case REDIS_REPLY_DOUBLE: + free(r->str); break; } free(r); @@ -125,7 +130,9 @@ static void *createStringObject(const redisReadTask *task, char *str, size_t len if (task->parent) { parent = task->parent->obj; - assert(parent->type == REDIS_REPLY_ARRAY); + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); parent->element[task->idx] = r; } return r; @@ -134,7 +141,7 @@ static void *createStringObject(const redisReadTask *task, char *str, size_t len static void *createArrayObject(const redisReadTask *task, int elements) { redisReply *r, *parent; - r = createReplyObject(REDIS_REPLY_ARRAY); + r = createReplyObject(task->type); if (r == NULL) return NULL; @@ -150,7 +157,9 @@ static void *createArrayObject(const redisReadTask *task, int elements) { if (task->parent) { parent = task->parent->obj; - assert(parent->type == REDIS_REPLY_ARRAY); + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); parent->element[task->idx] = r; } return r; @@ -167,7 +176,41 @@ static void *createIntegerObject(const redisReadTask *task, long long value) { if (task->parent) { parent = task->parent->obj; - assert(parent->type == REDIS_REPLY_ARRAY); + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); + parent->element[task->idx] = r; + } + return r; +} + +static void *createDoubleObject(const redisReadTask *task, double value, char *str, size_t len) { + redisReply *r, *parent; + + r = createReplyObject(REDIS_REPLY_DOUBLE); + if (r == NULL) + return NULL; + + r->dval = value; + r->str = malloc(len+1); + if (r->str == NULL) { + freeReplyObject(r); + return NULL; + } + + /* The double reply also has the original protocol string representing a + * double as a null terminated string. This way the caller does not need + * to format back for string conversion, especially since Redis does efforts + * to make the string more human readable avoiding the calssical double + * decimal string conversion artifacts. */ + memcpy(r->str, str, len); + r->str[len] = '\0'; + + if (task->parent) { + parent = task->parent->obj; + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); parent->element[task->idx] = r; } return r; @@ -182,7 +225,28 @@ static void *createNilObject(const redisReadTask *task) { if (task->parent) { parent = task->parent->obj; - assert(parent->type == REDIS_REPLY_ARRAY); + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); + parent->element[task->idx] = r; + } + return r; +} + +static void *createBoolObject(const redisReadTask *task, int bval) { + redisReply *r, *parent; + + r = createReplyObject(REDIS_REPLY_BOOL); + if (r == NULL) + return NULL; + + r->integer = bval != 0; + + if (task->parent) { + parent = task->parent->obj; + assert(parent->type == REDIS_REPLY_ARRAY || + parent->type == REDIS_REPLY_MAP || + parent->type == REDIS_REPLY_SET); parent->element[task->idx] = r; } return r; @@ -432,11 +496,7 @@ cleanup: } sdsfree(curarg); - - /* No need to check cmd since it is the last statement that can fail, - * but do it anyway to be as defensive as possible. */ - if (cmd != NULL) - free(cmd); + free(cmd); return error_type; } @@ -581,7 +641,7 @@ void __redisSetError(redisContext *c, int type, const char *str) { } else { /* Only REDIS_ERR_IO may lack a description! */ assert(type == REDIS_ERR_IO); - __redis_strerror_r(errno, c->errstr, sizeof(c->errstr)); + strerror_r(errno, c->errstr, sizeof(c->errstr)); } } @@ -596,14 +656,8 @@ static redisContext *redisContextInit(void) { if (c == NULL) return NULL; - c->err = 0; - c->errstr[0] = '\0'; c->obuf = sdsempty(); c->reader = redisReaderCreate(); - c->tcp.host = NULL; - c->tcp.source_addr = NULL; - c->unix_sock.path = NULL; - c->timeout = NULL; if (c->obuf == NULL || c->reader == NULL) { redisFree(c); @@ -618,18 +672,14 @@ void redisFree(redisContext *c) { return; if (c->fd > 0) close(c->fd); - if (c->obuf != NULL) - sdsfree(c->obuf); - if (c->reader != NULL) - redisReaderFree(c->reader); - if (c->tcp.host) - free(c->tcp.host); - if (c->tcp.source_addr) - free(c->tcp.source_addr); - if (c->unix_sock.path) - free(c->unix_sock.path); - if (c->timeout) - free(c->timeout); + + sdsfree(c->obuf); + redisReaderFree(c->reader); + free(c->tcp.host); + free(c->tcp.source_addr); + free(c->unix_sock.path); + free(c->timeout); + free(c->saddr); free(c); } @@ -710,6 +760,8 @@ redisContext *redisConnectNonBlock(const char *ip, int port) { redisContext *redisConnectBindNonBlock(const char *ip, int port, const char *source_addr) { redisContext *c = redisContextInit(); + if (c == NULL) + return NULL; c->flags &= ~REDIS_BLOCK; redisContextConnectBindTcp(c,ip,port,NULL,source_addr); return c; @@ -718,6 +770,8 @@ redisContext *redisConnectBindNonBlock(const char *ip, int port, redisContext *redisConnectBindNonBlockWithReuse(const char *ip, int port, const char *source_addr) { redisContext *c = redisContextInit(); + if (c == NULL) + return NULL; c->flags &= ~REDIS_BLOCK; c->flags |= REDIS_REUSEADDR; redisContextConnectBindTcp(c,ip,port,NULL,source_addr); @@ -789,7 +843,7 @@ int redisEnableKeepAlive(redisContext *c) { /* Use this function to handle a read event on the descriptor. It will try * and read some bytes from the socket and feed them to the reply parser. * - * After this function is called, you may use redisContextReadReply to + * After this function is called, you may use redisGetReplyFromReader to * see if there is a reply available. */ int redisBufferRead(redisContext *c) { char buf[1024*16]; @@ -1007,9 +1061,8 @@ void *redisvCommand(redisContext *c, const char *format, va_list ap) { void *redisCommand(redisContext *c, const char *format, ...) { va_list ap; - void *reply = NULL; va_start(ap,format); - reply = redisvCommand(c,format,ap); + void *reply = redisvCommand(c,format,ap); va_end(ap); return reply; } diff --git a/deps/hiredis/hiredis.h b/deps/hiredis/hiredis.h index 423d5e504..47d7982e9 100644 --- a/deps/hiredis/hiredis.h +++ b/deps/hiredis/hiredis.h @@ -40,9 +40,9 @@ #include "sds.h" /* for sds */ #define HIREDIS_MAJOR 0 -#define HIREDIS_MINOR 13 -#define HIREDIS_PATCH 3 -#define HIREDIS_SONAME 0.13 +#define HIREDIS_MINOR 14 +#define HIREDIS_PATCH 0 +#define HIREDIS_SONAME 0.14 /* Connection type can be blocking or non-blocking and is set in the * least significant bit of the flags field in redisContext. */ @@ -80,30 +80,6 @@ * SO_REUSEADDR is being used. */ #define REDIS_CONNECT_RETRIES 10 -/* strerror_r has two completely different prototypes and behaviors - * depending on system issues, so we need to operate on the error buffer - * differently depending on which strerror_r we're using. */ -#ifndef _GNU_SOURCE -/* "regular" POSIX strerror_r that does the right thing. */ -#define __redis_strerror_r(errno, buf, len) \ - do { \ - strerror_r((errno), (buf), (len)); \ - } while (0) -#else -/* "bad" GNU strerror_r we need to clean up after. */ -#define __redis_strerror_r(errno, buf, len) \ - do { \ - char *err_str = strerror_r((errno), (buf), (len)); \ - /* If return value _isn't_ the start of the buffer we passed in, \ - * then GNU strerror_r returned an internal static buffer and we \ - * need to copy the result into our private buffer. */ \ - if (err_str != (buf)) { \ - strncpy((buf), err_str, ((len) - 1)); \ - buf[(len)-1] = '\0'; \ - } \ - } while (0) -#endif - #ifdef __cplusplus extern "C" { #endif @@ -112,8 +88,10 @@ extern "C" { typedef struct redisReply { int type; /* REDIS_REPLY_* */ long long integer; /* The integer when type is REDIS_REPLY_INTEGER */ + double dval; /* The double when type is REDIS_REPLY_DOUBLE */ size_t len; /* Length of string */ - char *str; /* Used for both REDIS_REPLY_ERROR and REDIS_REPLY_STRING */ + char *str; /* Used for REDIS_REPLY_ERROR, REDIS_REPLY_STRING + and REDIS_REPLY_DOUBLE (in additionl to dval). */ size_t elements; /* number of elements, for REDIS_REPLY_ARRAY */ struct redisReply **element; /* elements vector for REDIS_REPLY_ARRAY */ } redisReply; @@ -158,6 +136,9 @@ typedef struct redisContext { char *path; } unix_sock; + /* For non-blocking connect */ + struct sockadr *saddr; + size_t addrlen; } redisContext; redisContext *redisConnect(const char *ip, int port); diff --git a/deps/hiredis/net.c b/deps/hiredis/net.c index 7d4120985..a4b3abc6d 100644 --- a/deps/hiredis/net.c +++ b/deps/hiredis/net.c @@ -65,12 +65,13 @@ static void redisContextCloseFd(redisContext *c) { } static void __redisSetErrorFromErrno(redisContext *c, int type, const char *prefix) { + int errorno = errno; /* snprintf() may change errno */ char buf[128] = { 0 }; size_t len = 0; if (prefix != NULL) len = snprintf(buf,sizeof(buf),"%s: ",prefix); - __redis_strerror_r(errno, (char *)(buf + len), sizeof(buf) - len); + strerror_r(errorno, (char *)(buf + len), sizeof(buf) - len); __redisSetError(c,type,buf); } @@ -135,14 +136,13 @@ int redisKeepAlive(redisContext *c, int interval) { val = interval; -#ifdef _OSX +#if defined(__APPLE__) && defined(__MACH__) if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPALIVE, &val, sizeof(val)) < 0) { __redisSetError(c,REDIS_ERR_OTHER,strerror(errno)); return REDIS_ERR; } #else #if defined(__GLIBC__) && !defined(__FreeBSD_kernel__) - val = interval; if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val)) < 0) { __redisSetError(c,REDIS_ERR_OTHER,strerror(errno)); return REDIS_ERR; @@ -221,8 +221,10 @@ static int redisContextWaitReady(redisContext *c, long msec) { return REDIS_ERR; } - if (redisCheckSocketError(c) != REDIS_OK) + if (redisCheckConnectDone(c, &res) != REDIS_OK || res == 0) { + redisCheckSocketError(c); return REDIS_ERR; + } return REDIS_OK; } @@ -232,8 +234,28 @@ static int redisContextWaitReady(redisContext *c, long msec) { return REDIS_ERR; } +int redisCheckConnectDone(redisContext *c, int *completed) { + int rc = connect(c->fd, (const struct sockaddr *)c->saddr, c->addrlen); + if (rc == 0) { + *completed = 1; + return REDIS_OK; + } + switch (errno) { + case EISCONN: + *completed = 1; + return REDIS_OK; + case EALREADY: + case EINPROGRESS: + case EWOULDBLOCK: + *completed = 0; + return REDIS_OK; + default: + return REDIS_ERR; + } +} + int redisCheckSocketError(redisContext *c) { - int err = 0; + int err = 0, errno_saved = errno; socklen_t errlen = sizeof(err); if (getsockopt(c->fd, SOL_SOCKET, SO_ERROR, &err, &errlen) == -1) { @@ -241,6 +263,10 @@ int redisCheckSocketError(redisContext *c) { return REDIS_ERR; } + if (err == 0) { + err = errno_saved; + } + if (err) { errno = err; __redisSetErrorFromErrno(c,REDIS_ERR_IO,NULL); @@ -285,8 +311,7 @@ static int _redisContextConnectTcp(redisContext *c, const char *addr, int port, * This is a bit ugly, but atleast it works and doesn't leak memory. **/ if (c->tcp.host != addr) { - if (c->tcp.host) - free(c->tcp.host); + free(c->tcp.host); c->tcp.host = strdup(addr); } @@ -299,8 +324,7 @@ static int _redisContextConnectTcp(redisContext *c, const char *addr, int port, memcpy(c->timeout, timeout, sizeof(struct timeval)); } } else { - if (c->timeout) - free(c->timeout); + free(c->timeout); c->timeout = NULL; } @@ -356,6 +380,7 @@ addrretry: n = 1; if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (char*) &n, sizeof(n)) < 0) { + freeaddrinfo(bservinfo); goto error; } } @@ -374,12 +399,27 @@ addrretry: goto error; } } + + /* For repeat connection */ + if (c->saddr) { + free(c->saddr); + } + c->saddr = malloc(p->ai_addrlen); + memcpy(c->saddr, p->ai_addr, p->ai_addrlen); + c->addrlen = p->ai_addrlen; + if (connect(s,p->ai_addr,p->ai_addrlen) == -1) { if (errno == EHOSTUNREACH) { redisContextCloseFd(c); continue; - } else if (errno == EINPROGRESS && !blocking) { - /* This is ok. */ + } else if (errno == EINPROGRESS) { + if (blocking) { + goto wait_for_ready; + } + /* This is ok. + * Note that even when it's in blocking mode, we unset blocking + * for `connect()` + */ } else if (errno == EADDRNOTAVAIL && reuseaddr) { if (++reuses >= REDIS_CONNECT_RETRIES) { goto error; @@ -388,6 +428,7 @@ addrretry: goto addrretry; } } else { + wait_for_ready: if (redisContextWaitReady(c,timeout_msec) != REDIS_OK) goto error; } @@ -411,7 +452,10 @@ addrretry: error: rv = REDIS_ERR; end: - freeaddrinfo(servinfo); + if(servinfo) { + freeaddrinfo(servinfo); + } + return rv; // Need to return REDIS_OK if alright } @@ -431,7 +475,7 @@ int redisContextConnectUnix(redisContext *c, const char *path, const struct time struct sockaddr_un sa; long timeout_msec = -1; - if (redisCreateSocket(c,AF_LOCAL) < 0) + if (redisCreateSocket(c,AF_UNIX) < 0) return REDIS_ERR; if (redisSetBlocking(c,0) != REDIS_OK) return REDIS_ERR; @@ -448,15 +492,14 @@ int redisContextConnectUnix(redisContext *c, const char *path, const struct time memcpy(c->timeout, timeout, sizeof(struct timeval)); } } else { - if (c->timeout) - free(c->timeout); + free(c->timeout); c->timeout = NULL; } if (redisContextTimeoutMsec(c,&timeout_msec) != REDIS_OK) return REDIS_ERR; - sa.sun_family = AF_LOCAL; + sa.sun_family = AF_UNIX; strncpy(sa.sun_path,path,sizeof(sa.sun_path)-1); if (connect(c->fd, (struct sockaddr*)&sa, sizeof(sa)) == -1) { if (errno == EINPROGRESS && !blocking) { diff --git a/deps/hiredis/net.h b/deps/hiredis/net.h index 2f1a0bf85..a11594e68 100644 --- a/deps/hiredis/net.h +++ b/deps/hiredis/net.h @@ -37,10 +37,6 @@ #include "hiredis.h" -#if defined(__sun) -#define AF_LOCAL AF_UNIX -#endif - int redisCheckSocketError(redisContext *c); int redisContextSetTimeout(redisContext *c, const struct timeval tv); int redisContextConnectTcp(redisContext *c, const char *addr, int port, const struct timeval *timeout); @@ -49,5 +45,6 @@ int redisContextConnectBindTcp(redisContext *c, const char *addr, int port, const char *source_addr); int redisContextConnectUnix(redisContext *c, const char *path, const struct timeval *timeout); int redisKeepAlive(redisContext *c, int interval); +int redisCheckConnectDone(redisContext *c, int *completed); #endif diff --git a/deps/hiredis/read.c b/deps/hiredis/read.c index 50333b534..c75c3435f 100644 --- a/deps/hiredis/read.c +++ b/deps/hiredis/read.c @@ -29,7 +29,6 @@ * POSSIBILITY OF SUCH DAMAGE. */ - #include "fmacros.h" #include #include @@ -39,6 +38,8 @@ #include #include #include +#include +#include #include "read.h" #include "sds.h" @@ -52,11 +53,9 @@ static void __redisReaderSetError(redisReader *r, int type, const char *str) { } /* Clear input buffer on errors. */ - if (r->buf != NULL) { - sdsfree(r->buf); - r->buf = NULL; - r->pos = r->len = 0; - } + sdsfree(r->buf); + r->buf = NULL; + r->pos = r->len = 0; /* Reset task stack. */ r->ridx = -1; @@ -143,33 +142,79 @@ static char *seekNewline(char *s, size_t len) { return NULL; } -/* Read a long long value starting at *s, under the assumption that it will be - * terminated by \r\n. Ambiguously returns -1 for unexpected input. */ -static long long readLongLong(char *s) { - long long v = 0; - int dec, mult = 1; - char c; +/* Convert a string into a long long. Returns REDIS_OK if the string could be + * parsed into a (non-overflowing) long long, REDIS_ERR otherwise. The value + * will be set to the parsed value when appropriate. + * + * Note that this function demands that the string strictly represents + * a long long: no spaces or other characters before or after the string + * representing the number are accepted, nor zeroes at the start if not + * for the string "0" representing the zero number. + * + * Because of its strictness, it is safe to use this function to check if + * you can convert a string into a long long, and obtain back the string + * from the number without any loss in the string representation. */ +static int string2ll(const char *s, size_t slen, long long *value) { + const char *p = s; + size_t plen = 0; + int negative = 0; + unsigned long long v; - if (*s == '-') { - mult = -1; - s++; - } else if (*s == '+') { - mult = 1; - s++; + if (plen == slen) + return REDIS_ERR; + + /* Special case: first and only digit is 0. */ + if (slen == 1 && p[0] == '0') { + if (value != NULL) *value = 0; + return REDIS_OK; } - while ((c = *(s++)) != '\r') { - dec = c - '0'; - if (dec >= 0 && dec < 10) { - v *= 10; - v += dec; - } else { - /* Should not happen... */ - return -1; - } + if (p[0] == '-') { + negative = 1; + p++; plen++; + + /* Abort on only a negative sign. */ + if (plen == slen) + return REDIS_ERR; } - return mult*v; + /* First digit should be 1-9, otherwise the string should just be 0. */ + if (p[0] >= '1' && p[0] <= '9') { + v = p[0]-'0'; + p++; plen++; + } else if (p[0] == '0' && slen == 1) { + *value = 0; + return REDIS_OK; + } else { + return REDIS_ERR; + } + + while (plen < slen && p[0] >= '0' && p[0] <= '9') { + if (v > (ULLONG_MAX / 10)) /* Overflow. */ + return REDIS_ERR; + v *= 10; + + if (v > (ULLONG_MAX - (p[0]-'0'))) /* Overflow. */ + return REDIS_ERR; + v += p[0]-'0'; + + p++; plen++; + } + + /* Return if not all bytes were used. */ + if (plen < slen) + return REDIS_ERR; + + if (negative) { + if (v > ((unsigned long long)(-(LLONG_MIN+1))+1)) /* Overflow. */ + return REDIS_ERR; + if (value != NULL) *value = -v; + } else { + if (v > LLONG_MAX) /* Overflow. */ + return REDIS_ERR; + if (value != NULL) *value = v; + } + return REDIS_OK; } static char *readLine(redisReader *r, int *_len) { @@ -198,7 +243,9 @@ static void moveToNextTask(redisReader *r) { cur = &(r->rstack[r->ridx]); prv = &(r->rstack[r->ridx-1]); - assert(prv->type == REDIS_REPLY_ARRAY); + assert(prv->type == REDIS_REPLY_ARRAY || + prv->type == REDIS_REPLY_MAP || + prv->type == REDIS_REPLY_SET); if (cur->idx == prv->elements-1) { r->ridx--; } else { @@ -220,10 +267,58 @@ static int processLineItem(redisReader *r) { if ((p = readLine(r,&len)) != NULL) { if (cur->type == REDIS_REPLY_INTEGER) { - if (r->fn && r->fn->createInteger) - obj = r->fn->createInteger(cur,readLongLong(p)); - else + if (r->fn && r->fn->createInteger) { + long long v; + if (string2ll(p, len, &v) == REDIS_ERR) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Bad integer value"); + return REDIS_ERR; + } + obj = r->fn->createInteger(cur,v); + } else { obj = (void*)REDIS_REPLY_INTEGER; + } + } else if (cur->type == REDIS_REPLY_DOUBLE) { + if (r->fn && r->fn->createDouble) { + char buf[326], *eptr; + double d; + + if ((size_t)len >= sizeof(buf)) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Double value is too large"); + return REDIS_ERR; + } + + memcpy(buf,p,len); + buf[len] = '\0'; + + if (strcasecmp(buf,",inf") == 0) { + d = 1.0/0.0; /* Positive infinite. */ + } else if (strcasecmp(buf,",-inf") == 0) { + d = -1.0/0.0; /* Nevative infinite. */ + } else { + d = strtod((char*)buf,&eptr); + if (buf[0] == '\0' || eptr[0] != '\0' || isnan(d)) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Bad double value"); + return REDIS_ERR; + } + } + obj = r->fn->createDouble(cur,d,buf,len); + } else { + obj = (void*)REDIS_REPLY_DOUBLE; + } + } else if (cur->type == REDIS_REPLY_NIL) { + if (r->fn && r->fn->createNil) + obj = r->fn->createNil(cur); + else + obj = (void*)REDIS_REPLY_NIL; + } else if (cur->type == REDIS_REPLY_BOOL) { + int bval = p[0] == 't' || p[0] == 'T'; + if (r->fn && r->fn->createBool) + obj = r->fn->createBool(cur,bval); + else + obj = (void*)REDIS_REPLY_BOOL; } else { /* Type will be error or status. */ if (r->fn && r->fn->createString) @@ -250,7 +345,7 @@ static int processBulkItem(redisReader *r) { redisReadTask *cur = &(r->rstack[r->ridx]); void *obj = NULL; char *p, *s; - long len; + long long len; unsigned long bytelen; int success = 0; @@ -259,9 +354,20 @@ static int processBulkItem(redisReader *r) { if (s != NULL) { p = r->buf+r->pos; bytelen = s-(r->buf+r->pos)+2; /* include \r\n */ - len = readLongLong(p); - if (len < 0) { + if (string2ll(p, bytelen - 2, &len) == REDIS_ERR) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Bad bulk string length"); + return REDIS_ERR; + } + + if (len < -1 || (LLONG_MAX > SIZE_MAX && len > (long long)SIZE_MAX)) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Bulk string length out of range"); + return REDIS_ERR; + } + + if (len == -1) { /* The nil object can always be created. */ if (r->fn && r->fn->createNil) obj = r->fn->createNil(cur); @@ -299,12 +405,13 @@ static int processBulkItem(redisReader *r) { return REDIS_ERR; } -static int processMultiBulkItem(redisReader *r) { +/* Process the array, map and set types. */ +static int processAggregateItem(redisReader *r) { redisReadTask *cur = &(r->rstack[r->ridx]); void *obj; char *p; - long elements; - int root = 0; + long long elements; + int root = 0, len; /* Set error for nested multi bulks with depth > 7 */ if (r->ridx == 8) { @@ -313,10 +420,21 @@ static int processMultiBulkItem(redisReader *r) { return REDIS_ERR; } - if ((p = readLine(r,NULL)) != NULL) { - elements = readLongLong(p); + if ((p = readLine(r,&len)) != NULL) { + if (string2ll(p, len, &elements) == REDIS_ERR) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Bad multi-bulk length"); + return REDIS_ERR; + } + root = (r->ridx == 0); + if (elements < -1 || elements > INT_MAX) { + __redisReaderSetError(r,REDIS_ERR_PROTOCOL, + "Multi-bulk length out of range"); + return REDIS_ERR; + } + if (elements == -1) { if (r->fn && r->fn->createNil) obj = r->fn->createNil(cur); @@ -330,10 +448,12 @@ static int processMultiBulkItem(redisReader *r) { moveToNextTask(r); } else { + if (cur->type == REDIS_REPLY_MAP) elements *= 2; + if (r->fn && r->fn->createArray) obj = r->fn->createArray(cur,elements); else - obj = (void*)REDIS_REPLY_ARRAY; + obj = (void*)(long)cur->type; if (obj == NULL) { __redisReaderSetErrorOOM(r); @@ -381,12 +501,27 @@ static int processItem(redisReader *r) { case ':': cur->type = REDIS_REPLY_INTEGER; break; + case ',': + cur->type = REDIS_REPLY_DOUBLE; + break; + case '_': + cur->type = REDIS_REPLY_NIL; + break; case '$': cur->type = REDIS_REPLY_STRING; break; case '*': cur->type = REDIS_REPLY_ARRAY; break; + case '%': + cur->type = REDIS_REPLY_MAP; + break; + case '~': + cur->type = REDIS_REPLY_SET; + break; + case '#': + cur->type = REDIS_REPLY_BOOL; + break; default: __redisReaderSetErrorProtocolByte(r,*p); return REDIS_ERR; @@ -402,11 +537,16 @@ static int processItem(redisReader *r) { case REDIS_REPLY_ERROR: case REDIS_REPLY_STATUS: case REDIS_REPLY_INTEGER: + case REDIS_REPLY_DOUBLE: + case REDIS_REPLY_NIL: + case REDIS_REPLY_BOOL: return processLineItem(r); case REDIS_REPLY_STRING: return processBulkItem(r); case REDIS_REPLY_ARRAY: - return processMultiBulkItem(r); + case REDIS_REPLY_MAP: + case REDIS_REPLY_SET: + return processAggregateItem(r); default: assert(NULL); return REDIS_ERR; /* Avoid warning. */ @@ -416,12 +556,10 @@ static int processItem(redisReader *r) { redisReader *redisReaderCreateWithFunctions(redisReplyObjectFunctions *fn) { redisReader *r; - r = calloc(sizeof(redisReader),1); + r = calloc(1,sizeof(redisReader)); if (r == NULL) return NULL; - r->err = 0; - r->errstr[0] = '\0'; r->fn = fn; r->buf = sdsempty(); r->maxbuf = REDIS_READER_MAX_BUF; @@ -435,10 +573,11 @@ redisReader *redisReaderCreateWithFunctions(redisReplyObjectFunctions *fn) { } void redisReaderFree(redisReader *r) { + if (r == NULL) + return; if (r->reply != NULL && r->fn && r->fn->freeObject) r->fn->freeObject(r->reply); - if (r->buf != NULL) - sdsfree(r->buf); + sdsfree(r->buf); free(r); } diff --git a/deps/hiredis/read.h b/deps/hiredis/read.h index 2988aa453..f3d075843 100644 --- a/deps/hiredis/read.h +++ b/deps/hiredis/read.h @@ -53,6 +53,14 @@ #define REDIS_REPLY_NIL 4 #define REDIS_REPLY_STATUS 5 #define REDIS_REPLY_ERROR 6 +#define REDIS_REPLY_DOUBLE 7 +#define REDIS_REPLY_BOOL 8 +#define REDIS_REPLY_VERB 9 +#define REDIS_REPLY_MAP 9 +#define REDIS_REPLY_SET 10 +#define REDIS_REPLY_ATTR 11 +#define REDIS_REPLY_PUSH 12 +#define REDIS_REPLY_BIGNUM 13 #define REDIS_READER_MAX_BUF (1024*16) /* Default max unused reader buffer. */ @@ -73,7 +81,9 @@ typedef struct redisReplyObjectFunctions { void *(*createString)(const redisReadTask*, char*, size_t); void *(*createArray)(const redisReadTask*, int); void *(*createInteger)(const redisReadTask*, long long); + void *(*createDouble)(const redisReadTask*, double, char*, size_t); void *(*createNil)(const redisReadTask*); + void *(*createBool)(const redisReadTask*, int); void (*freeObject)(void*); } redisReplyObjectFunctions; diff --git a/deps/hiredis/sds.c b/deps/hiredis/sds.c index 923ffd82f..44777b10c 100644 --- a/deps/hiredis/sds.c +++ b/deps/hiredis/sds.c @@ -219,7 +219,10 @@ sds sdsMakeRoomFor(sds s, size_t addlen) { hdrlen = sdsHdrSize(type); if (oldtype==type) { newsh = s_realloc(sh, hdrlen+newlen+1); - if (newsh == NULL) return NULL; + if (newsh == NULL) { + s_free(sh); + return NULL; + } s = (char*)newsh+hdrlen; } else { /* Since the header size changes, need to move the string forward, @@ -592,6 +595,7 @@ sds sdscatfmt(sds s, char const *fmt, ...) { /* Make sure there is always space for at least 1 char. */ if (sdsavail(s)==0) { s = sdsMakeRoomFor(s,1); + if (s == NULL) goto fmt_error; } switch(*f) { @@ -605,6 +609,7 @@ sds sdscatfmt(sds s, char const *fmt, ...) { l = (next == 's') ? strlen(str) : sdslen(str); if (sdsavail(s) < l) { s = sdsMakeRoomFor(s,l); + if (s == NULL) goto fmt_error; } memcpy(s+i,str,l); sdsinclen(s,l); @@ -621,6 +626,7 @@ sds sdscatfmt(sds s, char const *fmt, ...) { l = sdsll2str(buf,num); if (sdsavail(s) < l) { s = sdsMakeRoomFor(s,l); + if (s == NULL) goto fmt_error; } memcpy(s+i,buf,l); sdsinclen(s,l); @@ -638,6 +644,7 @@ sds sdscatfmt(sds s, char const *fmt, ...) { l = sdsull2str(buf,unum); if (sdsavail(s) < l) { s = sdsMakeRoomFor(s,l); + if (s == NULL) goto fmt_error; } memcpy(s+i,buf,l); sdsinclen(s,l); @@ -662,6 +669,10 @@ sds sdscatfmt(sds s, char const *fmt, ...) { /* Add null-term */ s[i] = '\0'; return s; + +fmt_error: + va_end(ap); + return NULL; } /* Remove the part of the string from left and from right composed just of @@ -1018,10 +1029,18 @@ sds *sdssplitargs(const char *line, int *argc) { if (*p) p++; } /* add the token to the vector */ - vector = s_realloc(vector,((*argc)+1)*sizeof(char*)); - vector[*argc] = current; - (*argc)++; - current = NULL; + { + char **new_vector = s_realloc(vector,((*argc)+1)*sizeof(char*)); + if (new_vector == NULL) { + s_free(vector); + return NULL; + } + + vector = new_vector; + vector[*argc] = current; + (*argc)++; + current = NULL; + } } else { /* Even on empty input string return something not NULL. */ if (vector == NULL) vector = s_malloc(sizeof(void*)); diff --git a/deps/hiredis/test.c b/deps/hiredis/test.c index a23d60676..79cff4308 100644 --- a/deps/hiredis/test.c +++ b/deps/hiredis/test.c @@ -3,7 +3,9 @@ #include #include #include +#include #include +#include #include #include #include @@ -91,7 +93,7 @@ static int disconnect(redisContext *c, int keep_fd) { return -1; } -static redisContext *connect(struct config config) { +static redisContext *do_connect(struct config config) { redisContext *c = NULL; if (config.type == CONN_TCP) { @@ -248,7 +250,7 @@ static void test_append_formatted_commands(struct config config) { char *cmd; int len; - c = connect(config); + c = do_connect(config); test("Append format command: "); @@ -302,6 +304,82 @@ static void test_reply_reader(void) { strncasecmp(reader->errstr,"No support for",14) == 0); redisReaderFree(reader); + test("Correctly parses LLONG_MAX: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, ":9223372036854775807\r\n",22); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_OK && + ((redisReply*)reply)->type == REDIS_REPLY_INTEGER && + ((redisReply*)reply)->integer == LLONG_MAX); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Set error when > LLONG_MAX: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, ":9223372036854775808\r\n",22); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Bad integer value") == 0); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Correctly parses LLONG_MIN: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, ":-9223372036854775808\r\n",23); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_OK && + ((redisReply*)reply)->type == REDIS_REPLY_INTEGER && + ((redisReply*)reply)->integer == LLONG_MIN); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Set error when < LLONG_MIN: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, ":-9223372036854775809\r\n",23); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Bad integer value") == 0); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Set error when array < -1: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, "*-2\r\n+asdf\r\n",12); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Multi-bulk length out of range") == 0); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Set error when bulk < -1: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, "$-2\r\nasdf\r\n",11); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Bulk string length out of range") == 0); + freeReplyObject(reply); + redisReaderFree(reader); + + test("Set error when array > INT_MAX: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, "*9223372036854775807\r\n+asdf\r\n",29); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Multi-bulk length out of range") == 0); + freeReplyObject(reply); + redisReaderFree(reader); + +#if LLONG_MAX > SIZE_MAX + test("Set error when bulk > SIZE_MAX: "); + reader = redisReaderCreate(); + redisReaderFeed(reader, "$9223372036854775807\r\nasdf\r\n",28); + ret = redisReaderGetReply(reader,&reply); + test_cond(ret == REDIS_ERR && + strcasecmp(reader->errstr,"Bulk string length out of range") == 0); + freeReplyObject(reply); + redisReaderFree(reader); +#endif + test("Works with NULL functions for reply: "); reader = redisReaderCreate(); reader->fn = NULL; @@ -358,18 +436,32 @@ static void test_free_null(void) { static void test_blocking_connection_errors(void) { redisContext *c; + struct addrinfo hints = {.ai_family = AF_INET}; + struct addrinfo *ai_tmp = NULL; + const char *bad_domain = "idontexist.com"; - test("Returns error when host cannot be resolved: "); - c = redisConnect((char*)"idontexist.test", 6379); - test_cond(c->err == REDIS_ERR_OTHER && - (strcmp(c->errstr,"Name or service not known") == 0 || - strcmp(c->errstr,"Can't resolve: idontexist.test") == 0 || - strcmp(c->errstr,"nodename nor servname provided, or not known") == 0 || - strcmp(c->errstr,"No address associated with hostname") == 0 || - strcmp(c->errstr,"Temporary failure in name resolution") == 0 || - strcmp(c->errstr,"hostname nor servname provided, or not known") == 0 || - strcmp(c->errstr,"no address associated with name") == 0)); - redisFree(c); + int rv = getaddrinfo(bad_domain, "6379", &hints, &ai_tmp); + if (rv != 0) { + // Address does *not* exist + test("Returns error when host cannot be resolved: "); + // First see if this domain name *actually* resolves to NXDOMAIN + c = redisConnect("dontexist.com", 6379); + test_cond( + c->err == REDIS_ERR_OTHER && + (strcmp(c->errstr, "Name or service not known") == 0 || + strcmp(c->errstr, "Can't resolve: sadkfjaskfjsa.com") == 0 || + strcmp(c->errstr, + "nodename nor servname provided, or not known") == 0 || + strcmp(c->errstr, "No address associated with hostname") == 0 || + strcmp(c->errstr, "Temporary failure in name resolution") == 0 || + strcmp(c->errstr, + "hostname nor servname provided, or not known") == 0 || + strcmp(c->errstr, "no address associated with name") == 0)); + redisFree(c); + } else { + printf("Skipping NXDOMAIN test. Found evil ISP!\n"); + freeaddrinfo(ai_tmp); + } test("Returns error when the port is not open: "); c = redisConnect((char*)"localhost", 1); @@ -387,7 +479,7 @@ static void test_blocking_connection(struct config config) { redisContext *c; redisReply *reply; - c = connect(config); + c = do_connect(config); test("Is able to deliver commands: "); reply = redisCommand(c,"PING"); @@ -468,7 +560,7 @@ static void test_blocking_connection_timeouts(struct config config) { const char *cmd = "DEBUG SLEEP 3\r\n"; struct timeval tv; - c = connect(config); + c = do_connect(config); test("Successfully completes a command when the timeout is not exceeded: "); reply = redisCommand(c,"SET foo fast"); freeReplyObject(reply); @@ -480,7 +572,7 @@ static void test_blocking_connection_timeouts(struct config config) { freeReplyObject(reply); disconnect(c, 0); - c = connect(config); + c = do_connect(config); test("Does not return a reply when the command times out: "); s = write(c->fd, cmd, strlen(cmd)); tv.tv_sec = 0; @@ -514,7 +606,7 @@ static void test_blocking_io_errors(struct config config) { int major, minor; /* Connect to target given by config. */ - c = connect(config); + c = do_connect(config); { /* Find out Redis version to determine the path for the next test */ const char *field = "redis_version:"; @@ -549,7 +641,7 @@ static void test_blocking_io_errors(struct config config) { strcmp(c->errstr,"Server closed the connection") == 0); redisFree(c); - c = connect(config); + c = do_connect(config); test("Returns I/O error on socket timeout: "); struct timeval tv = { 0, 1000 }; assert(redisSetTimeout(c,tv) == REDIS_OK); @@ -583,7 +675,7 @@ static void test_invalid_timeout_errors(struct config config) { } static void test_throughput(struct config config) { - redisContext *c = connect(config); + redisContext *c = do_connect(config); redisReply **replies; int i, num; long long t1, t2; @@ -616,6 +708,17 @@ static void test_throughput(struct config config) { free(replies); printf("\t(%dx LRANGE with 500 elements: %.3fs)\n", num, (t2-t1)/1000000.0); + replies = malloc(sizeof(redisReply*)*num); + t1 = usec(); + for (i = 0; i < num; i++) { + replies[i] = redisCommand(c, "INCRBY incrkey %d", 1000000); + assert(replies[i] != NULL && replies[i]->type == REDIS_REPLY_INTEGER); + } + t2 = usec(); + for (i = 0; i < num; i++) freeReplyObject(replies[i]); + free(replies); + printf("\t(%dx INCRBY: %.3fs)\n", num, (t2-t1)/1000000.0); + num = 10000; replies = malloc(sizeof(redisReply*)*num); for (i = 0; i < num; i++) @@ -644,6 +747,19 @@ static void test_throughput(struct config config) { free(replies); printf("\t(%dx LRANGE with 500 elements (pipelined): %.3fs)\n", num, (t2-t1)/1000000.0); + replies = malloc(sizeof(redisReply*)*num); + for (i = 0; i < num; i++) + redisAppendCommand(c,"INCRBY incrkey %d", 1000000); + t1 = usec(); + for (i = 0; i < num; i++) { + assert(redisGetReply(c, (void*)&replies[i]) == REDIS_OK); + assert(replies[i] != NULL && replies[i]->type == REDIS_REPLY_INTEGER); + } + t2 = usec(); + for (i = 0; i < num; i++) freeReplyObject(replies[i]); + free(replies); + printf("\t(%dx INCRBY (pipelined): %.3fs)\n", num, (t2-t1)/1000000.0); + disconnect(c, 0); } diff --git a/redis.conf b/redis.conf index 42d24f26e..5ea915905 100644 --- a/redis.conf +++ b/redis.conf @@ -264,59 +264,75 @@ dir ./ ################################# REPLICATION ################################# -# Master-Slave replication. Use slaveof to make a Redis instance a copy of +# Master-Replica replication. Use replicaof to make a Redis instance a copy of # another Redis server. A few things to understand ASAP about Redis replication. # +# +------------------+ +---------------+ +# | Master | ---> | Replica | +# | (receive writes) | | (exact copy) | +# +------------------+ +---------------+ +# # 1) Redis replication is asynchronous, but you can configure a master to # stop accepting writes if it appears to be not connected with at least -# a given number of slaves. -# 2) Redis slaves are able to perform a partial resynchronization with the +# a given number of replicas. +# 2) Redis replicas are able to perform a partial resynchronization with the # master if the replication link is lost for a relatively small amount of # time. You may want to configure the replication backlog size (see the next # sections of this file) with a sensible value depending on your needs. # 3) Replication is automatic and does not need user intervention. After a -# network partition slaves automatically try to reconnect to masters +# network partition replicas automatically try to reconnect to masters # and resynchronize with them. # -# slaveof +# replicaof # If the master is password protected (using the "requirepass" configuration -# directive below) it is possible to tell the slave to authenticate before +# directive below) it is possible to tell the replica to authenticate before # starting the replication synchronization process, otherwise the master will -# refuse the slave request. +# refuse the replica request. # # masterauth - -# When a slave loses its connection with the master, or when the replication -# is still in progress, the slave can act in two different ways: # -# 1) if slave-serve-stale-data is set to 'yes' (the default) the slave will +# However this is not enough if you are using Redis ACLs (for Redis version +# 6 or greater), and the default user is not capable of running the PSYNC +# command and/or other commands needed for replication. In this case it's +# better to configure a special user to use with replication, and specify the +# masteruser configuration as such: +# +# masteruser +# +# When masteruser is specified, the replica will authenticate against its +# master using the new AUTH form: AUTH . + +# When a replica loses its connection with the master, or when the replication +# is still in progress, the replica can act in two different ways: +# +# 1) if replica-serve-stale-data is set to 'yes' (the default) the replica will # still reply to client requests, possibly with out of date data, or the # data set may just be empty if this is the first synchronization. # -# 2) if slave-serve-stale-data is set to 'no' the slave will reply with +# 2) if replica-serve-stale-data is set to 'no' the replica will reply with # an error "SYNC with master in progress" to all the kind of commands -# but to INFO, SLAVEOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG, +# but to INFO, replicaOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG, # SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, PUBLISH, PUBSUB, # COMMAND, POST, HOST: and LATENCY. # -slave-serve-stale-data yes +replica-serve-stale-data yes -# You can configure a slave instance to accept writes or not. Writing against -# a slave instance may be useful to store some ephemeral data (because data -# written on a slave will be easily deleted after resync with the master) but +# You can configure a replica instance to accept writes or not. Writing against +# a replica instance may be useful to store some ephemeral data (because data +# written on a replica will be easily deleted after resync with the master) but # may also cause problems if clients are writing to it because of a # misconfiguration. # -# Since Redis 2.6 by default slaves are read-only. +# Since Redis 2.6 by default replicas are read-only. # -# Note: read only slaves are not designed to be exposed to untrusted clients +# Note: read only replicas are not designed to be exposed to untrusted clients # on the internet. It's just a protection layer against misuse of the instance. -# Still a read only slave exports by default all the administrative commands +# Still a read only replica exports by default all the administrative commands # such as CONFIG, DEBUG, and so forth. To a limited extent you can improve -# security of read only slaves using 'rename-command' to shadow all the +# security of read only replicas using 'rename-command' to shadow all the # administrative / dangerous commands. -slave-read-only yes +replica-read-only yes # Replication SYNC strategy: disk or socket. # @@ -324,25 +340,25 @@ slave-read-only yes # WARNING: DISKLESS REPLICATION IS EXPERIMENTAL CURRENTLY # ------------------------------------------------------- # -# New slaves and reconnecting slaves that are not able to continue the replication +# New replicas and reconnecting replicas that are not able to continue the replication # process just receiving differences, need to do what is called a "full -# synchronization". An RDB file is transmitted from the master to the slaves. +# synchronization". An RDB file is transmitted from the master to the replicas. # The transmission can happen in two different ways: # # 1) Disk-backed: The Redis master creates a new process that writes the RDB # file on disk. Later the file is transferred by the parent -# process to the slaves incrementally. +# process to the replicas incrementally. # 2) Diskless: The Redis master creates a new process that directly writes the -# RDB file to slave sockets, without touching the disk at all. +# RDB file to replica sockets, without touching the disk at all. # -# With disk-backed replication, while the RDB file is generated, more slaves +# With disk-backed replication, while the RDB file is generated, more replicas # can be queued and served with the RDB file as soon as the current child producing # the RDB file finishes its work. With diskless replication instead once -# the transfer starts, new slaves arriving will be queued and a new transfer +# the transfer starts, new replicas arriving will be queued and a new transfer # will start when the current one terminates. # # When diskless replication is used, the master waits a configurable amount of -# time (in seconds) before starting the transfer in the hope that multiple slaves +# time (in seconds) before starting the transfer in the hope that multiple replicas # will arrive and the transfer can be parallelized. # # With slow disks and fast (large bandwidth) networks, diskless replication @@ -351,157 +367,264 @@ repl-diskless-sync no # When diskless replication is enabled, it is possible to configure the delay # the server waits in order to spawn the child that transfers the RDB via socket -# to the slaves. +# to the replicas. # # This is important since once the transfer starts, it is not possible to serve -# new slaves arriving, that will be queued for the next RDB transfer, so the server -# waits a delay in order to let more slaves arrive. +# new replicas arriving, that will be queued for the next RDB transfer, so the server +# waits a delay in order to let more replicas arrive. # # The delay is specified in seconds, and by default is 5 seconds. To disable # it entirely just set it to 0 seconds and the transfer will start ASAP. repl-diskless-sync-delay 5 -# Slaves send PINGs to server in a predefined interval. It's possible to change -# this interval with the repl_ping_slave_period option. The default value is 10 +# Replicas send PINGs to server in a predefined interval. It's possible to change +# this interval with the repl_ping_replica_period option. The default value is 10 # seconds. # -# repl-ping-slave-period 10 +# repl-ping-replica-period 10 # The following option sets the replication timeout for: # -# 1) Bulk transfer I/O during SYNC, from the point of view of slave. -# 2) Master timeout from the point of view of slaves (data, pings). -# 3) Slave timeout from the point of view of masters (REPLCONF ACK pings). +# 1) Bulk transfer I/O during SYNC, from the point of view of replica. +# 2) Master timeout from the point of view of replicas (data, pings). +# 3) Replica timeout from the point of view of masters (REPLCONF ACK pings). # # It is important to make sure that this value is greater than the value -# specified for repl-ping-slave-period otherwise a timeout will be detected -# every time there is low traffic between the master and the slave. +# specified for repl-ping-replica-period otherwise a timeout will be detected +# every time there is low traffic between the master and the replica. # # repl-timeout 60 -# Disable TCP_NODELAY on the slave socket after SYNC? +# Disable TCP_NODELAY on the replica socket after SYNC? # # If you select "yes" Redis will use a smaller number of TCP packets and -# less bandwidth to send data to slaves. But this can add a delay for -# the data to appear on the slave side, up to 40 milliseconds with +# less bandwidth to send data to replicas. But this can add a delay for +# the data to appear on the replica side, up to 40 milliseconds with # Linux kernels using a default configuration. # -# If you select "no" the delay for data to appear on the slave side will +# If you select "no" the delay for data to appear on the replica side will # be reduced but more bandwidth will be used for replication. # # By default we optimize for low latency, but in very high traffic conditions -# or when the master and slaves are many hops away, turning this to "yes" may +# or when the master and replicas are many hops away, turning this to "yes" may # be a good idea. repl-disable-tcp-nodelay no # Set the replication backlog size. The backlog is a buffer that accumulates -# slave data when slaves are disconnected for some time, so that when a slave +# replica data when replicas are disconnected for some time, so that when a replica # wants to reconnect again, often a full resync is not needed, but a partial -# resync is enough, just passing the portion of data the slave missed while +# resync is enough, just passing the portion of data the replica missed while # disconnected. # -# The bigger the replication backlog, the longer the time the slave can be +# The bigger the replication backlog, the longer the time the replica can be # disconnected and later be able to perform a partial resynchronization. # -# The backlog is only allocated once there is at least a slave connected. +# The backlog is only allocated once there is at least a replica connected. # # repl-backlog-size 1mb -# After a master has no longer connected slaves for some time, the backlog +# After a master has no longer connected replicas for some time, the backlog # will be freed. The following option configures the amount of seconds that -# need to elapse, starting from the time the last slave disconnected, for +# need to elapse, starting from the time the last replica disconnected, for # the backlog buffer to be freed. # -# Note that slaves never free the backlog for timeout, since they may be +# Note that replicas never free the backlog for timeout, since they may be # promoted to masters later, and should be able to correctly "partially -# resynchronize" with the slaves: hence they should always accumulate backlog. +# resynchronize" with the replicas: hence they should always accumulate backlog. # # A value of 0 means to never release the backlog. # # repl-backlog-ttl 3600 -# The slave priority is an integer number published by Redis in the INFO output. -# It is used by Redis Sentinel in order to select a slave to promote into a +# The replica priority is an integer number published by Redis in the INFO output. +# It is used by Redis Sentinel in order to select a replica to promote into a # master if the master is no longer working correctly. # -# A slave with a low priority number is considered better for promotion, so -# for instance if there are three slaves with priority 10, 100, 25 Sentinel will +# A replica with a low priority number is considered better for promotion, so +# for instance if there are three replicas with priority 10, 100, 25 Sentinel will # pick the one with priority 10, that is the lowest. # -# However a special priority of 0 marks the slave as not able to perform the -# role of master, so a slave with priority of 0 will never be selected by +# However a special priority of 0 marks the replica as not able to perform the +# role of master, so a replica with priority of 0 will never be selected by # Redis Sentinel for promotion. # # By default the priority is 100. -slave-priority 100 +replica-priority 100 # It is possible for a master to stop accepting writes if there are less than -# N slaves connected, having a lag less or equal than M seconds. +# N replicas connected, having a lag less or equal than M seconds. # -# The N slaves need to be in "online" state. +# The N replicas need to be in "online" state. # # The lag in seconds, that must be <= the specified value, is calculated from -# the last ping received from the slave, that is usually sent every second. +# the last ping received from the replica, that is usually sent every second. # # This option does not GUARANTEE that N replicas will accept the write, but -# will limit the window of exposure for lost writes in case not enough slaves +# will limit the window of exposure for lost writes in case not enough replicas # are available, to the specified number of seconds. # -# For example to require at least 3 slaves with a lag <= 10 seconds use: +# For example to require at least 3 replicas with a lag <= 10 seconds use: # -# min-slaves-to-write 3 -# min-slaves-max-lag 10 +# min-replicas-to-write 3 +# min-replicas-max-lag 10 # # Setting one or the other to 0 disables the feature. # -# By default min-slaves-to-write is set to 0 (feature disabled) and -# min-slaves-max-lag is set to 10. +# By default min-replicas-to-write is set to 0 (feature disabled) and +# min-replicas-max-lag is set to 10. # A Redis master is able to list the address and port of the attached -# slaves in different ways. For example the "INFO replication" section +# replicas in different ways. For example the "INFO replication" section # offers this information, which is used, among other tools, by -# Redis Sentinel in order to discover slave instances. +# Redis Sentinel in order to discover replica instances. # Another place where this info is available is in the output of the # "ROLE" command of a master. # -# The listed IP and address normally reported by a slave is obtained +# The listed IP and address normally reported by a replica is obtained # in the following way: # # IP: The address is auto detected by checking the peer address -# of the socket used by the slave to connect with the master. +# of the socket used by the replica to connect with the master. # -# Port: The port is communicated by the slave during the replication -# handshake, and is normally the port that the slave is using to -# list for connections. +# Port: The port is communicated by the replica during the replication +# handshake, and is normally the port that the replica is using to +# listen for connections. # # However when port forwarding or Network Address Translation (NAT) is -# used, the slave may be actually reachable via different IP and port -# pairs. The following two options can be used by a slave in order to +# used, the replica may be actually reachable via different IP and port +# pairs. The following two options can be used by a replica in order to # report to its master a specific set of IP and port, so that both INFO # and ROLE will report those values. # # There is no need to use both the options if you need to override just # the port or the IP address. # -# slave-announce-ip 5.5.5.5 -# slave-announce-port 1234 +# replica-announce-ip 5.5.5.5 +# replica-announce-port 1234 ################################## SECURITY ################################### -# Require clients to issue AUTH before processing any other -# commands. This might be useful in environments in which you do not trust -# others with access to the host running redis-server. -# -# This should stay commented out for backward compatibility and because most -# people do not need auth (e.g. they run their own servers). -# # Warning: since Redis is pretty fast an outside user can try up to -# 150k passwords per second against a good box. This means that you should -# use a very strong password otherwise it will be very easy to break. +# 1 million passwords per second against a modern box. This means that you +# should use very strong passwords, otherwise they will be very easy to break. +# Note that because the password is really a shared secret between the client +# and the server, and should not be memorized by any human, the password +# can be easily a long string from /dev/urandom or whatever, so by using a +# long and unguessable password no brute force attack will be possible. + +# Redis ACL users are defined in the following format: +# +# user ... acl rules ... +# +# For example: +# +# user worker +@list +@connection ~jobs:* on >ffa9203c493aa99 +# +# The special username "default" is used for new connections. If this user +# has the "nopass" rule, then new connections will be immediately authenticated +# as the "default" user without the need of any password provided via the +# AUTH command. Otherwise if the "default" user is not flagged with "nopass" +# the connections will start in not authenticated state, and will require +# AUTH (or the HELLO command AUTH option) in order to be authenticated and +# start to work. +# +# The ACL rules that describe what an user can do are the following: +# +# on Enable the user: it is possible to authenticate as this user. +# off Disable the user: it's no longer possible to authenticate +# with this user, however the already authenticated connections +# will still work. +# + Allow the execution of that command +# - Disallow the execution of that command +# +@ Allow the execution of all the commands in such category +# with valid categories are like @admin, @set, @sortedset, ... +# and so forth, see the full list in the server.c file where +# the Redis command table is described and defined. +# The special category @all means all the commands, but currently +# present in the server, and that will be loaded in the future +# via modules. +# +|subcommand Allow a specific subcommand of an otherwise +# disabled command. Note that this form is not +# allowed as negative like -DEBUG|SEGFAULT, but +# only additive starting with "+". +# allcommands Alias for +@all. Note that it implies the ability to execute +# all the future commands loaded via the modules system. +# nocommands Alias for -@all. +# ~ Add a pattern of keys that can be mentioned as part of +# commands. For instance ~* allows all the keys. The pattern +# is a glob-style pattern like the one of KEYS. +# It is possible to specify multiple patterns. +# allkeys Alias for ~* +# resetkeys Flush the list of allowed keys patterns. +# > Add this passowrd to the list of valid password for the user. +# For example >mypass will add "mypass" to the list. +# This directive clears the "nopass" flag (see later). +# < Remove this password from the list of valid passwords. +# nopass All the set passwords of the user are removed, and the user +# is flagged as requiring no password: it means that every +# password will work against this user. If this directive is +# used for the default user, every new connection will be +# immediately authenticated with the default user without +# any explicit AUTH command required. Note that the "resetpass" +# directive will clear this condition. +# resetpass Flush the list of allowed passwords. Moreover removes the +# "nopass" status. After "resetpass" the user has no associated +# passwords and there is no way to authenticate without adding +# some password (or setting it as "nopass" later). +# reset Performs the following actions: resetpass, resetkeys, off, +# -@all. The user returns to the same state it has immediately +# after its creation. +# +# ACL rules can be specified in any order: for instance you can start with +# passwords, then flags, or key patterns. However note that the additive +# and subtractive rules will CHANGE MEANING depending on the ordering. +# For instance see the following example: +# +# user alice on +@all -DEBUG ~* >somepassword +# +# This will allow "alice" to use all the commands with the exception of the +# DEBUG command, since +@all added all the commands to the set of the commands +# alice can use, and later DEBUG was removed. However if we invert the order +# of two ACL rules the result will be different: +# +# user alice on -DEBUG +@all ~* >somepassword +# +# Now DEBUG was removed when alice had yet no commands in the set of allowed +# commands, later all the commands are added, so the user will be able to +# execute everything. +# +# Basically ACL rules are processed left-to-right. +# +# For more information about ACL configuration please refer to +# the Redis web site at https://redis.io/topics/acl + +# Using an external ACL file +# +# Instead of configuring users here in this file, it is possible to use +# a stand-alone file just listing users. The two methods cannot be mixed: +# if you configure users here and at the same time you activate the exteranl +# ACL file, the server will refuse to start. +# +# The format of the external ACL user file is exactly the same as the +# format that is used inside redis.conf to describe users. +# +# aclfile /etc/redis/users.acl + +# IMPORTANT NOTE: starting with Redis 6 "requirepass" is just a compatiblity +# layer on top of the new ACL system. The option effect will be just setting +# the password for the default user. Clients will still authenticate using +# AUTH as usually, or more explicitly with AUTH default +# if they follow the new protocol: both will work. # # requirepass foobared -# Command renaming. +# Command renaming (DEPRECATED). +# +# ------------------------------------------------------------------------ +# WARNING: avoid using this option if possible. Instead use ACLs to remove +# commands from the default user, and put them only in some admin user you +# create for administrative purposes. +# ------------------------------------------------------------------------ # # It is possible to change the name of dangerous commands in a shared # environment. For instance the CONFIG command may be renamed into something @@ -518,7 +641,7 @@ slave-priority 100 # rename-command CONFIG "" # # Please note that changing the name of commands that are logged into the -# AOF file or transmitted to slaves may cause problems. +# AOF file or transmitted to replicas may cause problems. ################################### CLIENTS #################################### @@ -547,15 +670,15 @@ slave-priority 100 # This option is usually useful when using Redis as an LRU or LFU cache, or to # set a hard memory limit for an instance (using the 'noeviction' policy). # -# WARNING: If you have slaves attached to an instance with maxmemory on, -# the size of the output buffers needed to feed the slaves are subtracted +# WARNING: If you have replicas attached to an instance with maxmemory on, +# the size of the output buffers needed to feed the replicas are subtracted # from the used memory count, so that network problems / resyncs will # not trigger a loop where keys are evicted, and in turn the output -# buffer of slaves is full with DELs of keys evicted triggering the deletion +# buffer of replicas is full with DELs of keys evicted triggering the deletion # of more keys, and so forth until the database is completely emptied. # -# In short... if you have slaves attached it is suggested that you set a lower -# limit for maxmemory so that there is some free RAM on the system for slave +# In short... if you have replicas attached it is suggested that you set a lower +# limit for maxmemory so that there is some free RAM on the system for replica # output buffers (but this is not needed if the policy is 'noeviction'). # # maxmemory @@ -602,6 +725,26 @@ slave-priority 100 # # maxmemory-samples 5 +# Starting from Redis 5, by default a replica will ignore its maxmemory setting +# (unless it is promoted to master after a failover or manually). It means +# that the eviction of keys will be just handled by the master, sending the +# DEL commands to the replica as keys evict in the master side. +# +# This behavior ensures that masters and replicas stay consistent, and is usually +# what you want, however if your replica is writable, or you want the replica to have +# a different memory setting, and you are sure all the writes performed to the +# replica are idempotent, then you may change this default (but be sure to understand +# what you are doing). +# +# Note that since the replica by default does not evict, it may end using more +# memory than the one set via maxmemory (there are certain buffers that may +# be larger on the replica, or data structures may sometimes take more memory and so +# forth). So make sure you monitor your replicas and make sure they have enough +# memory to never hit a real out-of-memory condition before the master hits +# the configured maxmemory setting. +# +# replica-ignore-maxmemory yes + ############################# LAZY FREEING #################################### # Redis has two primitives to delete keys. One is called DEL and is a blocking @@ -637,7 +780,7 @@ slave-priority 100 # or SORT with STORE option may delete existing keys. The SET command # itself removes any old content of the specified key in order to replace # it with the specified string. -# 4) During replication, when a slave performs a full resynchronization with +# 4) During replication, when a replica performs a full resynchronization with # its master, the content of the whole database is removed in order to # load the RDB file just transferred. # @@ -649,7 +792,7 @@ slave-priority 100 lazyfree-lazy-eviction no lazyfree-lazy-expire no lazyfree-lazy-server-del no -slave-lazy-flush no +replica-lazy-flush no ############################## APPEND ONLY MODE ############################### @@ -826,42 +969,42 @@ lua-time-limit 5000 # # cluster-node-timeout 15000 -# A slave of a failing master will avoid to start a failover if its data +# A replica of a failing master will avoid to start a failover if its data # looks too old. # -# There is no simple way for a slave to actually have an exact measure of +# There is no simple way for a replica to actually have an exact measure of # its "data age", so the following two checks are performed: # -# 1) If there are multiple slaves able to failover, they exchange messages -# in order to try to give an advantage to the slave with the best +# 1) If there are multiple replicas able to failover, they exchange messages +# in order to try to give an advantage to the replica with the best # replication offset (more data from the master processed). -# Slaves will try to get their rank by offset, and apply to the start +# Replicas will try to get their rank by offset, and apply to the start # of the failover a delay proportional to their rank. # -# 2) Every single slave computes the time of the last interaction with +# 2) Every single replica computes the time of the last interaction with # its master. This can be the last ping or command received (if the master # is still in the "connected" state), or the time that elapsed since the # disconnection with the master (if the replication link is currently down). -# If the last interaction is too old, the slave will not try to failover +# If the last interaction is too old, the replica will not try to failover # at all. # -# The point "2" can be tuned by user. Specifically a slave will not perform +# The point "2" can be tuned by user. Specifically a replica will not perform # the failover if, since the last interaction with the master, the time # elapsed is greater than: # -# (node-timeout * slave-validity-factor) + repl-ping-slave-period +# (node-timeout * replica-validity-factor) + repl-ping-replica-period # -# So for example if node-timeout is 30 seconds, and the slave-validity-factor -# is 10, and assuming a default repl-ping-slave-period of 10 seconds, the -# slave will not try to failover if it was not able to talk with the master +# So for example if node-timeout is 30 seconds, and the replica-validity-factor +# is 10, and assuming a default repl-ping-replica-period of 10 seconds, the +# replica will not try to failover if it was not able to talk with the master # for longer than 310 seconds. # -# A large slave-validity-factor may allow slaves with too old data to failover +# A large replica-validity-factor may allow replicas with too old data to failover # a master, while a too small value may prevent the cluster from being able to -# elect a slave at all. +# elect a replica at all. # -# For maximum availability, it is possible to set the slave-validity-factor -# to a value of 0, which means, that slaves will always try to failover the +# For maximum availability, it is possible to set the replica-validity-factor +# to a value of 0, which means, that replicas will always try to failover the # master regardless of the last time they interacted with the master. # (However they'll always try to apply a delay proportional to their # offset rank). @@ -869,22 +1012,22 @@ lua-time-limit 5000 # Zero is the only value able to guarantee that when all the partitions heal # the cluster will always be able to continue. # -# cluster-slave-validity-factor 10 +# cluster-replica-validity-factor 10 -# Cluster slaves are able to migrate to orphaned masters, that are masters -# that are left without working slaves. This improves the cluster ability +# Cluster replicas are able to migrate to orphaned masters, that are masters +# that are left without working replicas. This improves the cluster ability # to resist to failures as otherwise an orphaned master can't be failed over -# in case of failure if it has no working slaves. +# in case of failure if it has no working replicas. # -# Slaves migrate to orphaned masters only if there are still at least a -# given number of other working slaves for their old master. This number -# is the "migration barrier". A migration barrier of 1 means that a slave -# will migrate only if there is at least 1 other working slave for its master -# and so forth. It usually reflects the number of slaves you want for every +# Replicas migrate to orphaned masters only if there are still at least a +# given number of other working replicas for their old master. This number +# is the "migration barrier". A migration barrier of 1 means that a replica +# will migrate only if there is at least 1 other working replica for its master +# and so forth. It usually reflects the number of replicas you want for every # master in your cluster. # -# Default is 1 (slaves migrate only if their masters remain with at least -# one slave). To disable migration just set it to a very large value. +# Default is 1 (replicas migrate only if their masters remain with at least +# one replica). To disable migration just set it to a very large value. # A value of 0 can be set but is useful only for debugging and dangerous # in production. # @@ -903,7 +1046,7 @@ lua-time-limit 5000 # # cluster-require-full-coverage yes -# This option, when set to yes, prevents slaves from trying to failover its +# This option, when set to yes, prevents replicas from trying to failover its # master during master failures. However the master can still perform a # manual failover, if forced to do so. # @@ -911,7 +1054,7 @@ lua-time-limit 5000 # data center operations, where we want one side to never be promoted if not # in the case of a total DC failure. # -# cluster-slave-no-failover no +# cluster-replica-no-failover no # In order to setup your cluster make sure to read the documentation # available at http://redis.io web site. @@ -1040,6 +1183,61 @@ latency-monitor-threshold 0 # specify at least one of K or E, no events will be delivered. notify-keyspace-events "" +############################### GOPHER SERVER ################################# + +# Redis contains an implementation of the Gopher protocol, as specified in +# the RFC 1436 (https://www.ietf.org/rfc/rfc1436.txt). +# +# The Gopher protocol was very popular in the late '90s. It is an alternative +# to the web, and the implementation both server and client side is so simple +# that the Redis server has just 100 lines of code in order to implement this +# support. +# +# What do you do with Gopher nowadays? Well Gopher never *really* died, and +# lately there is a movement in order for the Gopher more hierarchical content +# composed of just plain text documents to be resurrected. Some want a simpler +# internet, others believe that the mainstream internet became too much +# controlled, and it's cool to create an alternative space for people that +# want a bit of fresh air. +# +# Anyway for the 10nth birthday of the Redis, we gave it the Gopher protocol +# as a gift. +# +# --- HOW IT WORKS? --- +# +# The Redis Gopher support uses the inline protocol of Redis, and specifically +# two kind of inline requests that were anyway illegal: an empty request +# or any request that starts with "/" (there are no Redis commands starting +# with such a slash). Normal RESP2/RESP3 requests are completely out of the +# path of the Gopher protocol implementation and are served as usually as well. +# +# If you open a connection to Redis when Gopher is enabled and send it +# a string like "/foo", if there is a key named "/foo" it is served via the +# Gopher protocol. +# +# In order to create a real Gopher "hole" (the name of a Gopher site in Gopher +# talking), you likely need a script like the following: +# +# https://github.com/antirez/gopher2redis +# +# --- SECURITY WARNING --- +# +# If you plan to put Redis on the internet in a publicly accessible address +# to server Gopher pages MAKE SURE TO SET A PASSWORD to the instance. +# Once a password is set: +# +# 1. The Gopher server (when enabled, not by default) will kill serve +# content via Gopher. +# 2. However other commands cannot be called before the client will +# authenticate. +# +# So use the 'requirepass' option to protect your instance. +# +# To enable Gopher support uncomment the following line and set +# the option from no (the default) to yes. +# +# gopher-enabled no + ############################### ADVANCED CONFIG ############################### # Hashes are encoded using a memory efficient data structure when they have a @@ -1145,7 +1343,7 @@ activerehashing yes # The limit can be set differently for the three different classes of clients: # # normal -> normal clients including MONITOR clients -# slave -> slave clients +# replica -> replica clients # pubsub -> clients subscribed to at least one pubsub channel or pattern # # The syntax of every client-output-buffer-limit directive is the following: @@ -1166,12 +1364,12 @@ activerehashing yes # asynchronous clients may create a scenario where data is requested faster # than it can read. # -# Instead there is a default limit for pubsub and slave clients, since -# subscribers and slaves receive data in a push fashion. +# Instead there is a default limit for pubsub and replica clients, since +# subscribers and replicas receive data in a push fashion. # # Both the hard or the soft limit can be disabled by setting them to zero. client-output-buffer-limit normal 0 0 0 -client-output-buffer-limit slave 256mb 64mb 60 +client-output-buffer-limit replica 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 # Client query buffers accumulate new commands. They are limited to a fixed @@ -1205,6 +1403,22 @@ client-output-buffer-limit pubsub 32mb 8mb 60 # 100 only in environments where very low latency is required. hz 10 +# Normally it is useful to have an HZ value which is proportional to the +# number of clients connected. This is useful in order, for instance, to +# avoid too many clients are processed for each background task invocation +# in order to avoid latency spikes. +# +# Since the default HZ value by default is conservatively set to 10, Redis +# offers, and enables by default, the ability to use an adaptive HZ value +# which will temporary raise when there are many connected clients. +# +# When dynamic HZ is enabled, the actual configured HZ will be used as +# as a baseline, but multiples of the configured HZ value will be actually +# used as needed once more clients are connected. In this way an idle +# instance will use very little CPU time while a busy instance will be +# more responsive. +dynamic-hz yes + # When a child rewrites the AOF file, if the following option is enabled # the file will be fsync-ed every 32 MB of data generated. This is useful # in order to commit the file to the disk more incrementally and avoid diff --git a/runtest b/runtest index d8451df57..ade1bd09a 100755 --- a/runtest +++ b/runtest @@ -11,4 +11,4 @@ then echo "You need tcl 8.5 or newer in order to run the Redis test" exit 1 fi -$TCLSH tests/test_helper.tcl $* +$TCLSH tests/test_helper.tcl "${@}" diff --git a/sentinel.conf b/sentinel.conf index 3703c7394..bc9a705ac 100644 --- a/sentinel.conf +++ b/sentinel.conf @@ -20,6 +20,21 @@ # The port that this sentinel instance will run on port 26379 +# By default Redis Sentinel does not run as a daemon. Use 'yes' if you need it. +# Note that Redis will write a pid file in /var/run/redis-sentinel.pid when +# daemonized. +daemonize no + +# When running daemonized, Redis Sentinel writes a pid file in +# /var/run/redis-sentinel.pid by default. You can specify a custom pid file +# location here. +pidfile /var/run/redis-sentinel.pid + +# Specify the log file name. Also the empty string can be used to force +# Sentinel to log on the standard output. Note that if you use standard +# output for logging but daemonize, logs will be sent to /dev/null +logfile "" + # sentinel announce-ip # sentinel announce-port # @@ -58,11 +73,11 @@ dir /tmp # be elected by the majority of the known Sentinels in order to # start a failover, so no failover can be performed in minority. # -# Slaves are auto-discovered, so you don't need to specify slaves in +# Replicas are auto-discovered, so you don't need to specify replicas in # any way. Sentinel itself will rewrite this configuration file adding -# the slaves using additional configuration options. +# the replicas using additional configuration options. # Also note that the configuration file is rewritten when a -# slave is promoted to master. +# replica is promoted to master. # # Note: master name should not include special characters or spaces. # The valid charset is A-z 0-9 and the three characters ".-_". @@ -70,11 +85,11 @@ sentinel monitor mymaster 127.0.0.1 6379 2 # sentinel auth-pass # -# Set the password to use to authenticate with the master and slaves. +# Set the password to use to authenticate with the master and replicas. # Useful if there is a password set in the Redis instances to monitor. # -# Note that the master password is also used for slaves, so it is not -# possible to set a different password in masters and slaves instances +# Note that the master password is also used for replicas, so it is not +# possible to set a different password in masters and replicas instances # if you want to be able to monitor these instances with Sentinel. # # However you can have Redis instances without the authentication enabled @@ -89,7 +104,7 @@ sentinel monitor mymaster 127.0.0.1 6379 2 # sentinel down-after-milliseconds # -# Number of milliseconds the master (or any attached slave or sentinel) should +# Number of milliseconds the master (or any attached replica or sentinel) should # be unreachable (as in, not acceptable reply to PING, continuously, for the # specified period) in order to consider it in S_DOWN state (Subjectively # Down). @@ -97,11 +112,11 @@ sentinel monitor mymaster 127.0.0.1 6379 2 # Default is 30 seconds. sentinel down-after-milliseconds mymaster 30000 -# sentinel parallel-syncs +# sentinel parallel-syncs # -# How many slaves we can reconfigure to point to the new slave simultaneously -# during the failover. Use a low number if you use the slaves to serve query -# to avoid that all the slaves will be unreachable at about the same +# How many replicas we can reconfigure to point to the new replica simultaneously +# during the failover. Use a low number if you use the replicas to serve query +# to avoid that all the replicas will be unreachable at about the same # time while performing the synchronization with the master. sentinel parallel-syncs mymaster 1 @@ -113,18 +128,18 @@ sentinel parallel-syncs mymaster 1 # already tried against the same master by a given Sentinel, is two # times the failover timeout. # -# - The time needed for a slave replicating to a wrong master according +# - The time needed for a replica replicating to a wrong master according # to a Sentinel current configuration, to be forced to replicate # with the right master, is exactly the failover timeout (counting since # the moment a Sentinel detected the misconfiguration). # # - The time needed to cancel a failover that is already in progress but # did not produced any configuration change (SLAVEOF NO ONE yet not -# acknowledged by the promoted slave). +# acknowledged by the promoted replica). # -# - The maximum time a failover in progress waits for all the slaves to be -# reconfigured as slaves of the new master. However even after this time -# the slaves will be reconfigured by the Sentinels anyway, but not with +# - The maximum time a failover in progress waits for all the replicas to be +# reconfigured as replicas of the new master. However even after this time +# the replicas will be reconfigured by the Sentinels anyway, but not with # the exact parallel-syncs progression as specified. # # Default is 3 minutes. @@ -185,7 +200,7 @@ sentinel failover-timeout mymaster 180000 # is either "leader" or "observer" # # The arguments from-ip, from-port, to-ip, to-port are used to communicate -# the old address of the master and the new address of the elected slave +# the old address of the master and the new address of the elected replica # (now a master). # # This script should be resistant to multiple invocations. @@ -213,12 +228,17 @@ sentinel deny-scripts-reconfig yes # # In such case it is possible to tell Sentinel to use different command names # instead of the normal ones. For example if the master "mymaster", and the -# associated slaves, have "CONFIG" all renamed to "GUESSME", I could use: +# associated replicas, have "CONFIG" all renamed to "GUESSME", I could use: # -# sentinel rename-command mymaster CONFIG GUESSME +# SENTINEL rename-command mymaster CONFIG GUESSME # # After such configuration is set, every time Sentinel would use CONFIG it will # use GUESSME instead. Note that there is no actual need to respect the command # case, so writing "config guessme" is the same in the example above. # # SENTINEL SET can also be used in order to perform this configuration at runtime. +# +# In order to set a command back to its original name (undo the renaming), it +# is possible to just rename a command to itsef: +# +# SENTINEL rename-command mymaster CONFIG CONFIG diff --git a/src/Makefile b/src/Makefile index f5525bd6d..d4874f7cf 100644 --- a/src/Makefile +++ b/src/Makefile @@ -21,6 +21,11 @@ NODEPS:=clean distclean # Default settings STD=-std=c99 -pedantic -DREDIS_STATIC='' +ifneq (,$(findstring clang,$(CC))) +ifneq (,$(findstring FreeBSD,$(uname_S))) + STD+=-Wno-c11-extensions +endif +endif WARN=-Wall -W -Wno-missing-field-initializers OPT=$(OPTIMIZATION) @@ -41,6 +46,10 @@ endif # To get ARM stack traces if Redis crashes we need a special C flag. ifneq (,$(filter aarch64 armv,$(uname_M))) CFLAGS+=-funwind-tables +else +ifneq (,$(findstring armv,$(uname_M))) + CFLAGS+=-funwind-tables +endif endif # Backwards compatibility for selecting an allocator @@ -93,10 +102,20 @@ else ifeq ($(uname_S),OpenBSD) # OpenBSD FINAL_LIBS+= -lpthread + ifeq ($(USE_BACKTRACE),yes) + FINAL_CFLAGS+= -DUSE_BACKTRACE -I/usr/local/include + FINAL_LDFLAGS+= -L/usr/local/lib + FINAL_LIBS+= -lexecinfo + endif + else ifeq ($(uname_S),FreeBSD) # FreeBSD - FINAL_LIBS+= -lpthread + FINAL_LIBS+= -lpthread -lexecinfo +else +ifeq ($(uname_S),DragonFly) + # FreeBSD + FINAL_LIBS+= -lpthread -lexecinfo else # All the other OSes (notably Linux) FINAL_LDFLAGS+= -rdynamic @@ -106,6 +125,7 @@ endif endif endif endif +endif # Include paths to dependencies FINAL_CFLAGS+= -I../deps/hiredis -I../deps/linenoise -I../deps/lua/src @@ -144,7 +164,7 @@ endif REDIS_SERVER_NAME=redis-server REDIS_SENTINEL_NAME=redis-sentinel -REDIS_SERVER_OBJ=adlist.o quicklist.o ae.o anet.o dict.o server.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o cluster.o crc16.o endianconv.o slowlog.o scripting.o bio.o rio.o rand.o memtest.o crc64.o bitops.o sentinel.o notify.o setproctitle.o blocked.o hyperloglog.o latency.o sparkline.o redis-check-rdb.o redis-check-aof.o geo.o lazyfree.o module.o evict.o expire.o geohash.o geohash_helper.o childinfo.o defrag.o siphash.o rax.o t_stream.o listpack.o localtime.o +REDIS_SERVER_OBJ=adlist.o quicklist.o ae.o anet.o dict.o server.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o cluster.o crc16.o endianconv.o slowlog.o scripting.o bio.o rio.o rand.o memtest.o crc64.o bitops.o sentinel.o notify.o setproctitle.o blocked.o hyperloglog.o latency.o sparkline.o redis-check-rdb.o redis-check-aof.o geo.o lazyfree.o module.o evict.o expire.o geohash.o geohash_helper.o childinfo.o defrag.o siphash.o rax.o t_stream.o listpack.o localtime.o lolwut.o lolwut5.o acl.o gopher.o REDIS_CLI_NAME=redis-cli REDIS_CLI_OBJ=anet.o adlist.o dict.o redis-cli.o zmalloc.o release.o anet.o ae.o crc64.o siphash.o crc16.o REDIS_BENCHMARK_NAME=redis-benchmark diff --git a/src/acl.c b/src/acl.c new file mode 100644 index 000000000..3cca50027 --- /dev/null +++ b/src/acl.c @@ -0,0 +1,1647 @@ +/* + * Copyright (c) 2018, Salvatore Sanfilippo + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * * Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Redis nor the names of its contributors may be used + * to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#include "server.h" +#include + +/* ============================================================================= + * Global state for ACLs + * ==========================================================================*/ + +rax *Users; /* Table mapping usernames to user structures. */ + +user *DefaultUser; /* Global reference to the default user. + Every new connection is associated to it, if no + AUTH or HELLO is used to authenticate with a + different user. */ + +list *UsersToLoad; /* This is a list of users found in the configuration file + that we'll need to load in the final stage of Redis + initialization, after all the modules are already + loaded. Every list element is a NULL terminated + array of SDS pointers: the first is the user name, + all the remaining pointers are ACL rules in the same + format as ACLSetUser(). */ + +struct ACLCategoryItem { + const char *name; + uint64_t flag; +} ACLCommandCategories[] = { + {"keyspace", CMD_CATEGORY_KEYSPACE}, + {"read", CMD_CATEGORY_READ}, + {"write", CMD_CATEGORY_WRITE}, + {"set", CMD_CATEGORY_SET}, + {"sortedset", CMD_CATEGORY_SORTEDSET}, + {"list", CMD_CATEGORY_LIST}, + {"hash", CMD_CATEGORY_HASH}, + {"string", CMD_CATEGORY_STRING}, + {"bitmap", CMD_CATEGORY_BITMAP}, + {"hyperloglog", CMD_CATEGORY_HYPERLOGLOG}, + {"geo", CMD_CATEGORY_GEO}, + {"stream", CMD_CATEGORY_STREAM}, + {"pubsub", CMD_CATEGORY_PUBSUB}, + {"admin", CMD_CATEGORY_ADMIN}, + {"fast", CMD_CATEGORY_FAST}, + {"slow", CMD_CATEGORY_SLOW}, + {"blocking", CMD_CATEGORY_BLOCKING}, + {"dangerous", CMD_CATEGORY_DANGEROUS}, + {"connection", CMD_CATEGORY_CONNECTION}, + {"transaction", CMD_CATEGORY_TRANSACTION}, + {"scripting", CMD_CATEGORY_SCRIPTING}, + {NULL,0} /* Terminator. */ +}; + +struct ACLUserFlag { + const char *name; + uint64_t flag; +} ACLUserFlags[] = { + {"on", USER_FLAG_ENABLED}, + {"off", USER_FLAG_DISABLED}, + {"allkeys", USER_FLAG_ALLKEYS}, + {"allcommands", USER_FLAG_ALLCOMMANDS}, + {"nopass", USER_FLAG_NOPASS}, + {NULL,0} /* Terminator. */ +}; + +void ACLResetSubcommandsForCommand(user *u, unsigned long id); +void ACLResetSubcommands(user *u); +void ACLAddAllowedSubcommand(user *u, unsigned long id, const char *sub); + +/* ============================================================================= + * Helper functions for the rest of the ACL implementation + * ==========================================================================*/ + +/* Return zero if strings are the same, non-zero if they are not. + * The comparison is performed in a way that prevents an attacker to obtain + * information about the nature of the strings just monitoring the execution + * time of the function. + * + * Note that limiting the comparison length to strings up to 512 bytes we + * can avoid leaking any information about the password length and any + * possible branch misprediction related leak. + */ +int time_independent_strcmp(char *a, char *b) { + char bufa[CONFIG_AUTHPASS_MAX_LEN], bufb[CONFIG_AUTHPASS_MAX_LEN]; + /* The above two strlen perform len(a) + len(b) operations where either + * a or b are fixed (our password) length, and the difference is only + * relative to the length of the user provided string, so no information + * leak is possible in the following two lines of code. */ + unsigned int alen = strlen(a); + unsigned int blen = strlen(b); + unsigned int j; + int diff = 0; + + /* We can't compare strings longer than our static buffers. + * Note that this will never pass the first test in practical circumstances + * so there is no info leak. */ + if (alen > sizeof(bufa) || blen > sizeof(bufb)) return 1; + + memset(bufa,0,sizeof(bufa)); /* Constant time. */ + memset(bufb,0,sizeof(bufb)); /* Constant time. */ + /* Again the time of the following two copies is proportional to + * len(a) + len(b) so no info is leaked. */ + memcpy(bufa,a,alen); + memcpy(bufb,b,blen); + + /* Always compare all the chars in the two buffers without + * conditional expressions. */ + for (j = 0; j < sizeof(bufa); j++) { + diff |= (bufa[j] ^ bufb[j]); + } + /* Length must be equal as well. */ + diff |= alen ^ blen; + return diff; /* If zero strings are the same. */ +} + +/* ============================================================================= + * Low level ACL API + * ==========================================================================*/ + +/* Given the category name the command returns the corresponding flag, or + * zero if there is no match. */ +uint64_t ACLGetCommandCategoryFlagByName(const char *name) { + for (int j = 0; ACLCommandCategories[j].flag != 0; j++) { + if (!strcasecmp(name,ACLCommandCategories[j].name)) { + return ACLCommandCategories[j].flag; + } + } + return 0; /* No match. */ +} + +/* Method for passwords/pattern comparison used for the user->passwords list + * so that we can search for items with listSearchKey(). */ +int ACLListMatchSds(void *a, void *b) { + return sdscmp(a,b) == 0; +} + +/* Method to free list elements from ACL users password/ptterns lists. */ +void ACLListFreeSds(void *item) { + sdsfree(item); +} + +/* Method to duplicate list elements from ACL users password/ptterns lists. */ +void *ACLListDupSds(void *item) { + return sdsdup(item); +} + +/* Create a new user with the specified name, store it in the list + * of users (the Users global radix tree), and returns a reference to + * the structure representing the user. + * + * If the user with such name already exists NULL is returned. */ +user *ACLCreateUser(const char *name, size_t namelen) { + if (raxFind(Users,(unsigned char*)name,namelen) != raxNotFound) return NULL; + user *u = zmalloc(sizeof(*u)); + u->name = sdsnewlen(name,namelen); + u->flags = USER_FLAG_DISABLED; + u->allowed_subcommands = NULL; + u->passwords = listCreate(); + u->patterns = listCreate(); + listSetMatchMethod(u->passwords,ACLListMatchSds); + listSetFreeMethod(u->passwords,ACLListFreeSds); + listSetDupMethod(u->passwords,ACLListDupSds); + listSetMatchMethod(u->patterns,ACLListMatchSds); + listSetFreeMethod(u->patterns,ACLListFreeSds); + listSetDupMethod(u->patterns,ACLListDupSds); + memset(u->allowed_commands,0,sizeof(u->allowed_commands)); + raxInsert(Users,(unsigned char*)name,namelen,u,NULL); + return u; +} + +/* This function should be called when we need an unlinked "fake" user + * we can use in order to validate ACL rules or for other similar reasons. + * The user will not get linked to the Users radix tree. The returned + * user should be released with ACLFreeUser() as usually. */ +user *ACLCreateUnlinkedUser(void) { + char username[64]; + for (int j = 0; ; j++) { + snprintf(username,sizeof(username),"__fakeuser:%d__",j); + user *fakeuser = ACLCreateUser(username,strlen(username)); + if (fakeuser == NULL) continue; + int retval = raxRemove(Users,(unsigned char*) username, + strlen(username),NULL); + serverAssert(retval != 0); + return fakeuser; + } +} + +/* Release the memory used by the user structure. Note that this function + * will not remove the user from the Users global radix tree. */ +void ACLFreeUser(user *u) { + sdsfree(u->name); + listRelease(u->passwords); + listRelease(u->patterns); + ACLResetSubcommands(u); + zfree(u); +} + +/* When a user is deleted we need to cycle the active + * connections in order to kill all the pending ones that + * are authenticated with such user. */ +void ACLFreeUserAndKillClients(user *u) { + listIter li; + listNode *ln; + listRewind(server.clients,&li); + while ((ln = listNext(&li)) != NULL) { + client *c = listNodeValue(ln); + if (c->user == u) { + /* We'll free the conenction asynchronously, so + * in theory to set a different user is not needed. + * However if there are bugs in Redis, soon or later + * this may result in some security hole: it's much + * more defensive to set the default user and put + * it in non authenticated mode. */ + c->user = DefaultUser; + c->authenticated = 0; + freeClientAsync(c); + } + } + ACLFreeUser(u); +} + +/* Copy the user ACL rules from the source user 'src' to the destination + * user 'dst' so that at the end of the process they'll have exactly the + * same rules (but the names will continue to be the original ones). */ +void ACLCopyUser(user *dst, user *src) { + listRelease(dst->passwords); + listRelease(dst->patterns); + dst->passwords = listDup(src->passwords); + dst->patterns = listDup(src->patterns); + memcpy(dst->allowed_commands,src->allowed_commands, + sizeof(dst->allowed_commands)); + dst->flags = src->flags; + ACLResetSubcommands(dst); + /* Copy the allowed subcommands array of array of SDS strings. */ + if (src->allowed_subcommands) { + for (int j = 0; j < USER_COMMAND_BITS_COUNT; j++) { + if (src->allowed_subcommands[j]) { + for (int i = 0; src->allowed_subcommands[j][i]; i++) + { + ACLAddAllowedSubcommand(dst, j, + src->allowed_subcommands[j][i]); + } + } + } + } +} + +/* Free all the users registered in the radix tree 'users' and free the + * radix tree itself. */ +void ACLFreeUsersSet(rax *users) { + raxFreeWithCallback(users,(void(*)(void*))ACLFreeUserAndKillClients); +} + +/* Given a command ID, this function set by reference 'word' and 'bit' + * so that user->allowed_commands[word] will address the right word + * where the corresponding bit for the provided ID is stored, and + * so that user->allowed_commands[word]&bit will identify that specific + * bit. The function returns C_ERR in case the specified ID overflows + * the bitmap in the user representation. */ +int ACLGetCommandBitCoordinates(uint64_t id, uint64_t *word, uint64_t *bit) { + if (id >= USER_COMMAND_BITS_COUNT) return C_ERR; + *word = id / sizeof(uint64_t) / 8; + *bit = 1ULL << (id % (sizeof(uint64_t) * 8)); + return C_OK; +} + +/* Check if the specified command bit is set for the specified user. + * The function returns 1 is the bit is set or 0 if it is not. + * Note that this function does not check the ALLCOMMANDS flag of the user + * but just the lowlevel bitmask. + * + * If the bit overflows the user internal represetation, zero is returned + * in order to disallow the execution of the command in such edge case. */ +int ACLGetUserCommandBit(user *u, unsigned long id) { + uint64_t word, bit; + if (ACLGetCommandBitCoordinates(id,&word,&bit) == C_ERR) return 0; + return (u->allowed_commands[word] & bit) != 0; +} + +/* When +@all or allcommands is given, we set a reserved bit as well that we + * can later test, to see if the user has the right to execute "future commands", + * that is, commands loaded later via modules. */ +int ACLUserCanExecuteFutureCommands(user *u) { + return ACLGetUserCommandBit(u,USER_COMMAND_BITS_COUNT-1); +} + +/* Set the specified command bit for the specified user to 'value' (0 or 1). + * If the bit overflows the user internal represetation, no operation + * is performed. As a side effect of calling this function with a value of + * zero, the user flag ALLCOMMANDS is cleared since it is no longer possible + * to skip the command bit explicit test. */ +void ACLSetUserCommandBit(user *u, unsigned long id, int value) { + uint64_t word, bit; + if (value == 0) u->flags &= ~USER_FLAG_ALLCOMMANDS; + if (ACLGetCommandBitCoordinates(id,&word,&bit) == C_ERR) return; + if (value) + u->allowed_commands[word] |= bit; + else + u->allowed_commands[word] &= ~bit; +} + +/* This is like ACLSetUserCommandBit(), but instead of setting the specified + * ID, it will check all the commands in the category specified as argument, + * and will set all the bits corresponding to such commands to the specified + * value. Since the category passed by the user may be non existing, the + * function returns C_ERR if the category was not found, or C_OK if it was + * found and the operation was performed. */ +int ACLSetUserCommandBitsForCategory(user *u, const char *category, int value) { + uint64_t cflag = ACLGetCommandCategoryFlagByName(category); + if (!cflag) return C_ERR; + dictIterator *di = dictGetIterator(server.orig_commands); + dictEntry *de; + while ((de = dictNext(di)) != NULL) { + struct redisCommand *cmd = dictGetVal(de); + if (cmd->flags & CMD_MODULE) continue; /* Ignore modules commands. */ + if (cmd->flags & cflag) { + ACLSetUserCommandBit(u,cmd->id,value); + ACLResetSubcommandsForCommand(u,cmd->id); + } + } + dictReleaseIterator(di); + return C_OK; +} + +/* Return the number of commands allowed (on) and denied (off) for the user 'u' + * in the subset of commands flagged with the specified category name. + * If the categoty name is not valid, C_ERR is returend, otherwise C_OK is + * returned and on and off are populated by reference. */ +int ACLCountCategoryBitsForUser(user *u, unsigned long *on, unsigned long *off, + const char *category) +{ + uint64_t cflag = ACLGetCommandCategoryFlagByName(category); + if (!cflag) return C_ERR; + + *on = *off = 0; + dictIterator *di = dictGetIterator(server.orig_commands); + dictEntry *de; + while ((de = dictNext(di)) != NULL) { + struct redisCommand *cmd = dictGetVal(de); + if (cmd->flags & cflag) { + if (ACLGetUserCommandBit(u,cmd->id)) + (*on)++; + else + (*off)++; + } + } + dictReleaseIterator(di); + return C_OK; +} + +/* This function returns an SDS string representing the specified user ACL + * rules related to command execution, in the same format you could set them + * back using ACL SETUSER. The function will return just the set of rules needed + * to recreate the user commands bitmap, without including other user flags such + * as on/off, passwords and so forth. The returned string always starts with + * the +@all or -@all rule, depending on the user bitmap, and is followed, if + * needed, by the other rules needed to narrow or extend what the user can do. */ +sds ACLDescribeUserCommandRules(user *u) { + sds rules = sdsempty(); + int additive; /* If true we start from -@all and add, otherwise if + false we start from +@all and remove. */ + + /* This code is based on a trick: as we generate the rules, we apply + * them to a fake user, so that as we go we still know what are the + * bit differences we should try to address by emitting more rules. */ + user fu = {0}; + user *fakeuser = &fu; + + /* Here we want to understand if we should start with +@all and remove + * the commands corresponding to the bits that are not set in the user + * commands bitmap, or the contrary. Note that semantically the two are + * different. For instance starting with +@all and subtracting, the user + * will be able to execute future commands, while -@all and adding will just + * allow the user the run the selected commands and/or categories. + * How do we test for that? We use the trick of a reserved command ID bit + * that is set only by +@all (and its alias "allcommands"). */ + if (ACLUserCanExecuteFutureCommands(u)) { + additive = 0; + rules = sdscat(rules,"+@all "); + ACLSetUser(fakeuser,"+@all",-1); + } else { + additive = 1; + rules = sdscat(rules,"-@all "); + ACLSetUser(fakeuser,"-@all",-1); + } + + /* Try to add or subtract each category one after the other. Often a + * single category will not perfectly match the set of commands into + * it, so at the end we do a final pass adding/removing the single commands + * needed to make the bitmap exactly match. */ + for (int j = 0; ACLCommandCategories[j].flag != 0; j++) { + unsigned long on, off; + ACLCountCategoryBitsForUser(u,&on,&off,ACLCommandCategories[j].name); + if ((additive && on > off) || (!additive && off > on)) { + sds op = sdsnewlen(additive ? "+@" : "-@", 2); + op = sdscat(op,ACLCommandCategories[j].name); + ACLSetUser(fakeuser,op,-1); + rules = sdscatsds(rules,op); + rules = sdscatlen(rules," ",1); + sdsfree(op); + } + } + + /* Fix the final ACLs with single commands differences. */ + dictIterator *di = dictGetIterator(server.orig_commands); + dictEntry *de; + while ((de = dictNext(di)) != NULL) { + struct redisCommand *cmd = dictGetVal(de); + int userbit = ACLGetUserCommandBit(u,cmd->id); + int fakebit = ACLGetUserCommandBit(fakeuser,cmd->id); + if (userbit != fakebit) { + rules = sdscatlen(rules, userbit ? "+" : "-", 1); + rules = sdscat(rules,cmd->name); + rules = sdscatlen(rules," ",1); + ACLSetUserCommandBit(fakeuser,cmd->id,userbit); + } + + /* Emit the subcommands if there are any. */ + if (userbit == 0 && u->allowed_subcommands && + u->allowed_subcommands[cmd->id]) + { + for (int j = 0; u->allowed_subcommands[cmd->id][j]; j++) { + rules = sdscatlen(rules,"+",1); + rules = sdscat(rules,cmd->name); + rules = sdscatlen(rules,"|",1); + rules = sdscatsds(rules,u->allowed_subcommands[cmd->id][j]); + rules = sdscatlen(rules," ",1); + } + } + } + dictReleaseIterator(di); + + /* Trim the final useless space. */ + sdsrange(rules,0,-2); + + /* This is technically not needed, but we want to verify that now the + * predicted bitmap is exactly the same as the user bitmap, and abort + * otherwise, because aborting is better than a security risk in this + * code path. */ + if (memcmp(fakeuser->allowed_commands, + u->allowed_commands, + sizeof(u->allowed_commands)) != 0) + { + serverLog(LL_WARNING, + "CRITICAL ERROR: User ACLs don't match final bitmap: '%s'", + rules); + serverPanic("No bitmap match in ACLDescribeUserCommandRules()"); + } + return rules; +} + +/* This is similar to ACLDescribeUserCommandRules(), however instead of + * describing just the user command rules, everything is described: user + * flags, keys, passwords and finally the command rules obtained via + * the ACLDescribeUserCommandRules() function. This is the function we call + * when we want to rewrite the configuration files describing ACLs and + * in order to show users with ACL LIST. */ +sds ACLDescribeUser(user *u) { + sds res = sdsempty(); + + /* Flags. */ + for (int j = 0; ACLUserFlags[j].flag; j++) { + /* Skip the allcommands and allkeys flags because they'll be emitted + * later as ~* and +@all. */ + if (ACLUserFlags[j].flag == USER_FLAG_ALLKEYS || + ACLUserFlags[j].flag == USER_FLAG_ALLCOMMANDS) continue; + if (u->flags & ACLUserFlags[j].flag) { + res = sdscat(res,ACLUserFlags[j].name); + res = sdscatlen(res," ",1); + } + } + + /* Passwords. */ + listIter li; + listNode *ln; + listRewind(u->passwords,&li); + while((ln = listNext(&li))) { + sds thispass = listNodeValue(ln); + res = sdscatlen(res,">",1); + res = sdscatsds(res,thispass); + res = sdscatlen(res," ",1); + } + + /* Key patterns. */ + if (u->flags & USER_FLAG_ALLKEYS) { + res = sdscatlen(res,"~* ",3); + } else { + listRewind(u->patterns,&li); + while((ln = listNext(&li))) { + sds thispat = listNodeValue(ln); + res = sdscatlen(res,"~",1); + res = sdscatsds(res,thispat); + res = sdscatlen(res," ",1); + } + } + + /* Command rules. */ + sds rules = ACLDescribeUserCommandRules(u); + res = sdscatsds(res,rules); + sdsfree(rules); + return res; +} + +/* Get a command from the original command table, that is not affected + * by the command renaming operations: we base all the ACL work from that + * table, so that ACLs are valid regardless of command renaming. */ +struct redisCommand *ACLLookupCommand(const char *name) { + struct redisCommand *cmd; + sds sdsname = sdsnew(name); + cmd = dictFetchValue(server.orig_commands, sdsname); + sdsfree(sdsname); + return cmd; +} + +/* Flush the array of allowed subcommands for the specified user + * and command ID. */ +void ACLResetSubcommandsForCommand(user *u, unsigned long id) { + if (u->allowed_subcommands && u->allowed_subcommands[id]) { + zfree(u->allowed_subcommands[id]); + u->allowed_subcommands[id] = NULL; + } +} + +/* Flush the entire table of subcommands. This is useful on +@all, -@all + * or similar to return back to the minimal memory usage (and checks to do) + * for the user. */ +void ACLResetSubcommands(user *u) { + if (u->allowed_subcommands == NULL) return; + for (int j = 0; j < USER_COMMAND_BITS_COUNT; j++) { + if (u->allowed_subcommands[j]) { + for (int i = 0; u->allowed_subcommands[j][i]; i++) + sdsfree(u->allowed_subcommands[j][i]); + zfree(u->allowed_subcommands[j]); + } + } + zfree(u->allowed_subcommands); + u->allowed_subcommands = NULL; +} + + +/* Add a subcommand to the list of subcommands for the user 'u' and + * the command id specified. */ +void ACLAddAllowedSubcommand(user *u, unsigned long id, const char *sub) { + /* If this is the first subcommand to be configured for + * this user, we have to allocate the subcommands array. */ + if (u->allowed_subcommands == NULL) { + u->allowed_subcommands = zcalloc(USER_COMMAND_BITS_COUNT * + sizeof(sds*)); + } + + /* We also need to enlarge the allocation pointing to the + * null terminated SDS array, to make space for this one. + * To start check the current size, and while we are here + * make sure the subcommand is not already specified inside. */ + long items = 0; + if (u->allowed_subcommands[id]) { + while(u->allowed_subcommands[id][items]) { + /* If it's already here do not add it again. */ + if (!strcasecmp(u->allowed_subcommands[id][items],sub)) return; + items++; + } + } + + /* Now we can make space for the new item (and the null term). */ + items += 2; + u->allowed_subcommands[id] = zrealloc(u->allowed_subcommands[id], + sizeof(sds)*items); + u->allowed_subcommands[id][items-2] = sdsnew(sub); + u->allowed_subcommands[id][items-1] = NULL; +} + +/* Set user properties according to the string "op". The following + * is a description of what different strings will do: + * + * on Enable the user: it is possible to authenticate as this user. + * off Disable the user: it's no longer possible to authenticate + * with this user, however the already authenticated connections + * will still work. + * + Allow the execution of that command + * - Disallow the execution of that command + * +@ Allow the execution of all the commands in such category + * with valid categories are like @admin, @set, @sortedset, ... + * and so forth, see the full list in the server.c file where + * the Redis command table is described and defined. + * The special category @all means all the commands, but currently + * present in the server, and that will be loaded in the future + * via modules. + * +|subcommand Allow a specific subcommand of an otherwise + * disabled command. Note that this form is not + * allowed as negative like -DEBUG|SEGFAULT, but + * only additive starting with "+". + * allcommands Alias for +@all. Note that it implies the ability to execute + * all the future commands loaded via the modules system. + * nocommands Alias for -@all. + * ~ Add a pattern of keys that can be mentioned as part of + * commands. For instance ~* allows all the keys. The pattern + * is a glob-style pattern like the one of KEYS. + * It is possible to specify multiple patterns. + * allkeys Alias for ~* + * resetkeys Flush the list of allowed keys patterns. + * > Add this passowrd to the list of valid password for the user. + * For example >mypass will add "mypass" to the list. + * This directive clears the "nopass" flag (see later). + * < Remove this password from the list of valid passwords. + * nopass All the set passwords of the user are removed, and the user + * is flagged as requiring no password: it means that every + * password will work against this user. If this directive is + * used for the default user, every new connection will be + * immediately authenticated with the default user without + * any explicit AUTH command required. Note that the "resetpass" + * directive will clear this condition. + * resetpass Flush the list of allowed passwords. Moreover removes the + * "nopass" status. After "resetpass" the user has no associated + * passwords and there is no way to authenticate without adding + * some password (or setting it as "nopass" later). + * reset Performs the following actions: resetpass, resetkeys, off, + * -@all. The user returns to the same state it has immediately + * after its creation. + * + * The 'op' string must be null terminated. The 'oplen' argument should + * specify the length of the 'op' string in case the caller requires to pass + * binary data (for instance the >password form may use a binary password). + * Otherwise the field can be set to -1 and the function will use strlen() + * to determine the length. + * + * The function returns C_OK if the action to perform was understood because + * the 'op' string made sense. Otherwise C_ERR is returned if the operation + * is unknown or has some syntax error. + * + * When an error is returned, errno is set to the following values: + * + * EINVAL: The specified opcode is not understood. + * ENOENT: The command name or command category provided with + or - is not + * known. + * EBUSY: The subcommand you want to add is about a command that is currently + * fully added. + * EEXIST: You are adding a key pattern after "*" was already added. This is + * almost surely an error on the user side. + * ENODEV: The password you are trying to remove from the user does not exist. + */ +int ACLSetUser(user *u, const char *op, ssize_t oplen) { + if (oplen == -1) oplen = strlen(op); + if (!strcasecmp(op,"on")) { + u->flags |= USER_FLAG_ENABLED; + u->flags &= ~USER_FLAG_DISABLED; + } else if (!strcasecmp(op,"off")) { + u->flags |= USER_FLAG_DISABLED; + u->flags &= ~USER_FLAG_ENABLED; + } else if (!strcasecmp(op,"allkeys") || + !strcasecmp(op,"~*")) + { + u->flags |= USER_FLAG_ALLKEYS; + listEmpty(u->patterns); + } else if (!strcasecmp(op,"resetkeys")) { + u->flags &= ~USER_FLAG_ALLKEYS; + listEmpty(u->patterns); + } else if (!strcasecmp(op,"allcommands") || + !strcasecmp(op,"+@all")) + { + memset(u->allowed_commands,255,sizeof(u->allowed_commands)); + u->flags |= USER_FLAG_ALLCOMMANDS; + ACLResetSubcommands(u); + } else if (!strcasecmp(op,"nocommands") || + !strcasecmp(op,"-@all")) + { + memset(u->allowed_commands,0,sizeof(u->allowed_commands)); + u->flags &= ~USER_FLAG_ALLCOMMANDS; + ACLResetSubcommands(u); + } else if (!strcasecmp(op,"nopass")) { + u->flags |= USER_FLAG_NOPASS; + listEmpty(u->passwords); + } else if (!strcasecmp(op,"resetpass")) { + u->flags &= ~USER_FLAG_NOPASS; + listEmpty(u->passwords); + } else if (op[0] == '>') { + sds newpass = sdsnewlen(op+1,oplen-1); + listNode *ln = listSearchKey(u->passwords,newpass); + /* Avoid re-adding the same password multiple times. */ + if (ln == NULL) listAddNodeTail(u->passwords,newpass); + u->flags &= ~USER_FLAG_NOPASS; + } else if (op[0] == '<') { + sds delpass = sdsnewlen(op+1,oplen-1); + listNode *ln = listSearchKey(u->passwords,delpass); + sdsfree(delpass); + if (ln) { + listDelNode(u->passwords,ln); + } else { + errno = ENODEV; + return C_ERR; + } + } else if (op[0] == '~') { + if (u->flags & USER_FLAG_ALLKEYS) { + errno = EEXIST; + return C_ERR; + } + sds newpat = sdsnewlen(op+1,oplen-1); + listNode *ln = listSearchKey(u->patterns,newpat); + /* Avoid re-adding the same pattern multiple times. */ + if (ln == NULL) listAddNodeTail(u->patterns,newpat); + u->flags &= ~USER_FLAG_ALLKEYS; + } else if (op[0] == '+' && op[1] != '@') { + if (strchr(op,'|') == NULL) { + if (ACLLookupCommand(op+1) == NULL) { + errno = ENOENT; + return C_ERR; + } + unsigned long id = ACLGetCommandID(op+1); + ACLSetUserCommandBit(u,id,1); + ACLResetSubcommandsForCommand(u,id); + } else { + /* Split the command and subcommand parts. */ + char *copy = zstrdup(op+1); + char *sub = strchr(copy,'|'); + sub[0] = '\0'; + sub++; + + /* Check if the command exists. We can't check the + * subcommand to see if it is valid. */ + if (ACLLookupCommand(copy) == NULL) { + zfree(copy); + errno = ENOENT; + return C_ERR; + } + unsigned long id = ACLGetCommandID(copy); + + /* The subcommand cannot be empty, so things like DEBUG| + * are syntax errors of course. */ + if (strlen(sub) == 0) { + zfree(copy); + errno = EINVAL; + return C_ERR; + } + + /* The command should not be set right now in the command + * bitmap, because adding a subcommand of a fully added + * command is probably an error on the user side. */ + if (ACLGetUserCommandBit(u,id) == 1) { + zfree(copy); + errno = EBUSY; + return C_ERR; + } + + /* Add the subcommand to the list of valid ones. */ + ACLAddAllowedSubcommand(u,id,sub); + + /* We have to clear the command bit so that we force the + * subcommand check. */ + ACLSetUserCommandBit(u,id,0); + zfree(copy); + } + } else if (op[0] == '-' && op[1] != '@') { + if (ACLLookupCommand(op+1) == NULL) { + errno = ENOENT; + return C_ERR; + } + unsigned long id = ACLGetCommandID(op+1); + ACLSetUserCommandBit(u,id,0); + ACLResetSubcommandsForCommand(u,id); + } else if ((op[0] == '+' || op[0] == '-') && op[1] == '@') { + int bitval = op[0] == '+' ? 1 : 0; + if (ACLSetUserCommandBitsForCategory(u,op+2,bitval) == C_ERR) { + errno = ENOENT; + return C_ERR; + } + } else if (!strcasecmp(op,"reset")) { + serverAssert(ACLSetUser(u,"resetpass",-1) == C_OK); + serverAssert(ACLSetUser(u,"resetkeys",-1) == C_OK); + serverAssert(ACLSetUser(u,"off",-1) == C_OK); + serverAssert(ACLSetUser(u,"-@all",-1) == C_OK); + } else { + errno = EINVAL; + return C_ERR; + } + return C_OK; +} + +/* Return a description of the error that occurred in ACLSetUser() according to + * the errno value set by the function on error. */ +char *ACLSetUserStringError(void) { + char *errmsg = "Wrong format"; + if (errno == ENOENT) + errmsg = "Unknown command or category name in ACL"; + else if (errno == EINVAL) + errmsg = "Syntax error"; + else if (errno == EBUSY) + errmsg = "Adding a subcommand of a command already fully " + "added is not allowed. Remove the command to start. " + "Example: -DEBUG +DEBUG|DIGEST"; + else if (errno == EEXIST) + errmsg = "Adding a pattern after the * pattern (or the " + "'allkeys' flag) is not valid and does not have any " + "effect. Try 'resetkeys' to start with an empty " + "list of patterns"; + else if (errno == ENODEV) + errmsg = "The password you are trying to remove from the user does " + "not exist"; + return errmsg; +} + +/* Return the first password of the default user or NULL. + * This function is needed for backward compatibility with the old + * directive "requirepass" when Redis supported a single global + * password. */ +sds ACLDefaultUserFirstPassword(void) { + if (listLength(DefaultUser->passwords) == 0) return NULL; + listNode *first = listFirst(DefaultUser->passwords); + return listNodeValue(first); +} + +/* Initialize the default user, that will always exist for all the process + * lifetime. */ +void ACLInitDefaultUser(void) { + DefaultUser = ACLCreateUser("default",7); + ACLSetUser(DefaultUser,"+@all",-1); + ACLSetUser(DefaultUser,"~*",-1); + ACLSetUser(DefaultUser,"on",-1); + ACLSetUser(DefaultUser,"nopass",-1); +} + +/* Initialization of the ACL subsystem. */ +void ACLInit(void) { + Users = raxNew(); + UsersToLoad = listCreate(); + ACLInitDefaultUser(); +} + +/* Check the username and password pair and return C_OK if they are valid, + * otherwise C_ERR is returned and errno is set to: + * + * EINVAL: if the username-password do not match. + * ENONENT: if the specified user does not exist at all. + */ +int ACLCheckUserCredentials(robj *username, robj *password) { + user *u = ACLGetUserByName(username->ptr,sdslen(username->ptr)); + if (u == NULL) { + errno = ENOENT; + return C_ERR; + } + + /* Disabled users can't login. */ + if (u->flags & USER_FLAG_DISABLED) { + errno = EINVAL; + return C_ERR; + } + + /* If the user is configured to don't require any password, we + * are already fine here. */ + if (u->flags & USER_FLAG_NOPASS) return C_OK; + + /* Check all the user passwords for at least one to match. */ + listIter li; + listNode *ln; + listRewind(u->passwords,&li); + while((ln = listNext(&li))) { + sds thispass = listNodeValue(ln); + if (!time_independent_strcmp(password->ptr, thispass)) + return C_OK; + } + + /* If we reached this point, no password matched. */ + errno = EINVAL; + return C_ERR; +} + +/* This is like ACLCheckUserCredentials(), however if the user/pass + * are correct, the connection is put in authenticated state and the + * connection user reference is populated. + * + * The return value is C_OK or C_ERR with the same meaning as + * ACLCheckUserCredentials(). */ +int ACLAuthenticateUser(client *c, robj *username, robj *password) { + if (ACLCheckUserCredentials(username,password) == C_OK) { + c->authenticated = 1; + c->user = ACLGetUserByName(username->ptr,sdslen(username->ptr)); + return C_OK; + } else { + return C_ERR; + } +} + +/* For ACL purposes, every user has a bitmap with the commands that such + * user is allowed to execute. In order to populate the bitmap, every command + * should have an assigned ID (that is used to index the bitmap). This function + * creates such an ID: it uses sequential IDs, reusing the same ID for the same + * command name, so that a command retains the same ID in case of modules that + * are unloaded and later reloaded. */ +unsigned long ACLGetCommandID(const char *cmdname) { + static rax *map = NULL; + static unsigned long nextid = 0; + + sds lowername = sdsnew(cmdname); + sdstolower(lowername); + if (map == NULL) map = raxNew(); + void *id = raxFind(map,(unsigned char*)lowername,sdslen(lowername)); + if (id != raxNotFound) { + sdsfree(lowername); + return (unsigned long)id; + } + raxInsert(map,(unsigned char*)lowername,strlen(lowername), + (void*)nextid,NULL); + sdsfree(lowername); + unsigned long thisid = nextid; + nextid++; + + /* We never assign the last bit in the user commands bitmap structure, + * this way we can later check if this bit is set, understanding if the + * current ACL for the user was created starting with a +@all to add all + * the possible commands and just subtracting other single commands or + * categories, or if, instead, the ACL was created just adding commands + * and command categories from scratch, not allowing future commands by + * default (loaded via modules). This is useful when rewriting the ACLs + * with ACL SAVE. */ + if (nextid == USER_COMMAND_BITS_COUNT-1) nextid++; + return thisid; +} + +/* Return an username by its name, or NULL if the user does not exist. */ +user *ACLGetUserByName(const char *name, size_t namelen) { + void *myuser = raxFind(Users,(unsigned char*)name,namelen); + if (myuser == raxNotFound) return NULL; + return myuser; +} + +/* Check if the command ready to be excuted in the client 'c', and already + * referenced by c->cmd, can be executed by this client according to the + * ACls associated to the client user c->user. + * + * If the user can execute the command ACL_OK is returned, otherwise + * ACL_DENIED_CMD or ACL_DENIED_KEY is returned: the first in case the + * command cannot be executed because the user is not allowed to run such + * command, the second if the command is denied because the user is trying + * to access keys that are not among the specified patterns. */ +int ACLCheckCommandPerm(client *c) { + user *u = c->user; + uint64_t id = c->cmd->id; + + /* If there is no associated user, the connection can run anything. */ + if (u == NULL) return ACL_OK; + + /* Check if the user can execute this command. */ + if (!(u->flags & USER_FLAG_ALLCOMMANDS) && + c->cmd->proc != authCommand) + { + /* If the bit is not set we have to check further, in case the + * command is allowed just with that specific subcommand. */ + if (ACLGetUserCommandBit(u,id) == 0) { + /* Check if the subcommand matches. */ + if (c->argc < 2 || + u->allowed_subcommands == NULL || + u->allowed_subcommands[id] == NULL) + { + return ACL_DENIED_CMD; + } + + long subid = 0; + while (1) { + if (u->allowed_subcommands[id][subid] == NULL) + return ACL_DENIED_CMD; + if (!strcasecmp(c->argv[1]->ptr, + u->allowed_subcommands[id][subid])) + break; /* Subcommand match found. Stop here. */ + subid++; + } + } + } + + /* Check if the user can execute commands explicitly touching the keys + * mentioned in the command arguments. */ + if (!(c->user->flags & USER_FLAG_ALLKEYS) && + (c->cmd->getkeys_proc || c->cmd->firstkey)) + { + int numkeys; + int *keyidx = getKeysFromCommand(c->cmd,c->argv,c->argc,&numkeys); + for (int j = 0; j < numkeys; j++) { + listIter li; + listNode *ln; + listRewind(u->patterns,&li); + + /* Test this key against every pattern. */ + int match = 0; + while((ln = listNext(&li))) { + sds pattern = listNodeValue(ln); + size_t plen = sdslen(pattern); + int idx = keyidx[j]; + if (stringmatchlen(pattern,plen,c->argv[idx]->ptr, + sdslen(c->argv[idx]->ptr),0)) + { + match = 1; + break; + } + } + if (!match) { + getKeysFreeResult(keyidx); + return ACL_DENIED_KEY; + } + } + getKeysFreeResult(keyidx); + } + + /* If we survived all the above checks, the user can execute the + * command. */ + return ACL_OK; +} + +/* ============================================================================= + * ACL loading / saving functions + * ==========================================================================*/ + +/* Given an argument vector describing a user in the form: + * + * user ... ACL rules and flags ... + * + * this function validates, and if the syntax is valid, appends + * the user definition to a list for later loading. + * + * The rules are tested for validity and if there obvious syntax errors + * the function returns C_ERR and does nothing, otherwise C_OK is returned + * and the user is appended to the list. + * + * Note that this function cannot stop in case of commands that are not found + * and, in that case, the error will be emitted later, because certain + * commands may be defined later once modules are loaded. + * + * When an error is detected and C_ERR is returned, the function populates + * by reference (if not set to NULL) the argc_err argument with the index + * of the argv vector that caused the error. */ +int ACLAppendUserForLoading(sds *argv, int argc, int *argc_err) { + if (argc < 2 || strcasecmp(argv[0],"user")) { + if (argc_err) *argc_err = 0; + return C_ERR; + } + + /* Try to apply the user rules in a fake user to see if they + * are actually valid. */ + user *fakeuser = ACLCreateUnlinkedUser(); + + for (int j = 2; j < argc; j++) { + if (ACLSetUser(fakeuser,argv[j],sdslen(argv[j])) == C_ERR) { + if (errno != ENOENT) { + ACLFreeUser(fakeuser); + if (argc_err) *argc_err = j; + return C_ERR; + } + } + } + + /* Rules look valid, let's append the user to the list. */ + sds *copy = zmalloc(sizeof(sds)*argc); + for (int j = 1; j < argc; j++) copy[j-1] = sdsdup(argv[j]); + copy[argc-1] = NULL; + listAddNodeTail(UsersToLoad,copy); + ACLFreeUser(fakeuser); + return C_OK; +} + +/* This function will load the configured users appended to the server + * configuration via ACLAppendUserForLoading(). On loading errors it will + * log an error and return C_ERR, otherwise C_OK will be returned. */ +int ACLLoadConfiguredUsers(void) { + listIter li; + listNode *ln; + listRewind(UsersToLoad,&li); + while ((ln = listNext(&li)) != NULL) { + sds *aclrules = listNodeValue(ln); + sds username = aclrules[0]; + user *u = ACLCreateUser(username,sdslen(username)); + if (!u) { + u = ACLGetUserByName(username,sdslen(username)); + serverAssert(u != NULL); + ACLSetUser(u,"reset",-1); + } + + /* Load every rule defined for this user. */ + for (int j = 1; aclrules[j]; j++) { + if (ACLSetUser(u,aclrules[j],sdslen(aclrules[j])) != C_OK) { + char *errmsg = ACLSetUserStringError(); + serverLog(LL_WARNING,"Error loading ACL rule '%s' for " + "the user named '%s': %s", + aclrules[j],aclrules[0],errmsg); + return C_ERR; + } + } + + /* Having a disabled user in the configuration may be an error, + * warn about it without returning any error to the caller. */ + if (u->flags & USER_FLAG_DISABLED) { + serverLog(LL_NOTICE, "The user '%s' is disabled (there is no " + "'on' modifier in the user description). Make " + "sure this is not a configuration error.", + aclrules[0]); + } + } + return C_OK; +} + +/* This function loads the ACL from the specified filename: every line + * is validated and shold be either empty or in the format used to specify + * users in the redis.conf configuration or in the ACL file, that is: + * + * user ... rules ... + * + * Note that this function considers comments starting with '#' as errors + * because the ACL file is meant to be rewritten, and comments would be + * lost after the rewrite. Yet empty lines are allowed to avoid being too + * strict. + * + * One important part of implementing ACL LOAD, that uses this function, is + * to avoid ending with broken rules if the ACL file is invalid for some + * reason, so the function will attempt to validate the rules before loading + * each user. For every line that will be found broken the function will + * collect an error message. + * + * IMPORTANT: If there is at least a single error, nothing will be loaded + * and the rules will remain exactly as they were. + * + * At the end of the process, if no errors were found in the whole file then + * NULL is returned. Otherwise an SDS string describing in a single line + * a description of all the issues found is returned. */ +sds ACLLoadFromFile(const char *filename) { + FILE *fp; + char buf[1024]; + + /* Open the ACL file. */ + if ((fp = fopen(filename,"r")) == NULL) { + sds errors = sdscatprintf(sdsempty(), + "Error loading ACLs, opening file '%s': %s", + filename, strerror(errno)); + return errors; + } + + /* Load the whole file as a single string in memory. */ + sds acls = sdsempty(); + while(fgets(buf,sizeof(buf),fp) != NULL) + acls = sdscat(acls,buf); + fclose(fp); + + /* Split the file into lines and attempt to load each line. */ + int totlines; + sds *lines, errors = sdsempty(); + lines = sdssplitlen(acls,strlen(acls),"\n",1,&totlines); + sdsfree(acls); + + /* We need a fake user to validate the rules before making changes + * to the real user mentioned in the ACL line. */ + user *fakeuser = ACLCreateUnlinkedUser(); + + /* We do all the loading in a fresh insteance of the Users radix tree, + * so if there are errors loading the ACL file we can rollback to the + * old version. */ + rax *old_users = Users; + user *old_default_user = DefaultUser; + Users = raxNew(); + ACLInitDefaultUser(); + + /* Load each line of the file. */ + for (int i = 0; i < totlines; i++) { + sds *argv; + int argc; + int linenum = i+1; + + lines[i] = sdstrim(lines[i]," \t\r\n"); + + /* Skip blank lines */ + if (lines[i][0] == '\0') continue; + + /* Split into arguments */ + argv = sdssplitargs(lines[i],&argc); + if (argv == NULL) { + errors = sdscatprintf(errors, + "%s:%d: unbalanced quotes in acl line. ", + server.acl_filename, linenum); + continue; + } + + /* Skip this line if the resulting command vector is empty. */ + if (argc == 0) { + sdsfreesplitres(argv,argc); + continue; + } + + /* The line should start with the "user" keyword. */ + if (strcmp(argv[0],"user") || argc < 2) { + errors = sdscatprintf(errors, + "%s:%d should start with user keyword followed " + "by the username. ", server.acl_filename, + linenum); + sdsfreesplitres(argv,argc); + continue; + } + + /* Try to process the line using the fake user to validate iif + * the rules are able to apply cleanly. */ + ACLSetUser(fakeuser,"reset",-1); + int j; + for (j = 2; j < argc; j++) { + if (ACLSetUser(fakeuser,argv[j],sdslen(argv[j])) != C_OK) { + char *errmsg = ACLSetUserStringError(); + errors = sdscatprintf(errors, + "%s:%d: %s. ", + server.acl_filename, linenum, errmsg); + continue; + } + } + + /* Apply the rule to the new users set only if so far there + * are no errors, otherwise it's useless since we are going + * to discard the new users set anyway. */ + if (sdslen(errors) != 0) { + sdsfreesplitres(argv,argc); + continue; + } + + /* We can finally lookup the user and apply the rule. If the + * user already exists we always reset it to start. */ + user *u = ACLCreateUser(argv[1],sdslen(argv[1])); + if (!u) { + u = ACLGetUserByName(argv[1],sdslen(argv[1])); + serverAssert(u != NULL); + ACLSetUser(u,"reset",-1); + } + + /* Note that the same rules already applied to the fake user, so + * we just assert that everything goess well: it should. */ + for (j = 2; j < argc; j++) + serverAssert(ACLSetUser(u,argv[j],sdslen(argv[j])) == C_OK); + + sdsfreesplitres(argv,argc); + } + + ACLFreeUser(fakeuser); + sdsfreesplitres(lines,totlines); + DefaultUser = old_default_user; /* This pointer must never change. */ + + /* Check if we found errors and react accordingly. */ + if (sdslen(errors) == 0) { + /* The default user pointer is referenced in different places: instead + * of replacing such occurrences it is much simpler to copy the new + * default user configuration in the old one. */ + user *new = ACLGetUserByName("default",7); + serverAssert(new != NULL); + ACLCopyUser(DefaultUser,new); + ACLFreeUser(new); + raxInsert(Users,(unsigned char*)"default",7,DefaultUser,NULL); + raxRemove(old_users,(unsigned char*)"default",7,NULL); + ACLFreeUsersSet(old_users); + sdsfree(errors); + return NULL; + } else { + ACLFreeUsersSet(Users); + Users = old_users; + errors = sdscat(errors,"WARNING: ACL errors detected, no change to the previously active ACL rules was performed"); + return errors; + } +} + +/* Generate a copy of the ACLs currently in memory in the specified filename. + * Returns C_OK on success or C_ERR if there was an error during the I/O. + * When C_ERR is returned a log is produced with hints about the issue. */ +int ACLSaveToFile(const char *filename) { + sds acl = sdsempty(); + int fd = -1; + sds tmpfilename = NULL; + int retval = C_ERR; + + /* Let's generate an SDS string containing the new version of the + * ACL file. */ + raxIterator ri; + raxStart(&ri,Users); + raxSeek(&ri,"^",NULL,0); + while(raxNext(&ri)) { + user *u = ri.data; + /* Return information in the configuration file format. */ + sds user = sdsnew("user "); + user = sdscatsds(user,u->name); + user = sdscatlen(user," ",1); + sds descr = ACLDescribeUser(u); + user = sdscatsds(user,descr); + sdsfree(descr); + acl = sdscatsds(acl,user); + acl = sdscatlen(acl,"\n",1); + sdsfree(user); + } + raxStop(&ri); + + /* Create a temp file with the new content. */ + tmpfilename = sdsnew(filename); + tmpfilename = sdscatfmt(tmpfilename,".tmp-%i-%I", + (int)getpid(),(int)mstime()); + if ((fd = open(tmpfilename,O_WRONLY|O_CREAT,0644)) == -1) { + serverLog(LL_WARNING,"Opening temp ACL file for ACL SAVE: %s", + strerror(errno)); + goto cleanup; + } + + /* Write it. */ + if (write(fd,acl,sdslen(acl)) != (ssize_t)sdslen(acl)) { + serverLog(LL_WARNING,"Writing ACL file for ACL SAVE: %s", + strerror(errno)); + goto cleanup; + } + close(fd); fd = -1; + + /* Let's replace the new file with the old one. */ + if (rename(tmpfilename,filename) == -1) { + serverLog(LL_WARNING,"Renaming ACL file for ACL SAVE: %s", + strerror(errno)); + goto cleanup; + } + sdsfree(tmpfilename); tmpfilename = NULL; + retval = C_OK; /* If we reached this point, everything is fine. */ + +cleanup: + if (fd != -1) close(fd); + if (tmpfilename) unlink(tmpfilename); + sdsfree(tmpfilename); + sdsfree(acl); + return retval; +} + +/* This function is called once the server is already running, modules are + * loaded, and we are ready to start, in order to load the ACLs either from + * the pending list of users defined in redis.conf, or from the ACL file. + * The function will just exit with an error if the user is trying to mix + * both the loading methods. */ +void ACLLoadUsersAtStartup(void) { + if (server.acl_filename[0] != '\0' && listLength(UsersToLoad) != 0) { + serverLog(LL_WARNING, + "Configuring Redis with users defined in redis.conf and at " + "the same setting an ACL file path is invalid. This setup " + "is very likely to lead to configuration errors and security " + "holes, please define either an ACL file or declare users " + "directly in your redis.conf, but not both."); + exit(1); + } + + if (ACLLoadConfiguredUsers() == C_ERR) { + serverLog(LL_WARNING, + "Critical error while loading ACLs. Exiting."); + exit(1); + } + + if (server.acl_filename[0] != '\0') { + sds errors = ACLLoadFromFile(server.acl_filename); + if (errors) { + serverLog(LL_WARNING, + "Aborting Redis startup because of ACL errors: %s", errors); + sdsfree(errors); + exit(1); + } + } +} + +/* ============================================================================= + * ACL related commands + * ==========================================================================*/ + +/* ACL -- show and modify the configuration of ACL users. + * ACL HELP + * ACL LOAD + * ACL LIST + * ACL USERS + * ACL CAT [] + * ACL SETUSER ... acl rules ... + * ACL DELUSER [...] + * ACL GETUSER + */ +void aclCommand(client *c) { + char *sub = c->argv[1]->ptr; + if (!strcasecmp(sub,"setuser") && c->argc >= 3) { + sds username = c->argv[2]->ptr; + /* Create a temporary user to validate and stage all changes against + * before applying to an existing user or creating a new user. If all + * arguments are valid the user parameters will all be applied together. + * If there are any errors then none of the changes will be applied. */ + user *tempu = ACLCreateUnlinkedUser(); + user *u = ACLGetUserByName(username,sdslen(username)); + if (u) ACLCopyUser(tempu, u); + + for (int j = 3; j < c->argc; j++) { + if (ACLSetUser(tempu,c->argv[j]->ptr,sdslen(c->argv[j]->ptr)) != C_OK) { + char *errmsg = ACLSetUserStringError(); + addReplyErrorFormat(c, + "Error in ACL SETUSER modifier '%s': %s", + (char*)c->argv[j]->ptr, errmsg); + + ACLFreeUser(tempu); + return; + } + } + + /* Overwrite the user with the temporary user we modified above. */ + if (!u) u = ACLCreateUser(username,sdslen(username)); + serverAssert(u != NULL); + ACLCopyUser(u, tempu); + ACLFreeUser(tempu); + addReply(c,shared.ok); + } else if (!strcasecmp(sub,"deluser") && c->argc >= 3) { + int deleted = 0; + for (int j = 2; j < c->argc; j++) { + sds username = c->argv[j]->ptr; + if (!strcmp(username,"default")) { + addReplyError(c,"The 'default' user cannot be removed"); + return; + } + } + + for (int j = 2; j < c->argc; j++) { + sds username = c->argv[j]->ptr; + user *u; + if (raxRemove(Users,(unsigned char*)username, + sdslen(username), + (void**)&u)) + { + ACLFreeUserAndKillClients(u); + deleted++; + } + } + addReplyLongLong(c,deleted); + } else if (!strcasecmp(sub,"getuser") && c->argc == 3) { + user *u = ACLGetUserByName(c->argv[2]->ptr,sdslen(c->argv[2]->ptr)); + if (u == NULL) { + addReplyNull(c); + return; + } + + addReplyMapLen(c,4); + + /* Flags */ + addReplyBulkCString(c,"flags"); + void *deflen = addReplyDeferredLen(c); + int numflags = 0; + for (int j = 0; ACLUserFlags[j].flag; j++) { + if (u->flags & ACLUserFlags[j].flag) { + addReplyBulkCString(c,ACLUserFlags[j].name); + numflags++; + } + } + setDeferredSetLen(c,deflen,numflags); + + /* Passwords */ + addReplyBulkCString(c,"passwords"); + addReplyArrayLen(c,listLength(u->passwords)); + listIter li; + listNode *ln; + listRewind(u->passwords,&li); + while((ln = listNext(&li))) { + sds thispass = listNodeValue(ln); + addReplyBulkCBuffer(c,thispass,sdslen(thispass)); + } + + /* Commands */ + addReplyBulkCString(c,"commands"); + sds cmddescr = ACLDescribeUserCommandRules(u); + addReplyBulkSds(c,cmddescr); + + /* Key patterns */ + addReplyBulkCString(c,"keys"); + if (u->flags & USER_FLAG_ALLKEYS) { + addReplyArrayLen(c,1); + addReplyBulkCBuffer(c,"*",1); + } else { + addReplyArrayLen(c,listLength(u->patterns)); + listIter li; + listNode *ln; + listRewind(u->patterns,&li); + while((ln = listNext(&li))) { + sds thispat = listNodeValue(ln); + addReplyBulkCBuffer(c,thispat,sdslen(thispat)); + } + } + } else if ((!strcasecmp(sub,"list") || !strcasecmp(sub,"users")) && + c->argc == 2) + { + int justnames = !strcasecmp(sub,"users"); + addReplyArrayLen(c,raxSize(Users)); + raxIterator ri; + raxStart(&ri,Users); + raxSeek(&ri,"^",NULL,0); + while(raxNext(&ri)) { + user *u = ri.data; + if (justnames) { + addReplyBulkCBuffer(c,u->name,sdslen(u->name)); + } else { + /* Return information in the configuration file format. */ + sds config = sdsnew("user "); + config = sdscatsds(config,u->name); + config = sdscatlen(config," ",1); + sds descr = ACLDescribeUser(u); + config = sdscatsds(config,descr); + sdsfree(descr); + addReplyBulkSds(c,config); + } + } + raxStop(&ri); + } else if (!strcasecmp(sub,"whoami") && c->argc == 2) { + if (c->user != NULL) { + addReplyBulkCBuffer(c,c->user->name,sdslen(c->user->name)); + } else { + addReplyNull(c); + } + } else if (server.acl_filename[0] == '\0' && + (!strcasecmp(sub,"load") || !strcasecmp(sub,"save"))) + { + addReplyError(c,"This Redis instance is not configured to use an ACL file. You may want to specify users via the ACL SETUSER command and then issue a CONFIG REWRITE (assuming you have a Redis configuration file set) in order to store users in the Redis configuration."); + return; + } else if (!strcasecmp(sub,"load") && c->argc == 2) { + sds errors = ACLLoadFromFile(server.acl_filename); + if (errors == NULL) { + addReply(c,shared.ok); + } else { + addReplyError(c,errors); + sdsfree(errors); + } + } else if (!strcasecmp(sub,"save") && c->argc == 2) { + if (ACLSaveToFile(server.acl_filename) == C_OK) { + addReply(c,shared.ok); + } else { + addReplyError(c,"There was an error trying to save the ACLs. " + "Please check the server logs for more " + "information"); + } + } else if (!strcasecmp(sub,"cat") && c->argc == 2) { + void *dl = addReplyDeferredLen(c); + int j; + for (j = 0; ACLCommandCategories[j].flag != 0; j++) + addReplyBulkCString(c,ACLCommandCategories[j].name); + setDeferredArrayLen(c,dl,j); + } else if (!strcasecmp(sub,"cat") && c->argc == 3) { + uint64_t cflag = ACLGetCommandCategoryFlagByName(c->argv[2]->ptr); + if (cflag == 0) { + addReplyErrorFormat(c, "Unknown category '%s'", (char*)c->argv[2]->ptr); + return; + } + int arraylen = 0; + void *dl = addReplyDeferredLen(c); + dictIterator *di = dictGetIterator(server.orig_commands); + dictEntry *de; + while ((de = dictNext(di)) != NULL) { + struct redisCommand *cmd = dictGetVal(de); + if (cmd->flags & CMD_MODULE) continue; + if (cmd->flags & cflag) { + addReplyBulkCString(c,cmd->name); + arraylen++; + } + } + dictReleaseIterator(di); + setDeferredArrayLen(c,dl,arraylen); + } else if (!strcasecmp(sub,"help")) { + const char *help[] = { +"LOAD -- Reload users from the ACL file.", +"LIST -- Show user details in config file format.", +"USERS -- List all the registered usernames.", +"SETUSER [attribs ...] -- Create or modify a user.", +"GETUSER -- Get the user details.", +"DELUSER [...] -- Delete a list of users.", +"CAT -- List available categories.", +"CAT -- List commands inside category.", +"WHOAMI -- Return the current connection username.", +NULL + }; + addReplyHelp(c,help); + } else { + addReplySubcommandSyntaxError(c); + } +} + +void addReplyCommandCategories(client *c, struct redisCommand *cmd) { + int flagcount = 0; + void *flaglen = addReplyDeferredLen(c); + for (int j = 0; ACLCommandCategories[j].flag != 0; j++) { + if (cmd->flags & ACLCommandCategories[j].flag) { + addReplyStatusFormat(c, "@%s", ACLCommandCategories[j].name); + flagcount++; + } + } + setDeferredSetLen(c, flaglen, flagcount); +} + +/* AUTH + * AUTH (Redis >= 6.0 form) + * + * When the user is omitted it means that we are trying to authenticate + * against the default user. */ +void authCommand(client *c) { + /* Only two or three argument forms are allowed. */ + if (c->argc > 3) { + addReply(c,shared.syntaxerr); + return; + } + + /* Handle the two different forms here. The form with two arguments + * will just use "default" as username. */ + robj *username, *password; + if (c->argc == 2) { + /* Mimic the old behavior of giving an error for the two commands + * from if no password is configured. */ + if (DefaultUser->flags & USER_FLAG_NOPASS) { + addReplyError(c,"AUTH called without any password " + "configured for the default user. Are you sure " + "your configuration is correct?"); + return; + } + + username = createStringObject("default",7); + password = c->argv[1]; + } else { + username = c->argv[1]; + password = c->argv[2]; + } + + if (ACLAuthenticateUser(c,username,password) == C_OK) { + addReply(c,shared.ok); + } else { + addReplyError(c,"-WRONGPASS invalid username-password pair"); + } + + /* Free the "default" string object we created for the two + * arguments form. */ + if (c->argc == 2) decrRefCount(username); +} + diff --git a/src/ae.c b/src/ae.c index 1ea671569..53629ef77 100644 --- a/src/ae.c +++ b/src/ae.c @@ -351,8 +351,8 @@ static int processTimeEvents(aeEventLoop *eventLoop) { * if flags has AE_FILE_EVENTS set, file events are processed. * if flags has AE_TIME_EVENTS set, time events are processed. * if flags has AE_DONT_WAIT set the function returns ASAP until all - * if flags has AE_CALL_AFTER_SLEEP set, the aftersleep callback is called. * the events that's possible to process without to wait are processed. + * if flags has AE_CALL_AFTER_SLEEP set, the aftersleep callback is called. * * The function returns the number of events processed. */ int aeProcessEvents(aeEventLoop *eventLoop, int flags) diff --git a/src/aof.c b/src/aof.c index be416ec4e..46ae58324 100644 --- a/src/aof.c +++ b/src/aof.c @@ -204,7 +204,7 @@ void aof_background_fsync(int fd) { } /* Kills an AOFRW child process if exists */ -static void killAppendOnlyChild(void) { +void killAppendOnlyChild(void) { int statloc; /* No AOFRW child? return. */ if (server.aof_child_pid == -1) return; @@ -221,6 +221,8 @@ static void killAppendOnlyChild(void) { server.aof_rewrite_time_start = -1; /* Close pipes used for IPC between the two processes. */ aofClosePipes(); + closeChildInfoPipe(); + updateDictResizePolicy(); } /* Called when the user switches from "appendonly yes" to "appendonly no" @@ -645,6 +647,8 @@ struct client *createFakeClient(void) { c->obuf_soft_limit_reached_time = 0; c->watched_keys = listCreate(); c->peerid = NULL; + c->resp = 2; + c->user = NULL; listSetFreeMethod(c->reply,freeClientReplyValue); listSetDupMethod(c->reply,dupClientReplyValue); initClientMultiState(c); @@ -677,6 +681,7 @@ int loadAppendOnlyFile(char *filename) { int old_aof_state = server.aof_state; long loops = 0; off_t valid_up_to = 0; /* Offset of latest well-formed command loaded. */ + off_t valid_before_multi = 0; /* Offset before MULTI command loaded. */ if (fp == NULL) { serverLog(LL_WARNING,"Fatal error: can't open the append log file for reading: %s",strerror(errno)); @@ -777,16 +782,28 @@ int loadAppendOnlyFile(char *filename) { /* Command lookup */ cmd = lookupCommand(argv[0]->ptr); if (!cmd) { - serverLog(LL_WARNING,"Unknown command '%s' reading the append only file", (char*)argv[0]->ptr); + serverLog(LL_WARNING, + "Unknown command '%s' reading the append only file", + (char*)argv[0]->ptr); exit(1); } + if (cmd == server.multiCommand) valid_before_multi = valid_up_to; + /* Run the command in the context of a fake client */ fakeClient->cmd = cmd; - cmd->proc(fakeClient); + if (fakeClient->flags & CLIENT_MULTI && + fakeClient->cmd->proc != execCommand) + { + queueMultiCommand(fakeClient); + } else { + cmd->proc(fakeClient); + } /* The fake client should not have a reply */ - serverAssert(fakeClient->bufpos == 0 && listLength(fakeClient->reply) == 0); + serverAssert(fakeClient->bufpos == 0 && + listLength(fakeClient->reply) == 0); + /* The fake client should never get blocked */ serverAssert((fakeClient->flags & CLIENT_BLOCKED) == 0); @@ -798,8 +815,15 @@ int loadAppendOnlyFile(char *filename) { } /* This point can only be reached when EOF is reached without errors. - * If the client is in the middle of a MULTI/EXEC, log error and quit. */ - if (fakeClient->flags & CLIENT_MULTI) goto uxeof; + * If the client is in the middle of a MULTI/EXEC, handle it as it was + * a short read, even if technically the protocol is correct: we want + * to remove the unprocessed tail and continue. */ + if (fakeClient->flags & CLIENT_MULTI) { + serverLog(LL_WARNING, + "Revert incomplete MULTI/EXEC transaction in AOF file"); + valid_up_to = valid_before_multi; + goto uxeof; + } loaded_ok: /* DB loaded, cleanup and return C_OK to the caller. */ fclose(fp); @@ -1119,25 +1143,47 @@ int rewriteStreamObject(rio *r, robj *key, robj *o) { streamID id; int64_t numfields; - /* Reconstruct the stream data using XADD commands. */ - while(streamIteratorGetID(&si,&id,&numfields)) { - /* Emit a two elements array for each item. The first is - * the ID, the second is an array of field-value pairs. */ + if (s->length) { + /* Reconstruct the stream data using XADD commands. */ + while(streamIteratorGetID(&si,&id,&numfields)) { + /* Emit a two elements array for each item. The first is + * the ID, the second is an array of field-value pairs. */ - /* Emit the XADD ...fields... command. */ - if (rioWriteBulkCount(r,'*',3+numfields*2) == 0) return 0; + /* Emit the XADD ...fields... command. */ + if (rioWriteBulkCount(r,'*',3+numfields*2) == 0) return 0; + if (rioWriteBulkString(r,"XADD",4) == 0) return 0; + if (rioWriteBulkObject(r,key) == 0) return 0; + if (rioWriteBulkStreamID(r,&id) == 0) return 0; + while(numfields--) { + unsigned char *field, *value; + int64_t field_len, value_len; + streamIteratorGetField(&si,&field,&value,&field_len,&value_len); + if (rioWriteBulkString(r,(char*)field,field_len) == 0) return 0; + if (rioWriteBulkString(r,(char*)value,value_len) == 0) return 0; + } + } + } else { + /* Use the XADD MAXLEN 0 trick to generate an empty stream if + * the key we are serializing is an empty string, which is possible + * for the Stream type. */ + if (rioWriteBulkCount(r,'*',7) == 0) return 0; if (rioWriteBulkString(r,"XADD",4) == 0) return 0; if (rioWriteBulkObject(r,key) == 0) return 0; - if (rioWriteBulkStreamID(r,&id) == 0) return 0; - while(numfields--) { - unsigned char *field, *value; - int64_t field_len, value_len; - streamIteratorGetField(&si,&field,&value,&field_len,&value_len); - if (rioWriteBulkString(r,(char*)field,field_len) == 0) return 0; - if (rioWriteBulkString(r,(char*)value,value_len) == 0) return 0; - } + if (rioWriteBulkString(r,"MAXLEN",6) == 0) return 0; + if (rioWriteBulkString(r,"0",1) == 0) return 0; + if (rioWriteBulkStreamID(r,&s->last_id) == 0) return 0; + if (rioWriteBulkString(r,"x",1) == 0) return 0; + if (rioWriteBulkString(r,"y",1) == 0) return 0; } + /* Append XSETID after XADD, make sure lastid is correct, + * in case of XDEL lastid. */ + if (rioWriteBulkCount(r,'*',3) == 0) return 0; + if (rioWriteBulkString(r,"XSETID",6) == 0) return 0; + if (rioWriteBulkObject(r,key) == 0) return 0; + if (rioWriteBulkStreamID(r,&s->last_id) == 0) return 0; + + /* Create all the stream consumer groups. */ if (s->cgroups) { raxIterator ri; diff --git a/src/atomicvar.h b/src/atomicvar.h index 173b045fc..160056cd7 100644 --- a/src/atomicvar.h +++ b/src/atomicvar.h @@ -1,7 +1,7 @@ /* This file implements atomic counters using __atomic or __sync macros if * available, otherwise synchronizing different threads using a mutex. * - * The exported interaface is composed of three macros: + * The exported interface is composed of three macros: * * atomicIncr(var,count) -- Increment the atomic counter * atomicGetIncr(var,oldvalue_var,count) -- Get and increment the atomic counter diff --git a/src/bio.c b/src/bio.c index 0c92d053b..2af684570 100644 --- a/src/bio.c +++ b/src/bio.c @@ -17,7 +17,7 @@ * * The design is trivial, we have a structure representing a job to perform * and a different thread and job queue for every job type. - * Every thread wait for new jobs in its queue, and process every job + * Every thread waits for new jobs in its queue, and process every job * sequentially. * * Jobs of the same type are guaranteed to be processed from the least @@ -204,14 +204,14 @@ void *bioProcessBackgroundJobs(void *arg) { } zfree(job); - /* Unblock threads blocked on bioWaitStepOfType() if any. */ - pthread_cond_broadcast(&bio_step_cond[type]); - /* Lock again before reiterating the loop, if there are no longer * jobs to process we'll block again in pthread_cond_wait(). */ pthread_mutex_lock(&bio_mutex[type]); listDelNode(bio_jobs[type],ln); bio_pending[type]--; + + /* Unblock threads blocked on bioWaitStepOfType() if any. */ + pthread_cond_broadcast(&bio_step_cond[type]); } } diff --git a/src/bitops.c b/src/bitops.c index 23f2266a7..8d03a7699 100644 --- a/src/bitops.c +++ b/src/bitops.c @@ -1002,7 +1002,7 @@ void bitfieldCommand(client *c) { highest_write_offset)) == NULL) return; } - addReplyMultiBulkLen(c,numops); + addReplyArrayLen(c,numops); /* Actually process the operations. */ for (j = 0; j < numops; j++) { @@ -1047,7 +1047,7 @@ void bitfieldCommand(client *c) { setSignedBitfield(o->ptr,thisop->offset, thisop->bits,newval); } else { - addReply(c,shared.nullbulk); + addReplyNull(c); } } else { uint64_t oldval, newval, wrapped, retval; @@ -1076,7 +1076,7 @@ void bitfieldCommand(client *c) { setUnsignedBitfield(o->ptr,thisop->offset, thisop->bits,newval); } else { - addReply(c,shared.nullbulk); + addReplyNull(c); } } changes++; diff --git a/src/blocked.c b/src/blocked.c index 4a667501f..f9e196626 100644 --- a/src/blocked.c +++ b/src/blocked.c @@ -126,12 +126,37 @@ void processUnblockedClients(void) { * the code is conceptually more correct this way. */ if (!(c->flags & CLIENT_BLOCKED)) { if (c->querybuf && sdslen(c->querybuf) > 0) { - processInputBuffer(c); + processInputBufferAndReplicate(c); } } } } +/* This function will schedule the client for reprocessing at a safe time. + * + * This is useful when a client was blocked for some reason (blocking opeation, + * CLIENT PAUSE, or whatever), because it may end with some accumulated query + * buffer that needs to be processed ASAP: + * + * 1. When a client is blocked, its readable handler is still active. + * 2. However in this case it only gets data into the query buffer, but the + * query is not parsed or executed once there is enough to proceed as + * usually (because the client is blocked... so we can't execute commands). + * 3. When the client is unblocked, without this function, the client would + * have to write some query in order for the readable handler to finally + * call processQueryBuffer*() on it. + * 4. With this function instead we can put the client in a queue that will + * process it for queries ready to be executed at a safe time. + */ +void queueClientForReprocessing(client *c) { + /* The client may already be into the unblocked list because of a previous + * blocking operation, don't add back it into the list multiple times. */ + if (!(c->flags & CLIENT_UNBLOCKED)) { + c->flags |= CLIENT_UNBLOCKED; + listAddNodeTail(server.unblocked_clients,c); + } +} + /* Unblock a client calling the right function depending on the kind * of operation the client is blocking for. */ void unblockClient(client *c) { @@ -152,12 +177,7 @@ void unblockClient(client *c) { server.blocked_clients_by_type[c->btype]--; c->flags &= ~CLIENT_BLOCKED; c->btype = BLOCKED_NONE; - /* The client may already be into the unblocked list because of a previous - * blocking operation, don't add back it into the list multiple times. */ - if (!(c->flags & CLIENT_UNBLOCKED)) { - c->flags |= CLIENT_UNBLOCKED; - listAddNodeTail(server.unblocked_clients,c); - } + queueClientForReprocessing(c); } /* This function gets called when a blocked client timed out in order to @@ -167,7 +187,7 @@ void replyToBlockedClientTimedOut(client *c) { if (c->btype == BLOCKED_LIST || c->btype == BLOCKED_ZSET || c->btype == BLOCKED_STREAM) { - addReply(c,shared.nullmultibulk); + addReplyNullArray(c); } else if (c->btype == BLOCKED_WAIT) { addReplyLongLong(c,replicationCountAcksByOffset(c->bpop.reploffset)); } else if (c->btype == BLOCKED_MODULE) { @@ -195,7 +215,7 @@ void disconnectAllBlockedClients(void) { if (c->flags & CLIENT_BLOCKED) { addReplySds(c,sdsnew( "-UNBLOCKED force unblock from blocking operation, " - "instance state changed (master -> slave?)\r\n")); + "instance state changed (master -> replica?)\r\n")); unblockClient(c); c->flags |= CLIENT_CLOSE_AFTER_REPLY; } @@ -269,7 +289,7 @@ void handleClientsBlockedOnKeys(void) { robj *dstkey = receiver->bpop.target; int where = (receiver->lastcmd && receiver->lastcmd->proc == blpopCommand) ? - LIST_HEAD : LIST_TAIL; + LIST_HEAD : LIST_TAIL; robj *value = listTypePop(o,where); if (value) { @@ -285,7 +305,7 @@ void handleClientsBlockedOnKeys(void) { { /* If we failed serving the client we need * to also undo the POP operation. */ - listTypePush(o,value,where); + listTypePush(o,value,where); } if (dstkey) decrRefCount(dstkey); @@ -416,8 +436,12 @@ void handleClientsBlockedOnKeys(void) { * the name of the stream and the data we * extracted from it. Wrapped in a single-item * array, since we have just one key. */ - addReplyMultiBulkLen(receiver,1); - addReplyMultiBulkLen(receiver,2); + if (receiver->resp == 2) { + addReplyArrayLen(receiver,1); + addReplyArrayLen(receiver,2); + } else { + addReplyMapLen(receiver,1); + } addReplyBulk(receiver,rl->key); streamPropInfo pi = { diff --git a/src/cluster.c b/src/cluster.c index e568f68a6..1a3a348b5 100644 --- a/src/cluster.c +++ b/src/cluster.c @@ -1230,7 +1230,7 @@ void clearNodeFailureIfNeeded(clusterNode *node) { serverLog(LL_NOTICE, "Clear FAIL state for node %.40s: %s is reachable again.", node->name, - nodeIsSlave(node) ? "slave" : "master without slots"); + nodeIsSlave(node) ? "replica" : "master without slots"); node->flags &= ~CLUSTER_NODE_FAIL; clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|CLUSTER_TODO_SAVE_CONFIG); } @@ -1589,6 +1589,12 @@ void clusterUpdateSlotsConfigWith(clusterNode *sender, uint64_t senderConfigEpoc } } + /* After updating the slots configuration, don't do any actual change + * in the state of the server if a module disabled Redis Cluster + * keys redirections. */ + if (server.cluster_module_flags & CLUSTER_MODULE_FLAG_NO_REDIRECTION) + return; + /* If at least one slot was reassigned from a node to another node * with a greater configEpoch, it is possible that: * 1) We are a master left without slots. This means that we were @@ -2059,7 +2065,7 @@ int clusterProcessPacket(clusterLink *link) { server.cluster->mf_end = mstime() + CLUSTER_MF_TIMEOUT; server.cluster->mf_slave = sender; pauseClients(mstime()+(CLUSTER_MF_TIMEOUT*2)); - serverLog(LL_WARNING,"Manual failover requested by slave %.40s.", + serverLog(LL_WARNING,"Manual failover requested by replica %.40s.", sender->name); } else if (type == CLUSTERMSG_TYPE_UPDATE) { clusterNode *n; /* The node the update is about. */ @@ -2873,7 +2879,7 @@ void clusterLogCantFailover(int reason) { switch(reason) { case CLUSTER_CANT_FAILOVER_DATA_AGE: msg = "Disconnected from master for longer than allowed. " - "Please check the 'cluster-slave-validity-factor' configuration " + "Please check the 'cluster-replica-validity-factor' configuration " "option."; break; case CLUSTER_CANT_FAILOVER_WAITING_DELAY: @@ -3054,7 +3060,7 @@ void clusterHandleSlaveFailover(void) { server.cluster->failover_auth_time += added_delay; server.cluster->failover_auth_rank = newrank; serverLog(LL_WARNING, - "Slave rank updated to #%d, added %lld milliseconds of delay.", + "Replica rank updated to #%d, added %lld milliseconds of delay.", newrank, added_delay); } } @@ -3210,7 +3216,8 @@ void clusterHandleSlaveMigration(int max_slaves) { * the natural slaves of this instance to advertise their switch from * the old master to the new one. */ if (target && candidate == myself && - (mstime()-target->orphaned_time) > CLUSTER_SLAVE_MIGRATION_DELAY) + (mstime()-target->orphaned_time) > CLUSTER_SLAVE_MIGRATION_DELAY && + !(server.cluster_module_flags & CLUSTER_MODULE_FLAG_NO_FAILOVER)) { serverLog(LL_WARNING,"Migrating to orphaned master %.40s", target->name); @@ -3321,14 +3328,18 @@ void clusterCron(void) { int changed = 0; if (prev_ip == NULL && curr_ip != NULL) changed = 1; - if (prev_ip != NULL && curr_ip == NULL) changed = 1; - if (prev_ip && curr_ip && strcmp(prev_ip,curr_ip)) changed = 1; + else if (prev_ip != NULL && curr_ip == NULL) changed = 1; + else if (prev_ip && curr_ip && strcmp(prev_ip,curr_ip)) changed = 1; if (changed) { + if (prev_ip) zfree(prev_ip); prev_ip = curr_ip; - if (prev_ip) prev_ip = zstrdup(prev_ip); if (curr_ip) { + /* We always take a copy of the previous IP address, by + * duplicating the string. This way later we can check if + * the address really changed. */ + prev_ip = zstrdup(prev_ip); strncpy(myself->ip,server.cluster_announce_ip,NET_IP_STR_LEN); myself->ip[NET_IP_STR_LEN-1] = '\0'; } else { @@ -3559,7 +3570,8 @@ void clusterCron(void) { if (nodeIsSlave(myself)) { clusterHandleManualFailover(); - clusterHandleSlaveFailover(); + if (!(server.cluster_module_flags & CLUSTER_MODULE_FLAG_NO_FAILOVER)) + clusterHandleSlaveFailover(); /* If there are orphaned slaves, and we are a slave among the masters * with the max number of non-failing slaves, consider migrating to * the orphaned masters. Note that it does not make sense to try @@ -3865,6 +3877,11 @@ int verifyClusterConfigWithData(void) { int j; int update_config = 0; + /* Return ASAP if a module disabled cluster redirections. In that case + * every master can store keys about every possible hash slot. */ + if (server.cluster_module_flags & CLUSTER_MODULE_FLAG_NO_REDIRECTION) + return C_OK; + /* If this node is a slave, don't perform the check at all as we * completely depend on the replication stream. */ if (nodeIsSlave(myself)) return C_OK; @@ -4109,7 +4126,7 @@ void clusterReplyMultiBulkSlots(client *c) { */ int num_masters = 0; - void *slot_replylen = addDeferredMultiBulkLength(c); + void *slot_replylen = addReplyDeferredLen(c); dictEntry *de; dictIterator *di = dictGetSafeIterator(server.cluster->nodes); @@ -4129,7 +4146,7 @@ void clusterReplyMultiBulkSlots(client *c) { } if (start != -1 && (!bit || j == CLUSTER_SLOTS-1)) { int nested_elements = 3; /* slots (2) + master addr (1). */ - void *nested_replylen = addDeferredMultiBulkLength(c); + void *nested_replylen = addReplyDeferredLen(c); if (bit && j == CLUSTER_SLOTS-1) j++; @@ -4145,7 +4162,7 @@ void clusterReplyMultiBulkSlots(client *c) { start = -1; /* First node reply position is always the master */ - addReplyMultiBulkLen(c, 3); + addReplyArrayLen(c, 3); addReplyBulkCString(c, node->ip); addReplyLongLong(c, node->port); addReplyBulkCBuffer(c, node->name, CLUSTER_NAMELEN); @@ -4155,19 +4172,19 @@ void clusterReplyMultiBulkSlots(client *c) { /* This loop is copy/pasted from clusterGenNodeDescription() * with modifications for per-slot node aggregation */ if (nodeFailed(node->slaves[i])) continue; - addReplyMultiBulkLen(c, 3); + addReplyArrayLen(c, 3); addReplyBulkCString(c, node->slaves[i]->ip); addReplyLongLong(c, node->slaves[i]->port); addReplyBulkCBuffer(c, node->slaves[i]->name, CLUSTER_NAMELEN); nested_elements++; } - setDeferredMultiBulkLength(c, nested_replylen, nested_elements); + setDeferredArrayLen(c, nested_replylen, nested_elements); num_masters++; } } } dictReleaseIterator(di); - setDeferredMultiBulkLength(c, slot_replylen, num_masters); + setDeferredArrayLen(c, slot_replylen, num_masters); } void clusterCommand(client *c) { @@ -4183,7 +4200,7 @@ void clusterCommand(client *c) { "COUNT-failure-reports -- Return number of failure reports for .", "COUNTKEYSINSLOT - Return the number of keys in .", "DELSLOTS [slot ...] -- Delete slots information from current node.", -"FAILOVER [force|takeover] -- Promote current slave node to being a master.", +"FAILOVER [force|takeover] -- Promote current replica node to being a master.", "FORGET -- Remove a node from the cluster.", "GETKEYSINSLOT -- Return key names stored by current node in a slot.", "FLUSHSLOTS -- Delete current node own slots information.", @@ -4193,11 +4210,11 @@ void clusterCommand(client *c) { "MYID -- Return the node id.", "NODES -- Return cluster configuration seen by node. Output format:", " ... ", -"REPLICATE -- Configure current node as slave to .", +"REPLICATE -- Configure current node as replica to .", "RESET [hard|soft] -- Reset current node (default: soft).", "SET-config-epoch - Set config epoch of current node.", "SETSLOT (importing|migrating|stable|node ) -- Set slot state.", -"SLAVES -- Return slaves.", +"REPLICAS -- Return replicas.", "SLOTS -- Return information about slots range mappings. Each range is made of:", " start, end, master and replicas IP addresses, ports and ids", NULL @@ -4531,7 +4548,7 @@ NULL keys = zmalloc(sizeof(robj*)*maxkeys); numkeys = getKeysInSlot(slot, keys, maxkeys); - addReplyMultiBulkLen(c,numkeys); + addReplyArrayLen(c,numkeys); for (j = 0; j < numkeys; j++) { addReplyBulk(c,keys[j]); decrRefCount(keys[j]); @@ -4574,7 +4591,7 @@ NULL /* Can't replicate a slave. */ if (nodeIsSlave(n)) { - addReplyError(c,"I can only replicate a master, not a slave."); + addReplyError(c,"I can only replicate a master, not a replica."); return; } @@ -4593,7 +4610,8 @@ NULL clusterSetMaster(n); clusterDoBeforeSleep(CLUSTER_TODO_UPDATE_STATE|CLUSTER_TODO_SAVE_CONFIG); addReply(c,shared.ok); - } else if (!strcasecmp(c->argv[1]->ptr,"slaves") && c->argc == 3) { + } else if ((!strcasecmp(c->argv[1]->ptr,"slaves") || + !strcasecmp(c->argv[1]->ptr,"replicas")) && c->argc == 3) { /* CLUSTER SLAVES */ clusterNode *n = clusterLookupNode(c->argv[2]->ptr); int j; @@ -4609,7 +4627,7 @@ NULL return; } - addReplyMultiBulkLen(c,n->numslaves); + addReplyArrayLen(c,n->numslaves); for (j = 0; j < n->numslaves; j++) { sds ni = clusterGenNodeDescription(n->slaves[j]); addReplyBulkCString(c,ni); @@ -4647,10 +4665,10 @@ NULL /* Check preconditions. */ if (nodeIsMaster(myself)) { - addReplyError(c,"You should send CLUSTER FAILOVER to a slave"); + addReplyError(c,"You should send CLUSTER FAILOVER to a replica"); return; } else if (myself->slaveof == NULL) { - addReplyError(c,"I'm a slave but my master is unknown to me"); + addReplyError(c,"I'm a replica but my master is unknown to me"); return; } else if (!force && (nodeFailed(myself->slaveof) || @@ -4818,7 +4836,7 @@ void dumpCommand(client *c) { /* Check if the key is here. */ if ((o = lookupKeyRead(c->db,c->argv[1])) == NULL) { - addReply(c,shared.nullbulk); + addReplyNull(c); return; } @@ -5146,6 +5164,11 @@ try_again: serverAssertWithInfo(c,NULL,rioWriteBulkLongLong(&cmd,dbid)); } + int non_expired = 0; /* Number of keys that we'll find non expired. + Note that serializing large keys may take some time + so certain keys that were found non expired by the + lookupKey() function, may be expired later. */ + /* Create RESTORE payload and generate the protocol to call the command. */ for (j = 0; j < num_keys; j++) { long long ttl = 0; @@ -5153,8 +5176,17 @@ try_again: if (expireat != -1) { ttl = expireat-mstime(); + if (ttl < 0) { + continue; + } if (ttl < 1) ttl = 1; } + + /* Relocate valid (non expired) keys into the array in successive + * positions to remove holes created by the keys that were present + * in the first lookup but are now expired after the second lookup. */ + kv[non_expired++] = kv[j]; + serverAssertWithInfo(c,NULL, rioWriteBulkCount(&cmd,'*',replace ? 5 : 4)); @@ -5182,6 +5214,9 @@ try_again: serverAssertWithInfo(c,NULL,rioWriteBulkString(&cmd,"REPLACE",7)); } + /* Fix the actual number of keys we are migrating. */ + num_keys = non_expired; + /* Transfer the query to the other node in 64K chunks. */ errno = 0; { @@ -5217,6 +5252,10 @@ try_again: int socket_error = 0; int del_idx = 1; /* Index of the key argument for the replicated DEL op. */ + /* Allocate the new argument vector that will replace the current command, + * to propagate the MIGRATE as a DEL command (if no COPY option was given). + * We allocate num_keys+1 because the additional argument is for "DEL" + * command name itself. */ if (!copy) newargv = zmalloc(sizeof(robj*)*(num_keys+1)); for (j = 0; j < num_keys; j++) { @@ -5417,9 +5456,17 @@ clusterNode *getNodeByQuery(client *c, struct redisCommand *cmd, robj **argv, in multiCmd mc; int i, slot = 0, migrating_slot = 0, importing_slot = 0, missing_keys = 0; + /* Allow any key to be set if a module disabled cluster redirections. */ + if (server.cluster_module_flags & CLUSTER_MODULE_FLAG_NO_REDIRECTION) + return myself; + /* Set error code optimistically for the base case. */ if (error_code) *error_code = CLUSTER_REDIR_NONE; + /* Modules can turn off Redis Cluster redirection: this is useful + * when writing a module that implements a completely different + * distributed system. */ + /* We handle all the cases as if they were EXEC commands, so we have * a common code path for everything */ if (cmd->proc == execCommand) { diff --git a/src/cluster.h b/src/cluster.h index 6f9954d24..571b9c543 100644 --- a/src/cluster.h +++ b/src/cluster.h @@ -100,6 +100,13 @@ typedef struct clusterLink { #define CLUSTERMSG_TYPE_MODULE 9 /* Module cluster API message. */ #define CLUSTERMSG_TYPE_COUNT 10 /* Total number of message types. */ +/* Flags that a module can set in order to prevent certain Redis Cluster + * features to be enabled. Useful when implementing a different distributed + * system on top of Redis Cluster message bus, using modules. */ +#define CLUSTER_MODULE_FLAG_NONE 0 +#define CLUSTER_MODULE_FLAG_NO_FAILOVER (1<<1) +#define CLUSTER_MODULE_FLAG_NO_REDIRECTION (1<<2) + /* This structure represent elements of node->fail_reports. */ typedef struct clusterNodeFailReport { struct clusterNode *node; /* Node reporting the failure condition. */ diff --git a/src/config.c b/src/config.c index 54494c8e1..8fe5cdbb7 100644 --- a/src/config.c +++ b/src/config.c @@ -120,7 +120,7 @@ const char *configEnumGetName(configEnum *ce, int val) { return NULL; } -/* Wrapper for configEnumGetName() returning "unknown" insetad of NULL if +/* Wrapper for configEnumGetName() returning "unknown" instead of NULL if * there is no match. */ const char *configEnumGetNameOrUnknown(configEnum *ce, int val) { const char *name = configEnumGetName(ce,val); @@ -216,6 +216,10 @@ void loadServerConfigFromString(char *config) { if ((server.protected_mode = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } + } else if (!strcasecmp(argv[0],"gopher-enabled") && argc == 2) { + if ((server.gopher_enabled = yesnotoi(argv[1])) == -1) { + err = "argument must be 'yes' or 'no'"; goto loaderr; + } } else if (!strcasecmp(argv[0],"port") && argc == 2) { server.port = atoi(argv[1]); if (server.port < 0 || server.port > 65535) { @@ -283,6 +287,9 @@ void loadServerConfigFromString(char *config) { } fclose(logfp); } + } else if (!strcasecmp(argv[0],"aclfile") && argc == 2) { + zfree(server.acl_filename); + server.acl_filename = zstrdup(argv[1]); } else if (!strcasecmp(argv[0],"always-show-logo") && argc == 2) { if ((server.always_show_logo = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; @@ -344,15 +351,19 @@ void loadServerConfigFromString(char *config) { err = "lfu-decay-time must be 0 or greater"; goto loaderr; } - } else if (!strcasecmp(argv[0],"slaveof") && argc == 3) { + } else if ((!strcasecmp(argv[0],"slaveof") || + !strcasecmp(argv[0],"replicaof")) && argc == 3) { slaveof_linenum = linenum; server.masterhost = sdsnew(argv[1]); server.masterport = atoi(argv[2]); server.repl_state = REPL_STATE_CONNECT; - } else if (!strcasecmp(argv[0],"repl-ping-slave-period") && argc == 2) { + } else if ((!strcasecmp(argv[0],"repl-ping-slave-period") || + !strcasecmp(argv[0],"repl-ping-replica-period")) && + argc == 2) + { server.repl_ping_slave_period = atoi(argv[1]); if (server.repl_ping_slave_period <= 0) { - err = "repl-ping-slave-period must be 1 or greater"; + err = "repl-ping-replica-period must be 1 or greater"; goto loaderr; } } else if (!strcasecmp(argv[0],"repl-timeout") && argc == 2) { @@ -388,17 +399,33 @@ void loadServerConfigFromString(char *config) { err = "repl-backlog-ttl can't be negative "; goto loaderr; } + } else if (!strcasecmp(argv[0],"masteruser") && argc == 2) { + zfree(server.masteruser); + server.masteruser = argv[1][0] ? zstrdup(argv[1]) : NULL; } else if (!strcasecmp(argv[0],"masterauth") && argc == 2) { zfree(server.masterauth); server.masterauth = argv[1][0] ? zstrdup(argv[1]) : NULL; - } else if (!strcasecmp(argv[0],"slave-serve-stale-data") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-serve-stale-data") || + !strcasecmp(argv[0],"replica-serve-stale-data")) + && argc == 2) + { if ((server.repl_serve_stale_data = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } - } else if (!strcasecmp(argv[0],"slave-read-only") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-read-only") || + !strcasecmp(argv[0],"replica-read-only")) + && argc == 2) + { if ((server.repl_slave_ro = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } + } else if ((!strcasecmp(argv[0],"slave-ignore-maxmemory") || + !strcasecmp(argv[0],"replica-ignore-maxmemory")) + && argc == 2) + { + if ((server.repl_slave_ignore_maxmemory = yesnotoi(argv[1])) == -1) { + err = "argument must be 'yes' or 'no'"; goto loaderr; + } } else if (!strcasecmp(argv[0],"rdbcompression") && argc == 2) { if ((server.rdb_compression = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; @@ -423,7 +450,9 @@ void loadServerConfigFromString(char *config) { if ((server.lazyfree_lazy_server_del = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } - } else if (!strcasecmp(argv[0],"slave-lazy-flush") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-lazy-flush") || + !strcasecmp(argv[0],"replica-lazy-flush")) && argc == 2) + { if ((server.repl_slave_lazy_flush = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } @@ -440,10 +469,14 @@ void loadServerConfigFromString(char *config) { if ((server.daemonize = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } + } else if (!strcasecmp(argv[0],"dynamic-hz") && argc == 2) { + if ((server.dynamic_hz = yesnotoi(argv[1])) == -1) { + err = "argument must be 'yes' or 'no'"; goto loaderr; + } } else if (!strcasecmp(argv[0],"hz") && argc == 2) { - server.hz = atoi(argv[1]); - if (server.hz < CONFIG_MIN_HZ) server.hz = CONFIG_MIN_HZ; - if (server.hz > CONFIG_MAX_HZ) server.hz = CONFIG_MAX_HZ; + server.config_hz = atoi(argv[1]); + if (server.config_hz < CONFIG_MIN_HZ) server.config_hz = CONFIG_MIN_HZ; + if (server.config_hz > CONFIG_MAX_HZ) server.config_hz = CONFIG_MAX_HZ; } else if (!strcasecmp(argv[0],"appendonly") && argc == 2) { int yes; @@ -508,7 +541,12 @@ void loadServerConfigFromString(char *config) { err = "Password is longer than CONFIG_AUTHPASS_MAX_LEN"; goto loaderr; } - server.requirepass = argv[1][0] ? zstrdup(argv[1]) : NULL; + /* The old "requirepass" directive just translates to setting + * a password to the default user. */ + ACLSetUser(DefaultUser,"resetpass",-1); + sds aclop = sdscatprintf(sdsempty(),">%s",argv[1]); + ACLSetUser(DefaultUser,aclop,sdslen(aclop)); + sdsfree(aclop); } else if (!strcasecmp(argv[0],"pidfile") && argc == 2) { zfree(server.pidfile); server.pidfile = zstrdup(argv[1]); @@ -651,15 +689,17 @@ void loadServerConfigFromString(char *config) { err = "cluster migration barrier must zero or positive"; goto loaderr; } - } else if (!strcasecmp(argv[0],"cluster-slave-validity-factor") + } else if ((!strcasecmp(argv[0],"cluster-slave-validity-factor") || + !strcasecmp(argv[0],"cluster-replica-validity-factor")) && argc == 2) { server.cluster_slave_validity_factor = atoi(argv[1]); if (server.cluster_slave_validity_factor < 0) { - err = "cluster slave validity factor must be zero or positive"; + err = "cluster replica validity factor must be zero or positive"; goto loaderr; } - } else if (!strcasecmp(argv[0],"cluster-slave-no-failover") && + } else if ((!strcasecmp(argv[0],"cluster-slave-no-failover") || + !strcasecmp(argv[0],"cluster-replica-no-failover")) && argc == 2) { server.cluster_slave_no_failover = yesnotoi(argv[1]); @@ -669,6 +709,8 @@ void loadServerConfigFromString(char *config) { } } else if (!strcasecmp(argv[0],"lua-time-limit") && argc == 2) { server.lua_time_limit = strtoll(argv[1],NULL,10); + } else if (!strcasecmp(argv[0],"lua-replicate-commands") && argc == 2) { + server.lua_always_replicate_commands = yesnotoi(argv[1]); } else if (!strcasecmp(argv[0],"slowlog-log-slower-than") && argc == 2) { @@ -710,27 +752,37 @@ void loadServerConfigFromString(char *config) { if ((server.stop_writes_on_bgsave_err = yesnotoi(argv[1])) == -1) { err = "argument must be 'yes' or 'no'"; goto loaderr; } - } else if (!strcasecmp(argv[0],"slave-priority") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-priority") || + !strcasecmp(argv[0],"replica-priority")) && argc == 2) + { server.slave_priority = atoi(argv[1]); - } else if (!strcasecmp(argv[0],"slave-announce-ip") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-announce-ip") || + !strcasecmp(argv[0],"replica-announce-ip")) && argc == 2) + { zfree(server.slave_announce_ip); server.slave_announce_ip = zstrdup(argv[1]); - } else if (!strcasecmp(argv[0],"slave-announce-port") && argc == 2) { + } else if ((!strcasecmp(argv[0],"slave-announce-port") || + !strcasecmp(argv[0],"replica-announce-port")) && argc == 2) + { server.slave_announce_port = atoi(argv[1]); if (server.slave_announce_port < 0 || server.slave_announce_port > 65535) { err = "Invalid port"; goto loaderr; } - } else if (!strcasecmp(argv[0],"min-slaves-to-write") && argc == 2) { + } else if ((!strcasecmp(argv[0],"min-slaves-to-write") || + !strcasecmp(argv[0],"min-replicas-to-write")) && argc == 2) + { server.repl_min_slaves_to_write = atoi(argv[1]); if (server.repl_min_slaves_to_write < 0) { - err = "Invalid value for min-slaves-to-write."; goto loaderr; + err = "Invalid value for min-replicas-to-write."; goto loaderr; } - } else if (!strcasecmp(argv[0],"min-slaves-max-lag") && argc == 2) { + } else if ((!strcasecmp(argv[0],"min-slaves-max-lag") || + !strcasecmp(argv[0],"min-replicas-max-lag")) && argc == 2) + { server.repl_min_slaves_max_lag = atoi(argv[1]); if (server.repl_min_slaves_max_lag < 0) { - err = "Invalid value for min-slaves-max-lag."; goto loaderr; + err = "Invalid value for min-replicas-max-lag."; goto loaderr; } } else if (!strcasecmp(argv[0],"notify-keyspace-events") && argc == 2) { int flags = keyspaceEventsStringToFlags(argv[1]); @@ -749,6 +801,16 @@ void loadServerConfigFromString(char *config) { "Allowed values: 'upstart', 'systemd', 'auto', or 'no'"; goto loaderr; } + } else if (!strcasecmp(argv[0],"user") && argc >= 2) { + int argc_err; + if (ACLAppendUserForLoading(argv,argc,&argc_err) == C_ERR) { + char buf[1024]; + char *errmsg = ACLSetUserStringError(); + snprintf(buf,sizeof(buf),"Error in user declaration '%s': %s", + argv[argc_err],errmsg); + err = buf; + goto loaderr; + } } else if (!strcasecmp(argv[0],"loadmodule") && argc >= 2) { queueLoadModule(argv[1],&argv[2],argc-2); } else if (!strcasecmp(argv[0],"sentinel")) { @@ -772,7 +834,7 @@ void loadServerConfigFromString(char *config) { if (server.cluster_enabled && server.masterhost) { linenum = slaveof_linenum; i = linenum-1; - err = "slaveof directive not allowed in cluster mode"; + err = "replicaof directive not allowed in cluster mode"; goto loaderr; } @@ -856,6 +918,10 @@ void loadServerConfig(char *filename, char *options) { #define config_set_special_field(_name) \ } else if (!strcasecmp(c->argv[2]->ptr,_name)) { +#define config_set_special_field_with_alias(_name1,_name2) \ + } else if (!strcasecmp(c->argv[2]->ptr,_name1) || \ + !strcasecmp(c->argv[2]->ptr,_name2)) { + #define config_set_else } else void configSetCommand(client *c) { @@ -878,8 +944,15 @@ void configSetCommand(client *c) { server.rdb_filename = zstrdup(o->ptr); } config_set_special_field("requirepass") { if (sdslen(o->ptr) > CONFIG_AUTHPASS_MAX_LEN) goto badfmt; - zfree(server.requirepass); - server.requirepass = ((char*)o->ptr)[0] ? zstrdup(o->ptr) : NULL; + /* The old "requirepass" directive just translates to setting + * a password to the default user. */ + ACLSetUser(DefaultUser,"resetpass",-1); + sds aclop = sdscatprintf(sdsempty(),">%s",(char*)o->ptr); + ACLSetUser(DefaultUser,aclop,sdslen(aclop)); + sdsfree(aclop); + } config_set_special_field("masteruser") { + zfree(server.masteruser); + server.masteruser = ((char*)o->ptr)[0] ? zstrdup(o->ptr) : NULL; } config_set_special_field("masterauth") { zfree(server.masterauth); server.masterauth = ((char*)o->ptr)[0] ? zstrdup(o->ptr) : NULL; @@ -1015,7 +1088,9 @@ void configSetCommand(client *c) { if (flags == -1) goto badfmt; server.notify_keyspace_events = flags; - } config_set_special_field("slave-announce-ip") { + } config_set_special_field_with_alias("slave-announce-ip", + "replica-announce-ip") + { zfree(server.slave_announce_ip); server.slave_announce_ip = ((char*)o->ptr)[0] ? zstrdup(o->ptr) : NULL; @@ -1031,6 +1106,8 @@ void configSetCommand(client *c) { "cluster-require-full-coverage",server.cluster_require_full_coverage) { } config_set_bool_field( "cluster-slave-no-failover",server.cluster_slave_no_failover) { + } config_set_bool_field( + "cluster-replica-no-failover",server.cluster_slave_no_failover) { } config_set_bool_field( "aof-rewrite-incremental-fsync",server.aof_rewrite_incremental_fsync) { } config_set_bool_field( @@ -1041,8 +1118,16 @@ void configSetCommand(client *c) { "aof-use-rdb-preamble",server.aof_use_rdb_preamble) { } config_set_bool_field( "slave-serve-stale-data",server.repl_serve_stale_data) { + } config_set_bool_field( + "replica-serve-stale-data",server.repl_serve_stale_data) { } config_set_bool_field( "slave-read-only",server.repl_slave_ro) { + } config_set_bool_field( + "replica-read-only",server.repl_slave_ro) { + } config_set_bool_field( + "slave-ignore-maxmemory",server.repl_slave_ignore_maxmemory) { + } config_set_bool_field( + "replica-ignore-maxmemory",server.repl_slave_ignore_maxmemory) { } config_set_bool_field( "activerehashing",server.activerehashing) { } config_set_bool_field( @@ -1060,6 +1145,8 @@ void configSetCommand(client *c) { #endif } config_set_bool_field( "protected-mode",server.protected_mode) { + } config_set_bool_field( + "gopher-enabled",server.gopher_enabled) { } config_set_bool_field( "stop-writes-on-bgsave-error",server.stop_writes_on_bgsave_err) { } config_set_bool_field( @@ -1070,8 +1157,12 @@ void configSetCommand(client *c) { "lazyfree-lazy-server-del",server.lazyfree_lazy_server_del) { } config_set_bool_field( "slave-lazy-flush",server.repl_slave_lazy_flush) { + } config_set_bool_field( + "replica-lazy-flush",server.repl_slave_lazy_flush) { } config_set_bool_field( "no-appendfsync-on-rewrite",server.aof_no_fsync_on_rewrite) { + } config_set_bool_field( + "dynamic-hz",server.dynamic_hz) { /* Numerical fields. * config_set_numerical_field(name,var,min,max) */ @@ -1131,6 +1222,8 @@ void configSetCommand(client *c) { "latency-monitor-threshold",server.latency_monitor_threshold,0,LLONG_MAX){ } config_set_numerical_field( "repl-ping-slave-period",server.repl_ping_slave_period,1,INT_MAX) { + } config_set_numerical_field( + "repl-ping-replica-period",server.repl_ping_slave_period,1,INT_MAX) { } config_set_numerical_field( "repl-timeout",server.repl_timeout,1,INT_MAX) { } config_set_numerical_field( @@ -1139,14 +1232,24 @@ void configSetCommand(client *c) { "repl-diskless-sync-delay",server.repl_diskless_sync_delay,0,INT_MAX) { } config_set_numerical_field( "slave-priority",server.slave_priority,0,INT_MAX) { + } config_set_numerical_field( + "replica-priority",server.slave_priority,0,INT_MAX) { } config_set_numerical_field( "slave-announce-port",server.slave_announce_port,0,65535) { + } config_set_numerical_field( + "replica-announce-port",server.slave_announce_port,0,65535) { } config_set_numerical_field( "min-slaves-to-write",server.repl_min_slaves_to_write,0,INT_MAX) { refreshGoodSlavesCount(); + } config_set_numerical_field( + "min-replicas-to-write",server.repl_min_slaves_to_write,0,INT_MAX) { + refreshGoodSlavesCount(); } config_set_numerical_field( "min-slaves-max-lag",server.repl_min_slaves_max_lag,0,INT_MAX) { refreshGoodSlavesCount(); + } config_set_numerical_field( + "min-replicas-max-lag",server.repl_min_slaves_max_lag,0,INT_MAX) { + refreshGoodSlavesCount(); } config_set_numerical_field( "cluster-node-timeout",server.cluster_node_timeout,0,LLONG_MAX) { } config_set_numerical_field( @@ -1158,11 +1261,13 @@ void configSetCommand(client *c) { } config_set_numerical_field( "cluster-slave-validity-factor",server.cluster_slave_validity_factor,0,INT_MAX) { } config_set_numerical_field( - "hz",server.hz,0,INT_MAX) { + "cluster-replica-validity-factor",server.cluster_slave_validity_factor,0,INT_MAX) { + } config_set_numerical_field( + "hz",server.config_hz,0,INT_MAX) { /* Hz is more an hint from the user, so we accept values out of range * but cap them to reasonable values. */ - if (server.hz < CONFIG_MIN_HZ) server.hz = CONFIG_MIN_HZ; - if (server.hz > CONFIG_MAX_HZ) server.hz = CONFIG_MAX_HZ; + if (server.config_hz < CONFIG_MIN_HZ) server.config_hz = CONFIG_MIN_HZ; + if (server.config_hz > CONFIG_MAX_HZ) server.config_hz = CONFIG_MAX_HZ; } config_set_numerical_field( "watchdog-period",ll,0,INT_MAX) { if (ll) @@ -1175,9 +1280,9 @@ void configSetCommand(client *c) { } config_set_memory_field("maxmemory",server.maxmemory) { if (server.maxmemory) { if (server.maxmemory < zmalloc_used_memory()) { - serverLog(LL_WARNING,"WARNING: the new maxmemory value set via CONFIG SET is smaller than the current memory usage. This will result in keys eviction and/or inability to accept new write commands depending on the maxmemory-policy."); + serverLog(LL_WARNING,"WARNING: the new maxmemory value set via CONFIG SET is smaller than the current memory usage. This will result in key eviction and/or the inability to accept new write commands depending on the maxmemory-policy."); } - freeMemoryIfNeeded(); + freeMemoryIfNeededAndSafe(); } } config_set_memory_field( "proto-max-bulk-len",server.proto_max_bulk_len) { @@ -1253,7 +1358,7 @@ badfmt: /* Bad format errors */ void configGetCommand(client *c) { robj *o = c->argv[2]; - void *replylen = addDeferredMultiBulkLength(c); + void *replylen = addReplyDeferredLen(c); char *pattern = o->ptr; char buf[128]; int matches = 0; @@ -1261,13 +1366,15 @@ void configGetCommand(client *c) { /* String values */ config_get_string_field("dbfilename",server.rdb_filename); - config_get_string_field("requirepass",server.requirepass); + config_get_string_field("masteruser",server.masteruser); config_get_string_field("masterauth",server.masterauth); config_get_string_field("cluster-announce-ip",server.cluster_announce_ip); config_get_string_field("unixsocket",server.unixsocket); config_get_string_field("logfile",server.logfile); + config_get_string_field("aclfile",server.acl_filename); config_get_string_field("pidfile",server.pidfile); config_get_string_field("slave-announce-ip",server.slave_announce_ip); + config_get_string_field("replica-announce-ip",server.slave_announce_ip); /* Numerical values */ config_get_numerical_field("maxmemory",server.maxmemory); @@ -1320,19 +1427,25 @@ void configGetCommand(client *c) { config_get_numerical_field("tcp-backlog",server.tcp_backlog); config_get_numerical_field("databases",server.dbnum); config_get_numerical_field("repl-ping-slave-period",server.repl_ping_slave_period); + config_get_numerical_field("repl-ping-replica-period",server.repl_ping_slave_period); config_get_numerical_field("repl-timeout",server.repl_timeout); config_get_numerical_field("repl-backlog-size",server.repl_backlog_size); config_get_numerical_field("repl-backlog-ttl",server.repl_backlog_time_limit); config_get_numerical_field("maxclients",server.maxclients); config_get_numerical_field("watchdog-period",server.watchdog_period); config_get_numerical_field("slave-priority",server.slave_priority); + config_get_numerical_field("replica-priority",server.slave_priority); config_get_numerical_field("slave-announce-port",server.slave_announce_port); + config_get_numerical_field("replica-announce-port",server.slave_announce_port); config_get_numerical_field("min-slaves-to-write",server.repl_min_slaves_to_write); + config_get_numerical_field("min-replicas-to-write",server.repl_min_slaves_to_write); config_get_numerical_field("min-slaves-max-lag",server.repl_min_slaves_max_lag); - config_get_numerical_field("hz",server.hz); + config_get_numerical_field("min-replicas-max-lag",server.repl_min_slaves_max_lag); + config_get_numerical_field("hz",server.config_hz); config_get_numerical_field("cluster-node-timeout",server.cluster_node_timeout); config_get_numerical_field("cluster-migration-barrier",server.cluster_migration_barrier); config_get_numerical_field("cluster-slave-validity-factor",server.cluster_slave_validity_factor); + config_get_numerical_field("cluster-replica-validity-factor",server.cluster_slave_validity_factor); config_get_numerical_field("repl-diskless-sync-delay",server.repl_diskless_sync_delay); config_get_numerical_field("tcp-keepalive",server.tcpkeepalive); @@ -1341,12 +1454,22 @@ void configGetCommand(client *c) { server.cluster_require_full_coverage); config_get_bool_field("cluster-slave-no-failover", server.cluster_slave_no_failover); + config_get_bool_field("cluster-replica-no-failover", + server.cluster_slave_no_failover); config_get_bool_field("no-appendfsync-on-rewrite", server.aof_no_fsync_on_rewrite); config_get_bool_field("slave-serve-stale-data", server.repl_serve_stale_data); + config_get_bool_field("replica-serve-stale-data", + server.repl_serve_stale_data); config_get_bool_field("slave-read-only", server.repl_slave_ro); + config_get_bool_field("replica-read-only", + server.repl_slave_ro); + config_get_bool_field("slave-ignore-maxmemory", + server.repl_slave_ignore_maxmemory); + config_get_bool_field("replica-ignore-maxmemory", + server.repl_slave_ignore_maxmemory); config_get_bool_field("stop-writes-on-bgsave-error", server.stop_writes_on_bgsave_err); config_get_bool_field("daemonize", server.daemonize); @@ -1355,6 +1478,7 @@ void configGetCommand(client *c) { config_get_bool_field("activerehashing", server.activerehashing); config_get_bool_field("activedefrag", server.active_defrag_enabled); config_get_bool_field("protected-mode", server.protected_mode); + config_get_bool_field("gopher-enabled", server.gopher_enabled); config_get_bool_field("repl-disable-tcp-nodelay", server.repl_disable_tcp_nodelay); config_get_bool_field("repl-diskless-sync", @@ -1375,6 +1499,10 @@ void configGetCommand(client *c) { server.lazyfree_lazy_server_del); config_get_bool_field("slave-lazy-flush", server.repl_slave_lazy_flush); + config_get_bool_field("replica-lazy-flush", + server.repl_slave_lazy_flush); + config_get_bool_field("dynamic-hz", + server.dynamic_hz); /* Enum values */ config_get_enum_field("maxmemory-policy", @@ -1446,10 +1574,14 @@ void configGetCommand(client *c) { addReplyBulkCString(c,buf); matches++; } - if (stringmatch(pattern,"slaveof",1)) { + if (stringmatch(pattern,"slaveof",1) || + stringmatch(pattern,"replicaof",1)) + { + char *optname = stringmatch(pattern,"slaveof",1) ? + "slaveof" : "replicaof"; char buf[256]; - addReplyBulkCString(c,"slaveof"); + addReplyBulkCString(c,optname); if (server.masterhost) snprintf(buf,sizeof(buf),"%s %d", server.masterhost, server.masterport); @@ -1475,7 +1607,17 @@ void configGetCommand(client *c) { sdsfree(aux); matches++; } - setDeferredMultiBulkLength(c,replylen,matches*2); + if (stringmatch(pattern,"requirepass",1)) { + addReplyBulkCString(c,"requirepass"); + sds password = ACLDefaultUserFirstPassword(); + if (password) { + addReplyBulkCBuffer(c,password,sdslen(password)); + } else { + addReplyBulkCString(c,""); + } + matches++; + } + setDeferredMapLen(c,replylen,matches); } /*----------------------------------------------------------------------------- @@ -1605,8 +1747,20 @@ struct rewriteConfigState *rewriteConfigReadOldFile(char *path) { /* Now we populate the state according to the content of this line. * Append the line and populate the option -> line numbers map. */ rewriteConfigAppendLine(state,line); - rewriteConfigAddLineNumberToOption(state,argv[0],linenum); + /* Translate options using the word "slave" to the corresponding name + * "replica", before adding such option to the config name -> lines + * mapping. */ + char *p = strstr(argv[0],"slave"); + if (p) { + sds alt = sdsempty(); + alt = sdscatlen(alt,argv[0],p-argv[0]);; + alt = sdscatlen(alt,"replica",7); + alt = sdscatlen(alt,p+5,strlen(p+5)); + sdsfree(argv[0]); + argv[0] = alt; + } + rewriteConfigAddLineNumberToOption(state,argv[0],linenum); sdsfreesplitres(argv,argc); } fclose(fp); @@ -1781,6 +1935,38 @@ void rewriteConfigSaveOption(struct rewriteConfigState *state) { rewriteConfigMarkAsProcessed(state,"save"); } +/* Rewrite the user option. */ +void rewriteConfigUserOption(struct rewriteConfigState *state) { + /* If there is a user file defined we just mark this configuration + * directive as processed, so that all the lines containing users + * inside the config file gets discarded. */ + if (server.acl_filename[0] != '\0') { + rewriteConfigMarkAsProcessed(state,"user"); + return; + } + + /* Otherwise scan the list of users and rewrite every line. Note that + * in case the list here is empty, the effect will just be to comment + * all the users directive inside the config file. */ + raxIterator ri; + raxStart(&ri,Users); + raxSeek(&ri,"^",NULL,0); + while(raxNext(&ri)) { + user *u = ri.data; + sds line = sdsnew("user "); + line = sdscatsds(line,u->name); + line = sdscatlen(line," ",1); + sds descr = ACLDescribeUser(u); + line = sdscatsds(line,descr); + sdsfree(descr); + rewriteConfigRewriteLine(state,"user",line,1); + } + raxStop(&ri); + + /* Mark "user" as processed in case there are no defined users. */ + rewriteConfigMarkAsProcessed(state,"user"); +} + /* Rewrite the dir option, always using absolute paths.*/ void rewriteConfigDirOption(struct rewriteConfigState *state) { char cwd[1024]; @@ -1793,15 +1979,14 @@ void rewriteConfigDirOption(struct rewriteConfigState *state) { } /* Rewrite the slaveof option. */ -void rewriteConfigSlaveofOption(struct rewriteConfigState *state) { - char *option = "slaveof"; +void rewriteConfigSlaveofOption(struct rewriteConfigState *state, char *option) { sds line; /* If this is a master, we want all the slaveof config options * in the file to be removed. Note that if this is a cluster instance * we don't want a slaveof directive inside redis.conf. */ if (server.cluster_enabled || server.masterhost == NULL) { - rewriteConfigMarkAsProcessed(state,"slaveof"); + rewriteConfigMarkAsProcessed(state,option); return; } line = sdscatprintf(sdsempty(),"%s %s %d", option, @@ -1843,8 +2028,10 @@ void rewriteConfigClientoutputbufferlimitOption(struct rewriteConfigState *state rewriteConfigFormatMemory(soft,sizeof(soft), server.client_obuf_limits[j].soft_limit_bytes); + char *typename = getClientTypeName(j); + if (!strcmp(typename,"slave")) typename = "replica"; line = sdscatprintf(sdsempty(),"%s %s %s %s %ld", - option, getClientTypeName(j), hard, soft, + option, typename, hard, soft, (long) server.client_obuf_limits[j].soft_limit_seconds); rewriteConfigRewriteLine(state,option,line,force); } @@ -1872,6 +2059,26 @@ void rewriteConfigBindOption(struct rewriteConfigState *state) { rewriteConfigRewriteLine(state,option,line,force); } +/* Rewrite the requirepass option. */ +void rewriteConfigRequirepassOption(struct rewriteConfigState *state, char *option) { + int force = 1; + sds line; + sds password = ACLDefaultUserFirstPassword(); + + /* If there is no password set, we don't want the requirepass option + * to be present in the configuration at all. */ + if (password == NULL) { + rewriteConfigMarkAsProcessed(state,option); + return; + } + + line = sdsnew(option); + line = sdscatlen(line, " ", 1); + line = sdscatsds(line, password); + + rewriteConfigRewriteLine(state,option,line,force); +} + /* Glue together the configuration lines in the current configuration * rewrite state into a single string, stripping multiple empty lines. */ sds rewriteConfigGetContentFromState(struct rewriteConfigState *state) { @@ -2022,36 +2229,40 @@ int rewriteConfig(char *path) { rewriteConfigOctalOption(state,"unixsocketperm",server.unixsocketperm,CONFIG_DEFAULT_UNIX_SOCKET_PERM); rewriteConfigNumericalOption(state,"timeout",server.maxidletime,CONFIG_DEFAULT_CLIENT_TIMEOUT); rewriteConfigNumericalOption(state,"tcp-keepalive",server.tcpkeepalive,CONFIG_DEFAULT_TCP_KEEPALIVE); - rewriteConfigNumericalOption(state,"slave-announce-port",server.slave_announce_port,CONFIG_DEFAULT_SLAVE_ANNOUNCE_PORT); + rewriteConfigNumericalOption(state,"replica-announce-port",server.slave_announce_port,CONFIG_DEFAULT_SLAVE_ANNOUNCE_PORT); rewriteConfigEnumOption(state,"loglevel",server.verbosity,loglevel_enum,CONFIG_DEFAULT_VERBOSITY); rewriteConfigStringOption(state,"logfile",server.logfile,CONFIG_DEFAULT_LOGFILE); + rewriteConfigStringOption(state,"aclfile",server.acl_filename,CONFIG_DEFAULT_ACL_FILENAME); rewriteConfigYesNoOption(state,"syslog-enabled",server.syslog_enabled,CONFIG_DEFAULT_SYSLOG_ENABLED); rewriteConfigStringOption(state,"syslog-ident",server.syslog_ident,CONFIG_DEFAULT_SYSLOG_IDENT); rewriteConfigSyslogfacilityOption(state); rewriteConfigSaveOption(state); + rewriteConfigUserOption(state); rewriteConfigNumericalOption(state,"databases",server.dbnum,CONFIG_DEFAULT_DBNUM); rewriteConfigYesNoOption(state,"stop-writes-on-bgsave-error",server.stop_writes_on_bgsave_err,CONFIG_DEFAULT_STOP_WRITES_ON_BGSAVE_ERROR); rewriteConfigYesNoOption(state,"rdbcompression",server.rdb_compression,CONFIG_DEFAULT_RDB_COMPRESSION); rewriteConfigYesNoOption(state,"rdbchecksum",server.rdb_checksum,CONFIG_DEFAULT_RDB_CHECKSUM); rewriteConfigStringOption(state,"dbfilename",server.rdb_filename,CONFIG_DEFAULT_RDB_FILENAME); rewriteConfigDirOption(state); - rewriteConfigSlaveofOption(state); - rewriteConfigStringOption(state,"slave-announce-ip",server.slave_announce_ip,CONFIG_DEFAULT_SLAVE_ANNOUNCE_IP); + rewriteConfigSlaveofOption(state,"replicaof"); + rewriteConfigStringOption(state,"replica-announce-ip",server.slave_announce_ip,CONFIG_DEFAULT_SLAVE_ANNOUNCE_IP); + rewriteConfigStringOption(state,"masteruser",server.masteruser,NULL); rewriteConfigStringOption(state,"masterauth",server.masterauth,NULL); rewriteConfigStringOption(state,"cluster-announce-ip",server.cluster_announce_ip,NULL); - rewriteConfigYesNoOption(state,"slave-serve-stale-data",server.repl_serve_stale_data,CONFIG_DEFAULT_SLAVE_SERVE_STALE_DATA); - rewriteConfigYesNoOption(state,"slave-read-only",server.repl_slave_ro,CONFIG_DEFAULT_SLAVE_READ_ONLY); - rewriteConfigNumericalOption(state,"repl-ping-slave-period",server.repl_ping_slave_period,CONFIG_DEFAULT_REPL_PING_SLAVE_PERIOD); + rewriteConfigYesNoOption(state,"replica-serve-stale-data",server.repl_serve_stale_data,CONFIG_DEFAULT_SLAVE_SERVE_STALE_DATA); + rewriteConfigYesNoOption(state,"replica-read-only",server.repl_slave_ro,CONFIG_DEFAULT_SLAVE_READ_ONLY); + rewriteConfigYesNoOption(state,"replica-ignore-maxmemory",server.repl_slave_ignore_maxmemory,CONFIG_DEFAULT_SLAVE_IGNORE_MAXMEMORY); + rewriteConfigNumericalOption(state,"repl-ping-replica-period",server.repl_ping_slave_period,CONFIG_DEFAULT_REPL_PING_SLAVE_PERIOD); rewriteConfigNumericalOption(state,"repl-timeout",server.repl_timeout,CONFIG_DEFAULT_REPL_TIMEOUT); rewriteConfigBytesOption(state,"repl-backlog-size",server.repl_backlog_size,CONFIG_DEFAULT_REPL_BACKLOG_SIZE); rewriteConfigBytesOption(state,"repl-backlog-ttl",server.repl_backlog_time_limit,CONFIG_DEFAULT_REPL_BACKLOG_TIME_LIMIT); rewriteConfigYesNoOption(state,"repl-disable-tcp-nodelay",server.repl_disable_tcp_nodelay,CONFIG_DEFAULT_REPL_DISABLE_TCP_NODELAY); rewriteConfigYesNoOption(state,"repl-diskless-sync",server.repl_diskless_sync,CONFIG_DEFAULT_REPL_DISKLESS_SYNC); rewriteConfigNumericalOption(state,"repl-diskless-sync-delay",server.repl_diskless_sync_delay,CONFIG_DEFAULT_REPL_DISKLESS_SYNC_DELAY); - rewriteConfigNumericalOption(state,"slave-priority",server.slave_priority,CONFIG_DEFAULT_SLAVE_PRIORITY); - rewriteConfigNumericalOption(state,"min-slaves-to-write",server.repl_min_slaves_to_write,CONFIG_DEFAULT_MIN_SLAVES_TO_WRITE); - rewriteConfigNumericalOption(state,"min-slaves-max-lag",server.repl_min_slaves_max_lag,CONFIG_DEFAULT_MIN_SLAVES_MAX_LAG); - rewriteConfigStringOption(state,"requirepass",server.requirepass,NULL); + rewriteConfigNumericalOption(state,"replica-priority",server.slave_priority,CONFIG_DEFAULT_SLAVE_PRIORITY); + rewriteConfigNumericalOption(state,"min-replicas-to-write",server.repl_min_slaves_to_write,CONFIG_DEFAULT_MIN_SLAVES_TO_WRITE); + rewriteConfigNumericalOption(state,"min-replicas-max-lag",server.repl_min_slaves_max_lag,CONFIG_DEFAULT_MIN_SLAVES_MAX_LAG); + rewriteConfigRequirepassOption(state,"requirepass"); rewriteConfigNumericalOption(state,"maxclients",server.maxclients,CONFIG_DEFAULT_MAX_CLIENTS); rewriteConfigBytesOption(state,"maxmemory",server.maxmemory,CONFIG_DEFAULT_MAXMEMORY); rewriteConfigBytesOption(state,"proto-max-bulk-len",server.proto_max_bulk_len,CONFIG_DEFAULT_PROTO_MAX_BULK_LEN); @@ -2076,10 +2287,10 @@ int rewriteConfig(char *path) { rewriteConfigYesNoOption(state,"cluster-enabled",server.cluster_enabled,0); rewriteConfigStringOption(state,"cluster-config-file",server.cluster_configfile,CONFIG_DEFAULT_CLUSTER_CONFIG_FILE); rewriteConfigYesNoOption(state,"cluster-require-full-coverage",server.cluster_require_full_coverage,CLUSTER_DEFAULT_REQUIRE_FULL_COVERAGE); - rewriteConfigYesNoOption(state,"cluster-slave-no-failover",server.cluster_slave_no_failover,CLUSTER_DEFAULT_SLAVE_NO_FAILOVER); + rewriteConfigYesNoOption(state,"cluster-replica-no-failover",server.cluster_slave_no_failover,CLUSTER_DEFAULT_SLAVE_NO_FAILOVER); rewriteConfigNumericalOption(state,"cluster-node-timeout",server.cluster_node_timeout,CLUSTER_DEFAULT_NODE_TIMEOUT); rewriteConfigNumericalOption(state,"cluster-migration-barrier",server.cluster_migration_barrier,CLUSTER_DEFAULT_MIGRATION_BARRIER); - rewriteConfigNumericalOption(state,"cluster-slave-validity-factor",server.cluster_slave_validity_factor,CLUSTER_DEFAULT_SLAVE_VALIDITY); + rewriteConfigNumericalOption(state,"cluster-replica-validity-factor",server.cluster_slave_validity_factor,CLUSTER_DEFAULT_SLAVE_VALIDITY); rewriteConfigNumericalOption(state,"slowlog-log-slower-than",server.slowlog_log_slower_than,CONFIG_DEFAULT_SLOWLOG_LOG_SLOWER_THAN); rewriteConfigNumericalOption(state,"latency-monitor-threshold",server.latency_monitor_threshold,CONFIG_DEFAULT_LATENCY_MONITOR_THRESHOLD); rewriteConfigNumericalOption(state,"slowlog-max-len",server.slowlog_max_len,CONFIG_DEFAULT_SLOWLOG_MAX_LEN); @@ -2097,8 +2308,9 @@ int rewriteConfig(char *path) { rewriteConfigYesNoOption(state,"activerehashing",server.activerehashing,CONFIG_DEFAULT_ACTIVE_REHASHING); rewriteConfigYesNoOption(state,"activedefrag",server.active_defrag_enabled,CONFIG_DEFAULT_ACTIVE_DEFRAG); rewriteConfigYesNoOption(state,"protected-mode",server.protected_mode,CONFIG_DEFAULT_PROTECTED_MODE); + rewriteConfigYesNoOption(state,"gopher-enabled",server.gopher_enabled,CONFIG_DEFAULT_GOPHER_ENABLED); rewriteConfigClientoutputbufferlimitOption(state); - rewriteConfigNumericalOption(state,"hz",server.hz,CONFIG_DEFAULT_HZ); + rewriteConfigNumericalOption(state,"hz",server.config_hz,CONFIG_DEFAULT_HZ); rewriteConfigYesNoOption(state,"aof-rewrite-incremental-fsync",server.aof_rewrite_incremental_fsync,CONFIG_DEFAULT_AOF_REWRITE_INCREMENTAL_FSYNC); rewriteConfigYesNoOption(state,"rdb-save-incremental-fsync",server.rdb_save_incremental_fsync,CONFIG_DEFAULT_RDB_SAVE_INCREMENTAL_FSYNC); rewriteConfigYesNoOption(state,"aof-load-truncated",server.aof_load_truncated,CONFIG_DEFAULT_AOF_LOAD_TRUNCATED); @@ -2107,7 +2319,8 @@ int rewriteConfig(char *path) { rewriteConfigYesNoOption(state,"lazyfree-lazy-eviction",server.lazyfree_lazy_eviction,CONFIG_DEFAULT_LAZYFREE_LAZY_EVICTION); rewriteConfigYesNoOption(state,"lazyfree-lazy-expire",server.lazyfree_lazy_expire,CONFIG_DEFAULT_LAZYFREE_LAZY_EXPIRE); rewriteConfigYesNoOption(state,"lazyfree-lazy-server-del",server.lazyfree_lazy_server_del,CONFIG_DEFAULT_LAZYFREE_LAZY_SERVER_DEL); - rewriteConfigYesNoOption(state,"slave-lazy-flush",server.repl_slave_lazy_flush,CONFIG_DEFAULT_SLAVE_LAZY_FLUSH); + rewriteConfigYesNoOption(state,"replica-lazy-flush",server.repl_slave_lazy_flush,CONFIG_DEFAULT_SLAVE_LAZY_FLUSH); + rewriteConfigYesNoOption(state,"dynamic-hz",server.dynamic_hz,CONFIG_DEFAULT_DYNAMIC_HZ); /* Rewrite Sentinel config if in Sentinel mode. */ if (server.sentinel_mode) rewriteConfigSentinelOption(state); diff --git a/src/config.h b/src/config.h index ee3ad508e..efa9d11f2 100644 --- a/src/config.h +++ b/src/config.h @@ -62,7 +62,9 @@ #endif /* Test for backtrace() */ -#if defined(__APPLE__) || (defined(__linux__) && defined(__GLIBC__)) +#if defined(__APPLE__) || (defined(__linux__) && defined(__GLIBC__)) || \ + defined(__FreeBSD__) || (defined(__OpenBSD__) && defined(USE_BACKTRACE))\ + || defined(__DragonFly__) #define HAVE_BACKTRACE 1 #endif diff --git a/src/db.c b/src/db.c index 055af71be..7950d5074 100644 --- a/src/db.c +++ b/src/db.c @@ -38,6 +38,8 @@ * C-level DB API *----------------------------------------------------------------------------*/ +int keyIsExpired(redisDb *db, robj *key); + /* Update LFU when an object is accessed. * Firstly, decrement the counter if the decrement time is reached. * Then logarithmically increment the counter, and update the access time. */ @@ -102,7 +104,10 @@ robj *lookupKeyReadWithFlags(redisDb *db, robj *key, int flags) { /* Key expired. If we are in the context of a master, expireIfNeeded() * returns 0 only when the key does not exist at all, so it's safe * to return NULL ASAP. */ - if (server.masterhost == NULL) return NULL; + if (server.masterhost == NULL) { + server.stat_keyspace_misses++; + return NULL; + } /* However if we are in the context of a slave, expireIfNeeded() will * not really try to expire the key, it only returns information @@ -121,6 +126,7 @@ robj *lookupKeyReadWithFlags(redisDb *db, robj *key, int flags) { server.current_client->cmd && server.current_client->cmd->flags & CMD_READONLY) { + server.stat_keyspace_misses++; return NULL; } } @@ -184,14 +190,19 @@ void dbOverwrite(redisDb *db, robj *key, robj *val) { dictEntry *de = dictFind(db->dict,key->ptr); serverAssertWithInfo(NULL,key,de != NULL); + dictEntry auxentry = *de; + robj *old = dictGetVal(de); if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) { - robj *old = dictGetVal(de); - int saved_lru = old->lru; - dictReplace(db->dict, key->ptr, val); - val->lru = saved_lru; - } else { - dictReplace(db->dict, key->ptr, val); + val->lru = old->lru; } + dictSetVal(db->dict, de, val); + + if (server.lazyfree_lazy_server_del) { + freeObjAsync(old); + dictSetVal(db->dict, &auxentry, NULL); + } + + dictFreeVal(db->dict, &auxentry); } /* High level Set operation. This function can be used in order to set @@ -201,7 +212,7 @@ void dbOverwrite(redisDb *db, robj *key, robj *val) { * 2) clients WATCHing for the destination key notified. * 3) The expire time of the key is reset (the key is made persistent). * - * All the new keys in the database should be craeted via this interface. */ + * All the new keys in the database should be created via this interface. */ void setKey(redisDb *db, robj *key, robj *val) { if (lookupKeyWrite(db,key) == NULL) { dbAdd(db,key,val); @@ -230,7 +241,7 @@ robj *dbRandomKey(redisDb *db) { sds key; robj *keyobj; - de = dictGetRandomKey(db->dict); + de = dictGetFairRandomKey(db->dict); if (de == NULL) return NULL; key = dictGetKey(de); @@ -329,7 +340,7 @@ robj *dbUnshareStringValue(redisDb *db, robj *key, robj *o) { * database(s). Otherwise -1 is returned in the specific case the * DB number is out of range, and errno is set to EINVAL. */ long long emptyDb(int dbnum, int flags, void(callback)(void*)) { - int j, async = (flags & EMPTYDB_ASYNC); + int async = (flags & EMPTYDB_ASYNC); long long removed = 0; if (dbnum < -1 || dbnum >= server.dbnum) { @@ -337,8 +348,15 @@ long long emptyDb(int dbnum, int flags, void(callback)(void*)) { return -1; } - for (j = 0; j < server.dbnum; j++) { - if (dbnum != -1 && dbnum != j) continue; + int startdb, enddb; + if (dbnum == -1) { + startdb = 0; + enddb = server.dbnum-1; + } else { + startdb = enddb = dbnum; + } + + for (int j = startdb; j <= enddb; j++) { removed += dictSize(server.db[j].dict); if (async) { emptyDbAsync(&server.db[j]); @@ -430,10 +448,7 @@ void flushallCommand(client *c) { signalFlushedDb(-1); server.dirty += emptyDb(-1,flags,NULL); addReply(c,shared.ok); - if (server.rdb_child_pid != -1) { - kill(server.rdb_child_pid,SIGUSR1); - rdbRemoveTempFile(server.rdb_child_pid); - } + if (server.rdb_child_pid != -1) killRDBChild(); if (server.saveparamslen > 0) { /* Normally rdbSave() will reset dirty, but we don't want this here * as otherwise FLUSHALL will not be replicated nor put into the AOF. */ @@ -507,7 +522,7 @@ void randomkeyCommand(client *c) { robj *key; if ((key = dbRandomKey(c->db)) == NULL) { - addReply(c,shared.nullbulk); + addReplyNull(c); return; } @@ -521,7 +536,7 @@ void keysCommand(client *c) { sds pattern = c->argv[1]->ptr; int plen = sdslen(pattern), allkeys; unsigned long numkeys = 0; - void *replylen = addDeferredMultiBulkLength(c); + void *replylen = addReplyDeferredLen(c); di = dictGetSafeIterator(c->db->dict); allkeys = (pattern[0] == '*' && pattern[1] == '\0'); @@ -531,7 +546,7 @@ void keysCommand(client *c) { if (allkeys || stringmatchlen(pattern,plen,key,sdslen(key),0)) { keyobj = createStringObject(key,sdslen(key)); - if (expireIfNeeded(c->db,keyobj) == 0) { + if (!keyIsExpired(c->db,keyobj)) { addReplyBulk(c,keyobj); numkeys++; } @@ -539,7 +554,7 @@ void keysCommand(client *c) { } } dictReleaseIterator(di); - setDeferredMultiBulkLength(c,replylen,numkeys); + setDeferredArrayLen(c,replylen,numkeys); } /* This callback is used by scanGenericCommand in order to collect elements @@ -764,10 +779,10 @@ void scanGenericCommand(client *c, robj *o, unsigned long cursor) { } /* Step 4: Reply to the client. */ - addReplyMultiBulkLen(c, 2); + addReplyArrayLen(c, 2); addReplyBulkLongLong(c,cursor); - addReplyMultiBulkLen(c, listLength(keys)); + addReplyArrayLen(c, listLength(keys)); while ((node = listFirst(keys)) != NULL) { robj *kobj = listNodeValue(node); addReplyBulk(c, kobj); @@ -1108,6 +1123,25 @@ void propagateExpire(redisDb *db, robj *key, int lazy) { decrRefCount(argv[1]); } +/* Check if the key is expired. */ +int keyIsExpired(redisDb *db, robj *key) { + mstime_t when = getExpire(db,key); + + if (when < 0) return 0; /* No expire for this key */ + + /* Don't expire anything while loading. It will be done later. */ + if (server.loading) return 0; + + /* If we are in the context of a Lua script, we pretend that time is + * blocked to when the Lua script started. This way a key can expire + * only the first time it is accessed and not in the middle of the + * script execution, making propagation to slaves / AOF consistent. + * See issue #1525 on Github for more information. */ + mstime_t now = server.lua_caller ? server.lua_time_start : mstime(); + + return now > when; +} + /* This function is called when we are going to perform some operation * in a given key, but such key may be already logically expired even if * it still exists in the database. The main way this function is called @@ -1128,32 +1162,17 @@ void propagateExpire(redisDb *db, robj *key, int lazy) { * The return value of the function is 0 if the key is still valid, * otherwise the function returns 1 if the key is expired. */ int expireIfNeeded(redisDb *db, robj *key) { - mstime_t when = getExpire(db,key); - mstime_t now; + if (!keyIsExpired(db,key)) return 0; - if (when < 0) return 0; /* No expire for this key */ - - /* Don't expire anything while loading. It will be done later. */ - if (server.loading) return 0; - - /* If we are in the context of a Lua script, we pretend that time is - * blocked to when the Lua script started. This way a key can expire - * only the first time it is accessed and not in the middle of the - * script execution, making propagation to slaves / AOF consistent. - * See issue #1525 on Github for more information. */ - now = server.lua_caller ? server.lua_time_start : mstime(); - - /* If we are running in the context of a slave, return ASAP: + /* If we are running in the context of a slave, instead of + * evicting the expired key from the database, we return ASAP: * the slave key expiration is controlled by the master that will * send us synthesized DEL operations for expired keys. * * Still we try to return the right information to the caller, * that is, 0 if we think the key should be still valid, 1 if * we think the key is expired at this time. */ - if (server.masterhost != NULL) return now > when; - - /* Return when this key has not expired */ - if (now <= when) return 0; + if (server.masterhost != NULL) return 1; /* Delete the key */ server.stat_expiredkeys++; diff --git a/src/debug.c b/src/debug.c index 32be3c59c..0c6b5630c 100644 --- a/src/debug.c +++ b/src/debug.c @@ -37,7 +37,11 @@ #ifdef HAVE_BACKTRACE #include +#ifndef __OpenBSD__ #include +#else +typedef ucontext_t sigcontext_t; +#endif #include #include "bio.h" #include @@ -70,7 +74,7 @@ void xorDigest(unsigned char *digest, void *ptr, size_t len) { digest[j] ^= hash[j]; } -void xorObjectDigest(unsigned char *digest, robj *o) { +void xorStringObjectDigest(unsigned char *digest, robj *o) { o = getDecodedObject(o); xorDigest(digest,o->ptr,sdslen(o->ptr)); decrRefCount(o); @@ -100,12 +104,151 @@ void mixDigest(unsigned char *digest, void *ptr, size_t len) { SHA1Final(digest,&ctx); } -void mixObjectDigest(unsigned char *digest, robj *o) { +void mixStringObjectDigest(unsigned char *digest, robj *o) { o = getDecodedObject(o); mixDigest(digest,o->ptr,sdslen(o->ptr)); decrRefCount(o); } +/* This function computes the digest of a data structure stored in the + * object 'o'. It is the core of the DEBUG DIGEST command: when taking the + * digest of a whole dataset, we take the digest of the key and the value + * pair, and xor all those together. + * + * Note that this function does not reset the initial 'digest' passed, it + * will continue mixing this object digest to anything that was already + * present. */ +void xorObjectDigest(redisDb *db, robj *keyobj, unsigned char *digest, robj *o) { + uint32_t aux = htonl(o->type); + mixDigest(digest,&aux,sizeof(aux)); + long long expiretime = getExpire(db,keyobj); + char buf[128]; + + /* Save the key and associated value */ + if (o->type == OBJ_STRING) { + mixStringObjectDigest(digest,o); + } else if (o->type == OBJ_LIST) { + listTypeIterator *li = listTypeInitIterator(o,0,LIST_TAIL); + listTypeEntry entry; + while(listTypeNext(li,&entry)) { + robj *eleobj = listTypeGet(&entry); + mixStringObjectDigest(digest,eleobj); + decrRefCount(eleobj); + } + listTypeReleaseIterator(li); + } else if (o->type == OBJ_SET) { + setTypeIterator *si = setTypeInitIterator(o); + sds sdsele; + while((sdsele = setTypeNextObject(si)) != NULL) { + xorDigest(digest,sdsele,sdslen(sdsele)); + sdsfree(sdsele); + } + setTypeReleaseIterator(si); + } else if (o->type == OBJ_ZSET) { + unsigned char eledigest[20]; + + if (o->encoding == OBJ_ENCODING_ZIPLIST) { + unsigned char *zl = o->ptr; + unsigned char *eptr, *sptr; + unsigned char *vstr; + unsigned int vlen; + long long vll; + double score; + + eptr = ziplistIndex(zl,0); + serverAssert(eptr != NULL); + sptr = ziplistNext(zl,eptr); + serverAssert(sptr != NULL); + + while (eptr != NULL) { + serverAssert(ziplistGet(eptr,&vstr,&vlen,&vll)); + score = zzlGetScore(sptr); + + memset(eledigest,0,20); + if (vstr != NULL) { + mixDigest(eledigest,vstr,vlen); + } else { + ll2string(buf,sizeof(buf),vll); + mixDigest(eledigest,buf,strlen(buf)); + } + + snprintf(buf,sizeof(buf),"%.17g",score); + mixDigest(eledigest,buf,strlen(buf)); + xorDigest(digest,eledigest,20); + zzlNext(zl,&eptr,&sptr); + } + } else if (o->encoding == OBJ_ENCODING_SKIPLIST) { + zset *zs = o->ptr; + dictIterator *di = dictGetIterator(zs->dict); + dictEntry *de; + + while((de = dictNext(di)) != NULL) { + sds sdsele = dictGetKey(de); + double *score = dictGetVal(de); + + snprintf(buf,sizeof(buf),"%.17g",*score); + memset(eledigest,0,20); + mixDigest(eledigest,sdsele,sdslen(sdsele)); + mixDigest(eledigest,buf,strlen(buf)); + xorDigest(digest,eledigest,20); + } + dictReleaseIterator(di); + } else { + serverPanic("Unknown sorted set encoding"); + } + } else if (o->type == OBJ_HASH) { + hashTypeIterator *hi = hashTypeInitIterator(o); + while (hashTypeNext(hi) != C_ERR) { + unsigned char eledigest[20]; + sds sdsele; + + memset(eledigest,0,20); + sdsele = hashTypeCurrentObjectNewSds(hi,OBJ_HASH_KEY); + mixDigest(eledigest,sdsele,sdslen(sdsele)); + sdsfree(sdsele); + sdsele = hashTypeCurrentObjectNewSds(hi,OBJ_HASH_VALUE); + mixDigest(eledigest,sdsele,sdslen(sdsele)); + sdsfree(sdsele); + xorDigest(digest,eledigest,20); + } + hashTypeReleaseIterator(hi); + } else if (o->type == OBJ_STREAM) { + streamIterator si; + streamIteratorStart(&si,o->ptr,NULL,NULL,0); + streamID id; + int64_t numfields; + + while(streamIteratorGetID(&si,&id,&numfields)) { + sds itemid = sdscatfmt(sdsempty(),"%U.%U",id.ms,id.seq); + mixDigest(digest,itemid,sdslen(itemid)); + sdsfree(itemid); + + while(numfields--) { + unsigned char *field, *value; + int64_t field_len, value_len; + streamIteratorGetField(&si,&field,&value, + &field_len,&value_len); + mixDigest(digest,field,field_len); + mixDigest(digest,value,value_len); + } + } + streamIteratorStop(&si); + } else if (o->type == OBJ_MODULE) { + RedisModuleDigest md; + moduleValue *mv = o->ptr; + moduleType *mt = mv->type; + moduleInitDigestContext(md); + if (mt->digest) { + mt->digest(&md,mv->value); + xorDigest(digest,md.x,sizeof(md.x)); + } + } else { + serverPanic("Unknown object type"); + } + /* If the key has an expire, add it to the mix */ + if (expiretime != -1) xorDigest(digest,"!!expire!!",10); +} + /* Compute the dataset digest. Since keys, sets elements, hashes elements * are not ordered, we use a trick: every aggregate digest is the xor * of the digests of their elements. This way the order will not change @@ -114,7 +257,6 @@ void mixObjectDigest(unsigned char *digest, robj *o) { * a different digest. */ void computeDatasetDigest(unsigned char *final) { unsigned char digest[20]; - char buf[128]; dictIterator *di = NULL; dictEntry *de; int j; @@ -137,7 +279,6 @@ void computeDatasetDigest(unsigned char *final) { while((de = dictNext(di)) != NULL) { sds key; robj *keyobj, *o; - long long expiretime; memset(digest,0,20); /* This key-val digest */ key = dictGetKey(de); @@ -146,134 +287,8 @@ void computeDatasetDigest(unsigned char *final) { mixDigest(digest,key,sdslen(key)); o = dictGetVal(de); + xorObjectDigest(db,keyobj,digest,o); - aux = htonl(o->type); - mixDigest(digest,&aux,sizeof(aux)); - expiretime = getExpire(db,keyobj); - - /* Save the key and associated value */ - if (o->type == OBJ_STRING) { - mixObjectDigest(digest,o); - } else if (o->type == OBJ_LIST) { - listTypeIterator *li = listTypeInitIterator(o,0,LIST_TAIL); - listTypeEntry entry; - while(listTypeNext(li,&entry)) { - robj *eleobj = listTypeGet(&entry); - mixObjectDigest(digest,eleobj); - decrRefCount(eleobj); - } - listTypeReleaseIterator(li); - } else if (o->type == OBJ_SET) { - setTypeIterator *si = setTypeInitIterator(o); - sds sdsele; - while((sdsele = setTypeNextObject(si)) != NULL) { - xorDigest(digest,sdsele,sdslen(sdsele)); - sdsfree(sdsele); - } - setTypeReleaseIterator(si); - } else if (o->type == OBJ_ZSET) { - unsigned char eledigest[20]; - - if (o->encoding == OBJ_ENCODING_ZIPLIST) { - unsigned char *zl = o->ptr; - unsigned char *eptr, *sptr; - unsigned char *vstr; - unsigned int vlen; - long long vll; - double score; - - eptr = ziplistIndex(zl,0); - serverAssert(eptr != NULL); - sptr = ziplistNext(zl,eptr); - serverAssert(sptr != NULL); - - while (eptr != NULL) { - serverAssert(ziplistGet(eptr,&vstr,&vlen,&vll)); - score = zzlGetScore(sptr); - - memset(eledigest,0,20); - if (vstr != NULL) { - mixDigest(eledigest,vstr,vlen); - } else { - ll2string(buf,sizeof(buf),vll); - mixDigest(eledigest,buf,strlen(buf)); - } - - snprintf(buf,sizeof(buf),"%.17g",score); - mixDigest(eledigest,buf,strlen(buf)); - xorDigest(digest,eledigest,20); - zzlNext(zl,&eptr,&sptr); - } - } else if (o->encoding == OBJ_ENCODING_SKIPLIST) { - zset *zs = o->ptr; - dictIterator *di = dictGetIterator(zs->dict); - dictEntry *de; - - while((de = dictNext(di)) != NULL) { - sds sdsele = dictGetKey(de); - double *score = dictGetVal(de); - - snprintf(buf,sizeof(buf),"%.17g",*score); - memset(eledigest,0,20); - mixDigest(eledigest,sdsele,sdslen(sdsele)); - mixDigest(eledigest,buf,strlen(buf)); - xorDigest(digest,eledigest,20); - } - dictReleaseIterator(di); - } else { - serverPanic("Unknown sorted set encoding"); - } - } else if (o->type == OBJ_HASH) { - hashTypeIterator *hi = hashTypeInitIterator(o); - while (hashTypeNext(hi) != C_ERR) { - unsigned char eledigest[20]; - sds sdsele; - - memset(eledigest,0,20); - sdsele = hashTypeCurrentObjectNewSds(hi,OBJ_HASH_KEY); - mixDigest(eledigest,sdsele,sdslen(sdsele)); - sdsfree(sdsele); - sdsele = hashTypeCurrentObjectNewSds(hi,OBJ_HASH_VALUE); - mixDigest(eledigest,sdsele,sdslen(sdsele)); - sdsfree(sdsele); - xorDigest(digest,eledigest,20); - } - hashTypeReleaseIterator(hi); - } else if (o->type == OBJ_STREAM) { - streamIterator si; - streamIteratorStart(&si,o->ptr,NULL,NULL,0); - streamID id; - int64_t numfields; - - while(streamIteratorGetID(&si,&id,&numfields)) { - sds itemid = sdscatfmt(sdsempty(),"%U.%U",id.ms,id.seq); - mixDigest(digest,itemid,sdslen(itemid)); - sdsfree(itemid); - - while(numfields--) { - unsigned char *field, *value; - int64_t field_len, value_len; - streamIteratorGetField(&si,&field,&value, - &field_len,&value_len); - mixDigest(digest,field,field_len); - mixDigest(digest,value,value_len); - } - } - streamIteratorStop(&si); - } else if (o->type == OBJ_MODULE) { - RedisModuleDigest md; - moduleValue *mv = o->ptr; - moduleType *mt = mv->type; - moduleInitDigestContext(md); - if (mt->digest) { - mt->digest(&md,mv->value); - xorDigest(digest,md.x,sizeof(md.x)); - } - } else { - serverPanic("Unknown object type"); - } - /* If the key has an expire, add it to the mix */ - if (expiretime != -1) xorDigest(digest,"!!expire!!",10); /* We can finally xor the key-val digest to the final digest */ xorDigest(final,digest,20); decrRefCount(keyobj); @@ -289,7 +304,9 @@ void debugCommand(client *c) { "CHANGE-REPL-ID -- Change the replication IDs of the instance. Dangerous, should be used only for testing the replication subsystem.", "CRASH-AND-RECOVER -- Hard crash and restart after delay.", "DIGEST -- Output a hex signature representing the current DB content.", +"DIGEST-VALUE ... -- Output a hex signature of the values of all the specified keys.", "ERROR -- Return a Redis protocol error with as message. Useful for clients unit tests to simulate Redis errors.", +"LOG -- write message to the server log.", "HTSTATS -- Return hash table statistics of the specified Redis database.", "HTSTATS-KEY -- Like htstats but for the hash table stored as key's value.", "LOADAOF -- Flush the AOF buffers on disk and reload the AOF in memory.", @@ -305,6 +322,7 @@ void debugCommand(client *c) { "SLEEP -- Stop the server for . Decimals allowed.", "STRUCTSIZE -- Return the size of different Redis core C structures.", "ZIPLIST -- Show low level info about the ziplist encoding.", +"STRINGMATCH-TEST -- Run a fuzz tester against the stringmatchlen() function.", NULL }; addReplyHelp(c, help); @@ -331,8 +349,10 @@ NULL zfree(ptr); addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"assert")) { - if (c->argc >= 3) c->argv[2] = tryObjectEncoding(c->argv[2]); serverAssertWithInfo(c,c->argv[0],1 == 2); + } else if (!strcasecmp(c->argv[1]->ptr,"log") && c->argc == 3) { + serverLog(LL_WARNING, "DEBUG LOG: %s", (char*)c->argv[2]->ptr); + addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"reload")) { rdbSaveInfo rsi, *rsiptr; rsiptr = rdbPopulateSaveInfo(&rsi); @@ -341,7 +361,10 @@ NULL return; } emptyDb(-1,EMPTYDB_NO_FLAGS,NULL); - if (rdbLoad(server.rdb_filename,NULL) != C_OK) { + protectClient(c); + int ret = rdbLoad(server.rdb_filename,NULL); + unprotectClient(c); + if (ret != C_OK) { addReplyError(c,"Error trying to load the RDB dump"); return; } @@ -350,7 +373,10 @@ NULL } else if (!strcasecmp(c->argv[1]->ptr,"loadaof")) { if (server.aof_state != AOF_OFF) flushAppendOnlyFile(1); emptyDb(-1,EMPTYDB_NO_FLAGS,NULL); - if (loadAppendOnlyFile(server.aof_filename) != C_OK) { + protectClient(c); + int ret = loadAppendOnlyFile(server.aof_filename); + unprotectClient(c); + if (ret != C_OK) { addReply(c,shared.err); return; } @@ -481,15 +507,80 @@ NULL } addReply(c,shared.ok); } else if (!strcasecmp(c->argv[1]->ptr,"digest") && c->argc == 2) { + /* DEBUG DIGEST (form without keys specified) */ unsigned char digest[20]; sds d = sdsempty(); - int j; computeDatasetDigest(digest); - for (j = 0; j < 20; j++) - d = sdscatprintf(d, "%02x",digest[j]); + for (int i = 0; i < 20; i++) d = sdscatprintf(d, "%02x",digest[i]); addReplyStatus(c,d); sdsfree(d); + } else if (!strcasecmp(c->argv[1]->ptr,"digest-value") && c->argc >= 2) { + /* DEBUG DIGEST-VALUE key key key ... key. */ + addReplyArrayLen(c,c->argc-2); + for (int j = 2; j < c->argc; j++) { + unsigned char digest[20]; + memset(digest,0,20); /* Start with a clean result */ + robj *o = lookupKeyReadWithFlags(c->db,c->argv[j],LOOKUP_NOTOUCH); + if (o) xorObjectDigest(c->db,c->argv[j],digest,o); + + sds d = sdsempty(); + for (int i = 0; i < 20; i++) d = sdscatprintf(d, "%02x",digest[i]); + addReplyStatus(c,d); + sdsfree(d); + } + } else if (!strcasecmp(c->argv[1]->ptr,"protocol") && c->argc == 3) { + /* DEBUG PROTOCOL [string|integer|double|bignum|null|array|set|map| + * attrib|push|verbatim|true|false|state|err|bloberr] */ + char *name = c->argv[2]->ptr; + if (!strcasecmp(name,"string")) { + addReplyBulkCString(c,"Hello World"); + } else if (!strcasecmp(name,"integer")) { + addReplyLongLong(c,12345); + } else if (!strcasecmp(name,"double")) { + addReplyDouble(c,3.14159265359); + } else if (!strcasecmp(name,"bignum")) { + addReplyProto(c,"(1234567999999999999999999999999999999\r\n",40); + } else if (!strcasecmp(name,"null")) { + addReplyNull(c); + } else if (!strcasecmp(name,"array")) { + addReplyArrayLen(c,3); + for (int j = 0; j < 3; j++) addReplyLongLong(c,j); + } else if (!strcasecmp(name,"set")) { + addReplySetLen(c,3); + for (int j = 0; j < 3; j++) addReplyLongLong(c,j); + } else if (!strcasecmp(name,"map")) { + addReplyMapLen(c,3); + for (int j = 0; j < 3; j++) { + addReplyLongLong(c,j); + addReplyBool(c, j == 1); + } + } else if (!strcasecmp(name,"attrib")) { + addReplyAttributeLen(c,1); + addReplyBulkCString(c,"key-popularity"); + addReplyArrayLen(c,2); + addReplyBulkCString(c,"key:123"); + addReplyLongLong(c,90); + /* Attributes are not real replies, so a well formed reply should + * also have a normal reply type after the attribute. */ + addReplyBulkCString(c,"Some real reply following the attribute"); + } else if (!strcasecmp(name,"push")) { + addReplyPushLen(c,2); + addReplyBulkCString(c,"server-cpu-usage"); + addReplyLongLong(c,42); + /* Push replies are not synchronous replies, so we emit also a + * normal reply in order for blocking clients just discarding the + * push reply, to actually consume the reply and continue. */ + addReplyBulkCString(c,"Some real reply following the push reply"); + } else if (!strcasecmp(name,"true")) { + addReplyBool(c,1); + } else if (!strcasecmp(name,"false")) { + addReplyBool(c,0); + } else if (!strcasecmp(name,"verbatim")) { + addReplyVerbatim(c,"This is a verbatim\nstring",25,"txt"); + } else { + addReplyError(c,"Wrong protocol type name. Please use one of the following: string|integer|double|bignum|null|array|set|map|attrib|push|verbatim|true|false|state|err|bloberr"); + } } else if (!strcasecmp(c->argv[1]->ptr,"sleep") && c->argc == 3) { double dtime = strtod(c->argv[2]->ptr,NULL); long long utime = dtime*1000000; @@ -581,6 +672,10 @@ NULL changeReplicationId(); clearReplicationId2(); addReply(c,shared.ok); + } else if (!strcasecmp(c->argv[1]->ptr,"stringmatch-test") && c->argc == 2) + { + stringmatchlen_fuzz_test(); + addReplyStatus(c,"Apparently Redis did not crash: test passed"); } else { addReplySubcommandSyntaxError(c); return; @@ -708,7 +803,7 @@ static void *getMcontextEip(ucontext_t *uc) { #endif #elif defined(__linux__) /* Linux */ - #if defined(__i386__) + #if defined(__i386__) || defined(__ILP32__) return (void*) uc->uc_mcontext.gregs[14]; /* Linux 32 */ #elif defined(__X86_64__) || defined(__x86_64__) return (void*) uc->uc_mcontext.gregs[16]; /* Linux 64 */ @@ -719,6 +814,22 @@ static void *getMcontextEip(ucontext_t *uc) { #elif defined(__aarch64__) /* Linux AArch64 */ return (void*) uc->uc_mcontext.pc; #endif +#elif defined(__FreeBSD__) + /* FreeBSD */ + #if defined(__i386__) + return (void*) uc->uc_mcontext.mc_eip; + #elif defined(__x86_64__) + return (void*) uc->uc_mcontext.mc_rip; + #endif +#elif defined(__OpenBSD__) + /* OpenBSD */ + #if defined(__i386__) + return (void*) uc->sc_eip; + #elif defined(__x86_64__) + return (void*) uc->sc_rip; + #endif +#elif defined(__DragonFly__) + return (void*) uc->uc_mcontext.mc_rip; #else return NULL; #endif @@ -804,7 +915,7 @@ void logRegisters(ucontext_t *uc) { /* Linux */ #elif defined(__linux__) /* Linux x86 */ - #if defined(__i386__) + #if defined(__i386__) || defined(__ILP32__) serverLog(LL_WARNING, "\n" "EAX:%08lx EBX:%08lx ECX:%08lx EDX:%08lx\n" @@ -860,6 +971,145 @@ void logRegisters(ucontext_t *uc) { ); logStackContent((void**)uc->uc_mcontext.gregs[15]); #endif +#elif defined(__FreeBSD__) + #if defined(__x86_64__) + serverLog(LL_WARNING, + "\n" + "RAX:%016lx RBX:%016lx\nRCX:%016lx RDX:%016lx\n" + "RDI:%016lx RSI:%016lx\nRBP:%016lx RSP:%016lx\n" + "R8 :%016lx R9 :%016lx\nR10:%016lx R11:%016lx\n" + "R12:%016lx R13:%016lx\nR14:%016lx R15:%016lx\n" + "RIP:%016lx EFL:%016lx\nCSGSFS:%016lx", + (unsigned long) uc->uc_mcontext.mc_rax, + (unsigned long) uc->uc_mcontext.mc_rbx, + (unsigned long) uc->uc_mcontext.mc_rcx, + (unsigned long) uc->uc_mcontext.mc_rdx, + (unsigned long) uc->uc_mcontext.mc_rdi, + (unsigned long) uc->uc_mcontext.mc_rsi, + (unsigned long) uc->uc_mcontext.mc_rbp, + (unsigned long) uc->uc_mcontext.mc_rsp, + (unsigned long) uc->uc_mcontext.mc_r8, + (unsigned long) uc->uc_mcontext.mc_r9, + (unsigned long) uc->uc_mcontext.mc_r10, + (unsigned long) uc->uc_mcontext.mc_r11, + (unsigned long) uc->uc_mcontext.mc_r12, + (unsigned long) uc->uc_mcontext.mc_r13, + (unsigned long) uc->uc_mcontext.mc_r14, + (unsigned long) uc->uc_mcontext.mc_r15, + (unsigned long) uc->uc_mcontext.mc_rip, + (unsigned long) uc->uc_mcontext.mc_rflags, + (unsigned long) uc->uc_mcontext.mc_cs + ); + logStackContent((void**)uc->uc_mcontext.mc_rsp); + #elif defined(__i386__) + serverLog(LL_WARNING, + "\n" + "EAX:%08lx EBX:%08lx ECX:%08lx EDX:%08lx\n" + "EDI:%08lx ESI:%08lx EBP:%08lx ESP:%08lx\n" + "SS :%08lx EFL:%08lx EIP:%08lx CS:%08lx\n" + "DS :%08lx ES :%08lx FS :%08lx GS:%08lx", + (unsigned long) uc->uc_mcontext.mc_eax, + (unsigned long) uc->uc_mcontext.mc_ebx, + (unsigned long) uc->uc_mcontext.mc_ebx, + (unsigned long) uc->uc_mcontext.mc_edx, + (unsigned long) uc->uc_mcontext.mc_edi, + (unsigned long) uc->uc_mcontext.mc_esi, + (unsigned long) uc->uc_mcontext.mc_ebp, + (unsigned long) uc->uc_mcontext.mc_esp, + (unsigned long) uc->uc_mcontext.mc_ss, + (unsigned long) uc->uc_mcontext.mc_eflags, + (unsigned long) uc->uc_mcontext.mc_eip, + (unsigned long) uc->uc_mcontext.mc_cs, + (unsigned long) uc->uc_mcontext.mc_es, + (unsigned long) uc->uc_mcontext.mc_fs, + (unsigned long) uc->uc_mcontext.mc_gs + ); + logStackContent((void**)uc->uc_mcontext.mc_esp); + #endif +#elif defined(__OpenBSD__) + #if defined(__x86_64__) + serverLog(LL_WARNING, + "\n" + "RAX:%016lx RBX:%016lx\nRCX:%016lx RDX:%016lx\n" + "RDI:%016lx RSI:%016lx\nRBP:%016lx RSP:%016lx\n" + "R8 :%016lx R9 :%016lx\nR10:%016lx R11:%016lx\n" + "R12:%016lx R13:%016lx\nR14:%016lx R15:%016lx\n" + "RIP:%016lx EFL:%016lx\nCSGSFS:%016lx", + (unsigned long) uc->sc_rax, + (unsigned long) uc->sc_rbx, + (unsigned long) uc->sc_rcx, + (unsigned long) uc->sc_rdx, + (unsigned long) uc->sc_rdi, + (unsigned long) uc->sc_rsi, + (unsigned long) uc->sc_rbp, + (unsigned long) uc->sc_rsp, + (unsigned long) uc->sc_r8, + (unsigned long) uc->sc_r9, + (unsigned long) uc->sc_r10, + (unsigned long) uc->sc_r11, + (unsigned long) uc->sc_r12, + (unsigned long) uc->sc_r13, + (unsigned long) uc->sc_r14, + (unsigned long) uc->sc_r15, + (unsigned long) uc->sc_rip, + (unsigned long) uc->sc_rflags, + (unsigned long) uc->sc_cs + ); + logStackContent((void**)uc->sc_rsp); + #elif defined(__i386__) + serverLog(LL_WARNING, + "\n" + "EAX:%08lx EBX:%08lx ECX:%08lx EDX:%08lx\n" + "EDI:%08lx ESI:%08lx EBP:%08lx ESP:%08lx\n" + "SS :%08lx EFL:%08lx EIP:%08lx CS:%08lx\n" + "DS :%08lx ES :%08lx FS :%08lx GS:%08lx", + (unsigned long) uc->sc_eax, + (unsigned long) uc->sc_ebx, + (unsigned long) uc->sc_ebx, + (unsigned long) uc->sc_edx, + (unsigned long) uc->sc_edi, + (unsigned long) uc->sc_esi, + (unsigned long) uc->sc_ebp, + (unsigned long) uc->sc_esp, + (unsigned long) uc->sc_ss, + (unsigned long) uc->sc_eflags, + (unsigned long) uc->sc_eip, + (unsigned long) uc->sc_cs, + (unsigned long) uc->sc_es, + (unsigned long) uc->sc_fs, + (unsigned long) uc->sc_gs + ); + logStackContent((void**)uc->sc_esp); + #endif +#elif defined(__DragonFly__) + serverLog(LL_WARNING, + "\n" + "RAX:%016lx RBX:%016lx\nRCX:%016lx RDX:%016lx\n" + "RDI:%016lx RSI:%016lx\nRBP:%016lx RSP:%016lx\n" + "R8 :%016lx R9 :%016lx\nR10:%016lx R11:%016lx\n" + "R12:%016lx R13:%016lx\nR14:%016lx R15:%016lx\n" + "RIP:%016lx EFL:%016lx\nCSGSFS:%016lx", + (unsigned long) uc->uc_mcontext.mc_rax, + (unsigned long) uc->uc_mcontext.mc_rbx, + (unsigned long) uc->uc_mcontext.mc_rcx, + (unsigned long) uc->uc_mcontext.mc_rdx, + (unsigned long) uc->uc_mcontext.mc_rdi, + (unsigned long) uc->uc_mcontext.mc_rsi, + (unsigned long) uc->uc_mcontext.mc_rbp, + (unsigned long) uc->uc_mcontext.mc_rsp, + (unsigned long) uc->uc_mcontext.mc_r8, + (unsigned long) uc->uc_mcontext.mc_r9, + (unsigned long) uc->uc_mcontext.mc_r10, + (unsigned long) uc->uc_mcontext.mc_r11, + (unsigned long) uc->uc_mcontext.mc_r12, + (unsigned long) uc->uc_mcontext.mc_r13, + (unsigned long) uc->uc_mcontext.mc_r14, + (unsigned long) uc->uc_mcontext.mc_r15, + (unsigned long) uc->uc_mcontext.mc_rip, + (unsigned long) uc->uc_mcontext.mc_rflags, + (unsigned long) uc->uc_mcontext.mc_cs + ); + logStackContent((void**)uc->uc_mcontext.mc_rsp); #else serverLog(LL_WARNING, " Dumping of registers not supported for this OS/arch"); @@ -1179,6 +1429,8 @@ void serverLogHexDump(int level, char *descr, void *value, size_t len) { void watchdogSignalHandler(int sig, siginfo_t *info, void *secret) { #ifdef HAVE_BACKTRACE ucontext_t *uc = (ucontext_t*) secret; +#else + (void)secret; #endif UNUSED(info); UNUSED(sig); diff --git a/src/dict.c b/src/dict.c index 2cf9d4839..106467ef7 100644 --- a/src/dict.c +++ b/src/dict.c @@ -739,6 +739,30 @@ unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count) { return stored; } +/* This is like dictGetRandomKey() from the POV of the API, but will do more + * work to ensure a better distribution of the returned element. + * + * This function improves the distribution because the dictGetRandomKey() + * problem is that it selects a random bucket, then it selects a random + * element from the chain in the bucket. However elements being in different + * chain lengths will have different probabilities of being reported. With + * this function instead what we do is to consider a "linear" range of the table + * that may be constituted of N buckets with chains of different lengths + * appearing one after the other. Then we report a random element in the range. + * In this way we smooth away the problem of different chain lenghts. */ +#define GETFAIR_NUM_ENTRIES 15 +dictEntry *dictGetFairRandomKey(dict *d) { + dictEntry *entries[GETFAIR_NUM_ENTRIES]; + unsigned int count = dictGetSomeKeys(d,entries,GETFAIR_NUM_ENTRIES); + /* Note that dictGetSomeKeys() may return zero elements in an unlucky + * run() even if there are actually elements inside the hash table. So + * when we get zero, we call the true dictGetRandomKey() that will always + * yeld the element if the hash table has at least one. */ + if (count == 0) return dictGetRandomKey(d); + unsigned int idx = rand() % count; + return entries[idx]; +} + /* Function to reverse bits. Algorithm from: * http://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel */ static unsigned long rev(unsigned long v) { diff --git a/src/dict.h b/src/dict.h index 62018cc44..dec60f637 100644 --- a/src/dict.h +++ b/src/dict.h @@ -166,6 +166,7 @@ dictIterator *dictGetSafeIterator(dict *d); dictEntry *dictNext(dictIterator *iter); void dictReleaseIterator(dictIterator *iter); dictEntry *dictGetRandomKey(dict *d); +dictEntry *dictGetFairRandomKey(dict *d); unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count); void dictGetStats(char *buf, size_t bufsize, dict *d); uint64_t dictGenHashFunction(const void *key, int len); diff --git a/src/evict.c b/src/evict.c index ecc25dd8e..773916ce8 100644 --- a/src/evict.c +++ b/src/evict.c @@ -364,7 +364,7 @@ size_t freeMemoryGetNotCountedMemory(void) { } } if (server.aof_state != AOF_OFF) { - overhead += sdslen(server.aof_buf)+aofRewriteBufferSize(); + overhead += sdsalloc(server.aof_buf)+aofRewriteBufferSize(); } return overhead; } @@ -444,6 +444,10 @@ int getMaxmemoryState(size_t *total, size_t *logical, size_t *tofree, float *lev * Otehrwise if we are over the memory limit, but not enough memory * was freed to return back under the limit, the function returns C_ERR. */ int freeMemoryIfNeeded(void) { + /* By default replicas should ignore maxmemory + * and just be masters exact copies. */ + if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK; + size_t mem_reported, mem_tofree, mem_freed; mstime_t latency, eviction_latency; long long delta; @@ -618,3 +622,14 @@ cant_free: return C_ERR; } +/* This is a wrapper for freeMemoryIfNeeded() that only really calls the + * function if right now there are the conditions to do so safely: + * + * - There must be no script in timeout condition. + * - Nor we are loading data right now. + * + */ +int freeMemoryIfNeededAndSafe(void) { + if (server.lua_timedout || server.loading) return C_OK; + return freeMemoryIfNeeded(); +} diff --git a/src/geo.c b/src/geo.c index c78fadfcf..91a0421f5 100644 --- a/src/geo.c +++ b/src/geo.c @@ -466,7 +466,7 @@ void georadiusGeneric(client *c, int flags) { /* Look up the requested zset */ robj *zobj = NULL; - if ((zobj = lookupKeyReadOrReply(c, key, shared.emptymultibulk)) == NULL || + if ((zobj = lookupKeyReadOrReply(c, key, shared.null[c->resp])) == NULL || checkType(c, zobj, OBJ_ZSET)) { return; } @@ -566,7 +566,7 @@ void georadiusGeneric(client *c, int flags) { /* If no matching results, the user gets an empty reply. */ if (ga->used == 0 && storekey == NULL) { - addReply(c, shared.emptymultibulk); + addReplyNull(c); geoArrayFree(ga); return; } @@ -597,11 +597,11 @@ void georadiusGeneric(client *c, int flags) { if (withhash) option_length++; - /* The multibulk len we send is exactly result_length. The result is + /* The array len we send is exactly result_length. The result is * either all strings of just zset members *or* a nested multi-bulk * reply containing the zset member string _and_ all the additional * options the user enabled for this request. */ - addReplyMultiBulkLen(c, returned_items); + addReplyArrayLen(c, returned_items); /* Finally send results back to the caller */ int i; @@ -613,7 +613,7 @@ void georadiusGeneric(client *c, int flags) { * as a nested multi-bulk. Add 1 to account for result value * itself. */ if (option_length) - addReplyMultiBulkLen(c, option_length + 1); + addReplyArrayLen(c, option_length + 1); addReplyBulkSds(c,gp->member); gp->member = NULL; @@ -625,7 +625,7 @@ void georadiusGeneric(client *c, int flags) { addReplyLongLong(c, gp->score); if (withcoords) { - addReplyMultiBulkLen(c, 2); + addReplyArrayLen(c, 2); addReplyHumanLongDouble(c, gp->longitude); addReplyHumanLongDouble(c, gp->latitude); } @@ -706,11 +706,11 @@ void geohashCommand(client *c) { /* Geohash elements one after the other, using a null bulk reply for * missing elements. */ - addReplyMultiBulkLen(c,c->argc-2); + addReplyArrayLen(c,c->argc-2); for (j = 2; j < c->argc; j++) { double score; if (!zobj || zsetScore(zobj, c->argv[j]->ptr, &score) == C_ERR) { - addReply(c,shared.nullbulk); + addReplyNull(c); } else { /* The internal format we use for geocoding is a bit different * than the standard, since we use as initial latitude range @@ -721,7 +721,7 @@ void geohashCommand(client *c) { /* Decode... */ double xy[2]; if (!decodeGeohash(score,xy)) { - addReply(c,shared.nullbulk); + addReplyNull(c); continue; } @@ -759,19 +759,19 @@ void geoposCommand(client *c) { /* Report elements one after the other, using a null bulk reply for * missing elements. */ - addReplyMultiBulkLen(c,c->argc-2); + addReplyArrayLen(c,c->argc-2); for (j = 2; j < c->argc; j++) { double score; if (!zobj || zsetScore(zobj, c->argv[j]->ptr, &score) == C_ERR) { - addReply(c,shared.nullmultibulk); + addReplyNullArray(c); } else { /* Decode... */ double xy[2]; if (!decodeGeohash(score,xy)) { - addReply(c,shared.nullmultibulk); + addReplyNullArray(c); continue; } - addReplyMultiBulkLen(c,2); + addReplyArrayLen(c,2); addReplyHumanLongDouble(c,xy[0]); addReplyHumanLongDouble(c,xy[1]); } @@ -797,7 +797,7 @@ void geodistCommand(client *c) { /* Look up the requested zset */ robj *zobj = NULL; - if ((zobj = lookupKeyReadOrReply(c, c->argv[1], shared.nullbulk)) + if ((zobj = lookupKeyReadOrReply(c, c->argv[1], shared.null[c->resp])) == NULL || checkType(c, zobj, OBJ_ZSET)) return; /* Get the scores. We need both otherwise NULL is returned. */ @@ -805,13 +805,13 @@ void geodistCommand(client *c) { if (zsetScore(zobj, c->argv[2]->ptr, &score1) == C_ERR || zsetScore(zobj, c->argv[3]->ptr, &score2) == C_ERR) { - addReply(c,shared.nullbulk); + addReplyNull(c); return; } /* Decode & compute the distance. */ if (!decodeGeohash(score1,xyxy) || !decodeGeohash(score2,xyxy+2)) - addReply(c,shared.nullbulk); + addReplyNull(c); else addReplyDoubleDistance(c, geohashGetDistance(xyxy[0],xyxy[1],xyxy[2],xyxy[3]) / to_meter); diff --git a/src/geohash.c b/src/geohash.c index b40282e76..db5ae025a 100644 --- a/src/geohash.c +++ b/src/geohash.c @@ -127,8 +127,8 @@ int geohashEncode(const GeoHashRange *long_range, const GeoHashRange *lat_range, /* Return an error when trying to index outside the supported * constraints. */ - if (longitude > 180 || longitude < -180 || - latitude > 85.05112878 || latitude < -85.05112878) return 0; + if (longitude > GEO_LONG_MAX || longitude < GEO_LONG_MIN || + latitude > GEO_LAT_MAX || latitude < GEO_LAT_MIN) return 0; hash->bits = 0; hash->step = step; diff --git a/src/gopher.c b/src/gopher.c new file mode 100644 index 000000000..38e44f754 --- /dev/null +++ b/src/gopher.c @@ -0,0 +1,97 @@ +/* + * Copyright (c) 2019, Salvatore Sanfilippo + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * * Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Redis nor the names of its contributors may be used + * to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#include "server.h" + +/* Emit an item in Gopher directory listing format: + * + * If descr or selector are NULL, then the "(NULL)" string is used instead. */ +void addReplyGopherItem(client *c, const char *type, const char *descr, + const char *selector, const char *hostname, int port) +{ + sds item = sdscatfmt(sdsempty(),"%s%s\t%s\t%s\t%i\r\n", + type, descr, + selector ? selector : "(NULL)", + hostname ? hostname : "(NULL)", + port); + addReplyProto(c,item,sdslen(item)); + sdsfree(item); +} + +/* This is called by processInputBuffer() when an inline request is processed + * with Gopher mode enabled, and the request happens to have zero or just one + * argument. In such case we get the relevant key and reply using the Gopher + * protocol. */ +void processGopherRequest(client *c) { + robj *keyname = c->argc == 0 ? createStringObject("/",1) : c->argv[0]; + robj *o = lookupKeyRead(c->db,keyname); + + /* If there is no such key, return with a Gopher error. */ + if (o == NULL || o->type != OBJ_STRING) { + char *errstr; + if (o == NULL) + errstr = "Error: no content at the specified key"; + else + errstr = "Error: selected key type is invalid " + "for Gopher output"; + addReplyGopherItem(c,"i",errstr,NULL,NULL,0); + addReplyGopherItem(c,"i","Redis Gopher server",NULL,NULL,0); + } else { + addReply(c,o); + } + + /* Cleanup, also make sure to emit the final ".CRLF" line. Note that + * the connection will be closed immediately after this because the client + * will be flagged with CLIENT_CLOSE_AFTER_REPLY, in accordance with the + * Gopher protocol. */ + if (c->argc == 0) decrRefCount(keyname); + + /* Note that in theory we should terminate the Gopher request with + * "." (called Lastline in the RFC) like that: + * + * addReplyProto(c,".\r\n",3); + * + * However after examining the current clients landscape, it's probably + * going to do more harm than good for several reasons: + * + * 1. Clients should not have any issue with missing . as for + * specification, and in the real world indeed certain servers + * implementations never used to send the terminator. + * + * 2. Redis does not know if it's serving a text file or a binary file: + * at the same time clients will not remove the "." bytes at + * tne end when downloading a binary file from the server, so adding + * the "Lastline" terminator without knowing the content is just + * dangerous. + * + * 3. The utility gopher2redis.rb that we provide for Redis, and any + * other similar tool you may use as Gopher authoring system for + * Redis, can just add the "Lastline" when needed. + */ +} diff --git a/src/help.h b/src/help.h index c89f1f44b..184d76724 100644 --- a/src/help.h +++ b/src/help.h @@ -98,6 +98,11 @@ struct commandHelp { "Get the current connection name", 9, "2.6.9" }, + { "CLIENT ID", + "-", + "Returns the client ID for the current connection", + 9, + "5.0.0" }, { "CLIENT KILL", "[ip:port] [ID client-id] [TYPE normal|master|slave|pubsub] [ADDR ip:port] [SKIPME yes/no]", "Kill the connection of a client", @@ -123,6 +128,11 @@ struct commandHelp { "Set the current connection name", 9, "2.6.9" }, + { "CLIENT UNBLOCK", + "client-id [TIMEOUT|ERROR]", + "Unblock a client blocked in a blocking command from a different connection", + 9, + "5.0.0" }, { "CLUSTER ADDSLOTS", "slot [slot ...]", "Assign new hash slots to receiving node", @@ -145,7 +155,7 @@ struct commandHelp { "3.0.0" }, { "CLUSTER FAILOVER", "[FORCE|TAKEOVER]", - "Forces a slave to perform a manual failover of its master.", + "Forces a replica to perform a manual failover of its master.", 12, "3.0.0" }, { "CLUSTER FORGET", @@ -178,9 +188,14 @@ struct commandHelp { "Get Cluster config for the node", 12, "3.0.0" }, + { "CLUSTER REPLICAS", + "node-id", + "List replica nodes of the specified master node", + 12, + "5.0.0" }, { "CLUSTER REPLICATE", "node-id", - "Reconfigure a node as a slave of the specified master node", + "Reconfigure a node as a replica of the specified master node", 12, "3.0.0" }, { "CLUSTER RESET", @@ -205,7 +220,7 @@ struct commandHelp { "3.0.0" }, { "CLUSTER SLAVES", "node-id", - "List slave nodes of the specified master node", + "List replica nodes of the specified master node", 12, "3.0.0" }, { "CLUSTER SLOTS", @@ -690,12 +705,12 @@ struct commandHelp { "1.0.0" }, { "READONLY", "-", - "Enables read queries for a connection to a cluster slave node", + "Enables read queries for a connection to a cluster replica node", 12, "3.0.0" }, { "READWRITE", "-", - "Disables read queries for a connection to a cluster slave node", + "Disables read queries for a connection to a cluster replica node", 12, "3.0.0" }, { "RENAME", @@ -708,6 +723,11 @@ struct commandHelp { "Rename a key, only if the new key does not exist", 0, "1.0.0" }, + { "REPLICAOF", + "host port", + "Make the server a replica of another instance, or promote it as master.", + 9, + "5.0.0" }, { "RESTORE", "key ttl serialized-value [REPLACE]", "Create a key using the provided serialized value, previously obtained using DUMP.", @@ -845,7 +865,7 @@ struct commandHelp { "1.0.0" }, { "SLAVEOF", "host port", - "Make the server a slave of another instance, or promote it as master", + "Make the server a replica of another instance, or promote it as master. Deprecated starting with Redis 5. Use REPLICAOF instead.", 9, "1.0.0" }, { "SLOWLOG", @@ -954,7 +974,7 @@ struct commandHelp { 7, "2.2.0" }, { "WAIT", - "numslaves timeout", + "numreplicas timeout", "Wait for the synchronous replication of all the write commands sent in the context of the current connection", 0, "3.0.0" }, @@ -963,11 +983,36 @@ struct commandHelp { "Watch the given keys to determine execution of the MULTI/EXEC block", 7, "2.2.0" }, + { "XACK", + "key group ID [ID ...]", + "Marks a pending message as correctly processed, effectively removing it from the pending entries list of the consumer group. Return value of the command is the number of messages successfully acknowledged, that is, the IDs we were actually able to resolve in the PEL.", + 14, + "5.0.0" }, { "XADD", "key ID field string [field string ...]", "Appends a new entry to a stream", 14, "5.0.0" }, + { "XCLAIM", + "key group consumer min-idle-time ID [ID ...] [IDLE ms] [TIME ms-unix-time] [RETRYCOUNT count] [force] [justid]", + "Changes (or acquires) ownership of a message in a consumer group, as if the message was delivered to the specified consumer.", + 14, + "5.0.0" }, + { "XDEL", + "key ID [ID ...]", + "Removes the specified entries from the stream. Returns the number of items actually deleted, that may be different from the number of IDs passed in case certain IDs do not exist.", + 14, + "5.0.0" }, + { "XGROUP", + "[CREATE key groupname id-or-$] [SETID key id-or-$] [DESTROY key groupname] [DELCONSUMER key groupname consumername]", + "Create, destroy, and manage consumer groups.", + 14, + "5.0.0" }, + { "XINFO", + "[CONSUMERS key groupname] [GROUPS key] [STREAM key] [HELP]", + "Get information on streams and consumer groups", + 14, + "5.0.0" }, { "XLEN", "key", "Return the number of entires in a stream", @@ -998,6 +1043,11 @@ struct commandHelp { "Return a range of elements in a stream, with IDs matching the specified IDs interval, in reverse order (from greater to smaller IDs) compared to XRANGE", 14, "5.0.0" }, + { "XTRIM", + "key MAXLEN [~] count", + "Trims the stream to (approximately if '~' is passed) a certain size", + 14, + "5.0.0" }, { "ZADD", "key [NX|XX] [CH] [INCR] score member [score member ...]", "Add one or more members to a sorted set, or update its score if it already exists", diff --git a/src/hyperloglog.c b/src/hyperloglog.c index ba3a3ab60..fc21ea006 100644 --- a/src/hyperloglog.c +++ b/src/hyperloglog.c @@ -1512,7 +1512,7 @@ void pfdebugCommand(client *c) { } hdr = o->ptr; - addReplyMultiBulkLen(c,HLL_REGISTERS); + addReplyArrayLen(c,HLL_REGISTERS); for (j = 0; j < HLL_REGISTERS; j++) { uint8_t val; diff --git a/src/intset.c b/src/intset.c index 198c90aa1..4445a5ca6 100644 --- a/src/intset.c +++ b/src/intset.c @@ -123,7 +123,7 @@ static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos) { } else { /* Check for the case where we know we cannot find the value, * but do know the insert position. */ - if (value > _intsetGet(is,intrev32ifbe(is->length)-1)) { + if (value > _intsetGet(is,max)) { if (pos) *pos = intrev32ifbe(is->length); return 0; } else if (value < _intsetGet(is,0)) { diff --git a/src/latency.c b/src/latency.c index e8d2af306..33aa1245b 100644 --- a/src/latency.c +++ b/src/latency.c @@ -476,19 +476,19 @@ sds createLatencyReport(void) { /* latencyCommand() helper to produce a time-delay reply for all the samples * in memory for the specified time series. */ void latencyCommandReplyWithSamples(client *c, struct latencyTimeSeries *ts) { - void *replylen = addDeferredMultiBulkLength(c); + void *replylen = addReplyDeferredLen(c); int samples = 0, j; for (j = 0; j < LATENCY_TS_LEN; j++) { int i = (ts->idx + j) % LATENCY_TS_LEN; if (ts->samples[i].time == 0) continue; - addReplyMultiBulkLen(c,2); + addReplyArrayLen(c,2); addReplyLongLong(c,ts->samples[i].time); addReplyLongLong(c,ts->samples[i].latency); samples++; } - setDeferredMultiBulkLength(c,replylen,samples); + setDeferredArrayLen(c,replylen,samples); } /* latencyCommand() helper to produce the reply for the LATEST subcommand, @@ -497,14 +497,14 @@ void latencyCommandReplyWithLatestEvents(client *c) { dictIterator *di; dictEntry *de; - addReplyMultiBulkLen(c,dictSize(server.latency_events)); + addReplyArrayLen(c,dictSize(server.latency_events)); di = dictGetIterator(server.latency_events); while((de = dictNext(di)) != NULL) { char *event = dictGetKey(de); struct latencyTimeSeries *ts = dictGetVal(de); int last = (ts->idx + LATENCY_TS_LEN - 1) % LATENCY_TS_LEN; - addReplyMultiBulkLen(c,4); + addReplyArrayLen(c,4); addReplyBulkCString(c,event); addReplyLongLong(c,ts->samples[last].time); addReplyLongLong(c,ts->samples[last].latency); @@ -560,19 +560,30 @@ sds latencyCommandGenSparkeline(char *event, struct latencyTimeSeries *ts) { /* LATENCY command implementations. * - * LATENCY SAMPLES: return time-latency samples for the specified event. + * LATENCY HISTORY: return time-latency samples for the specified event. * LATENCY LATEST: return the latest latency for all the events classes. - * LATENCY DOCTOR: returns an human readable analysis of instance latency. + * LATENCY DOCTOR: returns a human readable analysis of instance latency. * LATENCY GRAPH: provide an ASCII graph of the latency of the specified event. + * LATENCY RESET: reset data of a specified event or all the data if no event provided. */ void latencyCommand(client *c) { + const char *help[] = { +"DOCTOR -- Returns a human readable latency analysis report.", +"GRAPH -- Returns an ASCII latency graph for the event class.", +"HISTORY -- Returns time-latency samples for the event class.", +"LATEST -- Returns the latest latency samples for all events.", +"RESET [event ...] -- Resets latency data of one or more event classes.", +" (default: reset all data for all event classes)", +"HELP -- Prints this help.", +NULL + }; struct latencyTimeSeries *ts; if (!strcasecmp(c->argv[1]->ptr,"history") && c->argc == 3) { /* LATENCY HISTORY */ ts = dictFetchValue(server.latency_events,c->argv[2]->ptr); if (ts == NULL) { - addReplyMultiBulkLen(c,0); + addReplyArrayLen(c,0); } else { latencyCommandReplyWithSamples(c,ts); } @@ -610,8 +621,10 @@ void latencyCommand(client *c) { resets += latencyResetEvent(c->argv[j]->ptr); addReplyLongLong(c,resets); } + } else if (!strcasecmp(c->argv[1]->ptr,"help") && c->argc >= 2) { + addReplyHelp(c, help); } else { - addReply(c,shared.syntaxerr); + addReplySubcommandSyntaxError(c); } return; diff --git a/src/lazyfree.c b/src/lazyfree.c index ac8a6bee9..3d3159c90 100644 --- a/src/lazyfree.c +++ b/src/lazyfree.c @@ -90,6 +90,17 @@ int dbAsyncDelete(redisDb *db, robj *key) { } } +/* Free an object, if the object is huge enough, free it in async way. */ +void freeObjAsync(robj *o) { + size_t free_effort = lazyfreeGetFreeEffort(o); + if (free_effort > LAZYFREE_THRESHOLD && o->refcount == 1) { + atomicIncr(lazyfree_objects,1); + bioCreateBackgroundJob(BIO_LAZY_FREE,o,NULL,NULL); + } else { + decrRefCount(o); + } +} + /* Empty a Redis DB asynchronously. What the function does actually is to * create a new empty set of hash tables and scheduling the old ones for * lazy freeing. */ diff --git a/src/listpack.c b/src/listpack.c index c3070db6d..e1f4d9a02 100644 --- a/src/listpack.c +++ b/src/listpack.c @@ -707,6 +707,26 @@ unsigned char *lpInsert(unsigned char *lp, unsigned char *ele, uint32_t size, un } } lpSetTotalBytes(lp,new_listpack_bytes); + +#if 0 + /* This code path is normally disabled: what it does is to force listpack + * to return *always* a new pointer after performing some modification to + * the listpack, even if the previous allocation was enough. This is useful + * in order to spot bugs in code using listpacks: by doing so we can find + * if the caller forgets to set the new pointer where the listpack reference + * is stored, after an update. */ + unsigned char *oldlp = lp; + lp = lp_malloc(new_listpack_bytes); + memcpy(lp,oldlp,new_listpack_bytes); + if (newp) { + unsigned long offset = (*newp)-oldlp; + *newp = lp + offset; + } + /* Make sure the old allocation contains garbage. */ + memset(oldlp,'A',new_listpack_bytes); + lp_free(oldlp); +#endif + return lp; } diff --git a/src/lolwut.c b/src/lolwut.c new file mode 100644 index 000000000..19cbcf642 --- /dev/null +++ b/src/lolwut.c @@ -0,0 +1,56 @@ +/* + * Copyright (c) 2018, Salvatore Sanfilippo + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * * Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Redis nor the names of its contributors may be used + * to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * ---------------------------------------------------------------------------- + * + * This file implements the LOLWUT command. The command should do something + * fun and interesting, and should be replaced by a new implementation at + * each new version of Redis. + */ + +#include "server.h" + +void lolwut5Command(client *c); + +/* The default target for LOLWUT if no matching version was found. + * This is what unstable versions of Redis will display. */ +void lolwutUnstableCommand(client *c) { + sds rendered = sdsnew("Redis ver. "); + rendered = sdscat(rendered,REDIS_VERSION); + rendered = sdscatlen(rendered,"\n",1); + addReplyBulkSds(c,rendered); +} + +void lolwutCommand(client *c) { + char *v = REDIS_VERSION; + if ((v[0] == '5' && v[1] == '.') || + (v[0] == '4' && v[1] == '.' && v[2] == '9')) + lolwut5Command(c); + else + lolwutUnstableCommand(c); +} diff --git a/src/lolwut5.c b/src/lolwut5.c new file mode 100644 index 000000000..8408b378d --- /dev/null +++ b/src/lolwut5.c @@ -0,0 +1,282 @@ +/* + * Copyright (c) 2018, Salvatore Sanfilippo + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * * Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Redis nor the names of its contributors may be used + * to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + * + * ---------------------------------------------------------------------------- + * + * This file implements the LOLWUT command. The command should do something + * fun and interesting, and should be replaced by a new implementation at + * each new version of Redis. + */ + +#include "server.h" +#include + +/* This structure represents our canvas. Drawing functions will take a pointer + * to a canvas to write to it. Later the canvas can be rendered to a string + * suitable to be printed on the screen, using unicode Braille characters. */ +typedef struct lwCanvas { + int width; + int height; + char *pixels; +} lwCanvas; + +/* Translate a group of 8 pixels (2x4 vertical rectangle) to the corresponding + * braille character. The byte should correspond to the pixels arranged as + * follows, where 0 is the least significant bit, and 7 the most significant + * bit: + * + * 0 3 + * 1 4 + * 2 5 + * 6 7 + * + * The corresponding utf8 encoded character is set into the three bytes + * pointed by 'output'. + */ +#include +void lwTranslatePixelsGroup(int byte, char *output) { + int code = 0x2800 + byte; + /* Convert to unicode. This is in the U0800-UFFFF range, so we need to + * emit it like this in three bytes: + * 1110xxxx 10xxxxxx 10xxxxxx. */ + output[0] = 0xE0 | (code >> 12); /* 1110-xxxx */ + output[1] = 0x80 | ((code >> 6) & 0x3F); /* 10-xxxxxx */ + output[2] = 0x80 | (code & 0x3F); /* 10-xxxxxx */ +} + +/* Allocate and return a new canvas of the specified size. */ +lwCanvas *lwCreateCanvas(int width, int height) { + lwCanvas *canvas = zmalloc(sizeof(*canvas)); + canvas->width = width; + canvas->height = height; + canvas->pixels = zmalloc(width*height); + memset(canvas->pixels,0,width*height); + return canvas; +} + +/* Free the canvas created by lwCreateCanvas(). */ +void lwFreeCanvas(lwCanvas *canvas) { + zfree(canvas->pixels); + zfree(canvas); +} + +/* Set a pixel to the specified color. Color is 0 or 1, where zero means no + * dot will be displyed, and 1 means dot will be displayed. + * Coordinates are arranged so that left-top corner is 0,0. You can write + * out of the size of the canvas without issues. */ +void lwDrawPixel(lwCanvas *canvas, int x, int y, int color) { + if (x < 0 || x >= canvas->width || + y < 0 || y >= canvas->height) return; + canvas->pixels[x+y*canvas->width] = color; +} + +/* Return the value of the specified pixel on the canvas. */ +int lwGetPixel(lwCanvas *canvas, int x, int y) { + if (x < 0 || x >= canvas->width || + y < 0 || y >= canvas->height) return 0; + return canvas->pixels[x+y*canvas->width]; +} + +/* Draw a line from x1,y1 to x2,y2 using the Bresenham algorithm. */ +void lwDrawLine(lwCanvas *canvas, int x1, int y1, int x2, int y2, int color) { + int dx = abs(x2-x1); + int dy = abs(y2-y1); + int sx = (x1 < x2) ? 1 : -1; + int sy = (y1 < y2) ? 1 : -1; + int err = dx-dy, e2; + + while(1) { + lwDrawPixel(canvas,x1,y1,color); + if (x1 == x2 && y1 == y2) break; + e2 = err*2; + if (e2 > -dy) { + err -= dy; + x1 += sx; + } + if (e2 < dx) { + err += dx; + y1 += sy; + } + } +} + +/* Draw a square centered at the specified x,y coordinates, with the specified + * rotation angle and size. In order to write a rotated square, we use the + * trivial fact that the parametric equation: + * + * x = sin(k) + * y = cos(k) + * + * Describes a circle for values going from 0 to 2*PI. So basically if we start + * at 45 degrees, that is k = PI/4, with the first point, and then we find + * the other three points incrementing K by PI/2 (90 degrees), we'll have the + * points of the square. In order to rotate the square, we just start with + * k = PI/4 + rotation_angle, and we are done. + * + * Of course the vanilla equations above will describe the square inside a + * circle of radius 1, so in order to draw larger squares we'll have to + * multiply the obtained coordinates, and then translate them. However this + * is much simpler than implementing the abstract concept of 2D shape and then + * performing the rotation/translation transformation, so for LOLWUT it's + * a good approach. */ +void lwDrawSquare(lwCanvas *canvas, int x, int y, float size, float angle) { + int px[4], py[4]; + + /* Adjust the desired size according to the fact that the square inscribed + * into a circle of radius 1 has the side of length SQRT(2). This way + * size becomes a simple multiplication factor we can use with our + * coordinates to magnify them. */ + size /= 1.4142135623; + size = round(size); + + /* Compute the four points. */ + float k = M_PI/4 + angle; + for (int j = 0; j < 4; j++) { + px[j] = round(sin(k) * size + x); + py[j] = round(cos(k) * size + y); + k += M_PI/2; + } + + /* Draw the square. */ + for (int j = 0; j < 4; j++) + lwDrawLine(canvas,px[j],py[j],px[(j+1)%4],py[(j+1)%4],1); +} + +/* Schotter, the output of LOLWUT of Redis 5, is a computer graphic art piece + * generated by Georg Nees in the 60s. It explores the relationship between + * caos and order. + * + * The function creates the canvas itself, depending on the columns available + * in the output display and the number of squares per row and per column + * requested by the caller. */ +lwCanvas *lwDrawSchotter(int console_cols, int squares_per_row, int squares_per_col) { + /* Calculate the canvas size. */ + int canvas_width = console_cols*2; + int padding = canvas_width > 4 ? 2 : 0; + float square_side = (float)(canvas_width-padding*2) / squares_per_row; + int canvas_height = square_side * squares_per_col + padding*2; + lwCanvas *canvas = lwCreateCanvas(canvas_width, canvas_height); + + for (int y = 0; y < squares_per_col; y++) { + for (int x = 0; x < squares_per_row; x++) { + int sx = x * square_side + square_side/2 + padding; + int sy = y * square_side + square_side/2 + padding; + /* Rotate and translate randomly as we go down to lower + * rows. */ + float angle = 0; + if (y > 1) { + float r1 = (float)rand() / RAND_MAX / squares_per_col * y; + float r2 = (float)rand() / RAND_MAX / squares_per_col * y; + float r3 = (float)rand() / RAND_MAX / squares_per_col * y; + if (rand() % 2) r1 = -r1; + if (rand() % 2) r2 = -r2; + if (rand() % 2) r3 = -r3; + angle = r1; + sx += r2*square_side/3; + sy += r3*square_side/3; + } + lwDrawSquare(canvas,sx,sy,square_side,angle); + } + } + + return canvas; +} + +/* Converts the canvas to an SDS string representing the UTF8 characters to + * print to the terminal in order to obtain a graphical representaiton of the + * logical canvas. The actual returned string will require a terminal that is + * width/2 large and height/4 tall in order to hold the whole image without + * overflowing or scrolling, since each Barille character is 2x4. */ +sds lwRenderCanvas(lwCanvas *canvas) { + sds text = sdsempty(); + for (int y = 0; y < canvas->height; y += 4) { + for (int x = 0; x < canvas->width; x += 2) { + /* We need to emit groups of 8 bits according to a specific + * arrangement. See lwTranslatePixelsGroup() for more info. */ + int byte = 0; + if (lwGetPixel(canvas,x,y)) byte |= (1<<0); + if (lwGetPixel(canvas,x,y+1)) byte |= (1<<1); + if (lwGetPixel(canvas,x,y+2)) byte |= (1<<2); + if (lwGetPixel(canvas,x+1,y)) byte |= (1<<3); + if (lwGetPixel(canvas,x+1,y+1)) byte |= (1<<4); + if (lwGetPixel(canvas,x+1,y+2)) byte |= (1<<5); + if (lwGetPixel(canvas,x,y+3)) byte |= (1<<6); + if (lwGetPixel(canvas,x+1,y+3)) byte |= (1<<7); + char unicode[3]; + lwTranslatePixelsGroup(byte,unicode); + text = sdscatlen(text,unicode,3); + } + if (y != canvas->height-1) text = sdscatlen(text,"\n",1); + } + return text; +} + +/* The LOLWUT command: + * + * LOLWUT [terminal columns] [squares-per-row] [squares-per-col] + * + * By default the command uses 66 columns, 8 squares per row, 12 squares + * per column. + */ +void lolwut5Command(client *c) { + long cols = 66; + long squares_per_row = 8; + long squares_per_col = 12; + + /* Parse the optional arguments if any. */ + if (c->argc > 1 && + getLongFromObjectOrReply(c,c->argv[1],&cols,NULL) != C_OK) + return; + + if (c->argc > 2 && + getLongFromObjectOrReply(c,c->argv[2],&squares_per_row,NULL) != C_OK) + return; + + if (c->argc > 3 && + getLongFromObjectOrReply(c,c->argv[3],&squares_per_col,NULL) != C_OK) + return; + + /* Limits. We want LOLWUT to be always reasonably fast and cheap to execute + * so we have maximum number of columns, rows, and output resulution. */ + if (cols < 1) cols = 1; + if (cols > 1000) cols = 1000; + if (squares_per_row < 1) squares_per_row = 1; + if (squares_per_row > 200) squares_per_row = 200; + if (squares_per_col < 1) squares_per_col = 1; + if (squares_per_col > 200) squares_per_col = 200; + + /* Generate some computer art and reply. */ + lwCanvas *canvas = lwDrawSchotter(cols,squares_per_row,squares_per_col); + sds rendered = lwRenderCanvas(canvas); + rendered = sdscat(rendered, + "\nGeorg Nees - schotter, plotter on paper, 1968. Redis ver. "); + rendered = sdscat(rendered,REDIS_VERSION); + rendered = sdscatlen(rendered,"\n",1); + addReplyBulkSds(c,rendered); + lwFreeCanvas(canvas); +} diff --git a/src/lzf_d.c b/src/lzf_d.c index 93f43c27c..d44bfcc8d 100644 --- a/src/lzf_d.c +++ b/src/lzf_d.c @@ -52,6 +52,10 @@ #endif #endif +#if defined(__GNUC__) && __GNUC__ >= 5 +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wimplicit-fallthrough" +#endif unsigned int lzf_decompress (const void *const in_data, unsigned int in_len, void *out_data, unsigned int out_len) @@ -86,8 +90,6 @@ lzf_decompress (const void *const in_data, unsigned int in_len, #ifdef lzf_movsb lzf_movsb (op, ip, ctrl); #else -#pragma GCC diagnostic push -#pragma GCC diagnostic ignored "-Wimplicit-fallthrough" switch (ctrl) { case 32: *op++ = *ip++; case 31: *op++ = *ip++; case 30: *op++ = *ip++; case 29: *op++ = *ip++; @@ -99,7 +101,6 @@ lzf_decompress (const void *const in_data, unsigned int in_len, case 8: *op++ = *ip++; case 7: *op++ = *ip++; case 6: *op++ = *ip++; case 5: *op++ = *ip++; case 4: *op++ = *ip++; case 3: *op++ = *ip++; case 2: *op++ = *ip++; case 1: *op++ = *ip++; } -#pragma GCC diagnostic pop #endif } else /* back reference */ @@ -185,4 +186,6 @@ lzf_decompress (const void *const in_data, unsigned int in_len, return op - (u8 *)out_data; } - +#if defined(__GNUC__) && __GNUC__ >= 5 +#pragma GCC diagnostic pop +#endif diff --git a/src/mkreleasehdr.sh b/src/mkreleasehdr.sh index 1ae95886b..e6d558b17 100755 --- a/src/mkreleasehdr.sh +++ b/src/mkreleasehdr.sh @@ -2,6 +2,9 @@ GIT_SHA1=`(git show-ref --head --hash=8 2> /dev/null || echo 00000000) | head -n1` GIT_DIRTY=`git diff --no-ext-diff 2> /dev/null | wc -l` BUILD_ID=`uname -n`"-"`date +%s` +if [ -n "$SOURCE_DATE_EPOCH" ]; then + BUILD_ID=$(date -u -d "@$SOURCE_DATE_EPOCH" +%s 2>/dev/null || date -u -r "$SOURCE_DATE_EPOCH" +%s 2>/dev/null || date -u %s) +fi test -f release.h || touch release.h (cat release.h | grep SHA1 | grep $GIT_SHA1) && \ (cat release.h | grep DIRTY | grep $GIT_DIRTY) && exit 0 # Already up-to-date diff --git a/src/module.c b/src/module.c index 9809cd74e..81982ba76 100644 --- a/src/module.c +++ b/src/module.c @@ -64,6 +64,7 @@ struct AutoMemEntry { #define REDISMODULE_AM_STRING 1 #define REDISMODULE_AM_REPLY 2 #define REDISMODULE_AM_FREED 3 /* Explicitly freed by user already. */ +#define REDISMODULE_AM_DICT 4 /* The pool allocator block. Redis Modules can allocate memory via this special * allocator that will automatically release it all once the callback returns. @@ -241,9 +242,21 @@ typedef struct RedisModuleKeyspaceSubscriber { /* The module keyspace notification subscribers list */ static list *moduleKeyspaceSubscribers; -/* Static client recycled for all notification clients, to avoid allocating - * per round. */ -static client *moduleKeyspaceSubscribersClient; +/* Static client recycled for when we need to provide a context with a client + * in a situation where there is no client to provide. This avoidsallocating + * a new client per round. For instance this is used in the keyspace + * notifications, timers and cluster messages callbacks. */ +static client *moduleFreeContextReusedClient; + +/* Data structures related to the exported dictionary data structure. */ +typedef struct RedisModuleDict { + rax *rax; /* The radix tree. */ +} RedisModuleDict; + +typedef struct RedisModuleDictIter { + RedisModuleDict *dict; + raxIterator ri; +} RedisModuleDictIter; /* -------------------------------------------------------------------------- * Prototypes @@ -256,6 +269,7 @@ robj **moduleCreateArgvFromUserFormat(const char *cmdname, const char *fmt, int void moduleReplicateMultiIfNeeded(RedisModuleCtx *ctx); void RM_ZsetRangeStop(RedisModuleKey *kp); static void zsetKeyReset(RedisModuleKey *key); +void RM_FreeDict(RedisModuleCtx *ctx, RedisModuleDict *d); /* -------------------------------------------------------------------------- * Heap allocation raw functions @@ -474,7 +488,7 @@ void moduleHandlePropagationAfterCommandCallback(RedisModuleCtx *ctx) { if (c->flags & CLIENT_LUA) return; /* Handle the replication of the final EXEC, since whatever a command - * emits is always wrappered around MULTI/EXEC. */ + * emits is always wrapped around MULTI/EXEC. */ if (ctx->flags & REDISMODULE_CTX_MULTI_EMITTED) { robj *propargv[1]; propargv[0] = createStringObject("EXEC",4); @@ -548,7 +562,7 @@ void RM_KeyAtPos(RedisModuleCtx *ctx, int pos) { ctx->keys_pos[ctx->keys_count++] = pos; } -/* Helper for RM_CreateCommand(). Truns a string representing command +/* Helper for RM_CreateCommand(). Turns a string representing command * flags into the command flags used by the Redis core. * * It returns the set of flags, or -1 if unknown flags are found. */ @@ -595,7 +609,7 @@ int commandFlagsFromString(char *s) { * And is supposed to always return REDISMODULE_OK. * * The set of flags 'strflags' specify the behavior of the command, and should - * be passed as a C string compoesd of space separated words, like for + * be passed as a C string composed of space separated words, like for * example "write deny-oom". The set of flags are: * * * **"write"**: The command may modify the data set (it may also read @@ -616,7 +630,7 @@ int commandFlagsFromString(char *s) { * * **"allow-stale"**: The command is allowed to run on slaves that don't * serve stale data. Don't use if you don't know what * this means. - * * **"no-monitor"**: Don't propoagate the command on monitor. Use this if + * * **"no-monitor"**: Don't propagate the command on monitor. Use this if * the command has sensible data among the arguments. * * **"fast"**: The command time complexity is not greater * than O(log(N)) where N is the size of the collection or @@ -670,6 +684,7 @@ int RM_CreateCommand(RedisModuleCtx *ctx, const char *name, RedisModuleCmdFunc c cp->rediscmd->calls = 0; dictAdd(server.commands,sdsdup(cmdname),cp->rediscmd); dictAdd(server.orig_commands,sdsdup(cmdname),cp->rediscmd); + cp->rediscmd->id = ACLGetCommandID(cmdname); /* ID used for ACL. */ return REDISMODULE_OK; } @@ -777,6 +792,7 @@ void autoMemoryCollect(RedisModuleCtx *ctx) { case REDISMODULE_AM_STRING: decrRefCount(ptr); break; case REDISMODULE_AM_REPLY: RM_FreeCallReply(ptr); break; case REDISMODULE_AM_KEY: RM_CloseKey(ptr); break; + case REDISMODULE_AM_DICT: RM_FreeDict(NULL,ptr); break; } } ctx->flags |= REDISMODULE_CTX_AUTO_MEMORY; @@ -794,19 +810,26 @@ void autoMemoryCollect(RedisModuleCtx *ctx) { * with RedisModule_FreeString(), unless automatic memory is enabled. * * The string is created by copying the `len` bytes starting - * at `ptr`. No reference is retained to the passed buffer. */ + * at `ptr`. No reference is retained to the passed buffer. + * + * The module context 'ctx' is optional and may be NULL if you want to create + * a string out of the context scope. However in that case, the automatic + * memory management will not be available, and the string memory must be + * managed manually. */ RedisModuleString *RM_CreateString(RedisModuleCtx *ctx, const char *ptr, size_t len) { RedisModuleString *o = createStringObject(ptr,len); - autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); + if (ctx != NULL) autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); return o; } - /* Create a new module string object from a printf format and arguments. * The returned string must be freed with RedisModule_FreeString(), unless * automatic memory is enabled. * - * The string is created using the sds formatter function sdscatvprintf(). */ + * The string is created using the sds formatter function sdscatvprintf(). + * + * The passed context 'ctx' may be NULL if necessary, see the + * RedisModule_CreateString() documentation for more info. */ RedisModuleString *RM_CreateStringPrintf(RedisModuleCtx *ctx, const char *fmt, ...) { sds s = sdsempty(); @@ -816,7 +839,7 @@ RedisModuleString *RM_CreateStringPrintf(RedisModuleCtx *ctx, const char *fmt, . va_end(ap); RedisModuleString *o = createObject(OBJ_STRING, s); - autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); + if (ctx != NULL) autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); return o; } @@ -826,7 +849,10 @@ RedisModuleString *RM_CreateStringPrintf(RedisModuleCtx *ctx, const char *fmt, . * integer instead of taking a buffer and its length. * * The returned string must be released with RedisModule_FreeString() or by - * enabling automatic memory management. */ + * enabling automatic memory management. + * + * The passed context 'ctx' may be NULL if necessary, see the + * RedisModule_CreateString() documentation for more info. */ RedisModuleString *RM_CreateStringFromLongLong(RedisModuleCtx *ctx, long long ll) { char buf[LONG_STR_SIZE]; size_t len = ll2string(buf,sizeof(buf),ll); @@ -837,10 +863,13 @@ RedisModuleString *RM_CreateStringFromLongLong(RedisModuleCtx *ctx, long long ll * RedisModuleString. * * The returned string must be released with RedisModule_FreeString() or by - * enabling automatic memory management. */ + * enabling automatic memory management. + * + * The passed context 'ctx' may be NULL if necessary, see the + * RedisModule_CreateString() documentation for more info. */ RedisModuleString *RM_CreateStringFromString(RedisModuleCtx *ctx, const RedisModuleString *str) { RedisModuleString *o = dupStringObject(str); - autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); + if (ctx != NULL) autoMemoryAdd(ctx,REDISMODULE_AM_STRING,o); return o; } @@ -849,10 +878,16 @@ RedisModuleString *RM_CreateStringFromString(RedisModuleCtx *ctx, const RedisMod * * It is possible to call this function even when automatic memory management * is enabled. In that case the string will be released ASAP and removed - * from the pool of string to release at the end. */ + * from the pool of string to release at the end. + * + * If the string was created with a NULL context 'ctx', it is also possible to + * pass ctx as NULL when releasing the string (but passing a context will not + * create any issue). Strings created with a context should be freed also passing + * the context, so if you want to free a string out of context later, make sure + * to create it using a NULL context. */ void RM_FreeString(RedisModuleCtx *ctx, RedisModuleString *str) { decrRefCount(str); - autoMemoryFreed(ctx,REDISMODULE_AM_STRING,str); + if (ctx != NULL) autoMemoryFreed(ctx,REDISMODULE_AM_STRING,str); } /* Every call to this function, will make the string 'str' requiring @@ -876,9 +911,11 @@ void RM_FreeString(RedisModuleCtx *ctx, RedisModuleString *str) { * Note that when memory management is turned off, you don't need * any call to RetainString() since creating a string will always result * into a string that lives after the callback function returns, if - * no FreeString() call is performed. */ + * no FreeString() call is performed. + * + * It is possible to call this function with a NULL context. */ void RM_RetainString(RedisModuleCtx *ctx, RedisModuleString *str) { - if (!autoMemoryFreed(ctx,REDISMODULE_AM_STRING,str)) { + if (ctx == NULL || !autoMemoryFreed(ctx,REDISMODULE_AM_STRING,str)) { /* Increment the string reference counting only if we can't * just remove the object from the list of objects that should * be reclaimed. Why we do that, instead of just incrementing @@ -956,9 +993,9 @@ RedisModuleString *moduleAssertUnsharedString(RedisModuleString *str) { return str; } -/* Append the specified buffere to the string 'str'. The string must be a +/* Append the specified buffer to the string 'str'. The string must be a * string created by the user that is referenced only a single time, otherwise - * REDISMODULE_ERR is returend and the operation is not performed. */ + * REDISMODULE_ERR is returned and the operation is not performed. */ int RM_StringAppendBuffer(RedisModuleCtx *ctx, RedisModuleString *str, const char *buf, size_t len) { UNUSED(ctx); str = moduleAssertUnsharedString(str); @@ -1087,10 +1124,10 @@ int RM_ReplyWithArray(RedisModuleCtx *ctx, long len) { ctx->postponed_arrays = zrealloc(ctx->postponed_arrays,sizeof(void*)* (ctx->postponed_arrays_count+1)); ctx->postponed_arrays[ctx->postponed_arrays_count] = - addDeferredMultiBulkLength(c); + addReplyDeferredLen(c); ctx->postponed_arrays_count++; } else { - addReplyMultiBulkLen(c,len); + addReplyArrayLen(c,len); } return REDISMODULE_OK; } @@ -1118,7 +1155,7 @@ int RM_ReplyWithArray(RedisModuleCtx *ctx, long len) { * * Note that in the above example there is no reason to postpone the array * length, since we produce a fixed number of elements, but in the practice - * the code may use an interator or other ways of creating the output so + * the code may use an iterator or other ways of creating the output so * that is not easy to calculate in advance the number of elements. */ void RM_ReplySetArrayLength(RedisModuleCtx *ctx, long len) { @@ -1133,7 +1170,7 @@ void RM_ReplySetArrayLength(RedisModuleCtx *ctx, long len) { return; } ctx->postponed_arrays_count--; - setDeferredMultiBulkLength(c, + setDeferredArrayLen(c, ctx->postponed_arrays[ctx->postponed_arrays_count], len); if (ctx->postponed_arrays_count == 0) { @@ -1169,7 +1206,7 @@ int RM_ReplyWithString(RedisModuleCtx *ctx, RedisModuleString *str) { int RM_ReplyWithNull(RedisModuleCtx *ctx) { client *c = moduleGetReplyClient(ctx); if (c == NULL) return REDISMODULE_OK; - addReply(c,shared.nullbulk); + addReplyNull(c); return REDISMODULE_OK; } @@ -1410,7 +1447,7 @@ int RM_SelectDb(RedisModuleCtx *ctx, int newid) { * to call other APIs with the key handle as argument to perform * operations on the key. * - * The return value is the handle repesenting the key, that must be + * The return value is the handle representing the key, that must be * closed with RM_CloseKey(). * * If the key does not exist and WRITE mode is requested, the handle @@ -1664,7 +1701,7 @@ int RM_StringTruncate(RedisModuleKey *key, size_t newlen) { * Key API for List type * -------------------------------------------------------------------------- */ -/* Push an element into a list, on head or tail depending on 'where' argumnet. +/* Push an element into a list, on head or tail depending on 'where' argument. * If the key pointer is about an empty key opened for writing, the key * is created. On error (key opened for read-only operations or of the wrong * type) REDISMODULE_ERR is returned, otherwise REDISMODULE_OK is returned. */ @@ -1769,7 +1806,7 @@ int RM_ZsetAdd(RedisModuleKey *key, double score, RedisModuleString *ele, int *f * The input and output flags, and the return value, have the same exact * meaning, with the only difference that this function will return * REDISMODULE_ERR even when 'score' is a valid double number, but adding it - * to the existing score resuts into a NaN (not a number) condition. + * to the existing score results into a NaN (not a number) condition. * * This function has an additional field 'newscore', if not NULL is filled * with the new score of the element after the increment, if no error @@ -2150,7 +2187,9 @@ int RM_ZsetRangePrev(RedisModuleKey *key) { * * The function is variadic and the user must specify pairs of field * names and values, both as RedisModuleString pointers (unless the - * CFIELD option is set, see later). + * CFIELD option is set, see later). At the end of the field/value-ptr pairs, + * NULL must be specified as last argument to signal the end of the arguments + * in the variadic function. * * Example to set the hash argv[1] to the value argv[2]: * @@ -2658,6 +2697,7 @@ RedisModuleCallReply *RM_Call(RedisModuleCtx *ctx, const char *cmdname, const ch /* Create the client and dispatch the command. */ va_start(ap, fmt); c = createClient(-1); + c->user = NULL; /* Root user. */ argv = moduleCreateArgvFromUserFormat(cmdname,fmt,&argc,&flags,ap); replicate = flags & REDISMODULE_ARGV_REPLICATE; va_end(ap); @@ -2987,7 +3027,7 @@ int RM_ModuleTypeSetValue(RedisModuleKey *key, moduleType *mt, void *value) { } /* Assuming RedisModule_KeyType() returned REDISMODULE_KEYTYPE_MODULE on - * the key, returns the moduel type pointer of the value stored at key. + * the key, returns the module type pointer of the value stored at key. * * If the key is NULL, is not associated with a module type, or is empty, * then NULL is returned instead. */ @@ -3287,7 +3327,7 @@ void RM_DigestAddLongLong(RedisModuleDigest *md, long long ll) { mixDigest(md->o,buf,len); } -/* See the doucmnetation for `RedisModule_DigestAddElement()`. */ +/* See the documentation for `RedisModule_DigestAddElement()`. */ void RM_DigestEndSequence(RedisModuleDigest *md) { xorDigest(md->x,md->o,sizeof(md->o)); memset(md->o,0,sizeof(md->o)); @@ -3484,7 +3524,7 @@ void unblockClientFromModule(client *c) { * reply_timeout: called when the timeout is reached in order to send an * error to the client. * - * free_privdata: called in order to free the privata data that is passed + * free_privdata: called in order to free the private data that is passed * by RedisModule_UnblockClient() call. */ RedisModuleBlockedClient *RM_BlockClient(RedisModuleCtx *ctx, RedisModuleCmdFunc reply_callback, RedisModuleCmdFunc timeout_callback, void (*free_privdata)(RedisModuleCtx*,void*), long long timeout_ms) { @@ -3631,8 +3671,8 @@ void moduleHandleBlockedClients(void) { * free the temporary client we just used for the replies. */ if (c) { if (bc->reply_client->bufpos) - addReplyString(c,bc->reply_client->buf, - bc->reply_client->bufpos); + addReplyProto(c,bc->reply_client->buf, + bc->reply_client->bufpos); if (listLength(bc->reply_client->reply)) listJoin(c->reply,bc->reply_client->reply); c->reply_bytes += bc->reply_client->reply_bytes; @@ -3681,7 +3721,7 @@ void moduleBlockedClientTimedOut(client *c) { bc->timeout_callback(&ctx,(void**)c->argv,c->argc); moduleFreeContext(&ctx); /* For timeout events, we do not want to call the disconnect callback, - * because the blocekd client will be automatically disconnected in + * because the blocked client will be automatically disconnected in * this case, and the user can still hook using the timeout callback. */ bc->disconnect_callback = NULL; } @@ -3698,7 +3738,7 @@ int RM_IsBlockedTimeoutRequest(RedisModuleCtx *ctx) { return (ctx->flags & REDISMODULE_CTX_BLOCKED_TIMEOUT) != 0; } -/* Get the privata data set by RedisModule_UnblockClient() */ +/* Get the private data set by RedisModule_UnblockClient() */ void *RM_GetBlockedClientPrivateData(RedisModuleCtx *ctx) { return ctx->blocked_privdata; } @@ -3793,11 +3833,11 @@ void moduleReleaseGIL(void) { * -------------------------------------------------------------------------- */ /* Subscribe to keyspace notifications. This is a low-level version of the - * keyspace-notifications API. A module cand register callbacks to be notified + * keyspace-notifications API. A module can register callbacks to be notified * when keyspce events occur. * * Notification events are filtered by their type (string events, set events, - * etc), and the subsriber callback receives only events that match a specific + * etc), and the subscriber callback receives only events that match a specific * mask of event types. * * When subscribing to notifications with RedisModule_SubscribeToKeyspaceEvents @@ -3832,7 +3872,7 @@ void moduleReleaseGIL(void) { * used to send anything to the client, and has the db number where the event * occurred as its selected db number. * - * Notice that it is not necessary to enable norifications in redis.conf for + * Notice that it is not necessary to enable notifications in redis.conf for * module notifications to work. * * Warning: the notification callbacks are performed in a synchronous manner, @@ -3873,10 +3913,10 @@ void moduleNotifyKeyspaceEvent(int type, const char *event, robj *key, int dbid) if ((sub->event_mask & type) && sub->active == 0) { RedisModuleCtx ctx = REDISMODULE_CTX_INIT; ctx.module = sub->module; - ctx.client = moduleKeyspaceSubscribersClient; + ctx.client = moduleFreeContextReusedClient; selectDb(ctx.client, dbid); - /* mark the handler as activer to avoid reentrant loops. + /* mark the handler as active to avoid reentrant loops. * If the subscriber performs an action triggering itself, * it will not be notified about it. */ sub->active = 1; @@ -3936,6 +3976,8 @@ void moduleCallClusterReceivers(const char *sender_id, uint64_t module_id, uint8 if (r->module_id == module_id) { RedisModuleCtx ctx = REDISMODULE_CTX_INIT; ctx.module = r->module; + ctx.client = moduleFreeContextReusedClient; + selectDb(ctx.client, 0); r->callback(&ctx,sender_id,type,payload,len); moduleFreeContext(&ctx); return; @@ -4084,7 +4126,7 @@ size_t RM_GetClusterSize(void) { * * * REDISMODULE_NODE_MYSELF This node * * REDISMODULE_NODE_MASTER The node is a master - * * REDISMODULE_NODE_SLAVE The ndoe is a slave + * * REDISMODULE_NODE_SLAVE The node is a replica * * REDISMODULE_NODE_PFAIL We see the node as failing * * REDISMODULE_NODE_FAIL The cluster agrees the node is failing * * REDISMODULE_NODE_NOFAILOVER The slave is configured to never failover @@ -4126,6 +4168,32 @@ int RM_GetClusterNodeInfo(RedisModuleCtx *ctx, const char *id, char *ip, char *m return REDISMODULE_OK; } +/* Set Redis Cluster flags in order to change the normal behavior of + * Redis Cluster, especially with the goal of disabling certain functions. + * This is useful for modules that use the Cluster API in order to create + * a different distributed system, but still want to use the Redis Cluster + * message bus. Flags that can be set: + * + * CLUSTER_MODULE_FLAG_NO_FAILOVER + * CLUSTER_MODULE_FLAG_NO_REDIRECTION + * + * With the following effects: + * + * NO_FAILOVER: prevent Redis Cluster slaves to failover a failing master. + * Also disables the replica migration feature. + * + * NO_REDIRECTION: Every node will accept any key, without trying to perform + * partitioning according to the user Redis Cluster algorithm. + * Slots informations will still be propagated across the + * cluster, but without effects. */ +void RM_SetClusterFlags(RedisModuleCtx *ctx, uint64_t flags) { + UNUSED(ctx); + if (flags & REDISMODULE_CLUSTER_FLAG_NO_FAILOVER) + server.cluster_module_flags |= CLUSTER_MODULE_FLAG_NO_FAILOVER; + if (flags & REDISMODULE_CLUSTER_FLAG_NO_REDIRECTION) + server.cluster_module_flags |= CLUSTER_MODULE_FLAG_NO_REDIRECTION; +} + /* -------------------------------------------------------------------------- * Modules Timers API * @@ -4155,6 +4223,7 @@ typedef struct RedisModuleTimer { RedisModule *module; /* Module reference. */ RedisModuleTimerProc callback; /* The callback to invoke on expire. */ void *data; /* Private data for the callback. */ + int dbid; /* Database number selected by the original client. */ } RedisModuleTimer; /* This is the timer handler that is called by the main event loop. We schedule @@ -4180,6 +4249,8 @@ int moduleTimerHandler(struct aeEventLoop *eventLoop, long long id, void *client RedisModuleCtx ctx = REDISMODULE_CTX_INIT; ctx.module = timer->module; + ctx.client = moduleFreeContextReusedClient; + selectDb(ctx.client, timer->dbid); timer->callback(&ctx,timer->data); moduleFreeContext(&ctx); raxRemove(Timers,(unsigned char*)ri.key,ri.key_len,NULL); @@ -4204,6 +4275,7 @@ RedisModuleTimerID RM_CreateTimer(RedisModuleCtx *ctx, mstime_t period, RedisMod timer->module = ctx->module; timer->callback = callback; timer->data = data; + timer->dbid = ctx->client->db->id; uint64_t expiretime = ustime()+period*1000; uint64_t key; @@ -4243,7 +4315,7 @@ RedisModuleTimerID RM_CreateTimer(RedisModuleCtx *ctx, mstime_t period, RedisMod } /* Stop a timer, returns REDISMODULE_OK if the timer was found, belonged to the - * calling module, and was stoped, otherwise REDISMODULE_ERR is returned. + * calling module, and was stopped, otherwise REDISMODULE_ERR is returned. * If not NULL, the data pointer is set to the value of the data argument when * the timer was created. */ int RM_StopTimer(RedisModuleCtx *ctx, RedisModuleTimerID id, void **data) { @@ -4260,7 +4332,7 @@ int RM_StopTimer(RedisModuleCtx *ctx, RedisModuleTimerID id, void **data) { * (in milliseconds), and the private data pointer associated with the timer. * If the timer specified does not exist or belongs to a different module * no information is returned and the function returns REDISMODULE_ERR, otherwise - * REDISMODULE_OK is returned. The argumnets remaining or data can be NULL if + * REDISMODULE_OK is returned. The arguments remaining or data can be NULL if * the caller does not need certain information. */ int RM_GetTimerInfo(RedisModuleCtx *ctx, RedisModuleTimerID id, uint64_t *remaining, void **data) { RedisModuleTimer *timer = raxFind(Timers,(unsigned char*)&id,sizeof(id)); @@ -4275,6 +4347,257 @@ int RM_GetTimerInfo(RedisModuleCtx *ctx, RedisModuleTimerID id, uint64_t *remain return REDISMODULE_OK; } +/* -------------------------------------------------------------------------- + * Modules Dictionary API + * + * Implements a sorted dictionary (actually backed by a radix tree) with + * the usual get / set / del / num-items API, together with an iterator + * capable of going back and forth. + * -------------------------------------------------------------------------- */ + +/* Create a new dictionary. The 'ctx' pointer can be the current module context + * or NULL, depending on what you want. Please follow the following rules: + * + * 1. Use a NULL context if you plan to retain a reference to this dictionary + * that will survive the time of the module callback where you created it. + * 2. Use a NULL context if no context is available at the time you are creating + * the dictionary (of course...). + * 3. However use the current callback context as 'ctx' argument if the + * dictionary time to live is just limited to the callback scope. In this + * case, if enabled, you can enjoy the automatic memory management that will + * reclaim the dictionary memory, as well as the strings returned by the + * Next / Prev dictionary iterator calls. + */ +RedisModuleDict *RM_CreateDict(RedisModuleCtx *ctx) { + struct RedisModuleDict *d = zmalloc(sizeof(*d)); + d->rax = raxNew(); + if (ctx != NULL) autoMemoryAdd(ctx,REDISMODULE_AM_DICT,d); + return d; +} + +/* Free a dictionary created with RM_CreateDict(). You need to pass the + * context pointer 'ctx' only if the dictionary was created using the + * context instead of passing NULL. */ +void RM_FreeDict(RedisModuleCtx *ctx, RedisModuleDict *d) { + if (ctx != NULL) autoMemoryFreed(ctx,REDISMODULE_AM_DICT,d); + raxFree(d->rax); + zfree(d); +} + +/* Return the size of the dictionary (number of keys). */ +uint64_t RM_DictSize(RedisModuleDict *d) { + return raxSize(d->rax); +} + +/* Store the specified key into the dictionary, setting its value to the + * pointer 'ptr'. If the key was added with success, since it did not + * already exist, REDISMODULE_OK is returned. Otherwise if the key already + * exists the function returns REDISMODULE_ERR. */ +int RM_DictSetC(RedisModuleDict *d, void *key, size_t keylen, void *ptr) { + int retval = raxTryInsert(d->rax,key,keylen,ptr,NULL); + return (retval == 1) ? REDISMODULE_OK : REDISMODULE_ERR; +} + +/* Like RedisModule_DictSetC() but will replace the key with the new + * value if the key already exists. */ +int RM_DictReplaceC(RedisModuleDict *d, void *key, size_t keylen, void *ptr) { + int retval = raxInsert(d->rax,key,keylen,ptr,NULL); + return (retval == 1) ? REDISMODULE_OK : REDISMODULE_ERR; +} + +/* Like RedisModule_DictSetC() but takes the key as a RedisModuleString. */ +int RM_DictSet(RedisModuleDict *d, RedisModuleString *key, void *ptr) { + return RM_DictSetC(d,key->ptr,sdslen(key->ptr),ptr); +} + +/* Like RedisModule_DictReplaceC() but takes the key as a RedisModuleString. */ +int RM_DictReplace(RedisModuleDict *d, RedisModuleString *key, void *ptr) { + return RM_DictReplaceC(d,key->ptr,sdslen(key->ptr),ptr); +} + +/* Return the value stored at the specified key. The function returns NULL + * both in the case the key does not exist, or if you actually stored + * NULL at key. So, optionally, if the 'nokey' pointer is not NULL, it will + * be set by reference to 1 if the key does not exist, or to 0 if the key + * exists. */ +void *RM_DictGetC(RedisModuleDict *d, void *key, size_t keylen, int *nokey) { + void *res = raxFind(d->rax,key,keylen); + if (nokey) *nokey = (res == raxNotFound); + return (res == raxNotFound) ? NULL : res; +} + +/* Like RedisModule_DictGetC() but takes the key as a RedisModuleString. */ +void *RM_DictGet(RedisModuleDict *d, RedisModuleString *key, int *nokey) { + return RM_DictGetC(d,key->ptr,sdslen(key->ptr),nokey); +} + +/* Remove the specified key from the dictionary, returning REDISMODULE_OK if + * the key was found and delted, or REDISMODULE_ERR if instead there was + * no such key in the dictionary. When the operation is successful, if + * 'oldval' is not NULL, then '*oldval' is set to the value stored at the + * key before it was deleted. Using this feature it is possible to get + * a pointer to the value (for instance in order to release it), without + * having to call RedisModule_DictGet() before deleting the key. */ +int RM_DictDelC(RedisModuleDict *d, void *key, size_t keylen, void *oldval) { + int retval = raxRemove(d->rax,key,keylen,oldval); + return retval ? REDISMODULE_OK : REDISMODULE_ERR; +} + +/* Like RedisModule_DictDelC() but gets the key as a RedisModuleString. */ +int RM_DictDel(RedisModuleDict *d, RedisModuleString *key, void *oldval) { + return RM_DictDelC(d,key->ptr,sdslen(key->ptr),oldval); +} + +/* Return an interator, setup in order to start iterating from the specified + * key by applying the operator 'op', which is just a string specifying the + * comparison operator to use in order to seek the first element. The + * operators avalable are: + * + * "^" -- Seek the first (lexicographically smaller) key. + * "$" -- Seek the last (lexicographically biffer) key. + * ">" -- Seek the first element greter than the specified key. + * ">=" -- Seek the first element greater or equal than the specified key. + * "<" -- Seek the first element smaller than the specified key. + * "<=" -- Seek the first element smaller or equal than the specified key. + * "==" -- Seek the first element matching exactly the specified key. + * + * Note that for "^" and "$" the passed key is not used, and the user may + * just pass NULL with a length of 0. + * + * If the element to start the iteration cannot be seeked based on the + * key and operator passed, RedisModule_DictNext() / Prev() will just return + * REDISMODULE_ERR at the first call, otherwise they'll produce elements. + */ +RedisModuleDictIter *RM_DictIteratorStartC(RedisModuleDict *d, const char *op, void *key, size_t keylen) { + RedisModuleDictIter *di = zmalloc(sizeof(*di)); + di->dict = d; + raxStart(&di->ri,d->rax); + raxSeek(&di->ri,op,key,keylen); + return di; +} + +/* Exactly like RedisModule_DictIteratorStartC, but the key is passed as a + * RedisModuleString. */ +RedisModuleDictIter *RM_DictIteratorStart(RedisModuleDict *d, const char *op, RedisModuleString *key) { + return RM_DictIteratorStartC(d,op,key->ptr,sdslen(key->ptr)); +} + +/* Release the iterator created with RedisModule_DictIteratorStart(). This call + * is mandatory otherwise a memory leak is introduced in the module. */ +void RM_DictIteratorStop(RedisModuleDictIter *di) { + raxStop(&di->ri); + zfree(di); +} + +/* After its creation with RedisModule_DictIteratorStart(), it is possible to + * change the currently selected element of the iterator by using this + * API call. The result based on the operator and key is exactly like + * the function RedisModule_DictIteratorStart(), however in this case the + * return value is just REDISMODULE_OK in case the seeked element was found, + * or REDISMODULE_ERR in case it was not possible to seek the specified + * element. It is possible to reseek an iterator as many times as you want. */ +int RM_DictIteratorReseekC(RedisModuleDictIter *di, const char *op, void *key, size_t keylen) { + return raxSeek(&di->ri,op,key,keylen); +} + +/* Like RedisModule_DictIteratorReseekC() but takes the key as as a + * RedisModuleString. */ +int RM_DictIteratorReseek(RedisModuleDictIter *di, const char *op, RedisModuleString *key) { + return RM_DictIteratorReseekC(di,op,key->ptr,sdslen(key->ptr)); +} + +/* Return the current item of the dictionary iterator 'di' and steps to the + * next element. If the iterator already yield the last element and there + * are no other elements to return, NULL is returned, otherwise a pointer + * to a string representing the key is provided, and the '*keylen' length + * is set by reference (if keylen is not NULL). The '*dataptr', if not NULL + * is set to the value of the pointer stored at the returned key as auxiliary + * data (as set by the RedisModule_DictSet API). + * + * Usage example: + * + * ... create the iterator here ... + * char *key; + * void *data; + * while((key = RedisModule_DictNextC(iter,&keylen,&data)) != NULL) { + * printf("%.*s %p\n", (int)keylen, key, data); + * } + * + * The returned pointer is of type void because sometimes it makes sense + * to cast it to a char* sometimes to an unsigned char* depending on the + * fact it contains or not binary data, so this API ends being more + * comfortable to use. + * + * The validity of the returned pointer is until the next call to the + * next/prev iterator step. Also the pointer is no longer valid once the + * iterator is released. */ +void *RM_DictNextC(RedisModuleDictIter *di, size_t *keylen, void **dataptr) { + if (!raxNext(&di->ri)) return NULL; + if (keylen) *keylen = di->ri.key_len; + if (dataptr) *dataptr = di->ri.data; + return di->ri.key; +} + +/* This function is exactly like RedisModule_DictNext() but after returning + * the currently selected element in the iterator, it selects the previous + * element (laxicographically smaller) instead of the next one. */ +void *RM_DictPrevC(RedisModuleDictIter *di, size_t *keylen, void **dataptr) { + if (!raxPrev(&di->ri)) return NULL; + if (keylen) *keylen = di->ri.key_len; + if (dataptr) *dataptr = di->ri.data; + return di->ri.key; +} + +/* Like RedisModuleNextC(), but instead of returning an internally allocated + * buffer and key length, it returns directly a module string object allocated + * in the specified context 'ctx' (that may be NULL exactly like for the main + * API RedisModule_CreateString). + * + * The returned string object should be deallocated after use, either manually + * or by using a context that has automatic memory management active. */ +RedisModuleString *RM_DictNext(RedisModuleCtx *ctx, RedisModuleDictIter *di, void **dataptr) { + size_t keylen; + void *key = RM_DictNextC(di,&keylen,dataptr); + if (key == NULL) return NULL; + return RM_CreateString(ctx,key,keylen); +} + +/* Like RedisModule_DictNext() but after returning the currently selected + * element in the iterator, it selects the previous element (laxicographically + * smaller) instead of the next one. */ +RedisModuleString *RM_DictPrev(RedisModuleCtx *ctx, RedisModuleDictIter *di, void **dataptr) { + size_t keylen; + void *key = RM_DictPrevC(di,&keylen,dataptr); + if (key == NULL) return NULL; + return RM_CreateString(ctx,key,keylen); +} + +/* Compare the element currently pointed by the iterator to the specified + * element given by key/keylen, according to the operator 'op' (the set of + * valid operators are the same valid for RedisModule_DictIteratorStart). + * If the comparision is successful the command returns REDISMODULE_OK + * otherwise REDISMODULE_ERR is returned. + * + * This is useful when we want to just emit a lexicographical range, so + * in the loop, as we iterate elements, we can also check if we are still + * on range. + * + * The function returne REDISMODULE_ERR if the iterator reached the + * end of elements condition as well. */ +int RM_DictCompareC(RedisModuleDictIter *di, const char *op, void *key, size_t keylen) { + if (raxEOF(&di->ri)) return REDISMODULE_ERR; + int res = raxCompare(&di->ri,op,key,keylen); + return res ? REDISMODULE_OK : REDISMODULE_ERR; +} + +/* Like RedisModule_DictCompareC but gets the key to compare with the current + * iterator key as a RedisModuleString. */ +int RM_DictCompare(RedisModuleDictIter *di, const char *op, RedisModuleString *key) { + if (raxEOF(&di->ri)) return REDISMODULE_ERR; + int res = raxCompare(&di->ri,op,key->ptr,sdslen(key->ptr)); + return res ? REDISMODULE_OK : REDISMODULE_ERR; +} + /* -------------------------------------------------------------------------- * Modules utility APIs * -------------------------------------------------------------------------- */ @@ -4336,8 +4659,9 @@ void moduleInitModulesSystem(void) { /* Set up the keyspace notification susbscriber list and static client */ moduleKeyspaceSubscribers = listCreate(); - moduleKeyspaceSubscribersClient = createClient(-1); - moduleKeyspaceSubscribersClient->flags |= CLIENT_MODULE; + moduleFreeContextReusedClient = createClient(-1); + moduleFreeContextReusedClient->flags |= CLIENT_MODULE; + moduleFreeContextReusedClient->user = NULL; /* root user. */ moduleRegisterCoreAPI(); if (pipe(server.module_blocked_pipe) == -1) { @@ -4475,7 +4799,7 @@ int moduleUnload(sds name) { moduleUnregisterCommands(module); - /* Remvoe any noification subscribers this module might have */ + /* Remove any notification subscribers this module might have */ moduleUnsubscribeNotifications(module); /* Unregister all the hooks. TODO: Yet no hooks support here. */ @@ -4497,6 +4821,25 @@ int moduleUnload(sds name) { return REDISMODULE_OK; } +/* Helper function for the MODULE and HELLO command: send the list of the + * loaded modules to the client. */ +void addReplyLoadedModules(client *c) { + dictIterator *di = dictGetIterator(modules); + dictEntry *de; + + addReplyArrayLen(c,dictSize(modules)); + while ((de = dictNext(di)) != NULL) { + sds name = dictGetKey(de); + struct RedisModule *module = dictGetVal(de); + addReplyMapLen(c,2); + addReplyBulkCString(c,"name"); + addReplyBulkCBuffer(c,name,sdslen(name)); + addReplyBulkCString(c,"ver"); + addReplyLongLong(c,module->ver); + } + dictReleaseIterator(di); +} + /* Redis MODULE command. * * MODULE LOAD [args...] */ @@ -4544,20 +4887,7 @@ NULL addReplyErrorFormat(c,"Error unloading module: %s",errmsg); } } else if (!strcasecmp(subcmd,"list") && c->argc == 2) { - dictIterator *di = dictGetIterator(modules); - dictEntry *de; - - addReplyMultiBulkLen(c,dictSize(modules)); - while ((de = dictNext(di)) != NULL) { - sds name = dictGetKey(de); - struct RedisModule *module = dictGetVal(de); - addReplyMultiBulkLen(c,4); - addReplyBulkCString(c,"name"); - addReplyBulkCBuffer(c,name,sdslen(name)); - addReplyBulkCString(c,"ver"); - addReplyLongLong(c,module->ver); - } - dictReleaseIterator(di); + addReplyLoadedModules(c); } else { addReplySubcommandSyntaxError(c); return; @@ -4700,4 +5030,27 @@ void moduleRegisterCoreAPI(void) { REGISTER_API(BlockedClientDisconnected); REGISTER_API(SetDisconnectCallback); REGISTER_API(GetBlockedClientHandle); + REGISTER_API(SetClusterFlags); + REGISTER_API(CreateDict); + REGISTER_API(FreeDict); + REGISTER_API(DictSize); + REGISTER_API(DictSetC); + REGISTER_API(DictReplaceC); + REGISTER_API(DictSet); + REGISTER_API(DictReplace); + REGISTER_API(DictGetC); + REGISTER_API(DictGet); + REGISTER_API(DictDelC); + REGISTER_API(DictDel); + REGISTER_API(DictIteratorStartC); + REGISTER_API(DictIteratorStart); + REGISTER_API(DictIteratorStop); + REGISTER_API(DictIteratorReseekC); + REGISTER_API(DictIteratorReseek); + REGISTER_API(DictNextC); + REGISTER_API(DictPrevC); + REGISTER_API(DictNext); + REGISTER_API(DictPrev); + REGISTER_API(DictCompareC); + REGISTER_API(DictCompare); } diff --git a/src/modules/Makefile b/src/modules/Makefile index cffe68994..51ffac17d 100644 --- a/src/modules/Makefile +++ b/src/modules/Makefile @@ -13,7 +13,7 @@ endif .SUFFIXES: .c .so .xo .o -all: helloworld.so hellotype.so helloblock.so testmodule.so hellocluster.so hellotimer.so +all: helloworld.so hellotype.so helloblock.so testmodule.so hellocluster.so hellotimer.so hellodict.so .c.xo: $(CC) -I. $(CFLAGS) $(SHOBJ_CFLAGS) -fPIC -c $< -o $@ @@ -43,6 +43,11 @@ hellotimer.xo: ../redismodule.h hellotimer.so: hellotimer.xo $(LD) -o $@ $< $(SHOBJ_LDFLAGS) $(LIBS) -lc +hellodict.xo: ../redismodule.h + +hellodict.so: hellodict.xo + $(LD) -o $@ $< $(SHOBJ_LDFLAGS) $(LIBS) -lc + testmodule.xo: ../redismodule.h testmodule.so: testmodule.xo diff --git a/src/modules/helloblock.c b/src/modules/helloblock.c index 6bba17d33..b90ccaa50 100644 --- a/src/modules/helloblock.c +++ b/src/modules/helloblock.c @@ -77,7 +77,7 @@ void *HelloBlock_ThreadMain(void *arg) { /* An example blocked client disconnection callback. * * Note that in the case of the HELLO.BLOCK command, the blocked client is now - * owned by the thread calling sleep(). In this speciifc case, there is not + * owned by the thread calling sleep(). In this specific case, there is not * much we can do, however normally we could instead implement a way to * signal the thread that the client disconnected, and sleep the specified * amount of seconds with a while loop calling sleep(1), so that once we diff --git a/src/modules/hellocluster.c b/src/modules/hellocluster.c index 75d18f3e2..cb78187f9 100644 --- a/src/modules/hellocluster.c +++ b/src/modules/hellocluster.c @@ -69,7 +69,7 @@ int ListCommand_RedisCommand(RedisModuleCtx *ctx, RedisModuleString **argv, int RedisModule_ReplyWithLongLong(ctx,port); } RedisModule_FreeClusterNodesList(ids); - return RedisModule_ReplyWithSimpleString(ctx, "OK"); + return REDISMODULE_OK; } /* Callback for message MSGTYPE_PING */ @@ -77,6 +77,7 @@ void PingReceiver(RedisModuleCtx *ctx, const char *sender_id, uint8_t type, cons RedisModule_Log(ctx,"notice","PING (type %d) RECEIVED from %.*s: '%.*s'", type,REDISMODULE_NODE_ID_LEN,sender_id,(int)len, payload); RedisModule_SendClusterMessage(ctx,NULL,MSGTYPE_PONG,(unsigned char*)"Ohi!",4); + RedisModule_Call(ctx, "INCR", "c", "pings_received"); } /* Callback for message MSGTYPE_PONG. */ @@ -102,6 +103,15 @@ int RedisModule_OnLoad(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) ListCommand_RedisCommand,"readonly",0,0,0) == REDISMODULE_ERR) return REDISMODULE_ERR; + /* Disable Redis Cluster sharding and redirections. This way every node + * will be able to access every possible key, regardless of the hash slot. + * This way the PING message handler will be able to increment a specific + * variable. Normally you do that in order for the distributed system + * you create as a module to have total freedom in the keyspace + * manipulation. */ + RedisModule_SetClusterFlags(ctx,REDISMODULE_CLUSTER_FLAG_NO_REDIRECTION); + + /* Register our handlers for different message types. */ RedisModule_RegisterClusterMessageReceiver(ctx,MSGTYPE_PING,PingReceiver); RedisModule_RegisterClusterMessageReceiver(ctx,MSGTYPE_PONG,PongReceiver); return REDISMODULE_OK; diff --git a/src/modules/hellodict.c b/src/modules/hellodict.c new file mode 100644 index 000000000..651615b03 --- /dev/null +++ b/src/modules/hellodict.c @@ -0,0 +1,132 @@ +/* Hellodict -- An example of modules dictionary API + * + * This module implements a volatile key-value store on top of the + * dictionary exported by the Redis modules API. + * + * ----------------------------------------------------------------------------- + * + * Copyright (c) 2018, Salvatore Sanfilippo + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * * Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * * Neither the name of Redis nor the names of its contributors may be used + * to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#define REDISMODULE_EXPERIMENTAL_API +#include "../redismodule.h" +#include +#include +#include +#include + +static RedisModuleDict *Keyspace; + +/* HELLODICT.SET + * + * Set the specified key to the specified value. */ +int cmd_SET(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) { + if (argc != 3) return RedisModule_WrongArity(ctx); + RedisModule_DictSet(Keyspace,argv[1],argv[2]); + /* We need to keep a reference to the value stored at the key, otherwise + * it would be freed when this callback returns. */ + RedisModule_RetainString(NULL,argv[2]); + return RedisModule_ReplyWithSimpleString(ctx, "OK"); +} + +/* HELLODICT.GET + * + * Return the value of the specified key, or a null reply if the key + * is not defined. */ +int cmd_GET(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) { + if (argc != 2) return RedisModule_WrongArity(ctx); + RedisModuleString *val = RedisModule_DictGet(Keyspace,argv[1],NULL); + if (val == NULL) { + return RedisModule_ReplyWithNull(ctx); + } else { + return RedisModule_ReplyWithString(ctx, val); + } +} + +/* HELLODICT.KEYRANGE + * + * Return a list of matching keys, lexicographically between startkey + * and endkey. No more than 'count' items are emitted. */ +int cmd_KEYRANGE(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) { + if (argc != 4) return RedisModule_WrongArity(ctx); + + /* Parse the count argument. */ + long long count; + if (RedisModule_StringToLongLong(argv[3],&count) != REDISMODULE_OK) { + return RedisModule_ReplyWithError(ctx,"ERR invalid count"); + } + + /* Seek the iterator. */ + RedisModuleDictIter *iter = RedisModule_DictIteratorStart( + Keyspace, ">=", argv[1]); + + /* Reply with the matching items. */ + char *key; + size_t keylen; + long long replylen = 0; /* Keep track of the amitted array len. */ + RedisModule_ReplyWithArray(ctx,REDISMODULE_POSTPONED_ARRAY_LEN); + while((key = RedisModule_DictNextC(iter,&keylen,NULL)) != NULL) { + if (replylen >= count) break; + if (RedisModule_DictCompare(iter,"<=",argv[2]) == REDISMODULE_ERR) + break; + RedisModule_ReplyWithStringBuffer(ctx,key,keylen); + replylen++; + } + RedisModule_ReplySetArrayLength(ctx,replylen); + + /* Cleanup. */ + RedisModule_DictIteratorStop(iter); + return REDISMODULE_OK; +} + +/* This function must be present on each Redis module. It is used in order to + * register the commands into the Redis server. */ +int RedisModule_OnLoad(RedisModuleCtx *ctx, RedisModuleString **argv, int argc) { + REDISMODULE_NOT_USED(argv); + REDISMODULE_NOT_USED(argc); + + if (RedisModule_Init(ctx,"hellodict",1,REDISMODULE_APIVER_1) + == REDISMODULE_ERR) return REDISMODULE_ERR; + + if (RedisModule_CreateCommand(ctx,"hellodict.set", + cmd_SET,"write deny-oom",1,1,0) == REDISMODULE_ERR) + return REDISMODULE_ERR; + + if (RedisModule_CreateCommand(ctx,"hellodict.get", + cmd_GET,"readonly",1,1,0) == REDISMODULE_ERR) + return REDISMODULE_ERR; + + if (RedisModule_CreateCommand(ctx,"hellodict.keyrange", + cmd_KEYRANGE,"readonly",1,1,0) == REDISMODULE_ERR) + return REDISMODULE_ERR; + + /* Create our global dictionray. Here we'll set our keys and values. */ + Keyspace = RedisModule_CreateDict(NULL); + + return REDISMODULE_OK; +} diff --git a/src/modules/hellotimer.c b/src/modules/hellotimer.c index 6c3e1d7f4..57b111b7c 100644 --- a/src/modules/hellotimer.c +++ b/src/modules/hellotimer.c @@ -1,4 +1,4 @@ -/* Helloworld cluster -- A ping/pong cluster API example. +/* Timer API example -- Register and handle timer events * * ----------------------------------------------------------------------------- * @@ -37,9 +37,6 @@ #include #include -#define MSGTYPE_PING 1 -#define MSGTYPE_PONG 2 - /* Timer callback. */ void timerHandler(RedisModuleCtx *ctx, void *data) { REDISMODULE_NOT_USED(ctx); diff --git a/src/multi.c b/src/multi.c index 112ce0605..71090d8ed 100644 --- a/src/multi.c +++ b/src/multi.c @@ -35,6 +35,7 @@ void initClientMultiState(client *c) { c->mstate.commands = NULL; c->mstate.count = 0; + c->mstate.cmd_flags = 0; } /* Release all the resources associated with MULTI/EXEC state */ @@ -67,6 +68,7 @@ void queueMultiCommand(client *c) { for (j = 0; j < c->argc; j++) incrRefCount(mc->argv[j]); c->mstate.count++; + c->mstate.cmd_flags |= c->cmd->flags; } void discardTransaction(client *c) { @@ -132,7 +134,22 @@ void execCommand(client *c) { * in the second an EXECABORT error is returned. */ if (c->flags & (CLIENT_DIRTY_CAS|CLIENT_DIRTY_EXEC)) { addReply(c, c->flags & CLIENT_DIRTY_EXEC ? shared.execaborterr : - shared.nullmultibulk); + shared.nullarray[c->resp]); + discardTransaction(c); + goto handle_monitor; + } + + /* If there are write commands inside the transaction, and this is a read + * only slave, we want to send an error. This happens when the transaction + * was initiated when the instance was a master or a writable replica and + * then the configuration changed (for example instance was turned into + * a replica). */ + if (!server.loading && server.masterhost && server.repl_slave_ro && + !(c->flags & CLIENT_MASTER) && c->mstate.cmd_flags & CMD_WRITE) + { + addReplyError(c, + "Transaction contains write commands but instance " + "is now a read-only replica. EXEC aborted."); discardTransaction(c); goto handle_monitor; } @@ -142,7 +159,7 @@ void execCommand(client *c) { orig_argv = c->argv; orig_argc = c->argc; orig_cmd = c->cmd; - addReplyMultiBulkLen(c,c->mstate.count); + addReplyArrayLen(c,c->mstate.count); for (j = 0; j < c->mstate.count; j++) { c->argc = c->mstate.commands[j].argc; c->argv = c->mstate.commands[j].argv; @@ -158,7 +175,7 @@ void execCommand(client *c) { must_propagate = 1; } - call(c,CMD_CALL_FULL); + call(c,server.loading ? CMD_CALL_NONE : CMD_CALL_FULL); /* Commands may alter argc/argv, restore mstate. */ c->mstate.commands[j].argc = c->argc; diff --git a/src/networking.c b/src/networking.c index af7422178..7479b72a8 100644 --- a/src/networking.c +++ b/src/networking.c @@ -33,7 +33,7 @@ #include #include -static void setProtocolError(const char *errstr, client *c, long pos); +static void setProtocolError(const char *errstr, client *c); /* Return the size consumed from the allocator, for the specified SDS string, * including internal fragmentation. This function is used in order to compute @@ -107,9 +107,11 @@ client *createClient(int fd) { uint64_t client_id; atomicGetIncr(server.next_client_id,client_id,1); c->id = client_id; + c->resp = 2; c->fd = fd; c->name = NULL; c->bufpos = 0; + c->qb_pos = 0; c->querybuf = sdsempty(); c->pending_querybuf = sdsempty(); c->querybuf_peak = 0; @@ -117,12 +119,15 @@ client *createClient(int fd) { c->argc = 0; c->argv = NULL; c->cmd = c->lastcmd = NULL; + c->user = DefaultUser; c->multibulklen = 0; c->bulklen = -1; c->sentlen = 0; c->flags = 0; c->ctime = c->lastinteraction = server.unixtime; - c->authenticated = 0; + /* If the default user does not require authentication, the user is + * directly authenticated. */ + c->authenticated = (c->user->flags & USER_FLAG_NOPASS) != 0; c->replstate = REPL_STATE_NONE; c->repl_put_online_on_ack = 0; c->reploff = 0; @@ -159,6 +164,32 @@ client *createClient(int fd) { return c; } +/* This funciton puts the client in the queue of clients that should write + * their output buffers to the socket. Note that it does not *yet* install + * the write handler, to start clients are put in a queue of clients that need + * to write, so we try to do that before returning in the event loop (see the + * handleClientsWithPendingWrites() function). + * If we fail and there is more data to write, compared to what the socket + * buffers can hold, then we'll really install the handler. */ +void clientInstallWriteHandler(client *c) { + /* Schedule the client to write the output buffers to the socket only + * if not already done and, for slaves, if the slave can actually receive + * writes at this stage. */ + if (!(c->flags & CLIENT_PENDING_WRITE) && + (c->replstate == REPL_STATE_NONE || + (c->replstate == SLAVE_STATE_ONLINE && !c->repl_put_online_on_ack))) + { + /* Here instead of installing the write handler, we just flag the + * client and put it into a list of clients that have something + * to write to the socket. This way before re-entering the event + * loop, we can try to directly write to the client sockets avoiding + * a system call. We'll only really install the write handler if + * we'll not be able to write the whole reply at once. */ + c->flags |= CLIENT_PENDING_WRITE; + listAddNodeHead(server.clients_pending_write,c); + } +} + /* This function is called every time we are going to transmit new data * to the client. The behavior is the following: * @@ -196,24 +227,9 @@ int prepareClientToWrite(client *c) { if (c->fd <= 0) return C_ERR; /* Fake client for AOF loading. */ - /* Schedule the client to write the output buffers to the socket only - * if not already done (there were no pending writes already and the client - * was yet not flagged), and, for slaves, if the slave can actually - * receive writes at this stage. */ - if (!clientHasPendingReplies(c) && - !(c->flags & CLIENT_PENDING_WRITE) && - (c->replstate == REPL_STATE_NONE || - (c->replstate == SLAVE_STATE_ONLINE && !c->repl_put_online_on_ack))) - { - /* Here instead of installing the write handler, we just flag the - * client and put it into a list of clients that have something - * to write to the socket. This way before re-entering the event - * loop, we can try to directly write to the client sockets avoiding - * a system call. We'll only really install the write handler if - * we'll not be able to write the whole reply at once. */ - c->flags |= CLIENT_PENDING_WRITE; - listAddNodeHead(server.clients_pending_write,c); - } + /* Schedule the client to write the output buffers to the socket, unless + * it should already be setup to do so (it has already pending data). */ + if (!clientHasPendingReplies(c)) clientInstallWriteHandler(c); /* Authorize the caller to queue in the output buffer of this client. */ return C_OK; @@ -240,7 +256,7 @@ int _addReplyToBuffer(client *c, const char *s, size_t len) { return C_OK; } -void _addReplyStringToList(client *c, const char *s, size_t len) { +void _addReplyProtoToList(client *c, const char *s, size_t len) { if (c->flags & CLIENT_CLOSE_AFTER_REPLY) return; listNode *ln = listLast(c->reply); @@ -287,7 +303,7 @@ void addReply(client *c, robj *obj) { if (sdsEncodedObject(obj)) { if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != C_OK) - _addReplyStringToList(c,obj->ptr,sdslen(obj->ptr)); + _addReplyProtoToList(c,obj->ptr,sdslen(obj->ptr)); } else if (obj->encoding == OBJ_ENCODING_INT) { /* For integer encoded strings we just convert it into a string * using our optimized function, and attach the resulting string @@ -295,7 +311,7 @@ void addReply(client *c, robj *obj) { char buf[32]; size_t len = ll2string(buf,sizeof(buf),(long)obj->ptr); if (_addReplyToBuffer(c,buf,len) != C_OK) - _addReplyStringToList(c,buf,len); + _addReplyProtoToList(c,buf,len); } else { serverPanic("Wrong obj->encoding in addReply()"); } @@ -310,7 +326,7 @@ void addReplySds(client *c, sds s) { return; } if (_addReplyToBuffer(c,s,sdslen(s)) != C_OK) - _addReplyStringToList(c,s,sdslen(s)); + _addReplyProtoToList(c,s,sdslen(s)); sdsfree(s); } @@ -320,12 +336,12 @@ void addReplySds(client *c, sds s) { * * It is efficient because does not create an SDS object nor an Redis object * if not needed. The object will only be created by calling - * _addReplyStringToList() if we fail to extend the existing tail object + * _addReplyProtoToList() if we fail to extend the existing tail object * in the list of objects. */ -void addReplyString(client *c, const char *s, size_t len) { +void addReplyProto(client *c, const char *s, size_t len) { if (prepareClientToWrite(c) != C_OK) return; if (_addReplyToBuffer(c,s,len) != C_OK) - _addReplyStringToList(c,s,len); + _addReplyProtoToList(c,s,len); } /* Low level function called by the addReplyError...() functions. @@ -339,9 +355,9 @@ void addReplyString(client *c, const char *s, size_t len) { void addReplyErrorLength(client *c, const char *s, size_t len) { /* If the string already starts with "-..." then the error code * is provided by the caller. Otherwise we use "-ERR". */ - if (!len || s[0] != '-') addReplyString(c,"-ERR ",5); - addReplyString(c,s,len); - addReplyString(c,"\r\n",2); + if (!len || s[0] != '-') addReplyProto(c,"-ERR ",5); + addReplyProto(c,s,len); + addReplyProto(c,"\r\n",2); /* Sometimes it could be normal that a slave replies to a master with * an error and this function gets called. Actually the error will never @@ -353,19 +369,13 @@ void addReplyErrorLength(client *c, const char *s, size_t len) { * Where the master must propagate the first change even if the second * will produce an error. However it is useful to log such events since * they are rare and may hint at errors in a script or a bug in Redis. */ - if (c->flags & (CLIENT_MASTER|CLIENT_SLAVE)) { - char* to = c->flags & CLIENT_MASTER? "master": "slave"; - char* from = c->flags & CLIENT_MASTER? "slave": "master"; + if (c->flags & (CLIENT_MASTER|CLIENT_SLAVE) && !(c->flags & CLIENT_MONITOR)) { + char* to = c->flags & CLIENT_MASTER? "master": "replica"; + char* from = c->flags & CLIENT_MASTER? "replica": "master"; char *cmdname = c->lastcmd ? c->lastcmd->name : ""; serverLog(LL_WARNING,"== CRITICAL == This %s is sending an error " "to its %s: '%s' after processing the command " "'%s'", from, to, s, cmdname); - /* Here we want to panic because when a master is sending an - * error to some slave in the context of replication, this can - * only create some kind of offset or data desynchronization. Better - * to catch it ASAP and crash instead of continuing. */ - if (c->flags & CLIENT_SLAVE) - serverPanic("Continuing is unsafe: replication protocol violation."); } } @@ -390,9 +400,9 @@ void addReplyErrorFormat(client *c, const char *fmt, ...) { } void addReplyStatusLength(client *c, const char *s, size_t len) { - addReplyString(c,"+",1); - addReplyString(c,s,len); - addReplyString(c,"\r\n",2); + addReplyProto(c,"+",1); + addReplyProto(c,s,len); + addReplyProto(c,"\r\n",2); } void addReplyStatus(client *c, const char *status) { @@ -410,28 +420,28 @@ void addReplyStatusFormat(client *c, const char *fmt, ...) { /* Adds an empty object to the reply list that will contain the multi bulk * length, which is not known when this function is called. */ -void *addDeferredMultiBulkLength(client *c) { +void *addReplyDeferredLen(client *c) { /* Note that we install the write event here even if the object is not * ready to be sent, since we are sure that before returning to the - * event loop setDeferredMultiBulkLength() will be called. */ + * event loop setDeferredAggregateLen() will be called. */ if (prepareClientToWrite(c) != C_OK) return NULL; listAddNodeTail(c->reply,NULL); /* NULL is our placeholder. */ return listLast(c->reply); } /* Populate the length object and try gluing it to the next chunk. */ -void setDeferredMultiBulkLength(client *c, void *node, long length) { +void setDeferredAggregateLen(client *c, void *node, long length, char prefix) { listNode *ln = (listNode*)node; clientReplyBlock *next; char lenstr[128]; - size_t lenstr_len = sprintf(lenstr, "*%ld\r\n", length); + size_t lenstr_len = sprintf(lenstr, "%c%ld\r\n", prefix, length); /* Abort when *node is NULL: when the client should not accept writes - * we return NULL in addDeferredMultiBulkLength() */ + * we return NULL in addReplyDeferredLen() */ if (node == NULL) return; serverAssert(!listNodeValue(ln)); - /* Normally we fill this dummy NULL node, added by addDeferredMultiBulkLength(), + /* Normally we fill this dummy NULL node, added by addReplyDeferredLen(), * with a new buffer structure containing the protocol needed to specify * the length of the array following. However sometimes when there is * little memory to move, we may instead remove this NULL node, and prefix @@ -461,18 +471,55 @@ void setDeferredMultiBulkLength(client *c, void *node, long length) { asyncCloseClientOnOutputBufferLimitReached(c); } +void setDeferredArrayLen(client *c, void *node, long length) { + setDeferredAggregateLen(c,node,length,'*'); +} + +void setDeferredMapLen(client *c, void *node, long length) { + int prefix = c->resp == 2 ? '*' : '%'; + if (c->resp == 2) length *= 2; + setDeferredAggregateLen(c,node,length,prefix); +} + +void setDeferredSetLen(client *c, void *node, long length) { + int prefix = c->resp == 2 ? '*' : '~'; + setDeferredAggregateLen(c,node,length,prefix); +} + +void setDeferredAttributeLen(client *c, void *node, long length) { + int prefix = c->resp == 2 ? '*' : '|'; + if (c->resp == 2) length *= 2; + setDeferredAggregateLen(c,node,length,prefix); +} + +void setDeferredPushLen(client *c, void *node, long length) { + int prefix = c->resp == 2 ? '*' : '>'; + setDeferredAggregateLen(c,node,length,prefix); +} + /* Add a double as a bulk reply */ void addReplyDouble(client *c, double d) { - char dbuf[128], sbuf[128]; - int dlen, slen; if (isinf(d)) { /* Libc in odd systems (Hi Solaris!) will format infinite in a * different way, so better to handle it in an explicit way. */ - addReplyBulkCString(c, d > 0 ? "inf" : "-inf"); + if (c->resp == 2) { + addReplyBulkCString(c, d > 0 ? "inf" : "-inf"); + } else { + addReplyProto(c, d > 0 ? ",inf\r\n" : "-inf\r\n", + d > 0 ? 6 : 7); + } } else { - dlen = snprintf(dbuf,sizeof(dbuf),"%.17g",d); - slen = snprintf(sbuf,sizeof(sbuf),"$%d\r\n%s\r\n",dlen,dbuf); - addReplyString(c,sbuf,slen); + char dbuf[MAX_LONG_DOUBLE_CHARS+3], + sbuf[MAX_LONG_DOUBLE_CHARS+32]; + int dlen, slen; + if (c->resp == 2) { + dlen = snprintf(dbuf,sizeof(dbuf),"%.17g",d); + slen = snprintf(sbuf,sizeof(sbuf),"$%d\r\n%s\r\n",dlen,dbuf); + addReplyProto(c,sbuf,slen); + } else { + dlen = snprintf(dbuf,sizeof(dbuf),",%.17g\r\n",d); + addReplyProto(c,dbuf,dlen); + } } } @@ -480,9 +527,17 @@ void addReplyDouble(client *c, double d) { * of the double instead of exposing the crude behavior of doubles to the * dear user. */ void addReplyHumanLongDouble(client *c, long double d) { - robj *o = createStringObjectFromLongDouble(d,1); - addReplyBulk(c,o); - decrRefCount(o); + if (c->resp == 2) { + robj *o = createStringObjectFromLongDouble(d,1); + addReplyBulk(c,o); + decrRefCount(o); + } else { + char buf[MAX_LONG_DOUBLE_CHARS]; + int len = ld2string(buf,sizeof(buf),d,1); + addReplyProto(c,",",1); + addReplyProto(c,buf,len); + addReplyProto(c,"\r\n",2); + } } /* Add a long long as integer reply or bulk len / multi bulk count. @@ -506,7 +561,7 @@ void addReplyLongLongWithPrefix(client *c, long long ll, char prefix) { len = ll2string(buf+1,sizeof(buf)-1,ll); buf[len+1] = '\r'; buf[len+2] = '\n'; - addReplyString(c,buf,len+3); + addReplyProto(c,buf,len+3); } void addReplyLongLong(client *c, long long ll) { @@ -518,32 +573,70 @@ void addReplyLongLong(client *c, long long ll) { addReplyLongLongWithPrefix(c,ll,':'); } -void addReplyMultiBulkLen(client *c, long length) { - if (length < OBJ_SHARED_BULKHDR_LEN) +void addReplyAggregateLen(client *c, long length, int prefix) { + if (prefix == '*' && length < OBJ_SHARED_BULKHDR_LEN) addReply(c,shared.mbulkhdr[length]); else - addReplyLongLongWithPrefix(c,length,'*'); + addReplyLongLongWithPrefix(c,length,prefix); +} + +void addReplyArrayLen(client *c, long length) { + addReplyAggregateLen(c,length,'*'); +} + +void addReplyMapLen(client *c, long length) { + int prefix = c->resp == 2 ? '*' : '%'; + if (c->resp == 2) length *= 2; + addReplyAggregateLen(c,length,prefix); +} + +void addReplySetLen(client *c, long length) { + int prefix = c->resp == 2 ? '*' : '~'; + addReplyAggregateLen(c,length,prefix); +} + +void addReplyAttributeLen(client *c, long length) { + int prefix = c->resp == 2 ? '*' : '|'; + if (c->resp == 2) length *= 2; + addReplyAggregateLen(c,length,prefix); +} + +void addReplyPushLen(client *c, long length) { + int prefix = c->resp == 2 ? '*' : '>'; + addReplyAggregateLen(c,length,prefix); +} + +void addReplyNull(client *c) { + if (c->resp == 2) { + addReplyProto(c,"$-1\r\n",5); + } else { + addReplyProto(c,"_\r\n",3); + } +} + +void addReplyBool(client *c, int b) { + if (c->resp == 2) { + addReply(c, b ? shared.cone : shared.czero); + } else { + addReplyProto(c, b ? "#t\r\n" : "#f\r\n",4); + } +} + +/* A null array is a concept that no longer exists in RESP3. However + * RESP2 had it, so API-wise we have this call, that will emit the correct + * RESP2 protocol, however for RESP3 the reply will always be just the + * Null type "_\r\n". */ +void addReplyNullArray(client *c) { + if (c->resp == 2) { + addReplyProto(c,"*-1\r\n",5); + } else { + addReplyProto(c,"_\r\n",3); + } } /* Create the length prefix of a bulk reply, example: $2234 */ void addReplyBulkLen(client *c, robj *obj) { - size_t len; - - if (sdsEncodedObject(obj)) { - len = sdslen(obj->ptr); - } else { - long n = (long)obj->ptr; - - /* Compute how many bytes will take this integer as a radix 10 string */ - len = 1; - if (n < 0) { - len++; - n = -n; - } - while((n = n/10) != 0) { - len++; - } - } + size_t len = stringObjectLen(obj); if (len < OBJ_SHARED_BULKHDR_LEN) addReply(c,shared.bulkhdr[len]); @@ -561,7 +654,7 @@ void addReplyBulk(client *c, robj *obj) { /* Add a C buffer as bulk reply */ void addReplyBulkCBuffer(client *c, const void *p, size_t len) { addReplyLongLongWithPrefix(c,len,'$'); - addReplyString(c,p,len); + addReplyProto(c,p,len); addReply(c,shared.crlf); } @@ -575,7 +668,7 @@ void addReplyBulkSds(client *c, sds s) { /* Add a C null term string as bulk reply */ void addReplyBulkCString(client *c, const char *s) { if (s == NULL) { - addReply(c,shared.nullbulk); + addReplyNull(c); } else { addReplyBulkCBuffer(c,s,strlen(s)); } @@ -590,13 +683,42 @@ void addReplyBulkLongLong(client *c, long long ll) { addReplyBulkCBuffer(c,buf,len); } +/* Reply with a verbatim type having the specified extension. + * + * The 'ext' is the "extension" of the file, actually just a three + * character type that describes the format of the verbatim string. + * For instance "txt" means it should be interpreted as a text only + * file by the receiver, "md " as markdown, and so forth. Only the + * three first characters of the extension are used, and if the + * provided one is shorter than that, the remaining is filled with + * spaces. */ +void addReplyVerbatim(client *c, const char *s, size_t len, const char *ext) { + if (c->resp == 2) { + addReplyBulkCBuffer(c,s,len); + } else { + char buf[32]; + size_t preflen = snprintf(buf,sizeof(buf),"=%zu\r\nxxx:",len+4); + char *p = buf+preflen-4; + for (int i = 0; i < 3; i++) { + if (*ext == '\0') { + p[i] = ' '; + } else { + p[i] = *ext++; + } + } + addReplyProto(c,buf,preflen); + addReplyProto(c,s,len); + addReplyProto(c,"\r\n",2); + } +} + /* Add an array of C strings as status replies with a heading. * This function is typically invoked by from commands that support * subcommands in response to the 'help' subcommand. The help array * is terminated by NULL sentinel. */ void addReplyHelp(client *c, const char **help) { sds cmd = sdsnew((char*) c->argv[0]->ptr); - void *blenp = addDeferredMultiBulkLength(c); + void *blenp = addReplyDeferredLen(c); int blen = 0; sdstoupper(cmd); @@ -607,7 +729,7 @@ void addReplyHelp(client *c, const char **help) { while (help[blen]) addReplyStatus(c,help[blen++]); blen++; /* Account for the header line(s). */ - setDeferredMultiBulkLength(c,blenp,blen); + setDeferredArrayLen(c,blenp,blen); } /* Add a suggestive error reply. @@ -672,7 +794,7 @@ static void acceptCommonHandler(int fd, int flags, char *ip) { * user what to do to fix it if needed. */ if (server.protected_mode && server.bindaddr_count == 0 && - server.requirepass == NULL && + DefaultUser->flags & USER_FLAG_NOPASS && !(flags & CLIENT_UNIX_SOCKET) && ip != NULL) { @@ -817,6 +939,13 @@ void unlinkClient(client *c) { void freeClient(client *c) { listNode *ln; + /* If a client is protected, yet we need to free it right now, make sure + * to at least use asynchronous freeing. */ + if (c->flags & CLIENT_PROTECTED) { + freeClientAsync(c); + return; + } + /* If it is our master that's beging disconnected we should make sure * to cache the state to try a partial resynchronization later. * @@ -826,8 +955,7 @@ void freeClient(client *c) { serverLog(LL_WARNING,"Connection with master lost."); if (!(c->flags & (CLIENT_CLOSE_AFTER_REPLY| CLIENT_CLOSE_ASAP| - CLIENT_BLOCKED| - CLIENT_UNBLOCKED))) + CLIENT_BLOCKED))) { replicationCacheMaster(c); return; @@ -836,7 +964,7 @@ void freeClient(client *c) { /* Log link disconnection with slave */ if ((c->flags & CLIENT_SLAVE) && !(c->flags & CLIENT_MONITOR)) { - serverLog(LL_WARNING,"Connection with slave %s lost.", + serverLog(LL_WARNING,"Connection with replica %s lost.", replicationGetSlaveName(c)); } @@ -1054,6 +1182,10 @@ int handleClientsWithPendingWrites(void) { c->flags &= ~CLIENT_PENDING_WRITE; listDelNode(server.clients_pending_write,ln); + /* If a client is protected, don't do anything, + * that may trigger write error or recreate handler. */ + if (c->flags & CLIENT_PROTECTED) continue; + /* Try to write buffers to the client socket. */ if (writeToClient(c->fd,c,0) == C_ERR) continue; @@ -1105,6 +1237,34 @@ void resetClient(client *c) { } } +/* This funciton is used when we want to re-enter the event loop but there + * is the risk that the client we are dealing with will be freed in some + * way. This happens for instance in: + * + * * DEBUG RELOAD and similar. + * * When a Lua script is in -BUSY state. + * + * So the function will protect the client by doing two things: + * + * 1) It removes the file events. This way it is not possible that an + * error is signaled on the socket, freeing the client. + * 2) Moreover it makes sure that if the client is freed in a different code + * path, it is not really released, but only marked for later release. */ +void protectClient(client *c) { + c->flags |= CLIENT_PROTECTED; + aeDeleteFileEvent(server.el,c->fd,AE_READABLE); + aeDeleteFileEvent(server.el,c->fd,AE_WRITABLE); +} + +/* This will undo the client protection done by protectClient() */ +void unprotectClient(client *c) { + if (c->flags & CLIENT_PROTECTED) { + c->flags &= ~CLIENT_PROTECTED; + aeCreateFileEvent(server.el,c->fd,AE_READABLE,readQueryFromClient,c); + if (clientHasPendingReplies(c)) clientInstallWriteHandler(c); + } +} + /* Like processMultibulkBuffer(), but for the inline protocol instead of RESP, * this function consumes the client query buffer and creates a command ready * to be executed inside the client structure. Returns C_OK if the command @@ -1119,29 +1279,29 @@ int processInlineBuffer(client *c) { size_t querylen; /* Search for end of line */ - newline = strchr(c->querybuf,'\n'); + newline = strchr(c->querybuf+c->qb_pos,'\n'); /* Nothing to do without a \r\n */ if (newline == NULL) { - if (sdslen(c->querybuf) > PROTO_INLINE_MAX_SIZE) { + if (sdslen(c->querybuf)-c->qb_pos > PROTO_INLINE_MAX_SIZE) { addReplyError(c,"Protocol error: too big inline request"); - setProtocolError("too big inline request",c,0); + setProtocolError("too big inline request",c); } return C_ERR; } /* Handle the \r\n case. */ - if (newline && newline != c->querybuf && *(newline-1) == '\r') + if (newline && newline != c->querybuf+c->qb_pos && *(newline-1) == '\r') newline--, linefeed_chars++; /* Split the input buffer up to the \r\n */ - querylen = newline-(c->querybuf); - aux = sdsnewlen(c->querybuf,querylen); + querylen = newline-(c->querybuf+c->qb_pos); + aux = sdsnewlen(c->querybuf+c->qb_pos,querylen); argv = sdssplitargs(aux,&argc); sdsfree(aux); if (argv == NULL) { addReplyError(c,"Protocol error: unbalanced quotes in request"); - setProtocolError("unbalanced quotes in inline request",c,0); + setProtocolError("unbalanced quotes in inline request",c); return C_ERR; } @@ -1151,8 +1311,8 @@ int processInlineBuffer(client *c) { if (querylen == 0 && c->flags & CLIENT_SLAVE) c->repl_ack_time = server.unixtime; - /* Leave data after the first line of the query in the buffer */ - sdsrange(c->querybuf,querylen+linefeed_chars,-1); + /* Move querybuffer position to the next query in the buffer. */ + c->qb_pos += querylen+linefeed_chars; /* Setup argv array on client structure */ if (argc) { @@ -1173,19 +1333,19 @@ int processInlineBuffer(client *c) { return C_OK; } -/* Helper function. Trims query buffer to make the function that processes - * multi bulk requests idempotent. */ +/* Helper function. Record protocol erro details in server log, + * and set the client as CLIENT_CLOSE_AFTER_REPLY. */ #define PROTO_DUMP_LEN 128 -static void setProtocolError(const char *errstr, client *c, long pos) { +static void setProtocolError(const char *errstr, client *c) { if (server.verbosity <= LL_VERBOSE) { sds client = catClientInfoString(sdsempty(),c); /* Sample some protocol to given an idea about what was inside. */ char buf[256]; - if (sdslen(c->querybuf) < PROTO_DUMP_LEN) { - snprintf(buf,sizeof(buf),"Query buffer during protocol error: '%s'", c->querybuf); + if (sdslen(c->querybuf)-c->qb_pos < PROTO_DUMP_LEN) { + snprintf(buf,sizeof(buf),"Query buffer during protocol error: '%s'", c->querybuf+c->qb_pos); } else { - snprintf(buf,sizeof(buf),"Query buffer during protocol error: '%.*s' (... more %zu bytes ...) '%.*s'", PROTO_DUMP_LEN/2, c->querybuf, sdslen(c->querybuf)-PROTO_DUMP_LEN, PROTO_DUMP_LEN/2, c->querybuf+sdslen(c->querybuf)-PROTO_DUMP_LEN/2); + snprintf(buf,sizeof(buf),"Query buffer during protocol error: '%.*s' (... more %zu bytes ...) '%.*s'", PROTO_DUMP_LEN/2, c->querybuf+c->qb_pos, sdslen(c->querybuf)-c->qb_pos-PROTO_DUMP_LEN, PROTO_DUMP_LEN/2, c->querybuf+sdslen(c->querybuf)-PROTO_DUMP_LEN/2); } /* Remove non printable chars. */ @@ -1201,7 +1361,6 @@ static void setProtocolError(const char *errstr, client *c, long pos) { sdsfree(client); } c->flags |= CLIENT_CLOSE_AFTER_REPLY; - sdsrange(c->querybuf,pos,-1); } /* Process the query buffer for client 'c', setting up the client argument @@ -1217,7 +1376,6 @@ static void setProtocolError(const char *errstr, client *c, long pos) { * to be '*'. Otherwise for inline commands processInlineBuffer() is called. */ int processMultibulkBuffer(client *c) { char *newline = NULL; - long pos = 0; int ok; long long ll; @@ -1226,34 +1384,32 @@ int processMultibulkBuffer(client *c) { serverAssertWithInfo(c,NULL,c->argc == 0); /* Multi bulk length cannot be read without a \r\n */ - newline = strchr(c->querybuf,'\r'); + newline = strchr(c->querybuf+c->qb_pos,'\r'); if (newline == NULL) { - if (sdslen(c->querybuf) > PROTO_INLINE_MAX_SIZE) { + if (sdslen(c->querybuf)-c->qb_pos > PROTO_INLINE_MAX_SIZE) { addReplyError(c,"Protocol error: too big mbulk count string"); - setProtocolError("too big mbulk count string",c,0); + setProtocolError("too big mbulk count string",c); } return C_ERR; } /* Buffer should also contain \n */ - if (newline-(c->querybuf) > ((signed)sdslen(c->querybuf)-2)) + if (newline-(c->querybuf+c->qb_pos) > (ssize_t)(sdslen(c->querybuf)-c->qb_pos-2)) return C_ERR; /* We know for sure there is a whole line since newline != NULL, * so go ahead and find out the multi bulk length. */ - serverAssertWithInfo(c,NULL,c->querybuf[0] == '*'); - ok = string2ll(c->querybuf+1,newline-(c->querybuf+1),&ll); + serverAssertWithInfo(c,NULL,c->querybuf[c->qb_pos] == '*'); + ok = string2ll(c->querybuf+1+c->qb_pos,newline-(c->querybuf+1+c->qb_pos),&ll); if (!ok || ll > 1024*1024) { addReplyError(c,"Protocol error: invalid multibulk length"); - setProtocolError("invalid mbulk count",c,pos); + setProtocolError("invalid mbulk count",c); return C_ERR; } - pos = (newline-c->querybuf)+2; - if (ll <= 0) { - sdsrange(c->querybuf,pos,-1); - return C_OK; - } + c->qb_pos = (newline-c->querybuf)+2; + + if (ll <= 0) return C_OK; c->multibulklen = ll; @@ -1266,64 +1422,67 @@ int processMultibulkBuffer(client *c) { while(c->multibulklen) { /* Read bulk length if unknown */ if (c->bulklen == -1) { - newline = strchr(c->querybuf+pos,'\r'); + newline = strchr(c->querybuf+c->qb_pos,'\r'); if (newline == NULL) { - if (sdslen(c->querybuf) > PROTO_INLINE_MAX_SIZE) { + if (sdslen(c->querybuf)-c->qb_pos > PROTO_INLINE_MAX_SIZE) { addReplyError(c, "Protocol error: too big bulk count string"); - setProtocolError("too big bulk count string",c,0); + setProtocolError("too big bulk count string",c); return C_ERR; } break; } /* Buffer should also contain \n */ - if (newline-(c->querybuf) > ((signed)sdslen(c->querybuf)-2)) + if (newline-(c->querybuf+c->qb_pos) > (ssize_t)(sdslen(c->querybuf)-c->qb_pos-2)) break; - if (c->querybuf[pos] != '$') { + if (c->querybuf[c->qb_pos] != '$') { addReplyErrorFormat(c, "Protocol error: expected '$', got '%c'", - c->querybuf[pos]); - setProtocolError("expected $ but got something else",c,pos); + c->querybuf[c->qb_pos]); + setProtocolError("expected $ but got something else",c); return C_ERR; } - ok = string2ll(c->querybuf+pos+1,newline-(c->querybuf+pos+1),&ll); + ok = string2ll(c->querybuf+c->qb_pos+1,newline-(c->querybuf+c->qb_pos+1),&ll); if (!ok || ll < 0 || ll > server.proto_max_bulk_len) { addReplyError(c,"Protocol error: invalid bulk length"); - setProtocolError("invalid bulk length",c,pos); + setProtocolError("invalid bulk length",c); return C_ERR; } - pos += newline-(c->querybuf+pos)+2; + c->qb_pos = newline-c->querybuf+2; if (ll >= PROTO_MBULK_BIG_ARG) { - size_t qblen; - /* If we are going to read a large object from network * try to make it likely that it will start at c->querybuf * boundary so that we can optimize object creation - * avoiding a large copy of data. */ - sdsrange(c->querybuf,pos,-1); - pos = 0; - qblen = sdslen(c->querybuf); - /* Hint the sds library about the amount of bytes this string is - * going to contain. */ - if (qblen < (size_t)ll+2) - c->querybuf = sdsMakeRoomFor(c->querybuf,ll+2-qblen); + * avoiding a large copy of data. + * + * But only when the data we have not parsed is less than + * or equal to ll+2. If the data length is greater than + * ll+2, trimming querybuf is just a waste of time, because + * at this time the querybuf contains not only our bulk. */ + if (sdslen(c->querybuf)-c->qb_pos <= (size_t)ll+2) { + sdsrange(c->querybuf,c->qb_pos,-1); + c->qb_pos = 0; + /* Hint the sds library about the amount of bytes this string is + * going to contain. */ + c->querybuf = sdsMakeRoomFor(c->querybuf,ll+2); + } } c->bulklen = ll; } /* Read bulk argument */ - if (sdslen(c->querybuf)-pos < (size_t)(c->bulklen+2)) { + if (sdslen(c->querybuf)-c->qb_pos < (size_t)(c->bulklen+2)) { /* Not enough data (+2 == trailing \r\n) */ break; } else { /* Optimization: if the buffer contains JUST our bulk element * instead of creating a new object by *copying* the sds we * just use the current sds string. */ - if (pos == 0 && + if (c->qb_pos == 0 && c->bulklen >= PROTO_MBULK_BIG_ARG && sdslen(c->querybuf) == (size_t)(c->bulklen+2)) { @@ -1333,20 +1492,16 @@ int processMultibulkBuffer(client *c) { * likely... */ c->querybuf = sdsnewlen(SDS_NOINIT,c->bulklen+2); sdsclear(c->querybuf); - pos = 0; } else { c->argv[c->argc++] = - createStringObject(c->querybuf+pos,c->bulklen); - pos += c->bulklen+2; + createStringObject(c->querybuf+c->qb_pos,c->bulklen); + c->qb_pos += c->bulklen+2; } c->bulklen = -1; c->multibulklen--; } } - /* Trim to pos */ - if (pos) sdsrange(c->querybuf,pos,-1); - /* We're done when c->multibulk == 0 */ if (c->multibulklen == 0) return C_OK; @@ -1360,14 +1515,21 @@ int processMultibulkBuffer(client *c) { * pending query buffer, already representing a full command, to process. */ void processInputBuffer(client *c) { server.current_client = c; + /* Keep processing while there is something in the input buffer */ - while(sdslen(c->querybuf)) { + while(c->qb_pos < sdslen(c->querybuf)) { /* Return if clients are paused. */ if (!(c->flags & CLIENT_SLAVE) && clientsArePaused()) break; /* Immediately abort if the client is in the middle of something. */ if (c->flags & CLIENT_BLOCKED) break; + /* Don't process input from the master while there is a busy script + * condition on the slave. We want just to accumulate the replication + * stream (instead of replying -BUSY like we do with other clients) and + * later resume the processing. */ + if (server.lua_timedout && c->flags & CLIENT_MASTER) break; + /* CLIENT_CLOSE_AFTER_REPLY closes the connection once the reply is * written to the client. Make sure to not let the reply grow after * this flag has been set (i.e. don't process more commands). @@ -1377,7 +1539,7 @@ void processInputBuffer(client *c) { /* Determine request type when unknown. */ if (!c->reqtype) { - if (c->querybuf[0] == '*') { + if (c->querybuf[c->qb_pos] == '*') { c->reqtype = PROTO_REQ_MULTIBULK; } else { c->reqtype = PROTO_REQ_INLINE; @@ -1386,6 +1548,14 @@ void processInputBuffer(client *c) { if (c->reqtype == PROTO_REQ_INLINE) { if (processInlineBuffer(c) != C_OK) break; + /* If the Gopher mode and we got zero or one argument, process + * the request in Gopher mode. */ + if (server.gopher_enabled && (c->argc == 1 || c->argc == 0)) { + processGopherRequest(c); + resetClient(c); + c->flags |= CLIENT_CLOSE_AFTER_REPLY; + break; + } } else if (c->reqtype == PROTO_REQ_MULTIBULK) { if (processMultibulkBuffer(c) != C_OK) break; } else { @@ -1400,7 +1570,7 @@ void processInputBuffer(client *c) { if (processCommand(c) == C_OK) { if (c->flags & CLIENT_MASTER && !(c->flags & CLIENT_MULTI)) { /* Update the applied replication offset of our master. */ - c->reploff = c->read_reploff - sdslen(c->querybuf); + c->reploff = c->read_reploff - sdslen(c->querybuf) + c->qb_pos; } /* Don't reset the client structure for clients blocked in a @@ -1416,9 +1586,35 @@ void processInputBuffer(client *c) { if (server.current_client == NULL) break; } } + + /* Trim to pos */ + if (server.current_client != NULL && c->qb_pos) { + sdsrange(c->querybuf,c->qb_pos,-1); + c->qb_pos = 0; + } + server.current_client = NULL; } +/* This is a wrapper for processInputBuffer that also cares about handling + * the replication forwarding to the sub-slaves, in case the client 'c' + * is flagged as master. Usually you want to call this instead of the + * raw processInputBuffer(). */ +void processInputBufferAndReplicate(client *c) { + if (!(c->flags & CLIENT_MASTER)) { + processInputBuffer(c); + } else { + size_t prev_offset = c->reploff; + processInputBuffer(c); + size_t applied = c->reploff - prev_offset; + if (applied) { + replicationFeedSlavesFromMasterStream(server.slaves, + c->pending_querybuf, applied); + sdsrange(c->pending_querybuf,applied,-1); + } + } +} + void readQueryFromClient(aeEventLoop *el, int fd, void *privdata, int mask) { client *c = (client*) privdata; int nread, readlen; @@ -1438,7 +1634,9 @@ void readQueryFromClient(aeEventLoop *el, int fd, void *privdata, int mask) { { ssize_t remaining = (size_t)(c->bulklen+2)-sdslen(c->querybuf); - if (remaining < readlen) readlen = remaining; + /* Note that the 'remaining' variable may be zero in some edge case, + * for example once we resume a blocked client after CLIENT PAUSE. */ + if (remaining > 0 && remaining < readlen) readlen = remaining; } qblen = sdslen(c->querybuf); @@ -1486,18 +1684,7 @@ void readQueryFromClient(aeEventLoop *el, int fd, void *privdata, int mask) { * was actually applied to the master state: this quantity, and its * corresponding part of the replication stream, will be propagated to * the sub-slaves and to the replication backlog. */ - if (!(c->flags & CLIENT_MASTER)) { - processInputBuffer(c); - } else { - size_t prev_offset = c->reploff; - processInputBuffer(c); - size_t applied = c->reploff - prev_offset; - if (applied) { - replicationFeedSlavesFromMasterStream(server.slaves, - c->pending_querybuf, applied); - sdsrange(c->pending_querybuf,applied,-1); - } - } + processInputBufferAndReplicate(c); } void getClientsMaxBuffers(unsigned long *longest_output_list, @@ -1586,7 +1773,7 @@ sds catClientInfoString(sds s, client *client) { if (emask & AE_WRITABLE) *p++ = 'w'; *p = '\0'; return sdscatfmt(s, - "id=%U addr=%s fd=%i name=%s age=%I idle=%I flags=%s db=%i sub=%i psub=%i multi=%i qbuf=%U qbuf-free=%U obl=%U oll=%U omem=%U events=%s cmd=%s", + "id=%U addr=%s fd=%i name=%s age=%I idle=%I flags=%s db=%i sub=%i psub=%i multi=%i qbuf=%U qbuf-free=%U obl=%U oll=%U omem=%U events=%s cmd=%s user=%s", (unsigned long long) client->id, getClientPeerId(client), client->fd, @@ -1604,7 +1791,8 @@ sds catClientInfoString(sds s, client *client) { (unsigned long long) listLength(client->reply), (unsigned long long) getClientOutputBufferMemoryUsage(client), events, - client->lastcmd ? client->lastcmd->name : "NULL"); + client->lastcmd ? client->lastcmd->name : "NULL", + client->user ? client->user->name : "(superuser)"); } sds getAllClientsInfoString(int type) { @@ -1623,6 +1811,45 @@ sds getAllClientsInfoString(int type) { return o; } +/* This function implements CLIENT SETNAME, including replying to the + * user with an error if the charset is wrong (in that case C_ERR is + * returned). If the function succeeeded C_OK is returned, and it's up + * to the caller to send a reply if needed. + * + * Setting an empty string as name has the effect of unsetting the + * currently set name: the client will remain unnamed. + * + * This function is also used to implement the HELLO SETNAME option. */ +int clientSetNameOrReply(client *c, robj *name) { + int len = sdslen(name->ptr); + char *p = name->ptr; + + /* Setting the client name to an empty string actually removes + * the current name. */ + if (len == 0) { + if (c->name) decrRefCount(c->name); + c->name = NULL; + addReply(c,shared.ok); + return C_OK; + } + + /* Otherwise check if the charset is ok. We need to do this otherwise + * CLIENT LIST format will break. You should always be able to + * split by space to get the different fields. */ + for (int j = 0; j < len; j++) { + if (p[j] < '!' || p[j] > '~') { /* ASCII is assumed. */ + addReplyError(c, + "Client names cannot contain spaces, " + "newlines or special characters."); + return C_ERR; + } + } + if (c->name) decrRefCount(c->name); + c->name = name; + incrRefCount(name); + return C_OK; +} + void clientCommand(client *c) { listNode *ln; listIter li; @@ -1635,10 +1862,10 @@ void clientCommand(client *c) { "kill -- Kill connection made from .", "kill