New commands:
`HRANDFIELD [<count> [WITHVALUES]]`
`ZRANDMEMBER [<count> [WITHSCORES]]`
Algorithms are similar to the one in SRANDMEMBER.
Both return a simple bulk response when no arguments are given, and an array otherwise.
In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array.
note: in all 3 commands, the only option that also provides random order is the one with negative count.
Changes to SRANDMEMBER
* Optimization when count is 1, we can use the more efficient algorithm of non-unique random
* optimization: work with sds strings rather than robj
Other changes:
* zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in
case the buffer is too small)
* Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or
case 4) and by accident used a negative count
Co-authored-by: xinluton <xinluton@qq.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Fix wrong server dirty increment in
* spopWithCountCommand
* hsetCommand
* ltrimCommand
* pfaddCommand
Some didn't increment the amount of fields (just one per command).
Others had excessive increments.
Syntax:
COPY <key> <new-key> [DB <dest-db>] [REPLACE]
No support for module keys yet.
Co-authored-by: tmgauss
Co-authored-by: Itamar Haber <itamar@redislabs.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
Add two optional callbacks to the RedisModuleTypeMethods structure, which is `free_effort`
and `unlink`. the `free_effort` callback indicates the effort required to free a module memory.
Currently, if the effort exceeds LAZYFREE_THRESHOLD, the module memory may be released
asynchronously. the `unlink` callback indicates the key has been removed from the DB by redis, and
may soon be freed by a background thread.
Add `lazyfreed_objects` info field, which represents the number of objects that have been
lazyfreed since redis was started.
Add `RM_GetTypeMethodVersion` API, which return the current redis-server runtime value of
`REDISMODULE_TYPE_METHOD_VERSION`. You can use that when calling `RM_CreateDataType` to know
which fields of RedisModuleTypeMethods are gonna be supported and which will be ignored.
Reference the correct "case", case 4, in the comment explaining the need
for case 3, when the number of request items is too close to the
cardinality of the set. Case 4 is indeed the "natural approach"
referenced earlier in that sentence.
This is a rebased version of #3078 originally by shaharmor
with the following patches by TysonAndre made after rebasing
to work with the updated C API:
1. Add 2 more unit tests
(wrong argument count error message, integer over 64 bits)
2. Use addReplyArrayLen instead of addReplyMultiBulkLen.
3. Undo changes to src/help.h - for the ZMSCORE PR,
I heard those should instead be automatically
generated from the redis-doc repo if it gets updated
Motivations:
- Example use case: Client code to efficiently check if each element of a set
of 1000 items is a member of a set of 10 million items.
(Similar to reasons for working on #7593)
- HMGET and ZMSCORE already exist. This may lead to developers deciding
to implement functionality that's best suited to a regular set with a
data type of sorted set or hash map instead, for the multi-get support.
Currently, multi commands or lua scripting to call sismember multiple times
would almost definitely be less efficient than a native smismember
for the following reasons:
- Need to fetch the set from the string every time
instead of reusing the C pointer.
- Using pipelining or multi-commands would result in more bytes sent
and received by the client for the repeated SISMEMBER KEY sections.
- Need to specially encode the data and decode it from the client
for lua-based solutions.
- Proposed solutions using Lua or SADD/SDIFF could trigger writes to
memory, which is undesirable on a redis replica server
or when commands get replicated to replicas.
Co-Authored-By: Shahar Mor <shahar@peer5.com>
Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Segfault introduced during a refactoring / warning suppression a few
commits away. This particular call assumed that it is safe to pass NULL
to the object pointer argument when we are sure the set has a given
encoding. This can't be assumed and is now guaranteed to segfault
because of the new API of setTypeNext().
This change fixes several warnings compiling at -O3 level with GCC
4.8.2, and at the same time, in case of misuse of the API, we have the
pointer initialize to NULL or the integer initialized to the value
-123456789 which is easy to spot by naked eye.
The old version of SPOP with "count" argument used an API call of dict.c
which was actually designed for a different goal, and was not capable of
good distribution. We follow a different three-cases approach optimized
for different ratiion between sets and requested number of elements.
The implementation is simpler and allowed the removal of a large amount
of code.
Severan problems are addressed but still a few missing.
Since replication of this command was more complex than others since it
needs to replicate multiple SREM commands, an old API able to do this
was reused (it was taken inside the implementation since it was pretty
obvious soon or later that would be useful). The API was improved a bit
so that now a command may opt-out for the standard command replication
when the server.dirty counter is incremented, in order to "manually"
replicate what it wants.
1. memory leak in t_set.c has been fixed
2. end-of-line spaces has been removed (from all over the place)
3. for loops have been ordered up to match existing Redis style (less weird)
4. comments format has been fixed (added * in the beggining of every comment line)
setTypeRandomElements() now returns unsigned long, and also uses unsigned long for anything related to count of members.
spopWithCountCommand() now uses unsigned long elements_returned instead of int, for values returned from setTypeRandomElements()
spopCommand() now runs spopWithCountCommand() in case the <count> param is found.
Added intsetRandomMembers() to Intset: Copies N random members from the set into inputted 'values' array. Uses either the Knuth or Floyd sample algos depending on ratio count/size.
Added setTypeRandomElements() to SET type: Returns a number of random elements from a non empty set. This is a version of setTypeRandomElement() that is modified in order to return multiple entries, using dictGetRandomKeys() and intsetRandomMembers().
Added tests for SPOP with <count>: unit/type/set, unit/scripting, integration/aof
--
Cleaned up code a bit to match with required Redis coding style
The bug could be easily triggered by:
SADD foo a b c 1 2 3 4 5 6
SDIFF foo foo
When the key was the same in two sets, an unsafe iterator was used to
check existence of elements in the same set we were iterating.
Usually this would just result into a wrong output, however with the
dict.c API misuse protection we have in place, the result was actually
an assertion failed that was triggered by the CI test, while creating
random datasets for the "MASTER and SLAVE consistency" test.
The previous implementation of SCAN parsed the cursor in the generic
function implementing SCAN, SSCAN, HSCAN and ZSCAN.
The actual higher-level command implementation only checked for empty
keys and return ASAP in that case. The result was that inverting the
arguments of, for instance, SSCAN for example and write:
SSCAN 0 key
Instead of
SSCAN key 0
Resulted into no error, since 0 is a non-existing key name very likely.
Just the iterator returned no elements at all.
In order to fix this issue the code was refactored to extract the
function to parse the cursor and return the error. Every higher level
command implementation now parses the cursor and later checks if the key
exist or not.
Previously two string encodings were used for string objects:
1) REDIS_ENCODING_RAW: a string object with obj->ptr pointing to an sds
stirng.
2) REDIS_ENCODING_INT: a string object where the obj->ptr void pointer
is casted to a long.
This commit introduces a experimental new encoding called
REDIS_ENCODING_EMBSTR that implements an object represented by an sds
string that is not modifiable but allocated in the same memory chunk as
the robj structure itself.
The chunk looks like the following:
+--------------+-----------+------------+--------+----+
| robj data... | robj->ptr | sds header | string | \0 |
+--------------+-----+-----+------------+--------+----+
| ^
+-----------------------+
The robj->ptr points to the contiguous sds string data, so the object
can be manipulated with the same functions used to manipulate plan
string objects, however we need just on malloc and one free in order to
allocate or release this kind of objects. Moreover it has better cache
locality.
This new allocation strategy should benefit both the memory usage and
the performances. A performance gain between 60 and 70% was observed
during micro-benchmarks, however there is more work to do to evaluate
the performance impact and the memory usage behavior.