58 Commits

Author SHA1 Message Date
John Sully
b47d7715a4 Merge commit '71134e357ffd6ced7c40c145205dbbac173ee181' into unstable
Former-commit-id: d5b057534a3dbf50f94465332107da2490811946
2020-05-21 17:32:53 -04:00
Guy Benoish
71134e357f DEBUG OBJECT should pass keyname to module when loading 2020-04-07 16:52:04 +02:00
John Sully
6193e9ad4f Merge remote-tracking branch 'redis/6.0' into redis_merge
Former-commit-id: ef9a3cadcf94326bf2f163db7698aad9a3c01690
2020-01-27 02:55:48 -05:00
John Sully
1116b63a0e Initial implementation of the CRON command
Former-commit-id: 3204a39ada15ec33ac7926dc8b8f0e1875b99acb
2020-01-21 19:50:28 -05:00
Oran Agra
68c6aacf3b Modules hooks: complete missing hooks for the initial set of hooks
* replication hooks: role change, master link status, replica online/offline
* persistence hooks: saving, loading, loading progress
* misc hooks: cron loop, shutdown, module loaded/unloaded
* change the way hooks test work, and add tests for all of the above

startLoading() now gets flag indicating what is loaded.
stopLoading() now gets an indication of success or failure.
adding startSaving() and stopSaving() with similar args and role.
2019-10-29 17:59:09 +02:00
John Sully
4f19c5de9f Fix multi master bugs: 1. we fail to create the temp file. 2. We use a master RDB as our backup even though we merged databases (and therefore it is not representitive)
Former-commit-id: e776474f68a2824bb7d4082c41991a9a9f3a9c9d
2019-09-26 20:35:51 -04:00
Oran Agra
ff0780e8e6 Implement module api for aux data in rdb
Other changes:
* fix memory leak in error handling of rdb loading of type OBJ_MODULE
2019-07-22 21:15:33 +03:00
antirez
72b7b904c1 RDB: try to make error handling code more readable. 2019-07-17 17:30:02 +02:00
Oran Agra
73a945c73c prevent diskless replica from terminating on short read
now that replica can read rdb directly from the socket, it should avoid exiting
on short read and instead try to re-sync.

this commit tries to have minimal effects on non-diskless rdb reading.
and includes a test that tries to trigger this scenario on various read cases.
2019-07-17 16:46:22 +02:00
John Sully
725dd6539f Add back user space buffering of RDB save
Former-commit-id: d918ca6fa57a6149b86b4effc787dbdde7350133
2019-07-06 00:36:23 -04:00
John Sully
c07aa5f3fe Additional MVCC work and fix memory leak loading objects from rdb
Former-commit-id: efde2e6be6dc2fc3425a17e2dc146c5b8823730a
2019-04-19 22:54:42 -04:00
John Sully
0e10e4f6f5 Start of MVCC support (and more C++)
Former-commit-id: c4621a5ed2a7d8ca5034f2fbe8b71550f290ea64
2019-04-16 23:16:03 -04:00
John Sully
68bec6f239 Move remaning files dependent on server.h over to C++
Former-commit-id: 8c133b605c65212b023d35b3cb71e63b6a4c443a
2019-04-08 01:00:48 -04:00
John Sully
f11840f6b2 Merge branch 'unstable' of https://github.com/antirez/redis into unstable
Former-commit-id: 9322d604eea7b48df3feff47ce2c04f82291228f
2019-03-21 20:15:59 -04:00
Yossi Gottlieb
dd405d4026 Add RedisModule_GetKeyNameFromIO(). 2019-03-15 10:23:27 +02:00
John Sully
d0de467103 Implement load database dumps from S3. We already save.
Former-commit-id: a45f212693956a6fb1aacf465d88e940bbbfd56f
2019-03-13 16:53:37 -04:00
John Sully
2fc7811a28 Support AWS S3 saving via the s3 cli tools
Former-commit-id: 23a91df9f65fd5ac84003d24a2ef612ea7aa940c
2019-02-06 01:06:48 -05:00
John Sully
d2aa6471e5 Merge branch 'unstable' of https://github.com/antirez/redis into unstable
Former-commit-id: d8741595aea1f07b0c5ffdf63a086df2ca4e6b1b
2019-02-06 00:09:39 -05:00
John Sully
b660bfb6dd Make main headers C++ safe, and change rdb to use file descriptor instead of FILE pointer
Former-commit-id: 3c9dd6ffc254d089e4208ad39da7338b6fb0fba7
2019-02-05 23:36:40 -05:00
Oran Agra
c40f1008a1 fix redis-rdb-check to provide proper arguments to rdbLoadMillisecondTime
due to incorrect forward declaration, it didn't provide all arguments.
this lead to random value being read from the stack and return of incorrect time,
which in this case doesn't matter since no one uses it.
2018-06-19 16:54:22 +03:00
antirez
96c5bd2043 Don't expire keys while loading RDB from AOF preamble.
The AOF tail of a combined RDB+AOF is based on the premise of applying
the AOF commands to the exact state that there was in the server while
the RDB was persisted. By expiring keys while loading the RDB file, we
change the state, so applying the AOF tail later may change the state.

Test case:

* Time1: SET a 10
* Time2: EXPIREAT a $time5
* Time3: INCR a
* Time4: PERSIT A. Start bgrewiteaof with RDB preamble. The value of a is 11 without expire time.
* Time5: Restart redis from the RDB+AOF: consistency violation.

Thanks to @soloestoy for providing the patch.
Thanks to @trevor211 for the original issue report and the initial fix.

Check issue #4950 for more info.
2018-05-29 12:37:42 +02:00
WuYunlong
4a80f44955 Fix rdb save by allowing dumping of expire keys, so that when
we add a new slave, and do a failover, eighter by manual or
not, other local slaves will delete the expired keys properly.
2018-05-29 12:35:15 +02:00
antirez
8358c580a0 RDB version 9. 2018-03-16 13:48:44 +01:00
antirez
0f94a377b0 RDB: Implement future-proof module AUX data loading. 2018-03-16 13:47:10 +01:00
antirez
d4d246f296 RDB: Ability to save LFU/LRU info.
This is a big win for caching use cases, since on reloading Redis will
still have some idea about what is worth to evict and what not.
However this only solves part of the problem because the information is
only partially propagated to slaves (on write operations). Reads will
not affect slaves LFU and LRU counters, so after a failover the eviction
decisions are kinda random until keys start to collect some aging/freq info.

However since new slaves are initially populated via RDB file transfer,
this means that if we spin up a new slave from a master, and perform an
immediate manual failover (for instance in order to upgrade the master),
the slave will have eviction informations to use for some time.

The LFU/LRU info is persisted only if the maxmemory policy is set to one
of the relevant type, even if no actual "maxmemory"  memory limit is
set.
2018-03-15 13:15:55 +01:00
Oran Agra
5821e8cf82 fix processing of large bulks (above 2GB)
- protocol parsing (processMultibulkBuffer) was limitted to 32big positions in the buffer
  readQueryFromClient potential overflow
- rioWriteBulkCount used int, although rioWriteBulkString gave it size_t
- several places in sds.c that used int for string length or index.
- bugfix in RM_SaveAuxField (return was 1 or -1 and not length)
- RM_SaveStringBuffer was limitted to 32bit length
2017-12-29 12:24:19 +02:00
antirez
27fd469a8d Streams: assign value of 6 to OBJ_STREAM + some refactoring. 2017-12-01 10:24:24 +01:00
antirez
1850ff43f4 Streams: 12 commits squashed into the initial Streams implementation. 2017-12-01 10:24:24 +01:00
antirez
af78ec8ccf PSYNC2: Fix the way replication info is saved/loaded from RDB.
This commit attempts to fix a number of bugs reported in #4316.
They are related to the way replication info like replication ID,
offsets, and currently selected DB in the master client, are stored
and loaded by Redis. In order to avoid inconsistencies the changes in
this commit try to enforce that:

1. Replication information are only stored when the RDB file is
generated by a slave that has a valid 'master' client, so that we can
always extract the currently selected DB.
2. When replication informations are persisted in the RDB file, all the
info for a successful PSYNC or nothing is persisted.
3. The RDB replication informations are only loaded if the instance is
configured as a slave, otherwise a master can start with IDs that relate
to a different history of the data set, and stil retain such IDs in the
future while receiving unrelated writes.
2017-09-19 23:03:39 +02:00
antirez
2a26f2a9f6 RDB modules values serialization format version 2.
The original RDB serialization format was not parsable without the
module loaded, becuase the structure was managed only by the module
itself. Moreover RDB is a streaming protocol in the sense that it is
both produce di an append-only fashion, and is also sometimes directly
sent to the socket (in the case of diskless replication).

The fact that modules values cannot be parsed without the relevant
module loaded is a problem in many ways: RDB checking tools must have
loaded modules even for doing things not involving the value at all,
like splitting an RDB into N RDBs by key or alike, or just checking the
RDB for sanity.

In theory module values could be just a blob of data with a prefixed
length in order for us to be able to skip it. However prefixing the values
with a length would mean one of the following:

1. To be able to write some data at a previous offset. This breaks
stremaing.
2. To bufferize values before outputting them. This breaks performances.
3. To have some chunked RDB output format. This breaks simplicity.

Moreover, the above solution, still makes module values a totally opaque
matter, with the fowllowing problems:

1. The RDB check tool can just skip the value without being able to at
least check the general structure. For datasets composed mostly of
modules values this means to just check the outer level of the RDB not
actually doing any checko on most of the data itself.
2. It is not possible to do any recovering or processing of data for which a
module no longer exists in the future, or is unknown.

So this commit implements a different solution. The modules RDB
serialization API is composed if well defined calls to store integers,
floats, doubles or strings. After this commit, the parts generated by
the module API have a one-byte prefix for each of the above emitted
parts, and there is a final EOF byte as well. So even if we don't know
exactly how to interpret a module value, we can always parse it at an
high level, check the overall structure, understand the types used to
store the information, and easily skip the whole value.

The change is backward compatible: older RDB files can be still loaded
since the new encoding has a new RDB type: MODULE_2 (of value 7).
The commit also implements the ability to check RDB files for sanity
taking advantage of the new feature.
2017-06-27 13:19:16 +02:00
antirez
f53ed7a969 PSYNC2: different improvements to Redis replication.
The gist of the changes is that now, partial resynchronizations between
slaves and masters (without the need of a full resync with RDB transfer
and so forth), work in a number of cases when it was impossible
in the past. For instance:

1. When a slave is promoted to mastrer, the slaves of the old master can
partially resynchronize with the new master.

2. Chained slalves (slaves of slaves) can be moved to replicate to other
slaves or the master itsef, without requiring a full resync.

3. The master itself, after being turned into a slave, is able to
partially resynchronize with the new master, when it joins replication
again.

In order to obtain this, the following main changes were operated:

* Slaves also take a replication backlog, not just masters.

* Same stream replication for all the slaves and sub slaves. The
replication stream is identical from the top level master to its slaves
and is also the same from the slaves to their sub-slaves and so forth.
This means that if a slave is later promoted to master, it has the
same replication backlong, and can partially resynchronize with its
slaves (that were previously slaves of the old master).

* A given replication history is no longer identified by the `runid` of
a Redis node. There is instead a `replication ID` which changes every
time the instance has a new history no longer coherent with the past
one. So, for example, slaves publish the same replication history of
their master, however when they are turned into masters, they publish
a new replication ID, but still remember the old ID, so that they are
able to partially resynchronize with slaves of the old master (up to a
given offset).

* The replication protocol was slightly modified so that a new extended
+CONTINUE reply from the master is able to inform the slave of a
replication ID change.

* REPLCONF CAPA is used in order to notify masters that a slave is able
to understand the new +CONTINUE reply.

* The RDB file was extended with an auxiliary field that is able to
select a given DB after loading in the slave, so that the slave can
continue receiving the replication stream from the point it was
disconnected without requiring the master to insert "SELECT" statements.
This is useful in order to guarantee the "same stream" property, because
the slave must be able to accumulate an identical backlog.

* Slave pings to sub-slaves are now sent in a special form, when the
top-level master is disconnected, in order to don't interfer with the
replication stream. We just use out of band "\n" bytes as in other parts
of the Redis protocol.

An old design document is available here:

https://gist.github.com/antirez/ae068f95c0d084891305

However the implementation is not identical to the description because
during the work to implement it, different changes were needed in order
to make things working well.
2016-11-09 15:37:15 +01:00
antirez
adcafbd283 Modules: API to save/load single precision floating point numbers.
When double precision is not needed, to take 2x space in the
serialization is not good.
2016-10-03 00:08:35 +02:00
antirez
86a66265aa RDB AOF preamble: WIP 4 (Mixed RDB/AOF loading). 2016-08-11 15:42:28 +02:00
antirez
805f8c2c90 RDB AOF preamble: WIP 1. 2016-08-09 11:07:32 +02:00
antirez
9deb98167b Modules: support for modules native data types. 2016-06-03 18:14:04 +02:00
antirez
4e37d7d2c8 RDB v8: fix rdbLoadLen() return value. 2016-06-01 20:18:28 +02:00
antirez
fb9173a888 RDB v8: new ZSET storage format with binary doubles. 2016-06-01 12:12:26 +02:00
antirez
8bfdd07667 RDB v8: ability to save uint64_t lengths. 2016-06-01 11:35:47 +02:00
antirez
c15cac0d77 RDMF: More consistent define names. 2015-07-27 14:37:58 +02:00
antirez
6a424b5e36 RDMF (Redis/Disque merge friendlyness) refactoring WIP 1. 2015-07-26 15:17:18 +02:00
Matt Stancliff
0c611363e5 Improve RDB type correctness
It's possible large objects could be larger than 'int', so let's
upgrade all size counters to ssize_t.

This also fixes rdbSaveObject serialized bytes calculation.
Since entire serializations of data structures can be large,
so we don't want to limit their calculated size to a 32 bit signed max.

This commit increases object size calculation and
cascades the change back up to serializedlength printing.

Before:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:-2147483559 ...

After:
127.0.0.1:6379> debug object hihihi
... encoding:quicklist serializedlength:2147483737 ...
2015-01-19 14:10:12 -05:00
antirez
d93e29bea0 RDB AUX fields support.
This commit introduces a new RDB data type called 'aux'. It is used in
order to insert inside an RDB file key-value pairs that may serve
different needs, without breaking backward compatibility when new
informations are embedded inside an RDB file. The contract between Redis
versions is to ignore unknown aux fields when encountered.

Aux fields can be used in order to:

1. Augment the RDB file with info like version of Redis that created the
RDB file, creation time, used memory while the RDB was created, and so
forth.
2. Add state about Redis inside the RDB file that we need to reload
later: replication offset, previos master run ID, in order to improve
failovers safety and allow partial resynchronization after a slave
restart.
3. Anything that we may want to add to RDB files without breaking the
ability of past versions of Redis to load the file.
2015-01-08 09:52:55 +01:00
antirez
4a56ebe7dd New RDB v7 opcode: RESIZEDB.
The new opcode is an hint about the size of the dataset (keys and number
of expires) we are going to load for a given Redis database inside the
RDB file. Since hash tables are resized accordingly ASAP, useless
rehashing is avoided, speeding up load times significantly, in the order
of ~ 20% or more for larger data sets.

Related issue: #1719
2015-01-08 09:52:47 +01:00
Matt Stancliff
1dfe1cea49 Convert quicklist RDB to store ziplist nodes
Turns out it's a huge improvement during save/reload/migrate/restore
because, with compression enabled, we're compressing 4k or 8k
chunks of data consisting of multiple elements in one ziplist
instead of compressing series of smaller individual elements.
2015-01-02 11:16:09 -05:00
antirez
1900d091d7 Diskless replication: RDB -> slaves transfer draft implementation. 2014-10-14 10:11:29 +02:00
guiquanz
df7a5b7157 Fixed many typos. 2013-01-19 10:59:44 +01:00
antirez
a32d1ddff6 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez
52d954912e No longer used macro rdbIsOpcode() removed. 2012-10-30 19:10:46 +01:00
Alex Mitrofanov
33700ab2a7 Fixed RESTORE hash failure (Issue #532)
(additional commit notes by antirez@gmail.com):

The rdbIsObjectType() macro was not updated when the new RDB object type
of ziplist encoded hashes was added.

As a result RESTORE, that uses rdbLoadObjectType(), failed when a
ziplist encoded hash was loaded.
This does not affected normal RDB loading because in that case we use
the lower-level function rdbLoadType().

The commit also adds a regression test.
2012-06-02 10:24:27 +02:00
Grisha Trubetskoy
90e3a29f59 Add a 24bit integer to ziplists to save one byte for ints that can
fit in 24 bits (thanks to antirez for catching and solving the two's compliment
bug).

Increment REDIS_RDB_VERSION to 6
2012-04-24 12:02:19 +02:00