futriix/src/aof.c

2558 lines
92 KiB
C
Raw Normal View History

/*
* Copyright (c) 2009-2012, Salvatore Sanfilippo <antirez at gmail dot com>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Redis nor the names of its contributors may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include "server.h"
#include "bio.h"
2011-09-22 15:15:26 +02:00
#include "rio.h"
#include "functions.h"
#include <signal.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/wait.h>
#include <sys/param.h>
void freeClientArgv(client *c);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
off_t getAppendOnlyFileSize(sds filename);
off_t getBaseAndIncrAppendOnlyFilesSize(aofManifest *am);
int getBaseAndIncrAppendOnlyFilesNum(aofManifest *am);
int aofFileExist(char *filename);
int rewriteAppendOnlyFile(char *filename);
aofManifest *aofLoadManifestFromFile(sds am_filepath);
void aofManifestFreeAndUpdate(aofManifest *am);
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
/* ----------------------------------------------------------------------------
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* AOF Manifest file implementation.
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
*
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* The following code implements the read/write logic of AOF manifest file, which
* is used to track and manage all AOF files.
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
*
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* Append-only files consist of three types:
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
*
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* BASE: Represents a Redis snapshot from the time of last AOF rewrite. The manifest
* file contains at most a single BASE file, which will always be the first file in the
* list.
*
* INCR: Represents all write commands executed by Redis following the last successful
* AOF rewrite. In some cases it is possible to have several ordered INCR files. For
* example:
* - During an on-going AOF rewrite
* - After an AOF rewrite was aborted/failed, and before the next one succeeded.
*
* HISTORY: After a successful rewrite, the previous BASE and INCR become HISTORY files.
* They will be automatically removed unless garbage collection is disabled.
*
* The following is a possible AOF manifest file content:
*
* file appendonly.aof.2.base.rdb seq 2 type b
* file appendonly.aof.1.incr.aof seq 1 type h
* file appendonly.aof.2.incr.aof seq 2 type h
* file appendonly.aof.3.incr.aof seq 3 type h
* file appendonly.aof.4.incr.aof seq 4 type i
* file appendonly.aof.5.incr.aof seq 5 type i
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
* ------------------------------------------------------------------------- */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Naming rules. */
#define BASE_FILE_SUFFIX ".base"
#define INCR_FILE_SUFFIX ".incr"
#define RDB_FORMAT_SUFFIX ".rdb"
#define AOF_FORMAT_SUFFIX ".aof"
#define MANIFEST_NAME_SUFFIX ".manifest"
#define MANIFEST_TEMP_NAME_PREFIX "temp_"
/* AOF manifest key. */
#define AOF_MANIFEST_KEY_FILE_NAME "file"
#define AOF_MANIFEST_KEY_FILE_SEQ "seq"
#define AOF_MANIFEST_KEY_FILE_TYPE "type"
/* Create an empty aofInfo. */
aofInfo *aofInfoCreate(void) {
return zcalloc(sizeof(aofInfo));
}
/* Free the aofInfo structure (pointed to by ai) and its embedded file_name. */
void aofInfoFree(aofInfo *ai) {
serverAssert(ai != NULL);
if (ai->file_name) sdsfree(ai->file_name);
zfree(ai);
}
/* Deep copy an aofInfo. */
aofInfo *aofInfoDup(aofInfo *orig) {
serverAssert(orig != NULL);
aofInfo *ai = aofInfoCreate();
ai->file_name = sdsdup(orig->file_name);
ai->file_seq = orig->file_seq;
ai->file_type = orig->file_type;
return ai;
}
/* Format aofInfo as a string and it will be a line in the manifest. */
sds aofInfoFormat(sds buf, aofInfo *ai) {
sds filename_repr = NULL;
if (sdsneedsrepr(ai->file_name))
filename_repr = sdscatrepr(sdsempty(), ai->file_name, sdslen(ai->file_name));
sds ret = sdscatprintf(buf, "%s %s %s %lld %s %c\n",
AOF_MANIFEST_KEY_FILE_NAME, filename_repr ? filename_repr : ai->file_name,
AOF_MANIFEST_KEY_FILE_SEQ, ai->file_seq,
AOF_MANIFEST_KEY_FILE_TYPE, ai->file_type);
sdsfree(filename_repr);
return ret;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Method to free AOF list elements. */
void aofListFree(void *item) {
aofInfo *ai = (aofInfo *)item;
aofInfoFree(ai);
}
/* Method to duplicate AOF list elements. */
void *aofListDup(void *item) {
return aofInfoDup(item);
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Create an empty aofManifest, which will be called in `aofLoadManifestFromDisk`. */
aofManifest *aofManifestCreate(void) {
aofManifest *am = zcalloc(sizeof(aofManifest));
am->incr_aof_list = listCreate();
am->history_aof_list = listCreate();
listSetFreeMethod(am->incr_aof_list, aofListFree);
listSetDupMethod(am->incr_aof_list, aofListDup);
listSetFreeMethod(am->history_aof_list, aofListFree);
listSetDupMethod(am->history_aof_list, aofListDup);
return am;
}
/* Free the aofManifest structure (pointed to by am) and its embedded members. */
void aofManifestFree(aofManifest *am) {
if (am->base_aof_info) aofInfoFree(am->base_aof_info);
if (am->incr_aof_list) listRelease(am->incr_aof_list);
if (am->history_aof_list) listRelease(am->history_aof_list);
zfree(am);
}
sds getAofManifestFileName() {
return sdscatprintf(sdsempty(), "%s%s", server.aof_filename,
MANIFEST_NAME_SUFFIX);
}
sds getTempAofManifestFileName() {
return sdscatprintf(sdsempty(), "%s%s%s", MANIFEST_TEMP_NAME_PREFIX,
server.aof_filename, MANIFEST_NAME_SUFFIX);
}
/* Returns the string representation of aofManifest pointed to by am.
*
* The string is multiple lines separated by '\n', and each line represents
* an AOF file.
*
* Each line is space delimited and contains 6 fields, as follows:
* "file" [filename] "seq" [sequence] "type" [type]
*
* Where "file", "seq" and "type" are keywords that describe the next value,
* [filename] and [sequence] describe file name and order, and [type] is one
* of 'b' (base), 'h' (history) or 'i' (incr).
*
* The base file, if exists, will always be first, followed by history files,
* and incremental files.
*/
sds getAofManifestAsString(aofManifest *am) {
serverAssert(am != NULL);
sds buf = sdsempty();
listNode *ln;
listIter li;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* 1. Add BASE File information, it is always at the beginning
* of the manifest file. */
if (am->base_aof_info) {
buf = aofInfoFormat(buf, am->base_aof_info);
}
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* 2. Add HISTORY type AOF information. */
listRewind(am->history_aof_list, &li);
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
buf = aofInfoFormat(buf, ai);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* 3. Add INCR type AOF information. */
listRewind(am->incr_aof_list, &li);
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
buf = aofInfoFormat(buf, ai);
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
return buf;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Load the manifest information from the disk to `server.aof_manifest`
* when the Redis server start.
*
* During loading, this function does strict error checking and will abort
* the entire Redis server process on error (I/O error, invalid format, etc.)
*
* If the AOF directory or manifest file do not exist, this will be ignored
* in order to support seamless upgrades from previous versions which did not
* use them.
*/
void aofLoadManifestFromDisk(void) {
server.aof_manifest = aofManifestCreate();
if (!dirExists(server.aof_dirname)) {
serverLog(LL_NOTICE, "The AOF directory %s doesn't exist", server.aof_dirname);
return;
}
sds am_name = getAofManifestFileName();
sds am_filepath = makePath(server.aof_dirname, am_name);
if (!fileExist(am_filepath)) {
serverLog(LL_NOTICE, "The AOF manifest file %s doesn't exist", am_name);
sdsfree(am_name);
sdsfree(am_filepath);
return;
}
aofManifest *am = aofLoadManifestFromFile(am_filepath);
if (am) aofManifestFreeAndUpdate(am);
sdsfree(am_name);
sdsfree(am_filepath);
}
/* Generic manifest loading function, used in `aofLoadManifestFromDisk` and redis-check-aof tool. */
#define MANIFEST_MAX_LINE 1024
aofManifest *aofLoadManifestFromFile(sds am_filepath) {
const char *err = NULL;
long long maxseq = 0;
aofManifest *am = aofManifestCreate();
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
FILE *fp = fopen(am_filepath, "r");
if (fp == NULL) {
serverLog(LL_WARNING, "Fatal error: can't open the AOF manifest "
"file %s for reading: %s", am_filepath, strerror(errno));
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
exit(1);
}
char buf[MANIFEST_MAX_LINE+1];
sds *argv = NULL;
int argc;
aofInfo *ai = NULL;
sds line = NULL;
int linenum = 0;
while (1) {
if (fgets(buf, MANIFEST_MAX_LINE+1, fp) == NULL) {
if (feof(fp)) {
if (linenum == 0) {
err = "Found an empty AOF manifest";
goto loaderr;
} else {
break;
}
} else {
err = "Read AOF manifest failed";
goto loaderr;
}
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
linenum++;
/* Skip comments lines */
if (buf[0] == '#') continue;
if (strchr(buf, '\n') == NULL) {
err = "The AOF manifest file contains too long line";
goto loaderr;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
line = sdstrim(sdsnew(buf), " \t\r\n");
if (!sdslen(line)) {
err = "Invalid AOF manifest file format";
goto loaderr;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
argv = sdssplitargs(line, &argc);
/* 'argc < 6' was done for forward compatibility. */
if (argv == NULL || argc < 6 || (argc % 2)) {
err = "Invalid AOF manifest file format";
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
goto loaderr;
}
ai = aofInfoCreate();
for (int i = 0; i < argc; i += 2) {
if (!strcasecmp(argv[i], AOF_MANIFEST_KEY_FILE_NAME)) {
ai->file_name = sdsnew(argv[i+1]);
if (!pathIsBaseName(ai->file_name)) {
err = "File can't be a path, just a filename";
goto loaderr;
}
} else if (!strcasecmp(argv[i], AOF_MANIFEST_KEY_FILE_SEQ)) {
ai->file_seq = atoll(argv[i+1]);
} else if (!strcasecmp(argv[i], AOF_MANIFEST_KEY_FILE_TYPE)) {
ai->file_type = (argv[i+1])[0];
}
/* else if (!strcasecmp(argv[i], AOF_MANIFEST_KEY_OTHER)) {} */
}
/* We have to make sure we load all the information. */
if (!ai->file_name || !ai->file_seq || !ai->file_type) {
err = "Invalid AOF manifest file format";
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
goto loaderr;
}
sdsfreesplitres(argv, argc);
argv = NULL;
if (ai->file_type == AOF_FILE_TYPE_BASE) {
if (am->base_aof_info) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
err = "Found duplicate base file information";
goto loaderr;
}
am->base_aof_info = ai;
am->curr_base_file_seq = ai->file_seq;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
} else if (ai->file_type == AOF_FILE_TYPE_HIST) {
listAddNodeTail(am->history_aof_list, ai);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
} else if (ai->file_type == AOF_FILE_TYPE_INCR) {
if (ai->file_seq <= maxseq) {
err = "Found a non-monotonic sequence number";
goto loaderr;
}
listAddNodeTail(am->incr_aof_list, ai);
am->curr_incr_file_seq = ai->file_seq;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
maxseq = ai->file_seq;
} else {
err = "Unknown AOF file type";
goto loaderr;
}
sdsfree(line);
line = NULL;
ai = NULL;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
fclose(fp);
return am;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
loaderr:
/* Sanitizer suppression: may report a false positive if we goto loaderr
* and exit(1) without freeing these allocations. */
if (argv) sdsfreesplitres(argv, argc);
if (ai) aofInfoFree(ai);
serverLog(LL_WARNING, "\n*** FATAL AOF MANIFEST FILE ERROR ***\n");
if (line) {
serverLog(LL_WARNING, "Reading the manifest file, at line %d\n", linenum);
serverLog(LL_WARNING, ">>> '%s'\n", line);
}
serverLog(LL_WARNING, "%s\n", err);
exit(1);
}
/* Deep copy an aofManifest from orig.
*
* In `backgroundRewriteDoneHandler` and `openNewIncrAofForAppend`, we will
* first deep copy a temporary AOF manifest from the `server.aof_manifest` and
* try to modify it. Once everything is modified, we will atomically make the
* `server.aof_manifest` point to this temporary aof_manifest.
*/
aofManifest *aofManifestDup(aofManifest *orig) {
serverAssert(orig != NULL);
aofManifest *am = zcalloc(sizeof(aofManifest));
am->curr_base_file_seq = orig->curr_base_file_seq;
am->curr_incr_file_seq = orig->curr_incr_file_seq;
am->dirty = orig->dirty;
if (orig->base_aof_info) {
am->base_aof_info = aofInfoDup(orig->base_aof_info);
}
am->incr_aof_list = listDup(orig->incr_aof_list);
am->history_aof_list = listDup(orig->history_aof_list);
serverAssert(am->incr_aof_list != NULL);
serverAssert(am->history_aof_list != NULL);
return am;
}
/* Change the `server.aof_manifest` pointer to 'am' and free the previous
* one if we have. */
void aofManifestFreeAndUpdate(aofManifest *am) {
serverAssert(am != NULL);
if (server.aof_manifest) aofManifestFree(server.aof_manifest);
server.aof_manifest = am;
}
/* Called in `backgroundRewriteDoneHandler` to get a new BASE file
* name, and mark the previous (if we have) BASE file as HISTORY type.
*
* BASE file naming rules: `server.aof_filename`.seq.base.format
*
* for example:
* appendonly.aof.1.base.aof (server.aof_use_rdb_preamble is no)
* appendonly.aof.1.base.rdb (server.aof_use_rdb_preamble is yes)
*/
sds getNewBaseFileNameAndMarkPreAsHistory(aofManifest *am) {
serverAssert(am != NULL);
if (am->base_aof_info) {
serverAssert(am->base_aof_info->file_type == AOF_FILE_TYPE_BASE);
am->base_aof_info->file_type = AOF_FILE_TYPE_HIST;
listAddNodeHead(am->history_aof_list, am->base_aof_info);
}
char *format_suffix = server.aof_use_rdb_preamble ?
RDB_FORMAT_SUFFIX:AOF_FORMAT_SUFFIX;
aofInfo *ai = aofInfoCreate();
ai->file_name = sdscatprintf(sdsempty(), "%s.%lld%s%s", server.aof_filename,
++am->curr_base_file_seq, BASE_FILE_SUFFIX, format_suffix);
ai->file_seq = am->curr_base_file_seq;
ai->file_type = AOF_FILE_TYPE_BASE;
am->base_aof_info = ai;
am->dirty = 1;
return am->base_aof_info->file_name;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Get a new INCR type AOF name.
*
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* INCR AOF naming rules: `server.aof_filename`.seq.incr.aof
*
* for example:
* appendonly.aof.1.incr.aof
*/
sds getNewIncrAofName(aofManifest *am) {
aofInfo *ai = aofInfoCreate();
ai->file_type = AOF_FILE_TYPE_INCR;
ai->file_name = sdscatprintf(sdsempty(), "%s.%lld%s%s", server.aof_filename,
++am->curr_incr_file_seq, INCR_FILE_SUFFIX, AOF_FORMAT_SUFFIX);
ai->file_seq = am->curr_incr_file_seq;
listAddNodeTail(am->incr_aof_list, ai);
am->dirty = 1;
return ai->file_name;
}
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Get the last INCR AOF name or create a new one. */
sds getLastIncrAofName(aofManifest *am) {
serverAssert(am != NULL);
/* If 'incr_aof_list' is empty, just create a new one. */
if (!listLength(am->incr_aof_list)) {
return getNewIncrAofName(am);
}
/* Or return the last one. */
listNode *lastnode = listIndex(am->incr_aof_list, -1);
aofInfo *ai = listNodeValue(lastnode);
return ai->file_name;
}
/* Called in `backgroundRewriteDoneHandler`. when AOFRW success, This
* function will change the AOF file type in 'incr_aof_list' from
* AOF_FILE_TYPE_INCR to AOF_FILE_TYPE_HIST, and move them to the
* 'history_aof_list'.
*/
void markRewrittenIncrAofAsHistory(aofManifest *am) {
serverAssert(am != NULL);
if (!listLength(am->incr_aof_list)) {
return;
}
listNode *ln;
listIter li;
listRewindTail(am->incr_aof_list, &li);
/* "server.aof_fd != -1" means AOF enabled, then we must skip the
* last AOF, because this file is our currently writing. */
if (server.aof_fd != -1) {
ln = listNext(&li);
serverAssert(ln != NULL);
}
/* Move aofInfo from 'incr_aof_list' to 'history_aof_list'. */
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
serverAssert(ai->file_type == AOF_FILE_TYPE_INCR);
aofInfo *hai = aofInfoDup(ai);
hai->file_type = AOF_FILE_TYPE_HIST;
listAddNodeHead(am->history_aof_list, hai);
listDelNode(am->incr_aof_list, ln);
}
am->dirty = 1;
}
/* Write the formatted manifest string to disk. */
int writeAofManifestFile(sds buf) {
int ret = C_OK;
ssize_t nwritten;
int len;
sds am_name = getAofManifestFileName();
sds am_filepath = makePath(server.aof_dirname, am_name);
sds tmp_am_name = getTempAofManifestFileName();
sds tmp_am_filepath = makePath(server.aof_dirname, tmp_am_name);
int fd = open(tmp_am_filepath, O_WRONLY|O_TRUNC|O_CREAT, 0644);
if (fd == -1) {
serverLog(LL_WARNING, "Can't open the AOF manifest file %s: %s",
tmp_am_name, strerror(errno));
ret = C_ERR;
goto cleanup;
}
len = sdslen(buf);
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
while(len) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
nwritten = write(fd, buf, len);
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (nwritten < 0) {
if (errno == EINTR) continue;
serverLog(LL_WARNING, "Error trying to write the temporary AOF manifest file %s: %s",
tmp_am_name, strerror(errno));
ret = C_ERR;
goto cleanup;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
len -= nwritten;
buf += nwritten;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (redis_fsync(fd) == -1) {
serverLog(LL_WARNING, "Fail to fsync the temp AOF file %s: %s.",
tmp_am_name, strerror(errno));
ret = C_ERR;
goto cleanup;
}
if (rename(tmp_am_filepath, am_filepath) != 0) {
serverLog(LL_WARNING,
"Error trying to rename the temporary AOF manifest file %s into %s: %s",
tmp_am_name, am_name, strerror(errno));
ret = C_ERR;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
cleanup:
if (fd != -1) close(fd);
sdsfree(am_name);
sdsfree(am_filepath);
sdsfree(tmp_am_name);
sdsfree(tmp_am_filepath);
return ret;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Persist the aofManifest information pointed to by am to disk. */
int persistAofManifest(aofManifest *am) {
if (am->dirty == 0) {
return C_OK;
}
sds amstr = getAofManifestAsString(am);
int ret = writeAofManifestFile(amstr);
sdsfree(amstr);
if (ret == C_OK) am->dirty = 0;
return ret;
}
/* Called in `loadAppendOnlyFiles` when we upgrade from a old version redis.
*
* 1) Create AOF directory use 'server.aof_dirname' as the name.
* 2) Use 'server.aof_filename' to construct a BASE type aofInfo and add it to
* aofManifest, then persist the manifest file to AOF directory.
* 3) Move the old AOF file (server.aof_filename) to AOF directory.
*
* If any of the above steps fails or crash occurs, this will not cause any
* problems, and redis will retry the upgrade process when it restarts.
*/
void aofUpgradePrepare(aofManifest *am) {
serverAssert(!aofFileExist(server.aof_filename));
/* Create AOF directory use 'server.aof_dirname' as the name. */
if (dirCreateIfMissing(server.aof_dirname) == -1) {
serverLog(LL_WARNING, "Can't open or create append-only dir %s: %s",
server.aof_dirname, strerror(errno));
exit(1);
}
/* Manually construct a BASE type aofInfo and add it to aofManifest. */
if (am->base_aof_info) aofInfoFree(am->base_aof_info);
aofInfo *ai = aofInfoCreate();
ai->file_name = sdsnew(server.aof_filename);
ai->file_seq = 1;
ai->file_type = AOF_FILE_TYPE_BASE;
am->base_aof_info = ai;
am->curr_base_file_seq = 1;
am->dirty = 1;
/* Persist the manifest file to AOF directory. */
if (persistAofManifest(am) != C_OK) {
exit(1);
}
/* Move the old AOF file to AOF directory. */
sds aof_filepath = makePath(server.aof_dirname, server.aof_filename);
if (rename(server.aof_filename, aof_filepath) == -1) {
serverLog(LL_WARNING,
"Error trying to move the old AOF file %s into dir %s: %s",
server.aof_filename,
server.aof_dirname,
strerror(errno));
sdsfree(aof_filepath);
exit(1);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
}
sdsfree(aof_filepath);
serverLog(LL_NOTICE, "Successfully migrated an old-style AOF file (%s) into the AOF directory (%s).",
server.aof_filename, server.aof_dirname);
}
/* When AOFRW success, the previous BASE and INCR AOFs will
* become HISTORY type and be moved into 'history_aof_list'.
*
* The function will traverse the 'history_aof_list' and submit
* the delete task to the bio thread.
*/
int aofDelHistoryFiles(void) {
if (server.aof_manifest == NULL ||
server.aof_disable_auto_gc == 1 ||
!listLength(server.aof_manifest->history_aof_list))
{
return C_OK;
}
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
listNode *ln;
listIter li;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
listRewind(server.aof_manifest->history_aof_list, &li);
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
serverAssert(ai->file_type == AOF_FILE_TYPE_HIST);
serverLog(LL_NOTICE, "Removing the history file %s in the background", ai->file_name);
sds aof_filepath = makePath(server.aof_dirname, ai->file_name);
bg_unlink(aof_filepath);
sdsfree(aof_filepath);
listDelNode(server.aof_manifest->history_aof_list, ln);
}
server.aof_manifest->dirty = 1;
return persistAofManifest(server.aof_manifest);
}
/* Called after `loadDataFromDisk` when redis start. If `server.aof_state` is
* 'AOF_ON', It will do three things:
* 1. Force create a BASE file when redis starts with an empty dataset
* 2. Open the last opened INCR type AOF for writing, If not, create a new one
* 3. Synchronously update the manifest file to the disk
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
*
* If any of the above steps fails, the redis process will exit.
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
*/
void aofOpenIfNeededOnServerStart(void) {
if (server.aof_state != AOF_ON) {
return;
}
serverAssert(server.aof_manifest != NULL);
serverAssert(server.aof_fd == -1);
if (dirCreateIfMissing(server.aof_dirname) == -1) {
serverLog(LL_WARNING, "Can't open or create append-only dir %s: %s",
server.aof_dirname, strerror(errno));
exit(1);
}
/* If we start with an empty dataset, we will force create a BASE file. */
if (!server.aof_manifest->base_aof_info &&
!listLength(server.aof_manifest->incr_aof_list))
{
sds base_name = getNewBaseFileNameAndMarkPreAsHistory(server.aof_manifest);
sds base_filepath = makePath(server.aof_dirname, base_name);
if (rewriteAppendOnlyFile(base_filepath) != C_OK) {
exit(1);
}
sdsfree(base_filepath);
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Because we will 'exit(1)' if open AOF or persistent manifest fails, so
* we don't need atomic modification here. */
sds aof_name = getLastIncrAofName(server.aof_manifest);
/* Here we should use 'O_APPEND' flag. */
sds aof_filepath = makePath(server.aof_dirname, aof_name);
server.aof_fd = open(aof_filepath, O_WRONLY|O_APPEND|O_CREAT, 0644);
sdsfree(aof_filepath);
if (server.aof_fd == -1) {
serverLog(LL_WARNING, "Can't open the append-only file %s: %s",
aof_name, strerror(errno));
exit(1);
}
/* Persist our changes. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
int ret = persistAofManifest(server.aof_manifest);
if (ret != C_OK) {
exit(1);
}
server.aof_last_incr_size = getAppendOnlyFileSize(aof_name);
}
int aofFileExist(char *filename) {
sds file_path = makePath(server.aof_dirname, filename);
int ret = fileExist(file_path);
sdsfree(file_path);
return ret;
}
/* Called in `rewriteAppendOnlyFileBackground`. If `server.aof_state`
* is 'AOF_ON' or 'AOF_WAIT_REWRITE', It will do two things:
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* 1. Open a new INCR type AOF for writing
* 2. Synchronously update the manifest file to the disk
*
* The above two steps of modification are atomic, that is, if
* any step fails, the entire operation will rollback and returns
* C_ERR, and if all succeeds, it returns C_OK.
* */
int openNewIncrAofForAppend(void) {
serverAssert(server.aof_manifest != NULL);
int newfd;
/* Only open new INCR AOF when AOF enabled. */
if (server.aof_state == AOF_OFF) return C_OK;
/* Dup a temp aof_manifest to modify. */
aofManifest *temp_am = aofManifestDup(server.aof_manifest);
/* Open new AOF. */
sds new_aof_name = getNewIncrAofName(temp_am);
sds new_aof_filepath = makePath(server.aof_dirname, new_aof_name);
newfd = open(new_aof_filepath, O_WRONLY|O_TRUNC|O_CREAT, 0644);
sdsfree(new_aof_filepath);
if (newfd == -1) {
serverLog(LL_WARNING, "Can't open the append-only file %s: %s",
new_aof_name, strerror(errno));
aofManifestFree(temp_am);
return C_ERR;
}
/* Persist AOF Manifest. */
int ret = persistAofManifest(temp_am);
if (ret == C_ERR) {
close(newfd);
aofManifestFree(temp_am);
return C_ERR;
}
/* If reaches here, we can safely modify the `server.aof_manifest`
* and `server.aof_fd`. */
/* Close old aof_fd if needed. */
if (server.aof_fd != -1) bioCreateCloseJob(server.aof_fd);
server.aof_fd = newfd;
/* Reset the aof_last_incr_size. */
server.aof_last_incr_size = 0;
/* Update `server.aof_manifest`. */
aofManifestFreeAndUpdate(temp_am);
return C_OK;
}
/* Whether to limit the execution of Background AOF rewrite.
*
* At present, if AOFRW fails, redis will automatically retry. If it continues
* to fail, we may get a lot of very small INCR files. so we need an AOFRW
* limiting measure.
*
* We can't directly use `server.aof_current_size` and `server.aof_last_incr_size`,
* because there may be no new writes after AOFRW fails.
*
* So, we use time delay to achieve our goal. When AOFRW fails, we delay the execution
* of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2
* minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour).
*
* During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW
* immediately.
*
* Return 1 means that AOFRW is limited and cannot be executed. 0 means that we can execute
* AOFRW, which may be that we have reached the 'next_rewrite_time' or the number of INCR
* AOFs has not reached the limit threshold.
* */
#define AOF_REWRITE_LIMITE_THRESHOLD 3
#define AOF_REWRITE_LIMITE_NAX_MINUTES 60 /* 1 hour */
int aofRewriteLimited(void) {
int limit = 0;
static int limit_deley_minutes = 0;
static time_t next_rewrite_time = 0;
unsigned long incr_aof_num = listLength(server.aof_manifest->incr_aof_list);
if (incr_aof_num >= AOF_REWRITE_LIMITE_THRESHOLD) {
if (server.unixtime < next_rewrite_time) {
limit = 1;
} else {
if (limit_deley_minutes == 0) {
limit = 1;
limit_deley_minutes = 1;
} else {
limit_deley_minutes *= 2;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (limit_deley_minutes > AOF_REWRITE_LIMITE_NAX_MINUTES) {
limit_deley_minutes = AOF_REWRITE_LIMITE_NAX_MINUTES;
}
next_rewrite_time = server.unixtime + limit_deley_minutes * 60;
serverLog(LL_WARNING,
"Background AOF rewrite has repeatedly failed %ld times and triggered the limit, will retry in %d minutes",
incr_aof_num, limit_deley_minutes);
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
} else {
limit_deley_minutes = 0;
next_rewrite_time = 0;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
return limit;
Allow an AOF rewrite buffer > 2GB (Fix for issue #504). During the AOF rewrite process, the parent process needs to accumulate the new writes in an in-memory buffer: when the child will terminate the AOF rewriting process this buffer (that ist the difference between the dataset when the rewrite was started, and the current dataset) is flushed to the new AOF file. We used to implement this buffer using an sds.c string, but sds.c has a 2GB limit. Sometimes the dataset can be big enough, the amount of writes so high, and the rewrite process slow enough that we overflow the 2GB limit, causing a crash, documented on github by issue #504. In order to prevent this from happening, this commit introduces a new system to accumulate writes, implemented by a linked list of blocks of 10 MB each, so that we also avoid paying the reallocation cost. Note that theoretically modern operating systems may implement realloc() simply as a remaping of the old pages, thus with very good performances, see for instance the mremap() syscall on Linux. However this is not always true, and jemalloc by default avoids doing this because there are issues with the current implementation of mremap(). For this reason we are using a linked list of blocks instead of a single block that gets reallocated again and again. The changes in this commit lacks testing, that will be performed before merging into the unstable branch. This fix will not enter 2.4 because it is too invasive. However 2.4 will log a warning when the AOF rewrite buffer is near to the 2GB limit.
2012-05-22 13:03:41 +02:00
}
/* ----------------------------------------------------------------------------
* AOF file implementation
* ------------------------------------------------------------------------- */
/* Return true if an AOf fsync is currently already in progress in a
* BIO thread. */
int aofFsyncInProgress(void) {
return bioPendingJobsOfType(BIO_AOF_FSYNC) != 0;
}
/* Starts a background task that performs fsync() against the specified
* file descriptor (the one of the AOF file) in another thread. */
void aof_background_fsync(int fd) {
bioCreateFsyncJob(fd);
}
/* Kills an AOFRW child process if exists */
void killAppendOnlyChild(void) {
int statloc;
/* No AOFRW child? return. */
if (server.child_type != CHILD_TYPE_AOF) return;
/* Kill AOFRW child, wait for child exit. */
serverLog(LL_NOTICE,"Killing running AOF rewrite child: %ld",
(long) server.child_pid);
if (kill(server.child_pid,SIGUSR1) != -1) {
while(waitpid(-1, &statloc, 0) != server.child_pid);
}
aofRemoveTempFile(server.child_pid);
resetChildState();
server.aof_rewrite_time_start = -1;
}
/* Called when the user switches from "appendonly yes" to "appendonly no"
* at runtime using the CONFIG command. */
void stopAppendOnly(void) {
2015-07-27 09:41:48 +02:00
serverAssert(server.aof_state != AOF_OFF);
flushAppendOnlyFile(1);
if (redis_fsync(server.aof_fd) == -1) {
serverLog(LL_WARNING,"Fail to fsync the AOF file: %s",strerror(errno));
} else {
server.aof_fsync_offset = server.aof_current_size;
server.aof_last_fsync = server.unixtime;
}
2011-12-21 12:17:02 +01:00
close(server.aof_fd);
2011-12-21 12:17:02 +01:00
server.aof_fd = -1;
server.aof_selected_db = -1;
2015-07-27 09:41:48 +02:00
server.aof_state = AOF_OFF;
server.aof_rewrite_scheduled = 0;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_last_incr_size = 0;
killAppendOnlyChild();
sdsfree(server.aof_buf);
server.aof_buf = sdsempty();
}
/* Called when the user switches from "appendonly no" to "appendonly yes"
* at runtime using the CONFIG command. */
int startAppendOnly(void) {
2015-07-27 09:41:48 +02:00
serverAssert(server.aof_state == AOF_OFF);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_state = AOF_WAIT_REWRITE;
if (hasActiveChildProcess() && server.child_type != CHILD_TYPE_AOF) {
server.aof_rewrite_scheduled = 1;
serverLog(LL_WARNING,"AOF was enabled but there is already another background operation. An AOF background was scheduled to start when possible.");
} else if (server.in_exec){
server.aof_rewrite_scheduled = 1;
serverLog(LL_WARNING,"AOF was enabled during a transaction. An AOF background was scheduled to start when possible.");
} else {
/* If there is a pending AOF rewrite, we need to switch it off and
2018-07-01 13:24:50 +08:00
* start a new one: the old one cannot be reused because it is not
* accumulating the AOF buffer. */
if (server.child_type == CHILD_TYPE_AOF) {
serverLog(LL_WARNING,"AOF was enabled but there is already an AOF rewriting in background. Stopping background AOF and starting a rewrite now.");
killAppendOnlyChild();
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (rewriteAppendOnlyFileBackground() == C_ERR) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_state = AOF_OFF;
serverLog(LL_WARNING,"Redis needs to enable the AOF but can't trigger a background AOF rewrite operation. Check the above logs for more info about the error.");
return C_ERR;
}
}
server.aof_last_fsync = server.unixtime;
/* If AOF fsync error in bio job, we just ignore it and log the event. */
int aof_bio_fsync_status;
atomicGet(server.aof_bio_fsync_status, aof_bio_fsync_status);
if (aof_bio_fsync_status == C_ERR) {
serverLog(LL_WARNING,
"AOF reopen, just ignore the AOF fsync error in bio job");
atomicSet(server.aof_bio_fsync_status,C_OK);
}
/* If AOF was in error state, we just ignore it and log the event. */
if (server.aof_last_write_status == C_ERR) {
serverLog(LL_WARNING,"AOF reopen, just ignore the last error.");
server.aof_last_write_status = C_OK;
}
return C_OK;
}
/* This is a wrapper to the write syscall in order to retry on short writes
* or if the syscall gets interrupted. It could look strange that we retry
* on short writes given that we are writing to a block device: normally if
* the first call is short, there is a end-of-space condition, so the next
* is likely to fail. However apparently in modern systems this is no longer
* true, and in general it looks just more resilient to retry the write. If
* there is an actual error condition we'll get it at the next try. */
ssize_t aofWrite(int fd, const char *buf, size_t len) {
2017-11-30 10:22:12 +08:00
ssize_t nwritten = 0, totwritten = 0;
while(len) {
nwritten = write(fd, buf, len);
if (nwritten < 0) {
2019-07-12 12:18:33 +02:00
if (errno == EINTR) continue;
2017-11-30 10:22:12 +08:00
return totwritten ? totwritten : -1;
}
len -= nwritten;
buf += nwritten;
totwritten += nwritten;
}
return totwritten;
}
/* Write the append only file buffer on disk.
*
* Since we are required to write the AOF before replying to the client,
* and the only way the client socket can get a write is entering when
* the event loop, we accumulate all the AOF writes in a memory
* buffer and write it on disk using this function just before entering
* the event loop again.
*
* About the 'force' argument:
*
* When the fsync policy is set to 'everysec' we may delay the flush if there
* is still an fsync() going on in the background thread, since for instance
* on Linux write(2) will be blocked by the background fsync anyway.
* When this happens we remember that there is some aof buffer to be
* flushed ASAP, and will try to do that in the serverCron() function.
*
* However if force is set to 1 we'll write regardless of the background
* fsync. */
#define AOF_WRITE_LOG_ERROR_RATE 30 /* Seconds between errors logging. */
void flushAppendOnlyFile(int force) {
ssize_t nwritten;
int sync_in_progress = 0;
mstime_t latency;
if (sdslen(server.aof_buf) == 0) {
/* Check if we need to do fsync even the aof buffer is empty,
* because previously in AOF_FSYNC_EVERYSEC mode, fsync is
* called only when aof buffer is not empty, so if users
* stop write commands before fsync called in one second,
* the data in page cache cannot be flushed in time. */
if (server.aof_fsync == AOF_FSYNC_EVERYSEC &&
server.aof_fsync_offset != server.aof_current_size &&
server.unixtime > server.aof_last_fsync &&
!(sync_in_progress = aofFsyncInProgress())) {
goto try_fsync;
} else {
return;
}
}
if (server.aof_fsync == AOF_FSYNC_EVERYSEC)
sync_in_progress = aofFsyncInProgress();
if (server.aof_fsync == AOF_FSYNC_EVERYSEC && !force) {
/* With this append fsync policy we do background fsyncing.
* If the fsync is still in progress we can try to delay
* the write for a couple of seconds. */
if (sync_in_progress) {
if (server.aof_flush_postponed_start == 0) {
2014-07-12 18:02:17 +02:00
/* No previous write postponing, remember that we are
* postponing the flush and return. */
server.aof_flush_postponed_start = server.unixtime;
return;
} else if (server.unixtime - server.aof_flush_postponed_start < 2) {
2011-09-19 17:49:50 +02:00
/* We were already waiting for fsync to finish, but for less
* than two seconds this is still ok. Postpone again. */
return;
}
/* Otherwise fall through, and go write since we can't wait
* over two seconds. */
server.aof_delayed_fsync++;
2015-07-27 09:41:48 +02:00
serverLog(LL_NOTICE,"Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.");
}
}
/* We want to perform a single write. This should be guaranteed atomic
* at least if the filesystem we are writing is a real physical one.
* While this will save us against the server being killed I don't think
* there is much to do about the whole server stopping for power problems
* or alike */
if (server.aof_flush_sleep && sdslen(server.aof_buf)) {
usleep(server.aof_flush_sleep);
}
latencyStartMonitor(latency);
nwritten = aofWrite(server.aof_fd,server.aof_buf,sdslen(server.aof_buf));
latencyEndMonitor(latency);
/* We want to capture different events for delayed writes:
* when the delay happens with a pending fsync, or with a saving child
* active, and when the above two conditions are missing.
* We also use an additional event name to save all samples which is
* useful for graphing / monitoring purposes. */
if (sync_in_progress) {
latencyAddSampleIfNeeded("aof-write-pending-fsync",latency);
} else if (hasActiveChildProcess()) {
latencyAddSampleIfNeeded("aof-write-active-child",latency);
} else {
latencyAddSampleIfNeeded("aof-write-alone",latency);
}
latencyAddSampleIfNeeded("aof-write",latency);
/* We performed the write so reset the postponed flush sentinel to zero. */
server.aof_flush_postponed_start = 0;
2017-11-30 10:27:12 +08:00
if (nwritten != (ssize_t)sdslen(server.aof_buf)) {
static time_t last_write_error_log = 0;
int can_log = 0;
/* Limit logging rate to 1 line per AOF_WRITE_LOG_ERROR_RATE seconds. */
if ((server.unixtime - last_write_error_log) > AOF_WRITE_LOG_ERROR_RATE) {
can_log = 1;
last_write_error_log = server.unixtime;
}
2014-07-12 18:02:17 +02:00
/* Log the AOF write error and record the error code. */
2011-08-18 12:27:34 +02:00
if (nwritten == -1) {
if (can_log) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Error writing to the AOF file: %s",
strerror(errno));
server.aof_last_write_errno = errno;
}
2011-08-18 12:27:34 +02:00
} else {
if (can_log) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Short write while writing to "
"the AOF file: (nwritten=%lld, "
"expected=%lld)",
(long long)nwritten,
(long long)sdslen(server.aof_buf));
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (ftruncate(server.aof_fd, server.aof_last_incr_size) == -1) {
if (can_log) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING, "Could not remove short write "
"from the append-only file. Redis may refuse "
"to load the AOF the next time it starts. "
"ftruncate: %s", strerror(errno));
}
} else {
2014-07-12 18:02:17 +02:00
/* If the ftruncate() succeeded we can set nwritten to
* -1 since there is no longer partial data into the AOF. */
nwritten = -1;
}
server.aof_last_write_errno = ENOSPC;
}
/* Handle the AOF write error. */
if (server.aof_fsync == AOF_FSYNC_ALWAYS) {
/* We can't recover when the fsync policy is ALWAYS since the reply
* for the client is already in the output buffers (both writes and
* reads), and the changes to the db can't be rolled back. Since we
* have a contract with the user that on acknowledged or observed
* writes are is synced on disk, we must exit. */
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Can't recover from AOF write error when the AOF fsync policy is 'always'. Exiting...");
exit(1);
} else {
/* Recover from failed write leaving data into the buffer. However
* set an error to stop accepting writes as long as the error
* condition is not cleared. */
server.aof_last_write_status = C_ERR;
/* Trim the sds buffer if there was a partial write, and there
* was no way to undo it with ftruncate(2). */
if (nwritten > 0) {
server.aof_current_size += nwritten;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_last_incr_size += nwritten;
sdsrange(server.aof_buf,nwritten,-1);
}
return; /* We'll try again on the next call... */
}
} else {
/* Successful write(2). If AOF was in error state, restore the
* OK state and log the event. */
if (server.aof_last_write_status == C_ERR) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,
"AOF write error looks solved, Redis can write again.");
server.aof_last_write_status = C_OK;
2011-08-18 12:27:34 +02:00
}
}
server.aof_current_size += nwritten;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_last_incr_size += nwritten;
/* Re-use AOF buffer when it is small enough. The maximum comes from the
* arena size of 4k minus some overhead (but is otherwise arbitrary). */
2011-12-21 12:17:02 +01:00
if ((sdslen(server.aof_buf)+sdsavail(server.aof_buf)) < 4000) {
sdsclear(server.aof_buf);
} else {
2011-12-21 12:17:02 +01:00
sdsfree(server.aof_buf);
server.aof_buf = sdsempty();
}
try_fsync:
2011-08-18 12:25:59 +02:00
/* Don't fsync if no-appendfsync-on-rewrite is set to yes and there are
* children doing I/O in the background. */
if (server.aof_no_fsync_on_rewrite && hasActiveChildProcess())
return;
2011-08-18 12:25:59 +02:00
/* Perform the fsync if needed. */
if (server.aof_fsync == AOF_FSYNC_ALWAYS) {
/* redis_fsync is defined as fdatasync() for Linux in order to avoid
* flushing metadata. */
latencyStartMonitor(latency);
/* Let's try to get this data on the disk. To guarantee data safe when
* the AOF fsync policy is 'always', we should exit if failed to fsync
* AOF (see comment next to the exit(1) after write error above). */
if (redis_fsync(server.aof_fd) == -1) {
serverLog(LL_WARNING,"Can't persist AOF for fsync error when the "
"AOF fsync policy is 'always': %s. Exiting...", strerror(errno));
exit(1);
}
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("aof-fsync-always",latency);
server.aof_fsync_offset = server.aof_current_size;
2011-12-21 12:17:02 +01:00
server.aof_last_fsync = server.unixtime;
} else if ((server.aof_fsync == AOF_FSYNC_EVERYSEC &&
2011-12-21 12:17:02 +01:00
server.unixtime > server.aof_last_fsync)) {
if (!sync_in_progress) {
aof_background_fsync(server.aof_fd);
server.aof_fsync_offset = server.aof_current_size;
}
2011-12-21 12:17:02 +01:00
server.aof_last_fsync = server.unixtime;
}
}
2011-08-18 13:03:04 +02:00
sds catAppendOnlyGenericCommand(sds dst, int argc, robj **argv) {
char buf[32];
int len, j;
robj *o;
buf[0] = '*';
len = 1+ll2string(buf+1,sizeof(buf)-1,argc);
buf[len++] = '\r';
buf[len++] = '\n';
dst = sdscatlen(dst,buf,len);
for (j = 0; j < argc; j++) {
2011-08-18 13:03:04 +02:00
o = getDecodedObject(argv[j]);
buf[0] = '$';
len = 1+ll2string(buf+1,sizeof(buf)-1,sdslen(o->ptr));
buf[len++] = '\r';
buf[len++] = '\n';
dst = sdscatlen(dst,buf,len);
dst = sdscatlen(dst,o->ptr,sdslen(o->ptr));
dst = sdscatlen(dst,"\r\n",2);
decrRefCount(o);
}
2011-08-18 13:03:04 +02:00
return dst;
}
/* Generate a piece of timestamp annotation for AOF if current record timestamp
* in AOF is not equal server unix time. If we specify 'force' argument to 1,
* we would generate one without check, currently, it is useful in AOF rewriting
* child process which always needs to record one timestamp at the beginning of
* rewriting AOF.
*
* Timestamp annotation format is "#TS:${timestamp}\r\n". "TS" is short of
* timestamp and this method could save extra bytes in AOF. */
sds genAofTimestampAnnotationIfNeeded(int force) {
sds ts = NULL;
if (force || server.aof_cur_timestamp < server.unixtime) {
server.aof_cur_timestamp = force ? time(NULL) : server.unixtime;
ts = sdscatfmt(sdsempty(), "#TS:%I\r\n", server.aof_cur_timestamp);
serverAssert(sdslen(ts) <= AOF_ANNOTATION_LINE_MAX_LEN);
}
return ts;
}
void feedAppendOnlyFile(int dictid, robj **argv, int argc) {
sds buf = sdsempty();
Sort out mess around propagation and MULTI/EXEC (#9890) The mess: Some parts use alsoPropagate for late propagation, others using an immediate one (propagate()), causing edge cases, ugly/hacky code, and the tendency for bugs The basic idea is that all commands are propagated via alsoPropagate (i.e. added to a list) and the top-most call() is responsible for going over that list and actually propagating them (and wrapping them in MULTI/EXEC if there's more than one command). This is done in the new function, propagatePendingCommands. Callers to propagatePendingCommands: 1. top-most call() (we want all nested call()s to add to the also_propagate array and just the top-most one to propagate them) - via `afterCommand` 2. handleClientsBlockedOnKeys: it is out of call() context and it may propagate stuff - via `afterCommand`. 3. handleClientsBlockedOnKeys edge case: if the looked-up key is already expired, we will propagate the expire but will not unblock any client so `afterCommand` isn't called. in that case, we have to propagate the deletion explicitly. 4. cron stuff: active-expire and eviction may also propagate stuff 5. modules: the module API allows to propagate stuff from just about anywhere (timers, keyspace notifications, threads). I could have tried to catch all the out-of-call-context places but it seemed easier to handle it in one place: when we free the context. in the spirit of what was done in call(), only the top-most freeing of a module context may cause propagation. 6. modules: when using a thread-safe ctx it's not clear when/if the ctx will be freed. we do know that the module must lock the GIL before calling RM_Replicate/RM_Call so we propagate the pending commands when releasing the GIL. A "known limitation", which were actually a bug, was fixed because of this commit (see propagate.tcl): When using a mix of RM_Call with `!` and RM_Replicate, the command would propagate out-of-order: first all the commands from RM_Call, and then the ones from RM_Replicate Another thing worth mentioning is that if, in the past, a client would issue a MULTI/EXEC with just one write command the server would blindly propagate the MULTI/EXEC too, even though it's redundant. not anymore. This commit renames propagate() to propagateNow() in order to cause conflicts in pending PRs. propagatePendingCommands is the only caller of propagateNow, which is now a static, internal helper function. Optimizations: 1. alsoPropagate will not add stuff to also_propagate if there's no AOF and replicas 2. alsoPropagate reallocs also_propagagte exponentially, to save calls to memmove Bugfixes: 1. CONFIG SET can create evictions, sending notifications which can cause to dirty++ with modules. we need to prevent it from propagating to AOF/replicas 2. We need to set current_client in RM_Call. buggy scenario: - CONFIG SET maxmemory, eviction notifications, module hook calls RM_Call - assertion in lookupKey crashes, because current_client has CONFIG SET, which isn't CMD_WRITE 3. minor: in eviction, call propagateDeletion after notification, like active-expire and all commands (we always send a notification before propagating the command)
2021-12-22 23:03:48 +01:00
serverAssert(dictid >= 0 && dictid < server.dbnum);
/* Feed timestamp if needed */
if (server.aof_timestamp_enabled) {
sds ts = genAofTimestampAnnotationIfNeeded(0);
if (ts != NULL) {
buf = sdscatsds(buf, ts);
sdsfree(ts);
}
}
2013-01-17 01:00:20 +08:00
/* The DB this command was targeting is not the same as the last command
2014-07-12 18:02:17 +02:00
* we appended. To issue a SELECT command is needed. */
2011-12-21 12:17:02 +01:00
if (dictid != server.aof_selected_db) {
char seldb[64];
snprintf(seldb,sizeof(seldb),"%d",dictid);
buf = sdscatprintf(buf,"*2\r\n$6\r\nSELECT\r\n$%lu\r\n%s\r\n",
(unsigned long)strlen(seldb),seldb);
2011-12-21 12:17:02 +01:00
server.aof_selected_db = dictid;
}
/* All commands should be propagated the same way in AOF as in replication.
* No need for AOF-specific translation. */
buf = catAppendOnlyGenericCommand(buf,argc,argv);
/* Append to the AOF buffer. This will be flushed on disk just before
* of re-entering the event loop, so before the client will get a
* positive reply about the operation performed. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (server.aof_state == AOF_ON ||
server.child_type == CHILD_TYPE_AOF)
{
server.aof_buf = sdscatlen(server.aof_buf, buf, sdslen(buf));
}
sdsfree(buf);
}
/* ----------------------------------------------------------------------------
* AOF loading
* ------------------------------------------------------------------------- */
/* In Redis commands are always executed in the context of a client, so in
* order to load the append only file we need to create a fake client. */
struct client *createAOFClient(void) {
struct client *c = createClient(NULL);
c->id = CLIENT_ID_AOF; /* So modules can identify it's the AOF client. */
Unified MULTI, LUA, and RM_Call with respect to blocking commands (#8025) Blocking command should not be used with MULTI, LUA, and RM_Call. This is because, the caller, who executes the command in this context, expects a reply. Today, LUA and MULTI have a special (and different) treatment to blocking commands: LUA - Most commands are marked with no-script flag which are checked when executing and command from LUA, commands that are not marked (like XREAD) verify that their blocking mode is not used inside LUA (by checking the CLIENT_LUA client flag). MULTI - Command that is going to block, first verify that the client is not inside multi (by checking the CLIENT_MULTI client flag). If the client is inside multi, they return a result which is a match to the empty key with no timeout (for example blpop inside MULTI will act as lpop) For modules that perform RM_Call with blocking command, the returned results type is REDISMODULE_REPLY_UNKNOWN and the caller can not really know what happened. Disadvantages of the current state are: No unified approach, LUA, MULTI, and RM_Call, each has a different treatment Module can not safely execute blocking command (and get reply or error). Though It is true that modules are not like LUA or MULTI and should be smarter not to execute blocking commands on RM_Call, sometimes you want to execute a command base on client input (for example if you create a module that provides a new scripting language like javascript or python). While modules (on modules command) can check for REDISMODULE_CTX_FLAGS_LUA or REDISMODULE_CTX_FLAGS_MULTI to know not to block the client, there is no way to check if the command came from another module using RM_Call. So there is no way for a module to know not to block another module RM_Call execution. This commit adds a way to unify the treatment for blocking clients by introducing a new CLIENT_DENY_BLOCKING client flag. On LUA, MULTI, and RM_Call the new flag turned on to signify that the client should not be blocked. A blocking command verifies that the flag is turned off before blocking. If a blocking command sees that the CLIENT_DENY_BLOCKING flag is on, it's not blocking and return results which are matches to empty key with no timeout (as MULTI does today). The new flag is checked on the following commands: List blocking commands: BLPOP, BRPOP, BRPOPLPUSH, BLMOVE, Zset blocking commands: BZPOPMIN, BZPOPMAX Stream blocking commands: XREAD, XREADGROUP SUBSCRIBE, PSUBSCRIBE, MONITOR In addition, the new flag is turned on inside the AOF client, we do not want to block the AOF client to prevent deadlocks and commands ordering issues (and there is also an existing assert in the code that verifies it). To keep backward compatibility on LUA, all the no-script flags on existing commands were kept untouched. In addition, a LUA special treatment on XREAD and XREADGROUP was kept. To keep backward compatibility on MULTI (which today allows SUBSCRIBE, and PSUBSCRIBE). We added a special treatment on those commands to allow executing them on MULTI. The only backward compatibility issue that this PR introduces is that now MONITOR is not allowed inside MULTI. Tests were added to verify blocking commands are not blocking the client on LUA, MULTI, or RM_Call. Tests were added to verify the module can check for CLIENT_DENY_BLOCKING flag. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Itamar Haber <itamar@redislabs.com>
2020-11-17 18:58:55 +02:00
/*
* The AOF client should never be blocked (unlike master
* replication connection).
* This is because blocking the AOF client might cause
* deadlock (because potentially no one will unblock it).
* Also, if the AOF client will be blocked just for
* background processing there is a chance that the
* command execution order will be violated.
*/
c->flags = CLIENT_DENY_BLOCKING;
/* We set the fake client as a slave waiting for the synchronization
* so that Redis will not try to send replies to this client. */
2015-07-27 09:41:48 +02:00
c->replstate = SLAVE_STATE_WAIT_BGSAVE_START;
return c;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Replay an append log file. On success AOF_OK or AOF_TRUNCATED is returned,
* otherwise, one of the following is returned:
* AOF_OPEN_ERR: Failed to open the AOF file.
* AOF_NOT_EXIST: AOF file doesn't exist.
* AOF_EMPTY: The AOF file is empty (nothing to load).
* AOF_FAILED: Failed to load the AOF file. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
int loadSingleAppendOnlyFile(char *filename) {
struct client *fakeClient;
struct redis_stat sb;
int old_aof_state = server.aof_state;
long loops = 0;
off_t valid_up_to = 0; /* Offset of latest well-formed command loaded. */
off_t valid_before_multi = 0; /* Offset before MULTI command loaded. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
off_t last_progress_report_size = 0;
int ret = C_OK;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
sds aof_filepath = makePath(server.aof_dirname, filename);
FILE *fp = fopen(aof_filepath, "r");
if (fp == NULL) {
int en = errno;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (redis_stat(aof_filepath, &sb) == 0 || errno != ENOENT) {
serverLog(LL_WARNING,"Fatal error: can't open the append log file %s for reading: %s", filename, strerror(en));
sdsfree(aof_filepath);
return AOF_OPEN_ERR;
} else {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverLog(LL_WARNING,"The append log file %s doesn't exist: %s", filename, strerror(errno));
sdsfree(aof_filepath);
return AOF_NOT_EXIST;
}
}
2011-03-04 16:13:54 +01:00
if (fp && redis_fstat(fileno(fp),&sb) != -1 && sb.st_size == 0) {
fclose(fp);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
sdsfree(aof_filepath);
return AOF_EMPTY;
2011-03-04 16:13:54 +01:00
}
/* Temporarily disable AOF, to prevent EXEC from feeding a MULTI
* to the same file we're about to read. */
2015-07-27 09:41:48 +02:00
server.aof_state = AOF_OFF;
client *old_client = server.current_client;
fakeClient = server.current_client = createAOFClient();
/* Check if the AOF file is in RDB format (it may be RDB encoded base AOF
* or old style RDB-preamble AOF). In that case we need to load the RDB file
* and later continue loading the AOF tail if it is an old style RDB-preamble AOF. */
char sig[5]; /* "REDIS" */
if (fread(sig,1,5,fp) != 5 || memcmp(sig,"REDIS",5) != 0) {
/* Not in RDB format, seek back at 0 offset. */
if (fseek(fp,0,SEEK_SET) == -1) goto readerr;
} else {
/* RDB format. Pass loading the RDB functions. */
rio rdb;
int old_style = !strcmp(filename, server.aof_filename);
if (old_style)
serverLog(LL_NOTICE, "Reading RDB preamble from AOF file...");
else
serverLog(LL_NOTICE, "Reading RDB base file on AOF loading...");
if (fseek(fp,0,SEEK_SET) == -1) goto readerr;
rioInitWithFile(&rdb,fp);
if (rdbLoadRio(&rdb,RDBFLAGS_AOF_PREAMBLE,NULL) != C_OK) {
if (old_style)
serverLog(LL_WARNING, "Error reading the RDB preamble of the AOF file %s, AOF loading aborted", filename);
else
serverLog(LL_WARNING, "Error reading the RDB base file %s, AOF loading aborted", filename);
goto readerr;
} else {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
loadingAbsProgress(ftello(fp));
last_progress_report_size = ftello(fp);
if (old_style) serverLog(LL_NOTICE, "Reading the remaining AOF tail...");
}
}
/* Read the actual AOF file, in REPL format, command by command. */
while(1) {
int argc, j;
unsigned long len;
robj **argv;
char buf[AOF_ANNOTATION_LINE_MAX_LEN];
sds argsds;
struct redisCommand *cmd;
/* Serve the clients from time to time */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (!(loops++ % 1024)) {
off_t progress_delta = ftello(fp) - last_progress_report_size;
loadingIncrProgress(progress_delta);
last_progress_report_size += progress_delta;
processEventsWhileBlocked();
processModuleLoadingProgressEvent(1);
}
if (fgets(buf,sizeof(buf),fp) == NULL) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (feof(fp)) {
break;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
} else {
goto readerr;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
}
}
if (buf[0] == '#') continue; /* Skip annotations */
if (buf[0] != '*') goto fmterr;
2014-09-05 11:48:35 +02:00
if (buf[1] == '\0') goto readerr;
argc = atoi(buf+1);
if (argc < 1) goto fmterr;
if ((size_t)argc > SIZE_MAX / sizeof(robj*)) goto fmterr;
/* Load the next command in the AOF as our fake client
* argv. */
argv = zmalloc(sizeof(robj*)*argc);
fakeClient->argc = argc;
fakeClient->argv = argv;
fakeClient->argv_len = argc;
for (j = 0; j < argc; j++) {
/* Parse the argument len. */
char *readres = fgets(buf,sizeof(buf),fp);
if (readres == NULL || buf[0] != '$') {
fakeClient->argc = j; /* Free up to j-1. */
freeClientArgv(fakeClient);
if (readres == NULL)
goto readerr;
else
goto fmterr;
}
len = strtol(buf+1,NULL,10);
/* Read it into a string object. */
argsds = sdsnewlen(SDS_NOINIT,len);
if (len && fread(argsds,len,1,fp) == 0) {
sdsfree(argsds);
fakeClient->argc = j; /* Free up to j-1. */
freeClientArgv(fakeClient);
goto readerr;
}
argv[j] = createObject(OBJ_STRING,argsds);
/* Discard CRLF. */
if (fread(buf,2,1,fp) == 0) {
fakeClient->argc = j+1; /* Free up to j. */
freeClientArgv(fakeClient);
goto readerr;
}
}
/* Command lookup */
Treat subcommands as commands (#9504) ## Intro The purpose is to allow having different flags/ACL categories for subcommands (Example: CONFIG GET is ok-loading but CONFIG SET isn't) We create a small command table for every command that has subcommands and each subcommand has its own flags, etc. (same as a "regular" command) This commit also unites the Redis and the Sentinel command tables ## Affected commands CONFIG Used to have "admin ok-loading ok-stale no-script" Changes: 1. Dropped "ok-loading" in all except GET (this doesn't change behavior since there were checks in the code doing that) XINFO Used to have "read-only random" Changes: 1. Dropped "random" in all except CONSUMERS XGROUP Used to have "write use-memory" Changes: 1. Dropped "use-memory" in all except CREATE and CREATECONSUMER COMMAND No changes. MEMORY Used to have "random read-only" Changes: 1. Dropped "random" in PURGE and USAGE ACL Used to have "admin no-script ok-loading ok-stale" Changes: 1. Dropped "admin" in WHOAMI, GENPASS, and CAT LATENCY No changes. MODULE No changes. SLOWLOG Used to have "admin random ok-loading ok-stale" Changes: 1. Dropped "random" in RESET OBJECT Used to have "read-only random" Changes: 1. Dropped "random" in ENCODING and REFCOUNT SCRIPT Used to have "may-replicate no-script" Changes: 1. Dropped "may-replicate" in all except FLUSH and LOAD CLIENT Used to have "admin no-script random ok-loading ok-stale" Changes: 1. Dropped "random" in all except INFO and LIST 2. Dropped "admin" in ID, TRACKING, CACHING, GETREDIR, INFO, SETNAME, GETNAME, and REPLY STRALGO No changes. PUBSUB No changes. CLUSTER Changes: 1. Dropped "admin in countkeysinslots, getkeysinslot, info, nodes, keyslot, myid, and slots SENTINEL No changes. (note that DEBUG also fits, but we decided not to convert it since it's for debugging and anyway undocumented) ## New sub-command This commit adds another element to the per-command output of COMMAND, describing the list of subcommands, if any (in the same structure as "regular" commands) Also, it adds a new subcommand: ``` COMMAND LIST [FILTERBY (MODULE <module-name>|ACLCAT <cat>|PATTERN <pattern>)] ``` which returns a set of all commands (unless filters), but excluding subcommands. ## Module API A new module API, RM_CreateSubcommand, was added, in order to allow module writer to define subcommands ## ACL changes: 1. Now, that each subcommand is actually a command, each has its own ACL id. 2. The old mechanism of allowed_subcommands is redundant (blocking/allowing a subcommand is the same as blocking/allowing a regular command), but we had to keep it, to support the widespread usage of allowed_subcommands to block commands with certain args, that aren't subcommands (e.g. "-select +select|0"). 3. I have renamed allowed_subcommands to allowed_firstargs to emphasize the difference. 4. Because subcommands are commands in ACL too, you can now use "-" to block subcommands (e.g. "+client -client|kill"), which wasn't possible in the past. 5. It is also possible to use the allowed_firstargs mechanism with subcommand. For example: `+config -config|set +config|set|loglevel` will block all CONFIG SET except for setting the log level. 6. All of the ACL changes above required some amount of refactoring. ## Misc 1. There are two approaches: Either each subcommand has its own function or all subcommands use the same function, determining what to do according to argv[0]. For now, I took the former approaches only with CONFIG and COMMAND, while other commands use the latter approach (for smaller blamelog diff). 2. Deleted memoryGetKeys: It is no longer needed because MEMORY USAGE now uses the "range" key spec. 4. Bugfix: GETNAME was missing from CLIENT's help message. 5. Sentinel and Redis now use the same table, with the same function pointer. Some commands have a different implementation in Sentinel, so we redirect them (these are ROLE, PUBLISH, and INFO). 6. Command stats now show the stats per subcommand (e.g. instead of stats just for "config" you will have stats for "config|set", "config|get", etc.) 7. It is now possible to use COMMAND directly on subcommands: COMMAND INFO CONFIG|GET (The pipeline syntax was inspired from ACL, and can be used in functions lookupCommandBySds and lookupCommandByCString) 8. STRALGO is now a container command (has "help") ## Breaking changes: 1. Command stats now show the stats per subcommand (see (5) above)
2021-10-20 10:52:57 +02:00
cmd = lookupCommand(argv,argc);
if (!cmd) {
serverLog(LL_WARNING,
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
"Unknown command '%s' reading the append only file %s",
(char*)argv[0]->ptr, filename);
freeClientArgv(fakeClient);
ret = AOF_FAILED;
goto cleanup;
}
if (cmd->proc == multiCommand) valid_before_multi = valid_up_to;
/* Run the command in the context of a fake client */
fakeClient->cmd = fakeClient->lastcmd = cmd;
if (fakeClient->flags & CLIENT_MULTI &&
fakeClient->cmd->proc != execCommand)
{
queueMultiCommand(fakeClient);
} else {
cmd->proc(fakeClient);
}
/* The fake client should not have a reply */
serverAssert(fakeClient->bufpos == 0 &&
listLength(fakeClient->reply) == 0);
/* The fake client should never get blocked */
2015-07-27 09:41:48 +02:00
serverAssert((fakeClient->flags & CLIENT_BLOCKED) == 0);
/* Clean up. Command code may have changed argv/argc so we use the
* argv/argc of the client instead of the local variables. */
freeClientArgv(fakeClient);
if (server.aof_load_truncated) valid_up_to = ftello(fp);
if (server.key_load_delay)
debugDelay(server.key_load_delay);
}
/* This point can only be reached when EOF is reached without errors.
* If the client is in the middle of a MULTI/EXEC, handle it as it was
* a short read, even if technically the protocol is correct: we want
* to remove the unprocessed tail and continue. */
if (fakeClient->flags & CLIENT_MULTI) {
serverLog(LL_WARNING,
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
"Revert incomplete MULTI/EXEC transaction in AOF file %s", filename);
valid_up_to = valid_before_multi;
goto uxeof;
}
loaded_ok: /* DB loaded, cleanup and return C_OK to the caller. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
loadingIncrProgress(ftello(fp) - last_progress_report_size);
server.aof_state = old_aof_state;
goto cleanup;
readerr: /* Read error. If feof(fp) is true, fall through to unexpected EOF. */
if (!feof(fp)) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverLog(LL_WARNING,"Unrecoverable error reading the append only file %s: %s", filename, strerror(errno));
ret = AOF_FAILED;
goto cleanup;
}
uxeof: /* Unexpected AOF end of file. */
2014-09-05 11:48:35 +02:00
if (server.aof_load_truncated) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverLog(LL_WARNING,"!!! Warning: short read while loading the AOF file %s!!!", filename);
serverLog(LL_WARNING,"!!! Truncating the AOF %s at offset %llu !!!",
filename, (unsigned long long) valid_up_to);
if (valid_up_to == -1 || truncate(aof_filepath,valid_up_to) == -1) {
if (valid_up_to == -1) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Last valid command offset is invalid");
} else {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverLog(LL_WARNING,"Error truncating the AOF file %s: %s",
filename, strerror(errno));
}
} else {
/* Make sure the AOF file descriptor points to the end of the
* file after the truncate call. */
if (server.aof_fd != -1 && lseek(server.aof_fd,0,SEEK_END) == -1) {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverLog(LL_WARNING,"Can't seek the end of the AOF file %s: %s",
filename, strerror(errno));
} else {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
"AOF %s loaded anyway because aof-load-truncated is enabled", filename);
ret = AOF_TRUNCATED;
goto loaded_ok;
}
}
2014-09-05 11:48:35 +02:00
}
serverLog(LL_WARNING, "Unexpected end of file reading the append only file %s. You can: "
"1) Make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>. "
"2) Alternatively you can set the 'aof-load-truncated' configuration option to yes and restart the server.", filename);
ret = AOF_FAILED;
goto cleanup;
fmterr: /* Format error. */
serverLog(LL_WARNING, "Bad file format reading the append only file %s: "
"make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>", filename);
ret = AOF_FAILED;
/* fall through to cleanup. */
cleanup:
if (fakeClient) freeClient(fakeClient);
server.current_client = old_client;
fclose(fp);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
sdsfree(aof_filepath);
return ret;
}
/* Load the AOF files according the aofManifest pointed by am. */
int loadAppendOnlyFiles(aofManifest *am) {
serverAssert(am != NULL);
int ret = C_OK;
long long start;
off_t total_size = 0;
sds aof_name;
int total_num, aof_num = 0, last_file;
/* If the 'server.aof_filename' file exists in dir, we may be starting
* from an old redis version. We will use enter upgrade mode in three situations.
*
* 1. If the 'server.aof_dirname' directory not exist
* 2. If the 'server.aof_dirname' directory exists but the manifest file is missing
* 3. If the 'server.aof_dirname' directory exists and the manifest file it contains
* has only one base AOF record, and the file name of this base AOF is 'server.aof_filename',
* and the 'server.aof_filename' file not exist in 'server.aof_dirname' directory
* */
if (fileExist(server.aof_filename)) {
if (!dirExists(server.aof_dirname) ||
(am->base_aof_info == NULL && listLength(am->incr_aof_list) == 0) ||
(am->base_aof_info != NULL && listLength(am->incr_aof_list) == 0 &&
!strcmp(am->base_aof_info->file_name, server.aof_filename) && !aofFileExist(server.aof_filename)))
{
aofUpgradePrepare(am);
}
}
if (am->base_aof_info == NULL && listLength(am->incr_aof_list) == 0) {
return AOF_NOT_EXIST;
}
total_num = getBaseAndIncrAppendOnlyFilesNum(am);
serverAssert(total_num > 0);
/* Here we calculate the total size of all BASE and INCR files in
* advance, it will be set to `server.loading_total_bytes`. */
total_size = getBaseAndIncrAppendOnlyFilesSize(am);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
startLoading(total_size, RDBFLAGS_AOF_PREAMBLE, 0);
/* Load BASE AOF if needed. */
if (am->base_aof_info) {
serverAssert(am->base_aof_info->file_type == AOF_FILE_TYPE_BASE);
aof_name = (char*)am->base_aof_info->file_name;
updateLoadingFileName(aof_name);
last_file = ++aof_num == total_num;
start = ustime();
ret = loadSingleAppendOnlyFile(aof_name);
if (ret == AOF_OK || (ret == AOF_TRUNCATED && last_file)) {
serverLog(LL_NOTICE, "DB loaded from base file %s: %.3f seconds",
aof_name, (float)(ustime()-start)/1000000);
}
/* If an AOF exists in the manifest but not on the disk, Or the truncated
* file is not the last file, we consider this to be a fatal error. */
if (ret == AOF_NOT_EXIST || (ret == AOF_TRUNCATED && !last_file)) {
ret = AOF_FAILED;
}
if (ret == AOF_OPEN_ERR || ret == AOF_FAILED) {
goto cleanup;
}
}
/* Load INCR AOFs if needed. */
if (listLength(am->incr_aof_list)) {
listNode *ln;
listIter li;
listRewind(am->incr_aof_list, &li);
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
serverAssert(ai->file_type == AOF_FILE_TYPE_INCR);
aof_name = (char*)ai->file_name;
updateLoadingFileName(aof_name);
last_file = ++aof_num == total_num;
start = ustime();
ret = loadSingleAppendOnlyFile(aof_name);
if (ret == AOF_OK || (ret == AOF_TRUNCATED && last_file)) {
serverLog(LL_NOTICE, "DB loaded from incr file %s: %.3f seconds",
aof_name, (float)(ustime()-start)/1000000);
}
if (ret == AOF_NOT_EXIST || (ret == AOF_TRUNCATED && !last_file)) {
ret = AOF_FAILED;
}
if (ret == AOF_OPEN_ERR || ret == AOF_FAILED) {
goto cleanup;
}
}
}
server.aof_current_size = total_size;
server.aof_rewrite_base_size = server.aof_current_size;
server.aof_fsync_offset = server.aof_current_size;
cleanup:
stopLoading(ret == AOF_OK);
return ret;
}
/* ----------------------------------------------------------------------------
* AOF rewrite
* ------------------------------------------------------------------------- */
2011-05-14 12:36:22 +02:00
/* Delegate writing an object to writing a bulk string or bulk long long.
2016-04-25 16:49:57 +03:00
* This is not placed in rio.c since that adds the server.h dependency. */
2011-05-14 12:36:22 +02:00
int rioWriteBulkObject(rio *r, robj *obj) {
/* Avoid using getDecodedObject to help copy-on-write (we are often
* in a child process when this function is called). */
if (obj->encoding == OBJ_ENCODING_INT) {
2011-05-14 12:36:22 +02:00
return rioWriteBulkLongLong(r,(long)obj->ptr);
} else if (sdsEncodedObject(obj)) {
2011-05-14 12:36:22 +02:00
return rioWriteBulkString(r,obj->ptr,sdslen(obj->ptr));
} else {
2015-07-27 09:41:48 +02:00
serverPanic("Unknown string encoding");
2011-05-14 12:36:22 +02:00
}
}
/* Emit the commands needed to rebuild a list object.
* The function returns 0 on error, 1 on success. */
int rewriteListObject(rio *r, robj *key, robj *o) {
long long count = 0, items = listTypeLength(o);
if (o->encoding == OBJ_ENCODING_QUICKLIST) {
quicklist *list = o->ptr;
quicklistIter *li = quicklistGetIterator(list, AL_START_HEAD);
quicklistEntry entry;
while (quicklistNext(li,&entry)) {
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
if (!rioWriteBulkCount(r,'*',2+cmd_items) ||
!rioWriteBulkString(r,"RPUSH",5) ||
!rioWriteBulkObject(r,key))
{
quicklistReleaseIterator(li);
return 0;
}
}
if (entry.value) {
if (!rioWriteBulkString(r,(char*)entry.value,entry.sz)) {
quicklistReleaseIterator(li);
return 0;
}
} else {
if (!rioWriteBulkLongLong(r,entry.longval)) {
quicklistReleaseIterator(li);
return 0;
}
}
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
items--;
}
quicklistReleaseIterator(li);
} else {
2015-07-27 09:41:48 +02:00
serverPanic("Unknown list encoding");
}
return 1;
}
/* Emit the commands needed to rebuild a set object.
* The function returns 0 on error, 1 on success. */
int rewriteSetObject(rio *r, robj *key, robj *o) {
long long count = 0, items = setTypeSize(o);
if (o->encoding == OBJ_ENCODING_INTSET) {
int ii = 0;
int64_t llval;
while(intsetGet(o->ptr,ii++,&llval)) {
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
if (!rioWriteBulkCount(r,'*',2+cmd_items) ||
!rioWriteBulkString(r,"SADD",4) ||
!rioWriteBulkObject(r,key))
{
return 0;
}
}
if (!rioWriteBulkLongLong(r,llval)) return 0;
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
items--;
}
} else if (o->encoding == OBJ_ENCODING_HT) {
dictIterator *di = dictGetIterator(o->ptr);
dictEntry *de;
while((de = dictNext(di)) != NULL) {
sds ele = dictGetKey(de);
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
if (!rioWriteBulkCount(r,'*',2+cmd_items) ||
!rioWriteBulkString(r,"SADD",4) ||
!rioWriteBulkObject(r,key))
{
dictReleaseIterator(di);
return 0;
}
}
if (!rioWriteBulkString(r,ele,sdslen(ele))) {
dictReleaseIterator(di);
return 0;
}
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
items--;
}
dictReleaseIterator(di);
} else {
2015-07-27 09:41:48 +02:00
serverPanic("Unknown set encoding");
}
return 1;
}
/* Emit the commands needed to rebuild a sorted set object.
* The function returns 0 on error, 1 on success. */
int rewriteSortedSetObject(rio *r, robj *key, robj *o) {
long long count = 0, items = zsetLength(o);
Replace all usage of ziplist with listpack for t_zset (#9366) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.
2021-09-09 23:18:53 +08:00
if (o->encoding == OBJ_ENCODING_LISTPACK) {
unsigned char *zl = o->ptr;
unsigned char *eptr, *sptr;
unsigned char *vstr;
unsigned int vlen;
long long vll;
double score;
Replace all usage of ziplist with listpack for t_zset (#9366) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.
2021-09-09 23:18:53 +08:00
eptr = lpSeek(zl,0);
2015-07-26 15:29:53 +02:00
serverAssert(eptr != NULL);
Replace all usage of ziplist with listpack for t_zset (#9366) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.
2021-09-09 23:18:53 +08:00
sptr = lpNext(zl,eptr);
2015-07-26 15:29:53 +02:00
serverAssert(sptr != NULL);
while (eptr != NULL) {
Replace all usage of ziplist with listpack for t_zset (#9366) Part two of implementing #8702 (zset), after #8887. ## Description of the feature Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance. ## Rdb format changes New `RDB_TYPE_ZSET_LISTPACK` rdb type. ## Rdb loading improvements: 1) Pre-expansion of dict for validation of duplicate data for listpack and ziplist. 2) Simplifying the release of empty key objects when RDB loading. 3) Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c. ## Interface changes 1) New `zset-max-listpack-entries` config is an alias for `zset-max-ziplist-entries` (same with `zset-max-listpack-value`). 2) OBJECT ENCODING will return listpack instead of ziplist. ## Listpack improvements: 1) Add `lpDeleteRange` and `lpDeleteRangeWithEntry` functions to delete a range of entries from listpack. 2) Improve the performance of `lpCompare`, converting from string to integer is faster than converting from integer to string. 3) Replace `snprintf` with `ll2string` to improve performance in converting numbers to strings in `lpGet()`. ## Zset improvements: 1) Improve the performance of `zzlFind` method, use `lpFind` instead of `lpCompare` in a loop. 2) Use `lpDeleteRangeWithEntry` instead of `lpDelete` twice to delete a element of zset. ## Tests 1) Add some unittests for `lpDeleteRange` and `lpDeleteRangeWithEntry` function. 2) Add zset RDB loading test. 3) Add benchmark test for `lpCompare` and `ziplsitCompare`. 4) Add empty listpack zset corrupt dump test.
2021-09-09 23:18:53 +08:00
vstr = lpGetValue(eptr,&vlen,&vll);
score = zzlGetScore(sptr);
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
if (!rioWriteBulkCount(r,'*',2+cmd_items*2) ||
!rioWriteBulkString(r,"ZADD",4) ||
!rioWriteBulkObject(r,key))
{
return 0;
}
}
if (!rioWriteBulkDouble(r,score)) return 0;
if (vstr != NULL) {
if (!rioWriteBulkString(r,(char*)vstr,vlen)) return 0;
} else {
if (!rioWriteBulkLongLong(r,vll)) return 0;
}
zzlNext(zl,&eptr,&sptr);
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
items--;
}
} else if (o->encoding == OBJ_ENCODING_SKIPLIST) {
zset *zs = o->ptr;
dictIterator *di = dictGetIterator(zs->dict);
dictEntry *de;
while((de = dictNext(di)) != NULL) {
sds ele = dictGetKey(de);
double *score = dictGetVal(de);
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
if (!rioWriteBulkCount(r,'*',2+cmd_items*2) ||
!rioWriteBulkString(r,"ZADD",4) ||
!rioWriteBulkObject(r,key))
{
dictReleaseIterator(di);
return 0;
}
}
if (!rioWriteBulkDouble(r,*score) ||
!rioWriteBulkString(r,ele,sdslen(ele)))
{
dictReleaseIterator(di);
return 0;
}
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
items--;
}
dictReleaseIterator(di);
} else {
2015-07-27 09:41:48 +02:00
serverPanic("Unknown sorted zset encoding");
}
return 1;
}
/* Write either the key or the value of the currently selected item of a hash.
* The 'hi' argument passes a valid Redis hash iterator.
* The 'what' filed specifies if to write a key or a value and can be
* either OBJ_HASH_KEY or OBJ_HASH_VALUE.
*
* The function returns 0 on error, non-zero on success. */
2012-01-02 22:14:10 -08:00
static int rioWriteHashIteratorCursor(rio *r, hashTypeIterator *hi, int what) {
if (hi->encoding == OBJ_ENCODING_LISTPACK) {
2012-01-02 22:14:10 -08:00
unsigned char *vstr = NULL;
unsigned int vlen = UINT_MAX;
long long vll = LLONG_MAX;
hashTypeCurrentFromListpack(hi, what, &vstr, &vlen, &vll);
if (vstr)
2012-01-02 22:14:10 -08:00
return rioWriteBulkString(r, (char*)vstr, vlen);
else
2012-01-02 22:14:10 -08:00
return rioWriteBulkLongLong(r, vll);
} else if (hi->encoding == OBJ_ENCODING_HT) {
sds value = hashTypeCurrentFromHashTable(hi, what);
return rioWriteBulkString(r, value, sdslen(value));
2012-01-02 22:14:10 -08:00
}
2015-07-27 09:41:48 +02:00
serverPanic("Unknown hash encoding");
2012-01-02 22:14:10 -08:00
return 0;
}
2011-12-12 17:39:23 +01:00
/* Emit the commands needed to rebuild a hash object.
* The function returns 0 on error, 1 on success. */
int rewriteHashObject(rio *r, robj *key, robj *o) {
2012-01-02 22:14:10 -08:00
hashTypeIterator *hi;
2011-12-12 17:39:23 +01:00
long long count = 0, items = hashTypeLength(o);
2012-01-02 22:14:10 -08:00
hi = hashTypeInitIterator(o);
while (hashTypeNext(hi) != C_ERR) {
2012-01-02 22:14:10 -08:00
if (count == 0) {
2015-07-27 09:41:48 +02:00
int cmd_items = (items > AOF_REWRITE_ITEMS_PER_CMD) ?
AOF_REWRITE_ITEMS_PER_CMD : items;
2011-12-12 17:39:23 +01:00
if (!rioWriteBulkCount(r,'*',2+cmd_items*2) ||
!rioWriteBulkString(r,"HMSET",5) ||
!rioWriteBulkObject(r,key))
{
hashTypeReleaseIterator(hi);
return 0;
}
2011-12-12 17:39:23 +01:00
}
if (!rioWriteHashIteratorCursor(r, hi, OBJ_HASH_KEY) ||
!rioWriteHashIteratorCursor(r, hi, OBJ_HASH_VALUE))
{
hashTypeReleaseIterator(hi);
return 0;
}
2015-07-27 09:41:48 +02:00
if (++count == AOF_REWRITE_ITEMS_PER_CMD) count = 0;
2012-01-02 22:14:10 -08:00
items--;
}
2011-12-12 17:39:23 +01:00
2012-01-02 22:14:10 -08:00
hashTypeReleaseIterator(hi);
2011-12-12 17:39:23 +01:00
return 1;
}
2018-03-23 17:21:31 +01:00
/* Helper for rewriteStreamObject() that generates a bulk string into the
* AOF representing the ID 'id'. */
int rioWriteBulkStreamID(rio *r,streamID *id) {
int retval;
sds replyid = sdscatfmt(sdsempty(),"%U-%U",id->ms,id->seq);
2020-01-13 13:25:37 +01:00
retval = rioWriteBulkString(r,replyid,sdslen(replyid));
2018-03-23 17:21:31 +01:00
sdsfree(replyid);
return retval;
}
/* Helper for rewriteStreamObject(): emit the XCLAIM needed in order to
* add the message described by 'nack' having the id 'rawid', into the pending
* list of the specified consumer. All this in the context of the specified
* key and group. */
int rioWriteStreamPendingEntry(rio *r, robj *key, const char *groupname, size_t groupname_len, streamConsumer *consumer, unsigned char *rawid, streamNACK *nack) {
/* XCLAIM <key> <group> <consumer> 0 <id> TIME <milliseconds-unix-time>
RETRYCOUNT <count> JUSTID FORCE. */
streamID id;
streamDecodeID(rawid,&id);
if (rioWriteBulkCount(r,'*',12) == 0) return 0;
if (rioWriteBulkString(r,"XCLAIM",6) == 0) return 0;
if (rioWriteBulkObject(r,key) == 0) return 0;
if (rioWriteBulkString(r,groupname,groupname_len) == 0) return 0;
if (rioWriteBulkString(r,consumer->name,sdslen(consumer->name)) == 0) return 0;
if (rioWriteBulkString(r,"0",1) == 0) return 0;
if (rioWriteBulkStreamID(r,&id) == 0) return 0;
if (rioWriteBulkString(r,"TIME",4) == 0) return 0;
if (rioWriteBulkLongLong(r,nack->delivery_time) == 0) return 0;
if (rioWriteBulkString(r,"RETRYCOUNT",10) == 0) return 0;
if (rioWriteBulkLongLong(r,nack->delivery_count) == 0) return 0;
if (rioWriteBulkString(r,"JUSTID",6) == 0) return 0;
if (rioWriteBulkString(r,"FORCE",5) == 0) return 0;
return 1;
}
/* Helper for rewriteStreamObject(): emit the XGROUP CREATECONSUMER is
* needed in order to create consumers that do not have any pending entries.
* All this in the context of the specified key and group. */
int rioWriteStreamEmptyConsumer(rio *r, robj *key, const char *groupname, size_t groupname_len, streamConsumer *consumer) {
/* XGROUP CREATECONSUMER <key> <group> <consumer> */
if (rioWriteBulkCount(r,'*',5) == 0) return 0;
if (rioWriteBulkString(r,"XGROUP",6) == 0) return 0;
if (rioWriteBulkString(r,"CREATECONSUMER",14) == 0) return 0;
if (rioWriteBulkObject(r,key) == 0) return 0;
if (rioWriteBulkString(r,groupname,groupname_len) == 0) return 0;
if (rioWriteBulkString(r,consumer->name,sdslen(consumer->name)) == 0) return 0;
return 1;
}
/* Emit the commands needed to rebuild a stream object.
* The function returns 0 on error, 1 on success. */
int rewriteStreamObject(rio *r, robj *key, robj *o) {
2018-03-23 17:21:31 +01:00
stream *s = o->ptr;
streamIterator si;
2018-03-23 17:21:31 +01:00
streamIteratorStart(&si,s,NULL,NULL,0);
streamID id;
int64_t numfields;
if (s->length) {
/* Reconstruct the stream data using XADD commands. */
while(streamIteratorGetID(&si,&id,&numfields)) {
/* Emit a two elements array for each item. The first is
* the ID, the second is an array of field-value pairs. */
/* Emit the XADD <key> <id> ...fields... command. */
if (!rioWriteBulkCount(r,'*',3+numfields*2) ||
!rioWriteBulkString(r,"XADD",4) ||
!rioWriteBulkObject(r,key) ||
!rioWriteBulkStreamID(r,&id))
{
streamIteratorStop(&si);
return 0;
}
while(numfields--) {
unsigned char *field, *value;
int64_t field_len, value_len;
streamIteratorGetField(&si,&field,&value,&field_len,&value_len);
if (!rioWriteBulkString(r,(char*)field,field_len) ||
!rioWriteBulkString(r,(char*)value,value_len))
{
streamIteratorStop(&si);
return 0;
}
}
}
} else {
/* Use the XADD MAXLEN 0 trick to generate an empty stream if
* the key we are serializing is an empty string, which is possible
* for the Stream type. */
id.ms = 0; id.seq = 1;
if (!rioWriteBulkCount(r,'*',7) ||
!rioWriteBulkString(r,"XADD",4) ||
!rioWriteBulkObject(r,key) ||
!rioWriteBulkString(r,"MAXLEN",6) ||
!rioWriteBulkString(r,"0",1) ||
!rioWriteBulkStreamID(r,&id) ||
!rioWriteBulkString(r,"x",1) ||
!rioWriteBulkString(r,"y",1))
{
streamIteratorStop(&si);
return 0;
}
}
2018-03-23 17:21:31 +01:00
/* Append XSETID after XADD, make sure lastid is correct,
* in case of XDEL lastid. */
if (!rioWriteBulkCount(r,'*',3) ||
!rioWriteBulkString(r,"XSETID",6) ||
!rioWriteBulkObject(r,key) ||
!rioWriteBulkStreamID(r,&s->last_id))
{
streamIteratorStop(&si);
return 0;
}
2018-03-23 17:21:31 +01:00
/* Create all the stream consumer groups. */
if (s->cgroups) {
raxIterator ri;
raxStart(&ri,s->cgroups);
raxSeek(&ri,"^",NULL,0);
while(raxNext(&ri)) {
streamCG *group = ri.data;
/* Emit the XGROUP CREATE in order to create the group. */
if (!rioWriteBulkCount(r,'*',5) ||
!rioWriteBulkString(r,"XGROUP",6) ||
!rioWriteBulkString(r,"CREATE",6) ||
!rioWriteBulkObject(r,key) ||
!rioWriteBulkString(r,(char*)ri.key,ri.key_len) ||
!rioWriteBulkStreamID(r,&group->last_id))
{
raxStop(&ri);
streamIteratorStop(&si);
return 0;
}
2018-03-23 17:21:31 +01:00
/* Generate XCLAIMs for each consumer that happens to
* have pending entries. Empty consumers would be generated with
* XGROUP CREATECONSUMER. */
2018-03-23 17:21:31 +01:00
raxIterator ri_cons;
raxStart(&ri_cons,group->consumers);
raxSeek(&ri_cons,"^",NULL,0);
while(raxNext(&ri_cons)) {
streamConsumer *consumer = ri_cons.data;
/* If there are no pending entries, just emit XGROUP CREATECONSUMER */
if (raxSize(consumer->pel) == 0) {
if (rioWriteStreamEmptyConsumer(r,key,(char*)ri.key,
ri.key_len,consumer) == 0)
{
raxStop(&ri_cons);
raxStop(&ri);
streamIteratorStop(&si);
return 0;
}
continue;
}
2018-03-23 17:21:31 +01:00
/* For the current consumer, iterate all the PEL entries
* to emit the XCLAIM protocol. */
raxIterator ri_pel;
raxStart(&ri_pel,consumer->pel);
raxSeek(&ri_pel,"^",NULL,0);
while(raxNext(&ri_pel)) {
streamNACK *nack = ri_pel.data;
if (rioWriteStreamPendingEntry(r,key,(char*)ri.key,
ri.key_len,consumer,
ri_pel.key,nack) == 0)
{
raxStop(&ri_pel);
raxStop(&ri_cons);
raxStop(&ri);
streamIteratorStop(&si);
2018-03-23 17:21:31 +01:00
return 0;
}
}
raxStop(&ri_pel);
}
raxStop(&ri_cons);
}
raxStop(&ri);
}
streamIteratorStop(&si);
return 1;
}
/* Call the module type callback in order to rewrite a data type
2016-08-09 11:07:32 +02:00
* that is exported by a module and is not handled by Redis itself.
* The function returns 0 on error, 1 on success. */
int rewriteModuleObject(rio *r, robj *key, robj *o, int dbid) {
RedisModuleIO io;
moduleValue *mv = o->ptr;
moduleType *mt = mv->type;
moduleInitIOContext(io,mt,r,key,dbid);
mt->aof_rewrite(&io,key,mv->value);
if (io.ctx) {
moduleFreeContext(io.ctx);
zfree(io.ctx);
}
return io.error ? 0 : 1;
}
static int rewriteFunctions(rio *aof) {
dict *functions = functionsLibGet();
dictIterator *iter = dictGetIterator(functions);
dictEntry *entry = NULL;
while ((entry = dictNext(iter))) {
functionLibInfo *li = dictGetVal(entry);
if (li->desc) {
if (rioWrite(aof, "*7\r\n", 4) == 0) goto werr;
} else {
if (rioWrite(aof, "*5\r\n", 4) == 0) goto werr;
}
char function_load[] = "$8\r\nFUNCTION\r\n$4\r\nLOAD\r\n";
if (rioWrite(aof, function_load, sizeof(function_load) - 1) == 0) goto werr;
if (rioWriteBulkString(aof, li->ei->name, sdslen(li->ei->name)) == 0) goto werr;
if (rioWriteBulkString(aof, li->name, sdslen(li->name)) == 0) goto werr;
if (li->desc) {
if (rioWriteBulkString(aof, "description", 11) == 0) goto werr;
if (rioWriteBulkString(aof, li->desc, sdslen(li->desc)) == 0) goto werr;
}
if (rioWriteBulkString(aof, li->code, sdslen(li->code)) == 0) goto werr;
}
dictReleaseIterator(iter);
return 1;
werr:
dictReleaseIterator(iter);
return 0;
}
2016-08-09 16:41:40 +02:00
int rewriteAppendOnlyFileRio(rio *aof) {
dictIterator *di = NULL;
dictEntry *de;
2016-08-09 16:41:40 +02:00
int j;
long key_count = 0;
long long updated_time = 0;
/* Record timestamp at the beginning of rewriting AOF. */
if (server.aof_timestamp_enabled) {
sds ts = genAofTimestampAnnotationIfNeeded(1);
if (rioWrite(aof,ts,sdslen(ts)) == 0) { sdsfree(ts); goto werr; }
sdsfree(ts);
}
if (rewriteFunctions(aof) == 0) goto werr;
for (j = 0; j < server.dbnum; j++) {
char selectcmd[] = "*2\r\n$6\r\nSELECT\r\n";
redisDb *db = server.db+j;
dict *d = db->dict;
if (dictSize(d) == 0) continue;
di = dictGetSafeIterator(d);
/* SELECT the new DB */
2016-08-09 16:41:40 +02:00
if (rioWrite(aof,selectcmd,sizeof(selectcmd)-1) == 0) goto werr;
if (rioWriteBulkLongLong(aof,j) == 0) goto werr;
/* Iterate this DB writing every entry */
while((de = dictNext(di)) != NULL) {
2011-05-10 10:07:04 +02:00
sds keystr;
robj key, *o;
long long expiretime;
Use madvise(MADV_DONTNEED) to release memory to reduce COW (#8974) ## Backgroud As we know, after `fork`, one process will copy pages when writing data to these pages(CoW), and another process still keep old pages, they totally cost more memory. For redis, we suffered that redis consumed much memory when the fork child is serializing key/values, even that maybe cause OOM. But actually we find, in redis fork child process, the child process don't need to keep some memory and parent process may write or update that, for example, child process will never access the key-value that is serialized but users may update it in parent process. So we think it may reduce COW if the child process release memory that it is not needed. ## Implementation For releasing key value in child process, we may think we call `decrRefCount` to free memory, but i find the fork child process still use much memory when we don't write any data to redis, and it costs much more time that slows down bgsave. Maybe because memory allocator doesn't really release memory to OS, and it may modify some inner data for this free operation, especially when we free small objects. Moreover, CoW is based on pages, so it is a easy way that we only free the memory bulk that is not less than kernel page size. madvise(MADV_DONTNEED) can quickly release specified region pages to OS bypassing memory allocator, and allocator still consider that this memory still is used and don't change its inner data. There are some buffers we can release in the fork child process: - **Serialized key-values** the fork child process never access serialized key-values, so we try to free them. Because we only can release big bulk memory, and it is time consumed to iterate all items/members/fields/entries of complex data type. So we decide to iterate them and try to release them only when their average size of item/member/field/entry is more than page size of OS. - **Replication backlog** Because replication backlog is a cycle buffer, it will be changed quickly if redis has heavy write traffic, but in fork child process, we don't need to access that. - **Client buffers** If clients have requests during having the fork child process, clients' buffer also be changed frequently. The memory includes client query buffer, output buffer, and client struct used memory. To get child process peak private dirty memory, we need to count peak memory instead of last used memory, because the child process may continue to release memory (since COW used to only grow till now, the last was equivalent to the peak). Also we're adding a new `current_cow_peak` info variable (to complement the existing `current_cow_size`) Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-05 04:01:46 +08:00
size_t aof_bytes_before_key = aof->processed_bytes;
keystr = dictGetKey(de);
o = dictGetVal(de);
initStaticStringObject(key,keystr);
expiretime = getExpire(db,&key);
/* Save the key and associated value */
if (o->type == OBJ_STRING) {
/* Emit a SET command */
char cmd[]="*3\r\n$3\r\nSET\r\n";
2016-08-09 16:41:40 +02:00
if (rioWrite(aof,cmd,sizeof(cmd)-1) == 0) goto werr;
/* Key and value */
2016-08-09 16:41:40 +02:00
if (rioWriteBulkObject(aof,&key) == 0) goto werr;
if (rioWriteBulkObject(aof,o) == 0) goto werr;
} else if (o->type == OBJ_LIST) {
2016-08-09 16:41:40 +02:00
if (rewriteListObject(aof,&key,o) == 0) goto werr;
} else if (o->type == OBJ_SET) {
2016-08-09 16:41:40 +02:00
if (rewriteSetObject(aof,&key,o) == 0) goto werr;
} else if (o->type == OBJ_ZSET) {
2016-08-09 16:41:40 +02:00
if (rewriteSortedSetObject(aof,&key,o) == 0) goto werr;
} else if (o->type == OBJ_HASH) {
2016-08-09 16:41:40 +02:00
if (rewriteHashObject(aof,&key,o) == 0) goto werr;
} else if (o->type == OBJ_STREAM) {
if (rewriteStreamObject(aof,&key,o) == 0) goto werr;
} else if (o->type == OBJ_MODULE) {
if (rewriteModuleObject(aof,&key,o,j) == 0) goto werr;
} else {
2015-07-27 09:41:48 +02:00
serverPanic("Unknown object type");
}
Use madvise(MADV_DONTNEED) to release memory to reduce COW (#8974) ## Backgroud As we know, after `fork`, one process will copy pages when writing data to these pages(CoW), and another process still keep old pages, they totally cost more memory. For redis, we suffered that redis consumed much memory when the fork child is serializing key/values, even that maybe cause OOM. But actually we find, in redis fork child process, the child process don't need to keep some memory and parent process may write or update that, for example, child process will never access the key-value that is serialized but users may update it in parent process. So we think it may reduce COW if the child process release memory that it is not needed. ## Implementation For releasing key value in child process, we may think we call `decrRefCount` to free memory, but i find the fork child process still use much memory when we don't write any data to redis, and it costs much more time that slows down bgsave. Maybe because memory allocator doesn't really release memory to OS, and it may modify some inner data for this free operation, especially when we free small objects. Moreover, CoW is based on pages, so it is a easy way that we only free the memory bulk that is not less than kernel page size. madvise(MADV_DONTNEED) can quickly release specified region pages to OS bypassing memory allocator, and allocator still consider that this memory still is used and don't change its inner data. There are some buffers we can release in the fork child process: - **Serialized key-values** the fork child process never access serialized key-values, so we try to free them. Because we only can release big bulk memory, and it is time consumed to iterate all items/members/fields/entries of complex data type. So we decide to iterate them and try to release them only when their average size of item/member/field/entry is more than page size of OS. - **Replication backlog** Because replication backlog is a cycle buffer, it will be changed quickly if redis has heavy write traffic, but in fork child process, we don't need to access that. - **Client buffers** If clients have requests during having the fork child process, clients' buffer also be changed frequently. The memory includes client query buffer, output buffer, and client struct used memory. To get child process peak private dirty memory, we need to count peak memory instead of last used memory, because the child process may continue to release memory (since COW used to only grow till now, the last was equivalent to the peak). Also we're adding a new `current_cow_peak` info variable (to complement the existing `current_cow_size`) Co-authored-by: Oran Agra <oran@redislabs.com>
2021-08-05 04:01:46 +08:00
/* In fork child process, we can try to release memory back to the
* OS and possibly avoid or decrease COW. We guve the dismiss
* mechanism a hint about an estimated size of the object we stored. */
size_t dump_size = aof->processed_bytes - aof_bytes_before_key;
if (server.in_fork_child) dismissObject(o, dump_size);
/* Save the expire time */
if (expiretime != -1) {
char cmd[]="*3\r\n$9\r\nPEXPIREAT\r\n";
2016-08-09 16:41:40 +02:00
if (rioWrite(aof,cmd,sizeof(cmd)-1) == 0) goto werr;
if (rioWriteBulkObject(aof,&key) == 0) goto werr;
if (rioWriteBulkLongLong(aof,expiretime) == 0) goto werr;
}
/* Update info every 1 second (approximately).
* in order to avoid calling mstime() on each iteration, we will
* check the diff every 1024 keys */
if ((key_count++ & 1023) == 0) {
long long now = mstime();
if (now - updated_time >= 1000) {
sendChildInfo(CHILD_INFO_TYPE_CURRENT_INFO, key_count, "AOF rewrite");
updated_time = now;
}
}
/* Delay before next key if required (for testing) */
if (server.rdb_key_save_delay)
debugDelay(server.rdb_key_save_delay);
}
dictReleaseIterator(di);
di = NULL;
}
2016-08-09 11:07:32 +02:00
return C_OK;
werr:
if (di) dictReleaseIterator(di);
return C_ERR;
}
/* Write a sequence of commands able to fully rebuild the dataset into
* "filename". Used both by REWRITEAOF and BGREWRITEAOF.
*
* In order to minimize the number of commands needed in the rewritten
* log Redis uses variadic commands when possible, such as RPUSH, SADD
* and ZADD. However at max AOF_REWRITE_ITEMS_PER_CMD items per time
* are inserted using a single command. */
int rewriteAppendOnlyFile(char *filename) {
rio aof;
FILE *fp = NULL;
2016-08-09 11:07:32 +02:00
char tmpfile[256];
/* Note that we have to use a different temp name here compared to the
* one used by rewriteAppendOnlyFileBackground() function. */
snprintf(tmpfile,256,"temp-rewriteaof-%d.aof", (int) getpid());
fp = fopen(tmpfile,"w");
if (!fp) {
serverLog(LL_WARNING, "Opening the temp file for AOF rewrite in rewriteAppendOnlyFile(): %s", strerror(errno));
return C_ERR;
}
rioInitWithFile(&aof,fp);
if (server.aof_rewrite_incremental_fsync)
rioSetAutoSync(&aof,REDIS_AUTOSYNC_BYTES);
2016-08-09 11:07:32 +02:00
startSaving(RDBFLAGS_AOF_PREAMBLE);
2016-08-09 16:41:40 +02:00
if (server.aof_use_rdb_preamble) {
2016-08-09 11:07:32 +02:00
int error;
if (rdbSaveRio(SLAVE_REQ_NONE,&aof,&error,RDBFLAGS_AOF_PREAMBLE,NULL) == C_ERR) {
2016-08-09 11:07:32 +02:00
errno = error;
goto werr;
}
} else {
2016-08-09 16:41:40 +02:00
if (rewriteAppendOnlyFileRio(&aof) == C_ERR) goto werr;
2016-08-09 11:07:32 +02:00
}
/* Make sure data will not remain on the OS's output buffers */
if (fflush(fp)) goto werr;
if (fsync(fileno(fp))) goto werr;
if (fclose(fp)) { fp = NULL; goto werr; }
fp = NULL;
/* Use RENAME to make sure the DB file is changed atomically only
* if the generate DB file is ok. */
if (rename(tmpfile,filename) == -1) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Error moving temp append only file on the final destination: %s", strerror(errno));
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
2015-07-27 09:41:48 +02:00
serverLog(LL_NOTICE,"SYNC append only file rewrite performed");
stopSaving(1);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
return C_OK;
werr:
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,"Write error writing append only file on disk: %s", strerror(errno));
if (fp) fclose(fp);
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
/* ----------------------------------------------------------------------------
2014-07-12 18:02:17 +02:00
* AOF background rewrite
* ------------------------------------------------------------------------- */
/* This is how rewriting of the append only file in background works:
*
* 1) The user calls BGREWRITEAOF
* 2) Redis calls this function, that forks():
* 2a) the child rewrite the append only file in a temp file.
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* 2b) the parent open a new INCR AOF file to continue writing.
* 3) When the child finished '2a' exists.
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
* 4) The parent will trap the exit code, if it's OK, it will:
* 4a) get a new BASE file name and mark the previous (if we have) as the HISTORY type
* 4b) rename(2) the temp file in new BASE file name
* 4c) mark the rewritten INCR AOFs as history type
* 4d) persist AOF manifest file
* 4e) Delete the history files use bio
*/
int rewriteAppendOnlyFileBackground(void) {
pid_t childpid;
if (hasActiveChildProcess()) return C_ERR;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (dirCreateIfMissing(server.aof_dirname) == -1) {
serverLog(LL_WARNING, "Can't open or create append-only dir %s: %s",
server.aof_dirname, strerror(errno));
return C_ERR;
}
/* We set aof_selected_db to -1 in order to force the next call to the
* feedAppendOnlyFile() to issue a SELECT command. */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_selected_db = -1;
flushAppendOnlyFile(1);
if (openNewIncrAofForAppend() != C_OK) return C_ERR;
server.stat_aof_rewrites++;
if ((childpid = redisFork(CHILD_TYPE_AOF)) == 0) {
char tmpfile[256];
/* Child */
redisSetProcTitle("redis-aof-rewrite");
redisSetCpuAffinity(server.aof_rewrite_cpulist);
snprintf(tmpfile,256,"temp-rewriteaof-bg-%d.aof", (int) getpid());
if (rewriteAppendOnlyFile(tmpfile) == C_OK) {
sendChildCowInfo(CHILD_INFO_TYPE_AOF_COW_SIZE, "AOF rewrite");
exitFromChild(0);
} else {
exitFromChild(1);
}
} else {
/* Parent */
if (childpid == -1) {
server.aof_lastbgrewrite_status = C_ERR;
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,
"Can't rewrite append only file in background: fork: %s",
strerror(errno));
return C_ERR;
}
2015-07-27 09:41:48 +02:00
serverLog(LL_NOTICE,
"Background append only file rewriting started by pid %ld",(long) childpid);
server.aof_rewrite_scheduled = 0;
server.aof_rewrite_time_start = time(NULL);
return C_OK;
}
return C_OK; /* unreached */
}
void bgrewriteaofCommand(client *c) {
if (server.child_type == CHILD_TYPE_AOF) {
addReplyError(c,"Background append only file rewriting already in progress");
} else if (hasActiveChildProcess() || server.in_exec) {
server.aof_rewrite_scheduled = 1;
addReplyStatus(c,"Background append only file rewriting scheduled");
} else if (rewriteAppendOnlyFileBackground() == C_OK) {
addReplyStatus(c,"Background append only file rewriting started");
} else {
addReplyError(c,"Can't execute an AOF background rewriting. "
"Please check the server logs for more information.");
}
}
void aofRemoveTempFile(pid_t childpid) {
char tmpfile[256];
snprintf(tmpfile,256,"temp-rewriteaof-bg-%d.aof", (int) childpid);
bg_unlink(tmpfile);
snprintf(tmpfile,256,"temp-rewriteaof-%d.aof", (int) childpid);
bg_unlink(tmpfile);
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
off_t getAppendOnlyFileSize(sds filename) {
struct redis_stat sb;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
off_t size;
mstime_t latency;
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
sds aof_filepath = makePath(server.aof_dirname, filename);
latencyStartMonitor(latency);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (redis_stat(aof_filepath, &sb) == -1) {
serverLog(LL_WARNING, "Unable to obtain the AOF file %s length. stat: %s",
filename, strerror(errno));
size = 0;
} else {
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
size = sb.st_size;
}
latencyEndMonitor(latency);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
latencyAddSampleIfNeeded("aof-fstat", latency);
sdsfree(aof_filepath);
return size;
}
off_t getBaseAndIncrAppendOnlyFilesSize(aofManifest *am) {
off_t size = 0;
listNode *ln;
listIter li;
if (am->base_aof_info) {
serverAssert(am->base_aof_info->file_type == AOF_FILE_TYPE_BASE);
size += getAppendOnlyFileSize(am->base_aof_info->file_name);
}
listRewind(am->incr_aof_list, &li);
while ((ln = listNext(&li)) != NULL) {
aofInfo *ai = (aofInfo*)ln->value;
serverAssert(ai->file_type == AOF_FILE_TYPE_INCR);
size += getAppendOnlyFileSize(ai->file_name);
}
return size;
}
int getBaseAndIncrAppendOnlyFilesNum(aofManifest *am) {
int num = 0;
if (am->base_aof_info) num++;
if (am->incr_aof_list) num += listLength(am->incr_aof_list);
return num;
}
/* A background append only file rewriting (BGREWRITEAOF) terminated its work.
* Handle this. */
void backgroundRewriteDoneHandler(int exitcode, int bysignal) {
if (!bysignal && exitcode == 0) {
char tmpfile[256];
long long now = ustime();
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
sds new_base_filename;
aofManifest *temp_am;
mstime_t latency;
2015-07-27 09:41:48 +02:00
serverLog(LL_NOTICE,
"Background AOF rewrite terminated with success");
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
snprintf(tmpfile, 256, "temp-rewriteaof-bg-%d.aof",
(int)server.child_pid);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
serverAssert(server.aof_manifest != NULL);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Dup a temporary aof_manifest for subsequent modifications. */
temp_am = aofManifestDup(server.aof_manifest);
/* Get a new BASE file name and mark the previous (if we have)
* as the HISTORY type. */
new_base_filename = getNewBaseFileNameAndMarkPreAsHistory(temp_am);
serverAssert(new_base_filename != NULL);
sds new_base_filepath = makePath(server.aof_dirname, new_base_filename);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Rename the temporary aof file to 'new_base_filename'. */
latencyStartMonitor(latency);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
if (rename(tmpfile, new_base_filepath) == -1) {
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,
"Error trying to rename the temporary AOF file %s into %s: %s",
tmpfile,
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
new_base_filename,
strerror(errno));
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
aofManifestFree(temp_am);
sdsfree(new_base_filepath);
goto cleanup;
}
latencyEndMonitor(latency);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
latencyAddSampleIfNeeded("aof-rename", latency);
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* Change the AOF file type in 'incr_aof_list' from AOF_FILE_TYPE_INCR
* to AOF_FILE_TYPE_HIST, and move them to the 'history_aof_list'. */
markRewrittenIncrAofAsHistory(temp_am);
/* Persist our modifications. */
if (persistAofManifest(temp_am) == C_ERR) {
bg_unlink(new_base_filepath);
aofManifestFree(temp_am);
sdsfree(new_base_filepath);
goto cleanup;
}
sdsfree(new_base_filepath);
/* We can safely let `server.aof_manifest` point to 'temp_am' and free the previous one. */
aofManifestFreeAndUpdate(temp_am);
if (server.aof_fd != -1) {
/* AOF enabled. */
2011-12-21 12:17:02 +01:00
server.aof_selected_db = -1; /* Make sure SELECT is re-issued */
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
server.aof_current_size = getAppendOnlyFileSize(new_base_filename) + server.aof_last_incr_size;
server.aof_rewrite_base_size = server.aof_current_size;
server.aof_fsync_offset = server.aof_current_size;
server.aof_last_fsync = server.unixtime;
}
Implement Multi Part AOF mechanism to avoid AOFRW overheads. (#9788) Implement Multi-Part AOF mechanism to avoid overheads during AOFRW. Introducing a folder with multiple AOF files tracked by a manifest file. The main issues with the the original AOFRW mechanism are: * buffering of commands that are processed during rewrite (consuming a lot of RAM) * freezes of the main process when the AOFRW completes to drain the remaining part of the buffer and fsync it. * double disk IO for the data that arrives during AOFRW (had to be written to both the old and new AOF files) The main modifications of this PR: 1. Remove the AOF rewrite buffer and related code. 2. Divide the AOF into multiple files, they are classified as two types, one is the the `BASE` type, it represents the full amount of data (Maybe AOF or RDB format) after each AOFRW, there is only one `BASE` file at most. The second is `INCR` type, may have more than one. They represent the incremental commands since the last AOFRW. 3. Use a AOF manifest file to record and manage these AOF files mentioned above. 4. The original configuration of `appendfilename` will be the base part of the new file name, for example: `appendonly.aof.1.base.rdb` and `appendonly.aof.2.incr.aof` 5. Add manifest-related TCL tests, and modified some existing tests that depend on the `appendfilename` 6. Remove the `aof_rewrite_buffer_length` field in info. 7. Add `aof-disable-auto-gc` configuration. By default we're automatically deleting HISTORY type AOFs. It also gives users the opportunity to preserve the history AOFs. just for testing use now. 8. Add AOFRW limiting measure. When the AOFRW failures reaches the threshold (3 times now), we will delay the execution of the next AOFRW by 1 minute. If the next AOFRW also fails, it will be delayed by 2 minutes. The next is 4, 8, 16, the maximum delay is 60 minutes (1 hour). During the limit period, we can still use the 'bgrewriteaof' command to execute AOFRW immediately. 9. Support upgrade (load) data from old version redis. 10. Add `appenddirname` configuration, as the directory name of the append only files. All AOF files and manifest file will be placed in this directory. 11. Only the last AOF file (BASE or INCR) can be truncated. Otherwise redis will exit even if `aof-load-truncated` is enabled. Co-authored-by: Oran Agra <oran@redislabs.com>
2022-01-04 01:14:13 +08:00
/* We don't care about the return value of `aofDelHistoryFiles`, because the history
* deletion failure will not cause any problems. */
aofDelHistoryFiles();
server.aof_lastbgrewrite_status = C_OK;
2015-07-27 09:41:48 +02:00
serverLog(LL_NOTICE, "Background AOF rewrite finished successfully");
/* Change state from WAIT_REWRITE to ON if needed */
2015-07-27 09:41:48 +02:00
if (server.aof_state == AOF_WAIT_REWRITE)
server.aof_state = AOF_ON;
2015-07-27 09:41:48 +02:00
serverLog(LL_VERBOSE,
"Background AOF rewrite signal handler took %lldus", ustime()-now);
} else if (!bysignal && exitcode != 0) {
server.aof_lastbgrewrite_status = C_ERR;
serverLog(LL_WARNING,
"Background AOF rewrite terminated with error");
} else {
/* SIGUSR1 is whitelisted, so we have a way to kill a child without
Squash merging 125 typo/grammar/comment/doc PRs (#7773) List of squashed commits or PRs =============================== commit 66801ea Author: hwware <wen.hui.ware@gmail.com> Date: Mon Jan 13 00:54:31 2020 -0500 typo fix in acl.c commit 46f55db Author: Itamar Haber <itamar@redislabs.com> Date: Sun Sep 6 18:24:11 2020 +0300 Updates a couple of comments Specifically: * RM_AutoMemory completed instead of pointing to docs * Updated link to custom type doc commit 61a2aa0 Author: xindoo <xindoo@qq.com> Date: Tue Sep 1 19:24:59 2020 +0800 Correct errors in code comments commit a5871d1 Author: yz1509 <pro-756@qq.com> Date: Tue Sep 1 18:36:06 2020 +0800 fix typos in module.c commit 41eede7 Author: bookug <bookug@qq.com> Date: Sat Aug 15 01:11:33 2020 +0800 docs: fix typos in comments commit c303c84 Author: lazy-snail <ws.niu@outlook.com> Date: Fri Aug 7 11:15:44 2020 +0800 fix spelling in redis.conf commit 1eb76bf Author: zhujian <zhujianxyz@gmail.com> Date: Thu Aug 6 15:22:10 2020 +0800 add a missing 'n' in comment commit 1530ec2 Author: Daniel Dai <764122422@qq.com> Date: Mon Jul 27 00:46:35 2020 -0400 fix spelling in tracking.c commit e517b31 Author: Hunter-Chen <huntcool001@gmail.com> Date: Fri Jul 17 22:33:32 2020 +0800 Update redis.conf Co-authored-by: Itamar Haber <itamar@redislabs.com> commit c300eff Author: Hunter-Chen <huntcool001@gmail.com> Date: Fri Jul 17 22:33:23 2020 +0800 Update redis.conf Co-authored-by: Itamar Haber <itamar@redislabs.com> commit 4c058a8 Author: 陈浩鹏 <chenhaopeng@heytea.com> Date: Thu Jun 25 19:00:56 2020 +0800 Grammar fix and clarification commit 5fcaa81 Author: bodong.ybd <bodong.ybd@alibaba-inc.com> Date: Fri Jun 19 10:09:00 2020 +0800 Fix typos commit 4caca9a Author: Pruthvi P <pruthvi@ixigo.com> Date: Fri May 22 00:33:22 2020 +0530 Fix typo eviciton => eviction commit b2a25f6 Author: Brad Dunbar <dunbarb2@gmail.com> Date: Sun May 17 12:39:59 2020 -0400 Fix a typo. commit 12842ae Author: hwware <wen.hui.ware@gmail.com> Date: Sun May 3 17:16:59 2020 -0400 fix spelling in redis conf commit ddba07c Author: Chris Lamb <chris@chris-lamb.co.uk> Date: Sat May 2 23:25:34 2020 +0100 Correct a "conflicts" spelling error. commit 8fc7bf2 Author: Nao YONASHIRO <yonashiro@r.recruit.co.jp> Date: Thu Apr 30 10:25:27 2020 +0900 docs: fix EXPIRE_FAST_CYCLE_DURATION to ACTIVE_EXPIRE_CYCLE_FAST_DURATION commit 9b2b67a Author: Brad Dunbar <dunbarb2@gmail.com> Date: Fri Apr 24 11:46:22 2020 -0400 Fix a typo. commit 0746f10 Author: devilinrust <63737265+devilinrust@users.noreply.github.com> Date: Thu Apr 16 00:17:53 2020 +0200 Fix typos in server.c commit 92b588d Author: benjessop12 <56115861+benjessop12@users.noreply.github.com> Date: Mon Apr 13 13:43:55 2020 +0100 Fix spelling mistake in lazyfree.c commit 1da37aa Merge: 2d4ba28 af347a8 Author: hwware <wen.hui.ware@gmail.com> Date: Thu Mar 5 22:41:31 2020 -0500 Merge remote-tracking branch 'upstream/unstable' into expiretypofix commit 2d4ba28 Author: hwware <wen.hui.ware@gmail.com> Date: Mon Mar 2 00:09:40 2020 -0500 fix typo in expire.c commit 1a746f7 Author: SennoYuki <minakami1yuki@gmail.com> Date: Thu Feb 27 16:54:32 2020 +0800 fix typo commit 8599b1a Author: dongheejeong <donghee950403@gmail.com> Date: Sun Feb 16 20:31:43 2020 +0000 Fix typo in server.c commit f38d4e8 Author: hwware <wen.hui.ware@gmail.com> Date: Sun Feb 2 22:58:38 2020 -0500 fix typo in evict.c commit fe143fc Author: Leo Murillo <leonardo.murillo@gmail.com> Date: Sun Feb 2 01:57:22 2020 -0600 Fix a few typos in redis.conf commit 1ab4d21 Author: viraja1 <anchan.viraj@gmail.com> Date: Fri Dec 27 17:15:58 2019 +0530 Fix typo in Latency API docstring commit ca1f70e Author: gosth <danxuedexing@qq.com> Date: Wed Dec 18 15:18:02 2019 +0800 fix typo in sort.c commit a57c06b Author: ZYunH <zyunhjob@163.com> Date: Mon Dec 16 22:28:46 2019 +0800 fix-zset-typo commit b8c92b5 Author: git-hulk <hulk.website@gmail.com> Date: Mon Dec 16 15:51:42 2019 +0800 FIX: typo in cluster.c, onformation->information commit 9dd981c Author: wujm2007 <jim.wujm@gmail.com> Date: Mon Dec 16 09:37:52 2019 +0800 Fix typo commit e132d7a Author: Sebastien Williams-Wynn <s.williamswynn.mail@gmail.com> Date: Fri Nov 15 00:14:07 2019 +0000 Minor typo change commit 47f44d5 Author: happynote3966 <01ssrmikururudevice01@gmail.com> Date: Mon Nov 11 22:08:48 2019 +0900 fix comment typo in redis-cli.c commit b8bdb0d Author: fulei <fulei@kuaishou.com> Date: Wed Oct 16 18:00:17 2019 +0800 Fix a spelling mistake of comments in defragDictBucketCallback commit 0def46a Author: fulei <fulei@kuaishou.com> Date: Wed Oct 16 13:09:27 2019 +0800 fix some spelling mistakes of comments in defrag.c commit f3596fd Author: Phil Rajchgot <tophil@outlook.com> Date: Sun Oct 13 02:02:32 2019 -0400 Typo and grammar fixes Redis and its documentation are great -- just wanted to submit a few corrections in the spirit of Hacktoberfest. Thanks for all your work on this project. I use it all the time and it works beautifully. commit 2b928cd Author: KangZhiDong <worldkzd@gmail.com> Date: Sun Sep 1 07:03:11 2019 +0800 fix typos commit 33aea14 Author: Axlgrep <axlgrep@gmail.com> Date: Tue Aug 27 11:02:18 2019 +0800 Fixed eviction spelling issues commit e282a80 Author: Simen Flatby <simen@oms.no> Date: Tue Aug 20 15:25:51 2019 +0200 Update comments to reflect prop name In the comments the prop is referenced as replica-validity-factor, but it is really named cluster-replica-validity-factor. commit 74d1f9a Author: Jim Green <jimgreen2013@qq.com> Date: Tue Aug 20 20:00:31 2019 +0800 fix comment error, the code is ok commit eea1407 Author: Liao Tonglang <liaotonglang@gmail.com> Date: Fri May 31 10:16:18 2019 +0800 typo fix fix cna't to can't commit 0da553c Author: KAWACHI Takashi <tkawachi@gmail.com> Date: Wed Jul 17 00:38:16 2019 +0900 Fix typo commit 7fc8fb6 Author: Michael Prokop <mika@grml.org> Date: Tue May 28 17:58:42 2019 +0200 Typo fixes s/familar/familiar/ s/compatiblity/compatibility/ s/ ot / to / s/itsef/itself/ commit 5f46c9d Author: zhumoing <34539422+zhumoing@users.noreply.github.com> Date: Tue May 21 21:16:50 2019 +0800 typo-fixes typo-fixes commit 321dfe1 Author: wxisme <850885154@qq.com> Date: Sat Mar 16 15:10:55 2019 +0800 typo fix commit b4fb131 Merge: 267e0e6 3df1eb8 Author: Nikitas Bastas <nikitasbst@gmail.com> Date: Fri Feb 8 22:55:45 2019 +0200 Merge branch 'unstable' of antirez/redis into unstable commit 267e0e6 Author: Nikitas Bastas <nikitasbst@gmail.com> Date: Wed Jan 30 21:26:04 2019 +0200 Minor typo fix commit 30544e7 Author: inshal96 <39904558+inshal96@users.noreply.github.com> Date: Fri Jan 4 16:54:50 2019 +0500 remove an extra 'a' in the comments commit 337969d Author: BrotherGao <yangdongheng11@gmail.com> Date: Sat Dec 29 12:37:29 2018 +0800 fix typo in redis.conf commit 9f4b121 Merge: 423a030 e504583 Author: BrotherGao <yangdongheng@xiaomi.com> Date: Sat Dec 29 11:41:12 2018 +0800 Merge branch 'unstable' of antirez/redis into unstable commit 423a030 Merge: 42b02b7 46a51cd Author: 杨东衡 <yangdongheng@xiaomi.com> Date: Tue Dec 4 23:56:11 2018 +0800 Merge branch 'unstable' of antirez/redis into unstable commit 42b02b7 Merge: 68c0e6e b8febe6 Author: Dongheng Yang <yangdongheng11@gmail.com> Date: Sun Oct 28 15:54:23 2018 +0800 Merge pull request #1 from antirez/unstable update local data commit 714b589 Author: Christian <crifei93@gmail.com> Date: Fri Dec 28 01:17:26 2018 +0100 fix typo "resulution" commit e23259d Author: garenchan <1412950785@qq.com> Date: Wed Dec 26 09:58:35 2018 +0800 fix typo: segfauls -> segfault commit a9359f8 Author: xjp <jianping_xie@aliyun.com> Date: Tue Dec 18 17:31:44 2018 +0800 Fixed REDISMODULE_H spell bug commit a12c3e4 Author: jdiaz <jrd.palacios@gmail.com> Date: Sat Dec 15 23:39:52 2018 -0600 Fixes hyperloglog hash function comment block description commit 770eb11 Author: 林上耀 <1210tom@163.com> Date: Sun Nov 25 17:16:10 2018 +0800 fix typo commit fd97fbb Author: Chris Lamb <chris@chris-lamb.co.uk> Date: Fri Nov 23 17:14:01 2018 +0100 Correct "unsupported" typo. commit a85522d Author: Jungnam Lee <jungnam.lee@oracle.com> Date: Thu Nov 8 23:01:29 2018 +0900 fix typo in test comments commit ade8007 Author: Arun Kumar <palerdot@users.noreply.github.com> Date: Tue Oct 23 16:56:35 2018 +0530 Fixed grammatical typo Fixed typo for word 'dictionary' commit 869ee39 Author: Hamid Alaei <hamid.a85@gmail.com> Date: Sun Aug 12 16:40:02 2018 +0430 fix documentations: (ThreadSafeContextStart/Stop -> ThreadSafeContextLock/Unlock), minor typo commit f89d158 Author: Mayank Jain <mayankjain255@gmail.com> Date: Tue Jul 31 23:01:21 2018 +0530 Updated README.md with some spelling corrections. Made correction in spelling of some misspelled words. commit 892198e Author: dsomeshwar <someshwar.dhayalan@gmail.com> Date: Sat Jul 21 23:23:04 2018 +0530 typo fix commit 8a4d780 Author: Itamar Haber <itamar@redislabs.com> Date: Mon Apr 30 02:06:52 2018 +0300 Fixes some typos commit e3acef6 Author: Noah Rosamilia <ivoahivoah@gmail.com> Date: Sat Mar 3 23:41:21 2018 -0500 Fix typo in /deps/README.md commit 04442fb Author: WuYunlong <xzsyeb@126.com> Date: Sat Mar 3 10:32:42 2018 +0800 Fix typo in readSyncBulkPayload() comment. commit 9f36880 Author: WuYunlong <xzsyeb@126.com> Date: Sat Mar 3 10:20:37 2018 +0800 replication.c comment: run_id -> replid. commit f866b4a Author: Francesco 'makevoid' Canessa <makevoid@gmail.com> Date: Thu Feb 22 22:01:56 2018 +0000 fix comment typo in server.c commit 0ebc69b Author: 줍 <jubee0124@gmail.com> Date: Mon Feb 12 16:38:48 2018 +0900 Fix typo in redis.conf Fix `five behaviors` to `eight behaviors` in [this sentence ](antirez/redis@unstable/redis.conf#L564) commit b50a620 Author: martinbroadhurst <martinbroadhurst@users.noreply.github.com> Date: Thu Dec 28 12:07:30 2017 +0000 Fix typo in valgrind.sup commit 7d8f349 Author: Peter Boughton <peter@sorcerersisle.com> Date: Mon Nov 27 19:52:19 2017 +0000 Update CONTRIBUTING; refer doc updates to redis-doc repo. commit 02dec7e Author: Klauswk <klauswk1@hotmail.com> Date: Tue Oct 24 16:18:38 2017 -0200 Fix typo in comment commit e1efbc8 Author: chenshi <baiwfg2@gmail.com> Date: Tue Oct 3 18:26:30 2017 +0800 Correct two spelling errors of comments commit 93327d8 Author: spacewander <spacewanderlzx@gmail.com> Date: Wed Sep 13 16:47:24 2017 +0800 Update the comment for OBJ_ENCODING_EMBSTR_SIZE_LIMIT's value The value of OBJ_ENCODING_EMBSTR_SIZE_LIMIT is 44 now instead of 39. commit 63d361f Author: spacewander <spacewanderlzx@gmail.com> Date: Tue Sep 12 15:06:42 2017 +0800 Fix <prevlen> related doc in ziplist.c According to the definition of ZIP_BIG_PREVLEN and other related code, the guard of single byte <prevlen> should be 254 instead of 255. commit ebe228d Author: hanael80 <hanael80@gmail.com> Date: Tue Aug 15 09:09:40 2017 +0900 Fix typo commit 6b696e6 Author: Matt Robenolt <matt@ydekproductions.com> Date: Mon Aug 14 14:50:47 2017 -0700 Fix typo in LATENCY DOCTOR output commit a2ec6ae Author: caosiyang <caosiyang@qiyi.com> Date: Tue Aug 15 14:15:16 2017 +0800 Fix a typo: form => from commit 3ab7699 Author: caosiyang <caosiyang@qiyi.com> Date: Thu Aug 10 18:40:33 2017 +0800 Fix a typo: replicationFeedSlavesFromMaster() => replicationFeedSlavesFromMasterStream() commit 72d43ef Author: caosiyang <caosiyang@qiyi.com> Date: Tue Aug 8 15:57:25 2017 +0800 fix a typo: servewr => server commit 707c958 Author: Bo Cai <charpty@gmail.com> Date: Wed Jul 26 21:49:42 2017 +0800 redis-cli.c typo: conut -> count. Signed-off-by: Bo Cai <charpty@gmail.com> commit b9385b2 Author: JackDrogon <jack.xsuperman@gmail.com> Date: Fri Jun 30 14:22:31 2017 +0800 Fix some spell problems commit 20d9230 Author: akosel <aaronjkosel@gmail.com> Date: Sun Jun 4 19:35:13 2017 -0500 Fix typo commit b167bfc Author: Krzysiek Witkowicz <krzysiekwitkowicz@gmail.com> Date: Mon May 22 21:32:27 2017 +0100 Fix #4008 small typo in comment commit 2b78ac8 Author: Jake Clarkson <jacobwclarkson@gmail.com> Date: Wed Apr 26 15:49:50 2017 +0100 Correct typo in tests/unit/hyperloglog.tcl commit b0f1cdb Author: Qi Luo <qiluo-msft@users.noreply.github.com> Date: Wed Apr 19 14:25:18 2017 -0700 Fix typo commit a90b0f9 Author: charsyam <charsyam@naver.com> Date: Thu Mar 16 18:19:53 2017 +0900 fix typos fix typos fix typos commit 8430a79 Author: Richard Hart <richardhart92@gmail.com> Date: Mon Mar 13 22:17:41 2017 -0400 Fixed log message typo in listenToPort. commit 481a1c2 Author: Vinod Kumar <kumar003vinod@gmail.com> Date: Sun Jan 15 23:04:51 2017 +0530 src/db.c: Correct "save" -> "safe" typo commit 586b4d3 Author: wangshaonan <wshn13@gmail.com> Date: Wed Dec 21 20:28:27 2016 +0800 Fix typo they->the in helloworld.c commit c1c4b5e Author: Jenner <hypxm@qq.com> Date: Mon Dec 19 16:39:46 2016 +0800 typo error commit 1ee1a3f Author: tielei <43289893@qq.com> Date: Mon Jul 18 13:52:25 2016 +0800 fix some comments commit 11a41fb Author: Otto Kekäläinen <otto@seravo.fi> Date: Sun Jul 3 10:23:55 2016 +0100 Fix spelling in documentation and comments commit 5fb5d82 Author: francischan <f1ancis621@gmail.com> Date: Tue Jun 28 00:19:33 2016 +0800 Fix outdated comments about redis.c file. It should now refer to server.c file. commit 6b254bc Author: lmatt-bit <lmatt123n@gmail.com> Date: Thu Apr 21 21:45:58 2016 +0800 Refine the comment of dictRehashMilliseconds func SLAVECONF->REPLCONF in comment - by andyli029 commit ee9869f Author: clark.kang <charsyam@naver.com> Date: Tue Mar 22 11:09:51 2016 +0900 fix typos commit f7b3b11 Author: Harisankar H <harisankarh@gmail.com> Date: Wed Mar 9 11:49:42 2016 +0530 Typo correction: "faield" --> "failed" Typo correction: "faield" --> "failed" commit 3fd40fc Author: Itamar Haber <itamar@redislabs.com> Date: Thu Feb 25 10:31:51 2016 +0200 Fixes a typo in comments commit 621c160 Author: Prayag Verma <prayag.verma@gmail.com> Date: Mon Feb 1 12:36:20 2016 +0530 Fix typo in Readme.md Spelling mistakes - `eviciton` > `eviction` `familar` > `familiar` commit d7d07d6 Author: WonCheol Lee <toctoc21c@gmail.com> Date: Wed Dec 30 15:11:34 2015 +0900 Typo fixed commit a4dade7 Author: Felix Bünemann <buenemann@louis.info> Date: Mon Dec 28 11:02:55 2015 +0100 [ci skip] Improve supervised upstart config docs This mentions that "expect stop" is required for supervised upstart to work correctly. See http://upstart.ubuntu.com/cookbook/#expect-stop for an explanation. commit d9caba9 Author: daurnimator <quae@daurnimator.com> Date: Mon Dec 21 18:30:03 2015 +1100 README: Remove trailing whitespace commit 72d42e5 Author: daurnimator <quae@daurnimator.com> Date: Mon Dec 21 18:29:32 2015 +1100 README: Fix typo. th => the commit dd6e957 Author: daurnimator <quae@daurnimator.com> Date: Mon Dec 21 18:29:20 2015 +1100 README: Fix typo. familar => familiar commit 3a12b23 Author: daurnimator <quae@daurnimator.com> Date: Mon Dec 21 18:28:54 2015 +1100 README: Fix typo. eviciton => eviction commit 2d1d03b Author: daurnimator <quae@daurnimator.com> Date: Mon Dec 21 18:21:45 2015 +1100 README: Fix typo. sever => server commit 3973b06 Author: Itamar Haber <itamar@garantiadata.com> Date: Sat Dec 19 17:01:20 2015 +0200 Typo fix commit 4f2e460 Author: Steve Gao <fu@2token.com> Date: Fri Dec 4 10:22:05 2015 +0800 Update README - fix typos commit b21667c Author: binyan <binbin.yan@nokia.com> Date: Wed Dec 2 22:48:37 2015 +0800 delete redundancy color judge in sdscatcolor commit 88894c7 Author: binyan <binbin.yan@nokia.com> Date: Wed Dec 2 22:14:42 2015 +0800 the example output shoule be HelloWorld commit 2763470 Author: binyan <binbin.yan@nokia.com> Date: Wed Dec 2 17:41:39 2015 +0800 modify error word keyevente Signed-off-by: binyan <binbin.yan@nokia.com> commit 0847b3d Author: Bruno Martins <bscmartins@gmail.com> Date: Wed Nov 4 11:37:01 2015 +0000 typo commit bbb9e9e Author: dawedawe <dawedawe@gmx.de> Date: Fri Mar 27 00:46:41 2015 +0100 typo: zimap -> zipmap commit 5ed297e Author: Axel Advento <badwolf.bloodseeker.rev@gmail.com> Date: Tue Mar 3 15:58:29 2015 +0800 Fix 'salve' typos to 'slave' commit edec9d6 Author: LudwikJaniuk <ludvig.janiuk@gmail.com> Date: Wed Jun 12 14:12:47 2019 +0200 Update README.md Co-Authored-By: Qix <Qix-@users.noreply.github.com> commit 692a7af Author: LudwikJaniuk <ludvig.janiuk@gmail.com> Date: Tue May 28 14:32:04 2019 +0200 grammar commit d962b0a Author: Nick Frost <nickfrostatx@gmail.com> Date: Wed Jul 20 15:17:12 2016 -0700 Minor grammar fix commit 24fff01aaccaf5956973ada8c50ceb1462e211c6 (typos) Author: Chad Miller <chadm@squareup.com> Date: Tue Sep 8 13:46:11 2020 -0400 Fix faulty comment about operation of unlink() commit 3cd5c1f3326c52aa552ada7ec797c6bb16452355 Author: Kevin <kevin.xgr@gmail.com> Date: Wed Nov 20 00:13:50 2019 +0800 Fix typo in server.c. From a83af59 Mon Sep 17 00:00:00 2001 From: wuwo <wuwo@wacai.com> Date: Fri, 17 Mar 2017 20:37:45 +0800 Subject: [PATCH] falure to failure From c961896 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=B7=A6=E6=87=B6?= <veficos@gmail.com> Date: Sat, 27 May 2017 15:33:04 +0800 Subject: [PATCH] fix typo From e600ef2 Mon Sep 17 00:00:00 2001 From: "rui.zou" <rui.zou@yunify.com> Date: Sat, 30 Sep 2017 12:38:15 +0800 Subject: [PATCH] fix a typo From c7d07fa Mon Sep 17 00:00:00 2001 From: Alexandre Perrin <alex@kaworu.ch> Date: Thu, 16 Aug 2018 10:35:31 +0200 Subject: [PATCH] deps README.md typo From b25cb67 Mon Sep 17 00:00:00 2001 From: Guy Korland <gkorland@gmail.com> Date: Wed, 26 Sep 2018 10:55:37 +0300 Subject: [PATCH 1/2] fix typos in header From ad28ca6 Mon Sep 17 00:00:00 2001 From: Guy Korland <gkorland@gmail.com> Date: Wed, 26 Sep 2018 11:02:36 +0300 Subject: [PATCH 2/2] fix typos commit 34924cdedd8552466fc22c1168d49236cb7ee915 Author: Adrian Lynch <adi_ady_ade@hotmail.com> Date: Sat Apr 4 21:59:15 2015 +0100 Typos fixed commit fd2a1e7 Author: Jan <jsteemann@users.noreply.github.com> Date: Sat Oct 27 19:13:01 2018 +0200 Fix typos Fix typos commit e14e47c1a234b53b0e103c5f6a1c61481cbcbb02 Author: Andy Lester <andy@petdance.com> Date: Fri Aug 2 22:30:07 2019 -0500 Fix multiple misspellings of "following" commit 79b948ce2dac6b453fe80995abbcaac04c213d5a Author: Andy Lester <andy@petdance.com> Date: Fri Aug 2 22:24:28 2019 -0500 Fix misspelling of create-cluster commit 1fffde52666dc99ab35efbd31071a4c008cb5a71 Author: Andy Lester <andy@petdance.com> Date: Wed Jul 31 17:57:56 2019 -0500 Fix typos commit 204c9ba9651e9e05fd73936b452b9a30be456cfe Author: Xiaobo Zhu <xiaobo.zhu@shopee.com> Date: Tue Aug 13 22:19:25 2019 +0800 fix typos Squashed commit of the following: commit 1d9aaf8 Author: danmedani <danmedani@gmail.com> Date: Sun Aug 2 11:40:26 2015 -0700 README typo fix. Squashed commit of the following: commit 32bfa7c Author: Erik Dubbelboer <erik@dubbelboer.com> Date: Mon Jul 6 21:15:08 2015 +0200 Fixed grammer Squashed commit of the following: commit b24f69c Author: Sisir Koppaka <sisir.koppaka@gmail.com> Date: Mon Mar 2 22:38:45 2015 -0500 utils/hashtable/rehashing.c: Fix typos Squashed commit of the following: commit 4e04082 Author: Erik Dubbelboer <erik@dubbelboer.com> Date: Mon Mar 23 08:22:21 2015 +0000 Small config file documentation improvements Squashed commit of the following: commit acb8773 Author: ctd1500 <ctd1500@gmail.com> Date: Fri May 8 01:52:48 2015 -0700 Typo and grammar fixes in readme commit 2eb75b6 Author: ctd1500 <ctd1500@gmail.com> Date: Fri May 8 01:36:18 2015 -0700 fixed redis.conf comment Squashed commit of the following: commit a8249a2 Author: Masahiko Sawada <sawada.mshk@gmail.com> Date: Fri Dec 11 11:39:52 2015 +0530 Revise correction of typos. Squashed commit of the following: commit 3c02028 Author: zhaojun11 <zhaojun11@jd.com> Date: Wed Jan 17 19:05:28 2018 +0800 Fix typos include two code typos in cluster.c and latency.c Squashed commit of the following: commit 9dba47c Author: q191201771 <191201771@qq.com> Date: Sat Jan 4 11:31:04 2020 +0800 fix function listCreate comment in adlist.c Update src/server.c commit 2c7c2cb536e78dd211b1ac6f7bda00f0f54faaeb Author: charpty <charpty@gmail.com> Date: Tue May 1 23:16:59 2018 +0800 server.c typo: modules system dictionary type comment Signed-off-by: charpty <charpty@gmail.com> commit a8395323fb63cb59cb3591cb0f0c8edb7c29a680 Author: Itamar Haber <itamar@redislabs.com> Date: Sun May 6 00:25:18 2018 +0300 Updates test_helper.tcl's help with undocumented options Specifically: * Host * Port * Client commit bde6f9ced15755cd6407b4af7d601b030f36d60b Author: wxisme <850885154@qq.com> Date: Wed Aug 8 15:19:19 2018 +0800 fix comments in deps files commit 3172474ba991532ab799ee1873439f3402412331 Author: wxisme <850885154@qq.com> Date: Wed Aug 8 14:33:49 2018 +0800 fix some comments commit 01b6f2b6858b5cf2ce4ad5092d2c746e755f53f0 Author: Thor Juhasz <thor@juhasz.pro> Date: Sun Nov 18 14:37:41 2018 +0100 Minor fixes to comments Found some parts a little unclear on a first read, which prompted me to have a better look at the file and fix some minor things I noticed. Fixing minor typos and grammar. There are no changes to configuration options. These changes are only meant to help the user better understand the explanations to the various configuration options
2020-09-10 13:43:38 +03:00
* triggering an error condition. */
if (bysignal != SIGUSR1)
server.aof_lastbgrewrite_status = C_ERR;
2015-07-27 09:41:48 +02:00
serverLog(LL_WARNING,
"Background AOF rewrite terminated by signal %d", bysignal);
}
cleanup:
aofRemoveTempFile(server.child_pid);
server.aof_rewrite_time_last = time(NULL)-server.aof_rewrite_time_start;
server.aof_rewrite_time_start = -1;
/* Schedule a new rewrite if we are waiting for it to switch the AOF ON. */
2015-07-27 09:41:48 +02:00
if (server.aof_state == AOF_WAIT_REWRITE)
server.aof_rewrite_scheduled = 1;
}