278 Commits

Author SHA1 Message Date
Tibault Damman
011fdb4772 fix build on 32bit systems
Code had an assert that tests if sizeof(long) == sizeof(long long),
which obviously fails on 32-bit architectures.
I believe the assert is incorrect on any architecture, considering
the following code is actually putting a long long in an _int_.

I replaced it with code that checks if the value fits in an int.

Signed-off-by: Tibault Damman <tibault.damman@basalte.be>
2024-03-07 19:50:35 -05:00
Malavan Sotheeswaran
05945b60a8 missed a refernce to isRocksdbSnapshotRepl 2023-10-26 11:40:50 -07:00
Malavan Sotheeswaran
486a6bb7b1 Merge remote-tracking branch 'internal/main' 2023-10-26 11:21:46 -07:00
John Sully
a7ac6c1e43 Remove unused variable 2023-10-23 15:23:11 -04:00
John Sully
d7977c468f Fastsync in-mem initial implementation 2023-10-23 15:23:11 -04:00
John Sully
7ae27f61f0 rename constants 2023-10-23 15:23:11 -04:00
Karthick Ariyaratnam
5b12251627
Fix integer overflow issue in the temp rdb file naming. (#702)
Co-authored-by: Karthick Ariyaratnam (A) <k00809413@china.huawei.com>
2023-09-28 21:01:47 -04:00
Malavan Sotheeswaran
38169682f0
Merge latest internal to OSS (#720)
* add docker build

* fix the working dir in Dockerfile

* add release publish docker image

* address intentation and use default release

* migrate keydb_modstatsd to keydb-internal

* rm

* add submodule cpp-statsd-client

* include trigger keydb_modstatsd Makefile in modules Makefile

* update

* have clean also trigger keydb_modstatsd clean

* move cpp-statsd-client to deps

* checkout to a06a5b9359f31d946fe163b9038586982971ae49

* update relative path in compilation

* remove submodule instead use the source

* include building keydb statsd module

* fix check in Dockerfile docker-entrypoint.sh

* fix

* fix the comment caused stuck docker build

* use commit hash as tag template

* fix

* test tag

* Revert "test tag"

This reverts commit 9cbc57137d57aab4fdd5a9283bae07391b3c7f8b.

* make docker build independent

* add new build to ci

* emit system free metrics with '/proc/meminfo'

* have emit system free memory within metrics_time_taken_us and also add metric time taken for it

* Remove Expireset (#217)

Major refactor to place expiry information directly in the object struct.

* update MemFree to MemAvailable in keydb statsd

* add metric emit for non-empty primary with less than 2 connected replicas

* address comments

* Multiply CPU percent metric by 100

* Fix memory leaks

* Fix slow to free when low lock contention

* fix nodename metricsname

* fix unnecessary replace

* Make propagating before freeing module context optional (#225)

* don't propogate on module context free for rdb load

* default in wrong place

* Flash expiration (#197)

Design Doc: https://docs.google.com/document/d/1NmnYGnHLdZp-KOUCUatX5iXpF-L3YK4VUc9Lm3Tqxpo/edit?usp=sharing

* Emit more aggregate metrics in modstatsd (#223)

* Permit keys of differing slots as long as they are served by this cluster and we are not migrating

* Fix over pessimistic checks that prevent replicas from serving mget

* Fix logic bug

* async rehash is preventing rehashing during RDB load after a db flush.  Ensure it can't interefere after a flush

* make async rehash configurable

* only use internal locks when multithreaded (#205)

* Fix crossslot error migrating batches of keys

* Fix bug where we erroneously answer queries belonging to another shard

* fix mac compile

* enable level_compaction_dynamic_level_bytes after flush, and flush expires for FLASH (#229)

* enable level_compaction_dynamic_level_bytes after flush, and flush expires

* update debug reload for flash

* update debug reload for flash complete

* missing forward declare

* commit existing changes then track changes for debug reload

* missing args

* commitChanges is conditional

Co-authored-by: John Sully <jsully@snapchat.com>

---------

Co-authored-by: zliang <zliang@snapchat.com>
Co-authored-by: John Sully <jsully@snapchat.com>
Co-authored-by: Alex Cope <acope@snapchat.com>
Co-authored-by: John Sully <john@csquare.ca>
2023-09-28 18:13:27 -04:00
John Sully
f435218ed9 Fix memory leaks 2023-08-25 16:44:10 +00:00
Karthick Ariyaratnam
bcb542128a Fix a bug where a temp rdb file with zero bytes is generated in flash mode. (#696)
Co-authored-by: Karthick Ariyaratnam (A) <k00809413@china.huawei.com>
2023-07-25 21:33:30 -07:00
John Sully
bef77862e7 Prevent crash on free when using repl-disk-buffer-reserve (#207) 2023-07-14 14:39:23 -04:00
Malavan Sotheeswaran
c17b9f47ac Cherry picking keydb changes from keydbpro to main (#203)
* Audit Logging for KeyProxy and KeyDB (#144)

* Audit Log: log cert fingerprint (#151)

* Add more flash storage stats to info command.

* Remove unneeded libs when not building FLASH

* Fix mem leak

* Allow the reservation of localhost connections to ensure health checks always succeed even at maxclients (#181)

* Enable a force option for commands (#183)

* Fix missing newline and excessive logging in the CLI

* Support NO ONE for "CLUSTER REPLICATE" command.

Co-authored-by: Jacob Bohac <jbohac@snapchat.com>
Co-authored-by: Sergey Kolosov <skolosov@snapchat.com>
Co-authored-by: John Sully <jsully@snapchat.com>
Co-authored-by: John Sully <john@csquare.ca>
2023-06-27 16:23:20 -04:00
Malavan Sotheeswaran
715f832b00
make connectWithMaster error message less confusing (#592) 2023-03-08 16:00:53 -05:00
Malavan Sotheeswaran
7fcbfac103 snprintf fix 2023-02-14 17:51:19 -08:00
Malavan Sotheeswaran
5123e2b3a1
change hasActiveChildProcess to return true only when there is an actual child process (#558)
change hasActiveChildProcess to return true only when there is an actual child process, add hasActiveChildProcessOrBGSave to catch case of forkless bgsave
2023-02-03 13:36:06 -05:00
Malavan Sotheeswaran
2498e0fc1f
fix macos build warnings/ remove 32 bit CI run (#522)
* fix macos build warnings

* remove 32 bit ci run as we no longer support it
2022-12-15 15:49:44 -05:00
Malavan Sotheeswaran
f5f1bd7605
Merge main with oss release sep29 2022 (#521)
* need to include stdint for uintptr_t

* need to include stdint for uintptr_t

* use atomic_load for g_pserver->mstime

* use atomic_load for g_pserver->mstime

* Integrate readwritelock with Pro Code

* Integrate readwritelock with Pro Code

* Defensive asserts for RWLock

* Defensive asserts for RWLock

* Save and restore master info in rdb to allow active replica partial sync (#371)

* save replid for all masters in rdb

* expanded rdbSaveInfo to hold multiple master structs

* parse repl-masters from rdb

* recover replid info from rdb in active replica mode, attempt partial sync

* save offset from rdb into correct variable

* don't change replid based on master in active rep

* save and load psync info from correct fields

* Save and restore master info in rdb to allow active replica partial sync (#371)

* save replid for all masters in rdb

* expanded rdbSaveInfo to hold multiple master structs

* parse repl-masters from rdb

* recover replid info from rdb in active replica mode, attempt partial sync

* save offset from rdb into correct variable

* don't change replid based on master in active rep

* save and load psync info from correct fields

* placement new instead of memcpy

* placement new instead of memcpy

* Remove asserts, RW lock can go below zero in cases of aeAcquireLock

* Remove asserts, RW lock can go below zero in cases of aeAcquireLock

* Inclusive language

* Inclusive language

* update packaging for OS merge

* update packaging for OS merge

* modify dockerfile to build within image

* modify dockerfile to build within image

* Make active client balancing a configurable option

* Make active client balancing a configurable option

* With TLS throttle accepts if server is under heavy load - do not change non TLS behavior

* With TLS throttle accepts if server is under heavy load - do not change non TLS behavior

* Only run the tls-name-validation test if --tls is passed into runtest

* Only run the tls-name-validation test if --tls is passed into runtest

* Fix KeyDB not building with TLS < 1.1.1

* Fix KeyDB not building with TLS < 1.1.1

* update changelog to use replica as terminology

* update changelog to use replica as terminology

* update copyright

* update copyright

* update deb copyright

* update deb copyright

* call aeThreadOnline() earlier

* call aeThreadOnline() earlier

* Removed mergeReplicationId

* Removed mergeReplicationId

* acceptTLS is threadsafe like the non TLS version

* acceptTLS is threadsafe like the non TLS version

* setup Machamp ci

* setup Machamp ci

* make build_test.sh executable

* make build_test.sh executable

* PSYNC production fixes

* PSYNC production fixes

* fix the Machamp build

* fix the Machamp build

* break into tests into steps

* break into tests into steps

* Added multimaster test

* Added multimaster test

* Update ci.yml

Change min tested version to 18.04

* Update ci.yml

Change min tested version to 18.04

* fork lock for all threads, use fastlock for readwritelock

* fork lock for all threads, use fastlock for readwritelock

* hide forklock object in ae

* hide forklock object in ae

* only need to include readwritelock in ae

* only need to include readwritelock in ae

* time thread lock uses fastlock instead of std::mutex

* time thread lock uses fastlock instead of std::mutex

* set thread as offline when waiting for time thread lock

* set thread as offline when waiting for time thread lock

* update README resource links

* update README resource links

* Fix MALLOC=memkind build issues

* Fix MALLOC=memkind build issues

* Fix module test break

* Fix module test break

* Eliminate firewall dialogs on mac for regular and cluster tests.  There are still issues with the sentinel tests but attempting to bind only to localhost causes failures

* Eliminate firewall dialogs on mac for regular and cluster tests.  There are still issues with the sentinel tests but attempting to bind only to localhost causes failures

* remove unused var in networking.cpp

* remove unused var in networking.cpp

* check ziplist len to avoid crash on empty ziplist convert

* check ziplist len to avoid crash on empty ziplist convert

* remove nullptr subtraction

* remove nullptr subtraction

* cannot mod a pointer

* cannot mod a pointer

* need to include stdint for uintptr_t

* need to include stdint for uintptr_t

* use atomic_load for g_pserver->mstime

* use atomic_load for g_pserver->mstime

* Integrate readwritelock with Pro Code

* Integrate readwritelock with Pro Code

* Defensive asserts for RWLock

* Defensive asserts for RWLock

* Save and restore master info in rdb to allow active replica partial sync (#371)

* save replid for all masters in rdb

* expanded rdbSaveInfo to hold multiple master structs

* parse repl-masters from rdb

* recover replid info from rdb in active replica mode, attempt partial sync

* save offset from rdb into correct variable

* don't change replid based on master in active rep

* save and load psync info from correct fields

* Save and restore master info in rdb to allow active replica partial sync (#371)

* save replid for all masters in rdb

* expanded rdbSaveInfo to hold multiple master structs

* parse repl-masters from rdb

* recover replid info from rdb in active replica mode, attempt partial sync

* save offset from rdb into correct variable

* don't change replid based on master in active rep

* save and load psync info from correct fields

* placement new instead of memcpy

* placement new instead of memcpy

* Remove asserts, RW lock can go below zero in cases of aeAcquireLock

* Remove asserts, RW lock can go below zero in cases of aeAcquireLock

* Inclusive language

* Inclusive language

* call aeThreadOnline() earlier

* call aeThreadOnline() earlier

* Removed mergeReplicationId

* Removed mergeReplicationId

* Make active client balancing a configurable option

* Make active client balancing a configurable option

* With TLS throttle accepts if server is under heavy load - do not change non TLS behavior

* With TLS throttle accepts if server is under heavy load - do not change non TLS behavior

* acceptTLS is threadsafe like the non TLS version

* acceptTLS is threadsafe like the non TLS version

* PSYNC production fixes

* PSYNC production fixes

* Ensure we are responsive during storagecache clears

* Ensure we are responsive during storagecache clears

* Ensure recreated tables use the same settings as ones made at boot

* Ensure recreated tables use the same settings as ones made at boot

* Converted some existing PSYNC tests for multimaster

* Converted some existing PSYNC tests for multimaster

* Inclusive language fix

* Inclusive language fix

* Cleanup test suite

* Cleanup test suite

* Updated test replica configs so tests make sense

* Updated test replica configs so tests make sense

* active-rep test reliability

* active-rep test reliability

* Quick fix to make psync tests work

* Quick fix to make psync tests work

* Fix PSYNC test crashes

* Fix PSYNC test crashes

* Ensure we force moves not copies when ingesting bulk insert files

* Ensure we force moves not copies when ingesting bulk insert files

* Disable async for hget commands as it is not ready

* Disable FLASH

* Fix crash in save of masterinfo

* Fix musl/Alpine build failures

* Remove unnecessary libs

* update readme

* update readme

* remove Enterprise references

* Limit max overage to 20% during RDB save

* Delete COPYING to replace with BSD license

* update deb master changelog

* Update license

* Fix Readme typo from github org transition

Replace mention of scratch-file-path with db-s3-object

* Fix reference counting failure in the dict.  This is caused by std::swap also swapping refcounts

* Fix assertion in async rehash

* Prevent crash on shutdown by avoiding dtors (they are unnecessary anyways)

* Initialize noshrink, it was dangling

* Prevent us from starting a rehash when one wasn't already in progress.  This can cause severe issues for snapshots

* Avoid unnecessary rehashing when a rehash is abandoned

* Dictionary use correct acquire/release semantics

* Add fence barriers for the repl backlog (important for AARCH64 and other weak memory models)

* Silence TSAN errors on ustime and mstime.  Every CPU we support is atomic on aligned ints, but correctness matters

* Disable async commands by default

* Fix TSAN warnings on the repl backlog

* Merge OSS back into pro

* Fix unmerged files

* Fix O(n^2) algorithm in the GC cleanup logic

* Fix crash in expire when a snapshot is in flight.  Caused by a perf optimization getting the expire map out of sync with the val

* On Alpine we must have a reasonable stack size

* Revert ci.yml to unstable branch version

* Implements the soft shutdown feature to allow clients to cooperatively disconnect preventing disruption during shutdown

* Ensure clean shutdown with multiple threads

* update dockerfiles

* update deb pkg references and changelog

* update gem reference

* lpGetInteger returns int64_t, avoid overflow (#10068)

Fix #9410

Crucial for the ms and sequence deltas, but I changed all
calls, just in case (e.g. "flags")

Before this commit:
`ms_delta` and `seq_delta` could have overflown, causing `currid` to be wrong,
which in turn would cause `streamTrim` to trim the entire rax node (see new test)

* Fix issue #454 (BSD build break)

* Do not allow commands to run in background when in eval, Issue #452

* Fix certificate leak during connection when tls-allowlists are used

* Fix issue #480

* Fix crash running INFO command while a disk based backlog is set

* check tracking per db

* fix warnings

* Fix a race when undoConnectWithMaster changes mi->repl_transfer_s but the connection is not yet closed and the event handler runs

* Fix a race in processChanges/trackChanges with rdbLoadRio by acquiring the lock when trackChanges is set

* Fix ASAN use after free

* Additional fixes

* Fix integer overflow of the track changes counter

* Fix P99 latency issue for TLS where we leave work for the next event loop

tlsProcessPendingData() needs to be called before we execute queued commands because it may enqueue more commands

* Fix race removing key cache

* Prevent crash on load in long running KeyDB instances

* Fixes a crash where the server assertion failed when the key exists in DB during RDB load

* Remove old assertion which is commented out.

* avoid from instatiating EpochHolder multiple times to improve performance and cpu utilization

* avoid from instatiating EpochHolder multiple times to improve performance and cpu utilization

* src\redis-cli.c: fix potential null pointer dereference found by cppcheck

src\redis-cli.c:5488:35: warning: Either the condition
'!table' is redundant or there is possible null pointer dereference:
table. [nullPointerRedundantCheck]

* Fix Issue #486

* Workaround bug in snapshot sync - abort don't crash

* Improve reliability of async parts of the soft shutdown tests

* Improve reliability of fragmentation tests

* Verify that partial syncs do indeed occur

* Fix O(n) algorithm in INFO command

* Remove incorrect assert that fires when the repl backlog is used fully

* Make building flash optional

* Remove unneeded gitlab CI file

* [BUG] Moves key to another DB, the source key was removed if the move failed due to the key exists in the destination db #497 (#498)

Co-authored-by: Paul Chen <mingchen@Mings-MacBook-Pro.local>

* trigger repl_curr_off!= master_repl_offset assert failure when having pending write case

* use debug for logging the message instead

* rocksdb log using up the diskspace on flash (#519)

* Fix OpenSSL 3.0.x related issues. (#10291)

* Drop obsolete initialization calls.
* Use decoder API for DH parameters.
* Enable auto DH parameters if not explicitly used, which should be the
  preferred configuration going forward.

* remove unnecessary forward declaration

* remove internal ci stuff

* remove more internal ci/publishing

* submodule update step

* use with syntax instead

* bump ci ubuntu old ver as latest is now 22.04

* include submodules on all ci jobs

* install all deps for all ci jobs

Co-authored-by: Vivek Saini <vsaini@snapchat.com>
Co-authored-by: Christian Legge <christian@eqalpha.com>
Co-authored-by: benschermel <bschermel@snapchat.com>
Co-authored-by: John Sully <john@csquare.ca>
Co-authored-by: zliang <zliang@snapchat.com>
Co-authored-by: malavan <malavan@eqalpha.com>
Co-authored-by: John Sully <jsully@snapchat.com>
Co-authored-by: jfinity <38383673+jfinity@users.noreply.github.com>
Co-authored-by: benschermel <43507366+benschermel@users.noreply.github.com>
Co-authored-by: guybe7 <guy.benoish@redislabs.com>
Co-authored-by: Karthick Ariyaratnam (A) <k00809413@china.huawei.com>
Co-authored-by: root <paul.chen1@huawei.com>
Co-authored-by: Ilya Shipitsin <chipitsine@gmail.com>
Co-authored-by: Paul Chen <32553156+paulmchen@users.noreply.github.com>
Co-authored-by: Paul Chen <mingchen@Mings-MacBook-Pro.local>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2022-12-14 12:17:36 -05:00
John Sully
c97dc08e38 Additional fixes 2022-08-23 17:33:14 +00:00
John Sully
dd65d4af44 Fix ASAN use after free 2022-08-23 06:37:26 +00:00
John Sully
1810f8af35 Fix a race when undoConnectWithMaster changes mi->repl_transfer_s but the connection is not yet closed and the event handler runs 2022-08-21 22:35:08 +00:00
John Sully
a265f815e2 Merge OSS back into pro 2022-05-18 01:29:15 +00:00
John Sully
c7108ac57e PSYNC production fixes 2022-04-26 02:07:28 +00:00
Vivek Saini
a0208b7301 Removed mergeReplicationId 2022-04-26 01:55:22 +00:00
Vivek Saini
d7b4f1e492 call aeThreadOnline() earlier 2022-04-26 01:55:22 +00:00
Vivek Saini
4d053b1aa1 Inclusive language 2022-04-26 01:55:22 +00:00
Christian Legge
0ed0745d90 Save and restore master info in rdb to allow active replica partial sync (#371)
* save replid for all masters in rdb

* expanded rdbSaveInfo to hold multiple master structs

* parse repl-masters from rdb

* recover replid info from rdb in active replica mode, attempt partial sync

* save offset from rdb into correct variable

* don't change replid based on master in active rep

* save and load psync info from correct fields
2022-04-26 01:55:22 +00:00
Vivek Saini
c529f0e1ed Integrate readwritelock with Pro Code 2022-04-26 01:55:22 +00:00
Malavan Sotheeswaran
f35baf8e7d hide forklock object in ae 2022-04-26 01:55:22 +00:00
malavan
a352731178 fork lock for all threads, use fastlock for readwritelock 2022-04-26 01:55:21 +00:00
John Sully
d06b9cbbe0 Handle RREPLAY errors gracefully 2022-04-13 12:51:00 -04:00
John Sully
269b05b918 Log the connected masters in the INFO command 2022-04-02 01:20:45 -04:00
John Sully
c540e4b6e5 Do not save while loading 2022-04-01 05:08:08 +00:00
John Sully
b787828ef9 Fix mac build warnings 2022-03-07 19:28:39 -05:00
John Sully
53fbf7b362 Fix fast-sync perf issue while server is under load (batch size too small) 2022-03-07 16:40:01 -05:00
John Sully
d77bbee238 Fix mac build breaks and remove license checks (won't work on mac) 2022-03-07 14:50:31 -05:00
John Sully
1a597c78bd Handle the case where querybuf data is read by the fastsync read handler
Former-commit-id: c4a5b904e941e09132413abc3b4d86c59c342051
2021-12-27 00:15:09 -05:00
John Sully
30bba7f7de Fix partial sync failures
Former-commit-id: 7e9f7c0c4f520392a930ab72951e287f52c711ab
2021-12-26 05:16:58 -05:00
John Sully
cc0417d790 Fix repl-psync-flash test instability
Former-commit-id: 310b0cf5413dbf3e7aa67e9b9c31869f3e994291
2021-12-23 17:30:14 -05:00
John Sully
373f584465 Fix partial sync corruption with FLASH
Former-commit-id: 532f58c0539b775c040c0dd9a2ad3dc349faf87a
2021-12-23 00:04:28 -05:00
John Sully
b13134c501 Reenable TCP No Delay
Former-commit-id: e11211cdfeca46574a03f6f8210bbe1ab3d70961
2021-12-22 18:42:11 -05:00
John Sully
d22a9dd64a Fix memory leak
Former-commit-id: c5b9adf47e30658359071d458cfb16a094dc8e28
2021-12-22 18:41:50 -05:00
VivekSainiEQ
bfcea943ea Merge remote-tracking branch 'mainpro/PRO_RELEASE_6' into keydbpro
Former-commit-id: 5a32d66ee382b6d227a67073afc81ca058d605ed
2021-12-06 20:43:23 +00:00
John Sully
5a3080dd8c Implent the force-backlog-disk-reserve flag
Former-commit-id: d39f7f4407f8935b1540dd302be3e24ac02c5700
2021-12-04 02:07:03 +00:00
malavan
4dcb233e08 fix potential compilation fail on gettid in replication.cpp
Former-commit-id: edb6b240a7785b707652a168d90f46d8744dd4c7
2021-12-01 20:50:18 +00:00
Malavan Sotheeswaran
ac1775b8a6 Merge branch 'async_commands' of https://gitlab.eqalpha.com/external-collab/keydb-pro-6 into async_commands
Former-commit-id: 1afa51c4d21d695c052dbec690bf3880b243dbec
2021-11-30 12:17:46 -05:00
Malavan Sotheeswaran
250e5b39a7 Merge branch 'keydbpro' into async_commands
Former-commit-id: 9eaddb8ca1424ff3225dac9c144d23848228c7d2
2021-11-30 11:47:51 -05:00
Malavan Sotheeswaran
9afbac8a83 Merge branch 'fastsync_collab' into 'keydbpro'
Fastsync collab

See merge request external-collab/keydb-pro-6!9

Former-commit-id: 8f410e4b814ea6ac506ab540ee1394834fd1d8a8
2021-11-26 20:53:00 +00:00
christianEQ
47f453a5a1 don't save master if host is null
Former-commit-id: 8238d8da82c483c093f5248b9dac983bc542e20f
2021-11-26 20:36:46 +00:00
Malavan Sotheeswaran
414277b098 Merge branch 'flash-psync' into 'fastsync_collab'
Flash Partial Sync

See merge request external-collab/keydb-pro-6!6

Former-commit-id: 2ebf2a8abd8df5c4cf3a5d759491962d050e8cc5
2021-11-26 18:59:16 +00:00
malavan
76b900db22 client lock for fast sync replbuffer, delay fast sync for next replication cron
Former-commit-id: 9fe7f8328d66f9ec57060934462ad85ef60c36aa
2021-11-26 17:46:41 +00:00