11613 Commits

Author SHA1 Message Date
artix
b15f3515f3 Fixed memory write error in clusterManagerGetConfigSignature 2018-02-28 11:49:10 +01:00
artix
2f056b8331 Cluster Manager: reshard command, fixed slots
parsing bug and other minor bugs.
2018-02-28 10:44:14 +01:00
artix
66548863a4 Cluster Manager: reshard command, fixed slots
parsing bug and other minor bugs.
2018-02-28 10:44:14 +01:00
Salvatore Sanfilippo
7a73db7512
Merge pull request #4715 from charsyam/feature/refactoring-make-condition-clear-for-rdb
[BugFix] fix calculation length in rdbSaveAuxField
2018-02-27 10:15:27 -08:00
Salvatore Sanfilippo
c3934db151 Merge pull request #4715 from charsyam/feature/refactoring-make-condition-clear-for-rdb
[BugFix] fix calculation length in rdbSaveAuxField
2018-02-27 10:15:27 -08:00
antirez
92696e49d2 expireIfNeeded() needed a top comment documenting the behavior. 2018-02-27 16:44:43 +01:00
antirez
550181a96b expireIfNeeded() needed a top comment documenting the behavior. 2018-02-27 16:44:43 +01:00
antirez
b00c4ffab5 expireIfNeeded() comment: claim -> pretend. 2018-02-27 16:37:37 +01:00
antirez
4db08588cc expireIfNeeded() comment: claim -> pretend. 2018-02-27 16:37:37 +01:00
charsyam
76386c48b8 refactoring-make-condition-clear-for-rdb 2018-02-27 21:55:20 +09:00
charsyam
7bf2ef9dba refactoring-make-condition-clear-for-rdb 2018-02-27 21:55:20 +09:00
charsyam
6168d5a1a6 fix-out-of-index-range-for-redis-cli-findbigkey 2018-02-27 21:46:19 +09:00
charsyam
aecbdde3c0 fix-out-of-index-range-for-redis-cli-findbigkey 2018-02-27 21:46:19 +09:00
antirez
956350ef89 ae.c: insetad of not firing, on AE_BARRIER invert the sequence.
AE_BARRIER was implemented like:

    - Fire the readable event.
    - Do not fire the writabel event if the readable fired.

However this may lead to the writable event to never be called if the
readable event is always fired. There is an alterantive, we can just
invert the sequence of the calls in case AE_BARRIER is set. This commit
does that.
2018-02-27 13:06:42 +01:00
antirez
b745b98f3d ae.c: insetad of not firing, on AE_BARRIER invert the sequence.
AE_BARRIER was implemented like:

    - Fire the readable event.
    - Do not fire the writabel event if the readable fired.

However this may lead to the writable event to never be called if the
readable event is always fired. There is an alterantive, we can just
invert the sequence of the calls in case AE_BARRIER is set. This commit
does that.
2018-02-27 13:06:42 +01:00
antirez
75987431f0 AOF: fix a bug that may prevent proper fsyncing when fsync=always.
In case the write handler is already installed, it could happen that we
serve the reply of a query in the same event loop cycle we received it,
preventing beforeSleep() from guaranteeing that we do the AOF fsync
before sending the reply to the client.

The AE_BARRIER mechanism, introduced in a previous commit, prevents this
problem. This commit makes actual use of this new feature to fix the
bug.
2018-02-27 13:06:42 +01:00
antirez
5a57c953c9 AOF: fix a bug that may prevent proper fsyncing when fsync=always.
In case the write handler is already installed, it could happen that we
serve the reply of a query in the same event loop cycle we received it,
preventing beforeSleep() from guaranteeing that we do the AOF fsync
before sending the reply to the client.

The AE_BARRIER mechanism, introduced in a previous commit, prevents this
problem. This commit makes actual use of this new feature to fix the
bug.
2018-02-27 13:06:42 +01:00
antirez
533d0e0375 Cluster: improve crash-recovery safety after failover auth vote.
Add AE_BARRIER to the writable event loop so that slaves requesting
votes can't be served before we re-enter the event loop in the next
iteration, so clusterBeforeSleep() will fsync to disk in time.
Also add the call to explicitly fsync, given that we modified the last
vote epoch variable.
2018-02-27 13:06:42 +01:00
antirez
61da5a9da8 Cluster: improve crash-recovery safety after failover auth vote.
Add AE_BARRIER to the writable event loop so that slaves requesting
votes can't be served before we re-enter the event loop in the next
iteration, so clusterBeforeSleep() will fsync to disk in time.
Also add the call to explicitly fsync, given that we modified the last
vote epoch variable.
2018-02-27 13:06:42 +01:00
antirez
548e478e40 ae.c: introduce the concept of read->write barrier.
AOF fsync=always, and certain Redis Cluster bus operations, require to
fsync data on disk before replying with an acknowledge.
In such case, in order to implement Group Commits, we want to be sure
that queries that are read in a given cycle of the event loop, are never
served to clients in the same event loop iteration. This way, by using
the event loop "before sleep" callback, we can fsync the information
just one time before returning into the event loop for the next cycle.
This is much more efficient compared to calling fsync() multiple times.

Unfortunately because of a bug, this was not always guaranteed: the
actual way the events are installed was the sole thing that could
control. Normally this problem is hard to trigger when AOF is enabled
with fsync=always, because we try to flush the output buffers to the
socekt directly in the beforeSleep() function of Redis. However if the
output buffers are full, we actually install a write event, and in such
a case, this bug could happen.

This change to ae.c modifies the event loop implementation to make this
concept explicit. Write events that are registered with:

    AE_WRITABLE|AE_BARRIER

Are guaranteed to never fire after the readable event was fired for the
same file descriptor. In this way we are sure that data is persisted to
disk before the client performing the operation receives an
acknowledged.

However note that this semantics does not provide all the guarantees
that one may believe are automatically provided. Take the example of the
blocking list operations in Redis.

With AOF and fsync=always we could have:

    Client A doing: BLPOP myqueue 0
    Client B doing: RPUSH myqueue a b c

In this scenario, Client A will get the "a" elements immediately after
the Client B RPUSH will be executed, even before the operation is persisted.
However when Client B will get the acknowledge, it can be sure that
"b,c" are already safe on disk inside the list.

What to note here is that it cannot be assumed that Client A receiving
the element is a guaranteed that the operation succeeded from the point
of view of Client B.

This is due to the fact that the barrier exists within the same socket,
and not between different sockets. However in the case above, the
element "a" was not going to be persisted regardless, so it is a pretty
synthetic argument.
2018-02-27 13:06:42 +01:00
antirez
6c7d27c711 ae.c: introduce the concept of read->write barrier.
AOF fsync=always, and certain Redis Cluster bus operations, require to
fsync data on disk before replying with an acknowledge.
In such case, in order to implement Group Commits, we want to be sure
that queries that are read in a given cycle of the event loop, are never
served to clients in the same event loop iteration. This way, by using
the event loop "before sleep" callback, we can fsync the information
just one time before returning into the event loop for the next cycle.
This is much more efficient compared to calling fsync() multiple times.

Unfortunately because of a bug, this was not always guaranteed: the
actual way the events are installed was the sole thing that could
control. Normally this problem is hard to trigger when AOF is enabled
with fsync=always, because we try to flush the output buffers to the
socekt directly in the beforeSleep() function of Redis. However if the
output buffers are full, we actually install a write event, and in such
a case, this bug could happen.

This change to ae.c modifies the event loop implementation to make this
concept explicit. Write events that are registered with:

    AE_WRITABLE|AE_BARRIER

Are guaranteed to never fire after the readable event was fired for the
same file descriptor. In this way we are sure that data is persisted to
disk before the client performing the operation receives an
acknowledged.

However note that this semantics does not provide all the guarantees
that one may believe are automatically provided. Take the example of the
blocking list operations in Redis.

With AOF and fsync=always we could have:

    Client A doing: BLPOP myqueue 0
    Client B doing: RPUSH myqueue a b c

In this scenario, Client A will get the "a" elements immediately after
the Client B RPUSH will be executed, even before the operation is persisted.
However when Client B will get the acknowledge, it can be sure that
"b,c" are already safe on disk inside the list.

What to note here is that it cannot be assumed that Client A receiving
the element is a guaranteed that the operation succeeded from the point
of view of Client B.

This is due to the fact that the barrier exists within the same socket,
and not between different sockets. However in the case above, the
element "a" was not going to be persisted regardless, so it is a pretty
synthetic argument.
2018-02-27 13:06:42 +01:00
Salvatore Sanfilippo
d8830200b4
Merge pull request #3828 from oranagra/sdsnewlen_pr
add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.
2018-02-27 04:04:32 -08:00
Salvatore Sanfilippo
3f379e3d70 Merge pull request #3828 from oranagra/sdsnewlen_pr
add SDS_NOINIT option to sdsnewlen to avoid unnecessary memsets.
2018-02-27 04:04:32 -08:00
antirez
813960dbdd Fix ziplist prevlen encoding description. See #4705. 2018-02-23 12:19:35 +01:00
antirez
ac49eb0c8a Fix ziplist prevlen encoding description. See #4705. 2018-02-23 12:19:35 +01:00
gechunlin
d4e6d1086f
Update object.c 2018-02-22 20:57:54 -06:00
gechunlin
c857ac5840 Update object.c 2018-02-22 20:57:54 -06:00
artix
8f4f001dc3 Cluster Manager:
- Almost all Cluster Manager related code moved to
  the same section.
- Many macroes converted to functions
- Added various comments
- Little code restyling
2018-02-22 18:35:40 +01:00
artix
749e093591 Cluster Manager:
- Almost all Cluster Manager related code moved to
  the same section.
- Many macroes converted to functions
- Added various comments
- Little code restyling
2018-02-22 18:35:40 +01:00
artix
87f5a7c0b4 - Fixed bug in clusterManagerGetAntiAffinityScore
- Code improvements
2018-02-22 18:35:40 +01:00
artix
4a89b1b7c5 - Fixed bug in clusterManagerGetAntiAffinityScore
- Code improvements
2018-02-22 18:35:40 +01:00
artix
605d7262e6 Cluster Manager: colorized output 2018-02-22 18:35:40 +01:00
artix
2e64e25ee1 Cluster Manager: colorized output 2018-02-22 18:35:40 +01:00
artix
4ca8dbdc2b Cluster Manager: improved cleanup/error handling in various functions 2018-02-22 18:35:40 +01:00
artix
b0e77e6afa Cluster Manager: improved cleanup/error handling in various functions 2018-02-22 18:35:40 +01:00
artix
8128f1bf03 Cluster Manager: 'call' command. 2018-02-22 18:35:40 +01:00
artix
46465a7942 Cluster Manager: 'call' command. 2018-02-22 18:35:40 +01:00
artix
7b9f945b37 Cluster Manager: CLUSTER_MANAGER_NODE_CONNECT macro 2018-02-22 18:35:40 +01:00
artix
273ba95485 Cluster Manager: CLUSTER_MANAGER_NODE_CONNECT macro 2018-02-22 18:35:40 +01:00
artix
dad69ac320 ClusterManager: added replicas count to clusterManagerNode 2018-02-22 18:35:40 +01:00
artix
bb958098f4 ClusterManager: added replicas count to clusterManagerNode 2018-02-22 18:35:40 +01:00
artix
956bec4ca8 Cluster Manager: cluster is considered consistent if only one node has been found 2018-02-22 18:35:40 +01:00
artix
8251a1b736 Cluster Manager: cluster is considered consistent if only one node has been found 2018-02-22 18:35:40 +01:00
artix
1b1f80e60f Cluster Manager: reply error catch for MEET command 2018-02-22 18:35:40 +01:00
artix
c6e8eae7ae Cluster Manager: reply error catch for MEET command 2018-02-22 18:35:40 +01:00
artix
be7e2b84bd Cluster Manager: slots coverage check. 2018-02-22 18:35:40 +01:00
artix
6b13b265e4 Cluster Manager: slots coverage check. 2018-02-22 18:35:40 +01:00
artix
d38045805d - Cluster Manager: fixed various memory leaks
- Cluster Manager: fixed flags assignment in
  clusterManagerNodeLoadInfo
2018-02-22 18:35:40 +01:00
artix
7769aa0f98 - Cluster Manager: fixed various memory leaks
- Cluster Manager: fixed flags assignment in
  clusterManagerNodeLoadInfo
2018-02-22 18:35:40 +01:00
artix
74dcd14d13 Added check for open slots (clusterManagerCheckCluster) 2018-02-22 18:35:40 +01:00