Fix petential cluster link error.

Funcion adjustOpenFilesLimit() has an implicit parameter, which is server.maxclients.
This function aims to ajust maximum file descriptor number according to server.maxclients
by best effort, which is "bestlimit" could be lower than "maxfiles" but greater than "oldlimit".
When we try to increase "maxclients" using CONFIG SET command, we could increase maximum
file descriptor number to a bigger value without calling aeResizeSetSize the same time.
When later more and more clients connect to server, the allocated fd could be bigger and bigger,
and eventually exceeds events size of aeEventLoop.events. When new nodes joins the cluster,
new link is created, together with new fd, but when calling aeCreateFileEvent, we did not
check the return value. In this case, we have a non-null "link" but the associated fd is not
registered.

So when we dynamically set "maxclients" we could reach an inconsistency between maximum file
descriptor number of the process and server.maxclients. And later could cause cluster link and link
fd inconsistency.

While setting "maxclients" dynamically, we consider it as failed when resulting "maxclients" is not
the same as expected. We try to restore back the maximum file descriptor number when we failed to set
"maxclients" to the specified value, so that server.maxclients could act as a guard as before.
This commit is contained in:
WuYunlong 2019-12-31 18:15:21 +08:00
parent 6e4f70b817
commit 0992ada2fe

View File

@ -2098,6 +2098,10 @@ static int updateMaxclients(long long val, long long prev, char **err) {
static char msg[128];
sprintf(msg, "The operating system is not able to handle the specified number of clients, try with %d", server.maxclients);
*err = msg;
if (server.maxclients > prev) {
server.maxclients = prev;
adjustOpenFilesLimit();
}
return 0;
}
if ((unsigned int) aeGetSetSize(server.el) <