Symptoms
The chkservd process relies on opening test sessions with Dovecot to verify that the service (and its related protocols) are online. However, if the 'Maximum Number of Mail Processes' limit (set via WHM's Mailserver Configuration page) is reached, then the chkservd check will fail with an error similar to the following:
lmtp [Service check failed to complete
Unable to connect to unix socket /var/run/dovecot/lmtp: Connection refused: Died[check command:+][socket connect:-][socket failure threshold:1/3]]..
Description
To identify if this scenario is occurring, you can compare the number of active Dovecot processes to the current process limit set. The process limit can be found on WHM's Mailserver Configuration page, or in the output of the following command:
grep -A1 "new users aren't allowed to log in" /etc/dovecot/dovecot.conf | head -2
The number of active Dovecot processes can be found using a pgrep command similar to the following:
pgrep -c dovecot
Workaround
If the number of active Dovecot processes is hovering around or at the specified process limit, then this scenario is the most likely cause for the false-positive service failure reports.
To resolve this issue, the administrator may wish to either mitigate users that are opening a large number of Dovecot processes, or increase the 'Maximum Number of Mail Processes' limit to accommodate for the increased traffic. In the former scenario, high-usage users can be identified using a command similar to the following:
ps aux | awk '/pop3/ || /imap/ {print $1}' | sort | uniq -c | sort -rn | head
Comments
0 comments
Article is closed for comments.