SYSLOG Check Failed
I have two servers that keep giving errors regarding syslog.
[CODE=bash]Time: Mon Nov 30 01:00:13 2020 -0500
Error: Failed to detect code [UdtnxuePtqKOQsz49h] in SYSLOG_LOG [/var/log/messages]
SYSLOG may not be running correctly on host.example.com
I checked the service and it appears to be running fine except for some errors I saw in the journal: [CODE=bash][root@host ~]# grep UdtnxuePtqKOQsz49h /var/log/messages [root@host ~]# zgrep UdtnxuePtqKOQsz49h /var/log/messages-* [root@host ~]# /scripts/restartsrv_rsyslogd --check The 'rsyslog' service passed the check: rsyslog (/usr/sbin/rsyslogd -n) is running as root with PID 1516483 (systemd+/proc check method). [root@host ~]# cat /var/run/syslogd.pid 1516483 [root@host ~]# logger -p auth.notice "test123" [root@host ~]# grep "test123" /var/log/messages Nov 30 16:08:27 host centos: test123 [root@host ~]# /scripts/restartsrv_rsyslogd --status rsyslog (/usr/sbin/rsyslogd -n) is running as root with PID 1516483 (systemd+/proc check method). [root@host ~]# service rsyslog status -l Redirecting to /bin/systemctl status -l rsyslog.service ? rsyslog.service - System Logging Service Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2020-11-30 00:21:07 EST; 15h ago Docs: man:rsyslogd(8) http://www.rsyslog.com/doc/ Main PID: 1516483 (rsyslogd) CGroup: /system.slice/rsyslog.service ??1516483 /usr/sbin/rsyslogd -n Nov 30 01:00:00 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:01:12 host.example.com rsyslogd[1516483]: imjournal: 2765 messages lost due to rate-limiting Nov 30 01:10:06 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:11:13 host.example.com rsyslogd[1516483]: imjournal: 2490 messages lost due to rate-limiting Nov 30 01:20:07 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:21:14 host.example.com rsyslogd[1516483]: imjournal: 2535 messages lost due to rate-limiting Nov 30 01:27:18 host.example.com rsyslogd[1516483]: imjournal: journal reloaded... [v8.24.0-57.el7_9 try http://www.rsyslog.com/e/0 ] Nov 30 01:27:18 host.example.com rsyslogd[1516483]: imjournal: journal reloaded... [v8.24.0-57.el7_9 try http://www.rsyslog.com/e/0 ] Nov 30 01:29:08 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:31:16 host.example.com rsyslogd[1516483]: imjournal: 536 messages lost due to rate-limiting
So I see the notice about the rate-limiting so I checked: [CODE=bash][root@host ~]# cat /etc/systemd/journald.conf # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it # under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation; either version 2.1 of the License, or # (at your option) any later version. # # Entries in this file show the compile time defaults. # You can change settings by editing this file. # Defaults can be restored by simply deleting this file. # # See journald.conf(5) for details. [Journal] #Storage=auto #Compress=yes #Seal=yes #SplitMode=uid #SyncIntervalSec=5m #RateLimitInterval=30s #RateLimitBurst=1000 #SystemMaxUse= #SystemKeepFree= #SystemMaxFileSize= #RuntimeMaxUse= #RuntimeKeepFree= #RuntimeMaxFileSize= #MaxRetentionSec= #MaxFileSec=1month #ForwardToSyslog=yes #ForwardToKMsg=no #ForwardToConsole=no #ForwardToWall=yes #TTYPath=/dev/console #MaxLevelStore=debug #MaxLevelSyslog=debug #MaxLevelKMsg=notice #MaxLevelConsole=info #MaxLevelWall=emerg #LineMax=48K
It looks like it's not even being used. Where does cPanel control the rate-limiting? Or is there a way to check if a bigger problem is causing the issue? I suspect it's related to a server backup at night at the same time, but not sure how to work around that. Any help is appreciated.
I checked the service and it appears to be running fine except for some errors I saw in the journal: [CODE=bash][root@host ~]# grep UdtnxuePtqKOQsz49h /var/log/messages [root@host ~]# zgrep UdtnxuePtqKOQsz49h /var/log/messages-* [root@host ~]# /scripts/restartsrv_rsyslogd --check The 'rsyslog' service passed the check: rsyslog (/usr/sbin/rsyslogd -n) is running as root with PID 1516483 (systemd+/proc check method). [root@host ~]# cat /var/run/syslogd.pid 1516483 [root@host ~]# logger -p auth.notice "test123" [root@host ~]# grep "test123" /var/log/messages Nov 30 16:08:27 host centos: test123 [root@host ~]# /scripts/restartsrv_rsyslogd --status rsyslog (/usr/sbin/rsyslogd -n) is running as root with PID 1516483 (systemd+/proc check method). [root@host ~]# service rsyslog status -l Redirecting to /bin/systemctl status -l rsyslog.service ? rsyslog.service - System Logging Service Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2020-11-30 00:21:07 EST; 15h ago Docs: man:rsyslogd(8) http://www.rsyslog.com/doc/ Main PID: 1516483 (rsyslogd) CGroup: /system.slice/rsyslog.service ??1516483 /usr/sbin/rsyslogd -n Nov 30 01:00:00 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:01:12 host.example.com rsyslogd[1516483]: imjournal: 2765 messages lost due to rate-limiting Nov 30 01:10:06 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:11:13 host.example.com rsyslogd[1516483]: imjournal: 2490 messages lost due to rate-limiting Nov 30 01:20:07 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:21:14 host.example.com rsyslogd[1516483]: imjournal: 2535 messages lost due to rate-limiting Nov 30 01:27:18 host.example.com rsyslogd[1516483]: imjournal: journal reloaded... [v8.24.0-57.el7_9 try http://www.rsyslog.com/e/0 ] Nov 30 01:27:18 host.example.com rsyslogd[1516483]: imjournal: journal reloaded... [v8.24.0-57.el7_9 try http://www.rsyslog.com/e/0 ] Nov 30 01:29:08 host.example.com rsyslogd[1516483]: imjournal: begin to drop messages due to rate-limiting Nov 30 01:31:16 host.example.com rsyslogd[1516483]: imjournal: 536 messages lost due to rate-limiting
So I see the notice about the rate-limiting so I checked: [CODE=bash][root@host ~]# cat /etc/systemd/journald.conf # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it # under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation; either version 2.1 of the License, or # (at your option) any later version. # # Entries in this file show the compile time defaults. # You can change settings by editing this file. # Defaults can be restored by simply deleting this file. # # See journald.conf(5) for details. [Journal] #Storage=auto #Compress=yes #Seal=yes #SplitMode=uid #SyncIntervalSec=5m #RateLimitInterval=30s #RateLimitBurst=1000 #SystemMaxUse= #SystemKeepFree= #SystemMaxFileSize= #RuntimeMaxUse= #RuntimeKeepFree= #RuntimeMaxFileSize= #MaxRetentionSec= #MaxFileSec=1month #ForwardToSyslog=yes #ForwardToKMsg=no #ForwardToConsole=no #ForwardToWall=yes #TTYPath=/dev/console #MaxLevelStore=debug #MaxLevelSyslog=debug #MaxLevelKMsg=notice #MaxLevelConsole=info #MaxLevelWall=emerg #LineMax=48K
It looks like it's not even being used. Where does cPanel control the rate-limiting? Or is there a way to check if a bigger problem is causing the issue? I suspect it's related to a server backup at night at the same time, but not sure how to work around that. Any help is appreciated.
-
I did see it in the service status journal logs. But looking in /messages specifically I see it as well: [CODE=bash][root@host ~]# grep "limiting" /var/log/messages Nov 30 00:51:11 host rsyslogd: imjournal: 3181 messages lost due to rate-limiting Nov 30 01:01:12 host rsyslogd: imjournal: 2765 messages lost due to rate-limiting Nov 30 01:11:13 host rsyslogd: imjournal: 2490 messages lost due to rate-limiting Nov 30 01:21:14 host rsyslogd: imjournal: 2535 messages lost due to rate-limiting Nov 30 01:31:16 host rsyslogd: imjournal: 536 messages lost due to rate-limiting
I looked over that link you sent a little bit, but I was curious how cPanel handles it. Does cPanel adjust the defaults in anyway? If not, is it safe to assume it's using the default of 200 messages in 5 seconds? If so, is it bad to increase this to say 1000 messages?0 -
Great - so that's what happening then. cPanel doesn't do anything with that on our side, but this just means there were too many messages to print at once due to the rate limit settings. Instead of bumping that, since you seem to be frequently hitting 2500+, it might be best to review the log to see what is writing so much data so see if it could be resolved at the source. 0 -
I see a lot of these, is this caused by backups? Not familiar with that message. That continues for each of the accounts (112) for about 30min. With about 20 a second by the looks of it, i could see how it could quickly hit that limit. I did some research and found this Logs flooded with systemd messages: Created slice, Starting Session - Red Hat Customer Portal, would you say that's a good solution for this situation? Not overly familiar with this logging so I don't want jump the gun here. [CODE=bash]Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167544 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167545 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167546 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167547 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167548 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167549 of user . Nov 30 01:01:38 host systemd: Removed slice User Slice of . Nov 30 01:01:38 host systemd: Created slice User Slice of . Nov 30 01:01:38 host systemd: Started Session c167550 of user . 0 -
I'd try it - it's normal behavior, and not related to cPanel, so trying that filter how they recommend would also be my best recommendation. 0 -
For future reference, I added this to one of my servers to test. We'll see if this fixes the issue. [CODE=bash]echo 'if $programname == "systemd" and ($msg contains "Starting Session" or $msg contains "Started Session" or $msg contains "Created slice" or $msg contains "Starting user-" or $msg contains "Starting User Slice of" or $msg contains "Removed session" or $msg contains "Removed slice User Slice of" or $msg contains "Stopping User Slice of" or $msg contains "Removed slice" or $msg contains "Stopping user-") then stop' >/etc/rsyslog.d/ignore-systemd-session-slice.conf systemctl restart rsyslog 0 -
I think it helped but it didn't fix the issue. I still got the SYSLOG notice this morning. I looked at the looked at the service log and see this: [CODE=bash]Dec 02 01:10:47 host.example.com rsyslogd[697298]: imjournal: begin to drop messages due to rate-limiting Dec 02 01:11:32 host.example.com rsyslogd[697298]: imjournal: 1569 messages lost due to rate-limiting Dec 02 01:21:24 host.example.com rsyslogd[697298]: imjournal: begin to drop messages due to rate-limiting Dec 02 01:21:33 host.example.com rsyslogd[697298]: imjournal: 316 messages lost due to rate-limiting Dec 02 01:30:04 host.example.com rsyslogd[697298]: imjournal: begin to drop messages due to rate-limiting Dec 02 01:31:37 host.example.com rsyslogd[697298]: imjournal: 559 messages lost due to rate-limiting
[CODE=bash][root@host ~]# grep "limiting" /var/log/messages Dec 2 00:51:29 host rsyslogd: imjournal: 3525 messages lost due to rate-limiting Dec 2 01:01:30 host rsyslogd: imjournal: 2200 messages lost due to rate-limiting Dec 2 01:11:32 host rsyslogd: imjournal: 1569 messages lost due to rate-limiting Dec 2 01:21:33 host rsyslogd: imjournal: 316 messages lost due to rate-limiting Dec 2 01:31:37 host rsyslogd: imjournal: 559 messages lost due to rate-limiting
I checked /var/log/messages again and counted the entries between 01:10 and 01:11 and there was only 388. I checked the conf files again and don't see anything explicitly set. But I also can't figure out what the limit actual is. Can you help there? [CODE=bash] [root@host ~]# grep -i rate /etc/systemd/journald.conf #RateLimitInterval=30s #RateLimitBurst=1000 [root@host ~]# grep -i rate /etc/rsyslog.conf0 -
I would also expect those values to be controlled as part of the /etc/systemd/journald.conf file with the two values you provided in that last reply. Setting those to 0 would disable the values, so it might be best to uncomment those if you wanted to adjust the values. 0
Please sign in to leave a comment.
Comments
8 comments