Exim not fully terminating DATA command, resulting in SMTP or LMTP "timeout after data"
I have some customers who forward their incoming email from an alias to gmail which has worked in the past, but since release 62, something is has changed.
The exact scenario is that the email is sent to an address at a domain hosted by us, but it is not a mailbox, just a direct forward to a gmail account. When sending to gmail, we see timeouts like this:
If I manually try to deliver the email in Mail Queue Manager, I see this:
Nothing happens at all after that last BDAT command. It just times out. It seems like something is blocking the data from going out, or an ack coming back. I do use mailscanner and csf, but I've disabled both of those and still can't send successfully. I've tried enabling SRS, but that didn't work. I don't use DKIM verification at all. This only happens for users who forward to gmail from an alias. All other email to and from gmail works fine. I'm out of ideas honestly. I think it might have something to do with exim's CHUNKING support, but I can't seem to disable that in the exim configuration manager. Is there a way to disable that so I can confirm if that's the problem?
2017-02-09 03:55:04 1cbJVD-0004zc-1T == user@gmail.com (user@domainhostedbyme.com) R=lookuphost T=remote_smtp defer (110): Connection timed out H=alt4.gmail-smtp-in.l.google.com [74.125.192.26]: SMTP timeout after end of data (69349 bytes written)
If I manually try to deliver the email in Mail Queue Manager, I see this:
LOG: MAIN
cwd=/usr/local/cpanel/whostmgr/docroot 6 args: /usr/sbin/exim -C /etc/exim_outgoing.conf -v -M 1cbJVD-0004zc-1T
delivering 1cbJVD-0004zc-1T
LOG: MAIN
SMTP connection identification D=domainhostedbyme.com O=user@domainhostedbyme.com E=user@gmail.com M=1cbJVD-0004zc-1T U=username ID=559 B=redirect_resolver
LOG: MAIN
=> discarded R=has_alias_but_no_mailbox_discarded_to_prevent_loop
Connecting to gmail-smtp-in.l.google.com [64.233.188.26]:25 from myserver.ip ... connected
SMTP<< 220 mx.google.com ESMTP f4si8166997pgc.224 - gsmtp
SMTP>> EHLO myserver.com
SMTP<< 250-mx.google.com at your service, [myserver.ip]
250-SIZE 157286400
250-8BITMIME
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
SMTP>> STARTTLS
SMTP<< 220 2.0.0 Ready to start TLS
SMTP>> EHLO myserver.com
SMTP<< 250-mx.google.com at your service, [myserver.ip]
250-SIZE 157286400
250-8BITMIME
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
SMTP>> MAIL FROM: SIZE=69814
SMTP>> RCPT TO:
will write message using CHUNKING
SMTP>> BDAT 3740
SMTP<< 250 2.1.0 OK f4si8166997pgc.224 - gsmtp
SMTP<< 250 2.1.5 OK f4si8166997pgc.224 - gsmtp
SMTP<< 250 2.0.0 OK f4si8166997pgc.224 - gsmtp
SMTP>> BDAT 65624 LAST
LOG: MAIN
SMTP timeout after end of data (69347 bytes written): Connection timed out
SMTP(close)>>
Nothing happens at all after that last BDAT command. It just times out. It seems like something is blocking the data from going out, or an ack coming back. I do use mailscanner and csf, but I've disabled both of those and still can't send successfully. I've tried enabling SRS, but that didn't work. I don't use DKIM verification at all. This only happens for users who forward to gmail from an alias. All other email to and from gmail works fine. I'm out of ideas honestly. I think it might have something to do with exim's CHUNKING support, but I can't seem to disable that in the exim configuration manager. Is there a way to disable that so I can confirm if that's the problem?
-
If I manually try to deliver the email in Mail Queue Manager, I see this:
You might want to look up the emails here for more clues: WebHost Manager "Email "Mail Delivery Reports0 -
Unfortunately that tells me exactly the same as I posted above. 0 -
Looks like this might be a bug in exim 4.88 which introduced chunking. 4.89 has some fixes for it, but not 100% sure if they will fix the problem. 0 -
Hello @Recifier, The issue you have described appears related to the following bug with Exim: Bug 1974 " Handle missing final CRLF in chunked/BDAT transferred messages, resulting in a missing final LF in spool files Essentially, Exim over-reports the correct amount of bytes to send in the BDAT chunk to the remote server, and thus when less bytes are transferred to the remote server, the remote server times out waiting for those additional bytes that Exim over-reported. Internal case CPANEL-11193 is open to determine how to best address this Exim bug to alleviate this issue for customers on cPanel servers. It's likely we'll need to await an official patch from Exim and then include it with the version of Exim published with cPanel. In the meantime, the current workaround is to disable chunking support by browsing to "WHM >> Exim Configuration Manager >> Advanced Editor", clicking on the "Add Additional Configuration Setting" button, and adding the following values: chunking_advertise_hosts = nonexistent.tld
Note that this will not fix any existing messages affected by this issue that are in the mail queue, but it should address the issue for any new emails handled by Exim after making this change. I'll update this thread with more information on the status of internal case CPANEL-11193 as it becomes available. Thank you.0 -
Thanks @cPanelMichael and thanks for updating the title to something more sensible too :) 0 -
Adding chunking_advertise_hosts = nonexistent.tld wasn't enough to stop the problem in my case. I also had to add hosts_try_chunking = nonexistent.tld
to the smtp transports in exim.conf. That seems to have solved the issue so far.0 -
I too have had this problem on multiple servers since the upgrade to cpanel 62.0.8. The cpanel team has worked on my ticket and seem to have it classified as the bug CPANEL-11193. All I can add from observation is that appears to only effect forwards to servers that support chunking. Most notably are gmail.com and outlook.com. Additionally all the failing messages are typically commercial mailings with multi-part mime. I've only seen one "personal" email fail forwarding. Cpanel has raised some suspicions over MailScanner's processing but I'm very reluctant to remove that. It might be helpful for them to know if you're having this same problem, whether you're using MailScanner or not, and the nature of the failed messages as best you can tell. 0 -
Noted this behavior occurring a few weeks ago, glad to see it is being addressed 0 -
Adding chunking_advertise_hosts = nonexistent.tld wasn't enough to stop the problem in my case. I also had to add
hosts_try_chunking = nonexistent.tld
to the smtp transports in exim.conf. That seems to have solved the issue so far.
It appears that didn't fix the issue either. New messages coming in are still trying to use chunking and stalling.0 -
hosts_try_chunking has worked for me and according to the cpanel support I received: "That option can't be added to the main section of the Exim configuration in the WHM editor, and we do not offer the ability to modify the existing SMTP transports in the WHM editor. The only semi-permanent way to add that to the smtp transports is to add them to /usr/local/cpanel/etc/exim/replacecf/dkim/remote_smtp. This is temporary because a future WHM update could replace this file (if it changes, otherwise it won't be touched) and remove those options again." Note the entries were added after the remote_smtp and the dkim_remote_smtp sections "After that is changed: /scripts/buildeximconf /scripts/restartsrv_exim There are 2 smtp transports in the configuration so it now shows twice: [15:10:04 host2 root@8220035 ~]cPs# grep hosts_try_chunking /etc/exim.conf hosts_try_chunking = nonexistent.tld hosts_try_chunking = nonexistent.tld" Hope this helps as much as it has helped me. I had 200-300 messages back up on servers and they were all cleared out in less than an hour after the fix. I haven't seen any more stalled forwarders since. 0 -
Added hosts_try_chunking = nonexistent.tld
to smtp transports in exim_outgoing.conf and that is looking better0 -
hosts_try_chunking has worked for me and according to the cpanel support I received: "That option can't be added to the main section of the Exim configuration in the WHM editor, and we do not offer the ability to modify the existing SMTP transports in the WHM editor. The only semi-permanent way to add that to the smtp transports is to add them to /usr/local/cpanel/etc/exim/replacecf/dkim/remote_smtp. This is temporary because a future WHM update could replace this file (if it changes, otherwise it won't be touched) and remove those options again." Note the entries were added after the remote_smtp and the dkim_remote_smtp sections "After that is changed: /scripts/buildeximconf /scripts/restartsrv_exim There are 2 smtp transports in the configuration so it now shows twice: [15:10:04 host2 root@8220035 ~]cPs# grep hosts_try_chunking /etc/exim.conf hosts_try_chunking = nonexistent.tld hosts_try_chunking = nonexistent.tld" Hope this helps as much as it has helped me. I had 200-300 messages back up on servers and they were all cleared out in less than an hour after the fix. I haven't seen any more stalled forwarders since.
Excellent work! Thats fixed it for me too.0 -
hosts_try_chunking has worked for me and according to the cpanel support I received: "That option can't be added to the main section of the Exim configuration in the WHM editor, and we do not offer the ability to modify the existing SMTP transports in the WHM editor. The only semi-permanent way to add that to the smtp transports is to add them to /usr/local/cpanel/etc/exim/replacecf/dkim/remote_smtp. This is temporary because a future WHM update could replace this file (if it changes, otherwise it won't be touched) and remove those options again." Note the entries were added after the remote_smtp and the dkim_remote_smtp sections "After that is changed: /scripts/buildeximconf /scripts/restartsrv_exim There are 2 smtp transports in the configuration so it now shows twice: [15:10:04 host2 root@8220035 ~]cPs# grep hosts_try_chunking /etc/exim.conf hosts_try_chunking = nonexistent.tld hosts_try_chunking = nonexistent.tld" Hope this helps as much as it has helped me. I had 200-300 messages back up on servers and they were all cleared out in less than an hour after the fix. I haven't seen any more stalled forwarders since.
I missed this earlier... that fix is much better than mine, thanks! While my changes did work, they would be wiped out when exim confs are regenerated, so it's much better to use the "cpanel approved" method, even if it is only semi-permanent.0 -
I wanto to confirm that doing all that was said in this post is working for me like a charm. Here I left the recompilation on how to do it: 1. Exim Configuration Manager -> Advanced Editor -> Add Aditional Configuration Settings: chunking_advertise_hosts = nonexistent.tld 2. Add to /usr/local/cpanel/etc/exim/replacecf/dkim/remote_smtp: 2.a after remote_smtp dkim_remote_smtp section: hosts_try_chunking = nonexistent.tld 2.b after dkim_remote_smtp section: hosts_try_chunking = nonexistent.tld 3. Restart Services: /scripts/buildeximconf /scripts/restartsrv_exim 4. Check everything is fine: grep hosts_try_chunking /etc/exim.conf hosts_try_chunking = nonexistent.tld hosts_try_chunking = nonexistent.tld
I want to thank you all that contributed on this post, as I had thousands of emails stuck in my servers and the person that handle my support ticket at cpanel wanted that I uninstalled MailScanner completely in order for him to help me, something that I was not willing to do. Regards, Sergio0 -
I had already tried uninstalling mailscanner on one of my servers to see if it was the culprit and it didn't change anything. 0 -
I had already tried uninstalling mailscanner on one of my servers to see if it was the culprit and it didn't change anything.
I contacted ConfigServer and told them about what the cpanel support want me to do with MailScanner and said it was not a MailScanner issue that only appeared after the upgrade to 60.0.10, I tried to find a solution myself but then Sarah told me about this thread and thanks to you all, now my servers are working great. From 500+ emails stuck in the QUEUE with the same issue, right now everything is back to normal.0 -
Hello, Internal case CPANEL-11193 was included in cPanel version 62.0.14: Fixed case CPANEL-11193: Update exim to 4.88-3.cp1162. Additionally, internal case CPANEL-11327 was included in cPanel version 62.0.15: Fixed case CPANEL-11327: Disable CHUNKING in exim. Let us know if any additional issues persist once your system is updated to cPanel version 62.0.15 or newer. More information on how new builds are published is available on our 0 -
Hello, Internal case CPANEL-11193 was included in cPanel version 62.0.14: Fixed case CPANEL-11193: Update exim to 4.88-3.cp1162. Additionally, internal case CPANEL-11327 was included in cPanel version 62.0.15: Fixed case CPANEL-11327: Disable CHUNKING in exim. Let us know if any additional issues persist once your system is updated to cPanel version 62.0.15 or newer. More information on how new builds are published is available on our
0 -
When is this going to be available on the RELEASE Tier?
62.0.15 is now published to the "Release" build tier. Thank you.0 -
.0.15 is now published to the "Release" build tier. Thank you.
Thanks, I have updated all my servers.0 -
Thanks @cPanelMichael for the fast turnaround. I've updated all my servers too and will monitor closely. 0 -
After we update to ver 62.0.15, Do the lines added to EXIM and remote_smtp have to be deleted? chunking_advertise_hosts = nonexistent.tld hosts_try_chunking = nonexistent.tld I am sking as now it appears on any message logged in exim_mainlog, example: 2017-02-17 10:16:01 SMTP connection from [xxx.xxx.xxx.xxx]:49908 (TCP/IP connection count = 4) 2017-02-17 10:16:02 no IP address found for host nonexistent.tld (during SMTP connection from (account) [xxx.xxx.xxx.xxx]:49908) 2017-02-17 10:16:02 H=(account) [xxx.xxx.xxx.xxx]:49908 Warning: Sender rate 36.0 / 1h 2017-02-17 10:16:02 SMTP call from (account) [xxx.xxx.xxx.xxx]:49908 dropped: too many syntax or protocol errors (last command was "RCPT TO: <'some@domain.com[/EMAIL]'>")[\code]
I have checked the remote_smtp lines and now them have changed to: hosts_try_chunking = 198.51.100.10 -
After we update to ver 62.0.15, Do the lines added to EXIM and remote_smtp have to be deleted?
I recommend reverting any workarounds put in place, as the changes from these cases should address the issue. Thank you.0
Please sign in to leave a comment.
Comments
23 comments