Filter UTF-8 Subject
I have received a lot of spam that has a subject with special characters in it. In every one, the subject or from header looks like this:
I want to create a mail filter to detect these. I tried the following but it didn't work and I can't figure out why:
But when I test with the headers shown above, I get the result:
Thanks for any advice on how I can make this work...
From: =?UTF-8?B?4oCM4oGg4oCL4oCO4oGg4oCO4oCO4oCq4oCL4oCtVGFsYy5Jbmp1cnlNYXRjaC5jb20gQWTvu7/vu7/igI3igIzigI7igK3igI7igI7igI7igaA=?=
Subject: =?UTF-8?B?4oCsVO+7v+KAjmhlIEJh77u/YnkgUOKArOKAje+/um93ZGXvv7py4oCt4oCNIEPigI7igIxhbmNlciBD4oCs4oGgb25jZXLigIxu?=
I want to create a mail filter to detect these. I tried the following but it didn't work and I can't figure out why:
Any Header matches regex
=\?UTF-8\?B\?
But when I test with the headers shown above, I get the result:
Condition is false: $message_headers matches =\\?UTF-8\\?B\\?
Filtering did not set up a significant delivery.
Normal delivery will occur.
Thanks for any advice on how I can make this work...
-
what about contains rather than matches regex. 0 -
Thanks -- same result with contains or begins with. :( 0 -
when you tried contains did you use =\?UTF-8\?B\? or =?UTF-8?B? (without the slashes) 0 -
I had it without slashes originally when I used "contains" and "begins with" -- I only added the slashes when I tried it as a regex. Here's something interesting that may help, perhaps? If I change the regex to match "?UTF-8?B?" -- i.e, without the leading equals sign (=) -- and then I change my test content to not have a leading equals sign at the beginning of the SUBJECT header, for example: From: =?UTF-8?B?4oCt4oCO4oCs77u/77+677+64oCN4oGg4oCN4oCqSG9tZSBJbnN1cmFuY2UgQ29ubmVjdOKAjuKArOKAi+KBoOKAre+7v+KAjeKAje+/uuKArA==?= Subject: ?UTF-8?B?Q2hlY2sg4oCt77+6Zm/vu79yIGxv4oCLd2VyIOKAjkhvbeKAjGXigIsgaW5zdeKAjnJh4oCNbmPigIxlIOKAi3JhdGXigI5z4oCtIHRvZOKAquKBoGF5?=
then it matches and sets up delivery to /dev/null (discard rule).Match expanded arguments: Subject = ?????????Home Insurance Connect???????? Condition is false: $h_X-Spam-Bar: contains ++++++++++ Condition is true: $message_headers contains ?UTF-8?B?0 -
I don't profess to understand regex, and I'm by no means an expert on this subject. However, based on what you just said, could it be that the phrase 'From: =' is not being read by the rule or regex, and the phrase you actually need is just 'UTF-8?B? ' 0 -
The "=" is a quantifier so I doubt it was even being read properly how it was added prior so I'd assume that's why when it was removed you were able to get a match on the subject. I'd second @keat63 here and skip the "=" for the UTF-8 section. 0
Please sign in to leave a comment.
Comments
6 comments