[Bug 1859011] [NEW] bounce mail processed multiple times -> oom crash of BounceRunner
Public bug reported: System: RHEL 7.7 Mailman: mailman-2.1.15-26.el7_4.1.x86_64 Description: We have 2 Mailinglist with > 7000 Members which trigger a spike in processed bounces (more than 3000 in an sinle run), followed by an Out of Memory situation in the BounceRunner and an > 20 GB bounce-event-XXXX.pck file. We tried to mitigate the problem by increasing the system memory, running the BoucneRunner every minute and limiting the number of mails delivered at ounce by postfix. But it happened again: Dec 30 19:53:29 2019 (13392) <BounceRunner at 140395473885088> processing 4134 queued bounces ... Dec 30 19:53:59 mx09 kernel: [13392] 41 13392 2755797 1874474 5337 825695 0 python ... Dec 30 19:53:59 mx09 kernel: Out of memory: Kill process 13392 (python) score 896 or sacrifice child Dec 30 19:53:59 mx09 kernel: Killed process 13392 (python), UID 41, total-vm:11023188kB, anon-rss:7497896kB, file-rss:0kB, shmem-rss:0kB We analyzed the bounce-event file, extracting data with "stings". This time we extracted the postfix mail queue ids from the received headers with our listserver. We found the following: cat /tmp/bounce-20191230.txt | sed 's/;//' | sort | uniq -c | sort -n 1 01A7DE9314 1 10F6AE9319 1 18456E930E 1 27D0BAC960 1 3B51CE9316 1 57C2DAC992 1 5D3B2E9310 1 5EF11E9311 1 63054E9312 1 69377E9313 1 ED636E930F 2 29884E9315 2 49ECEAC98D 2 99A16A9DA7 2 (Postfix) 3 59EE1AC995 3 CEB61AC996 192 C12E9D3B48 3929 F2BEEE9318 4122 CC58AAC993 4134 6EFD1A9DA7 As the bounces the last 2 qids are from the original mail send to the list (6EFD1A9DA7), and one of mails send by mailman to 500 members of that list (CC58AAC993). (F2BEEE9318 and C12E9D3B48) are both bounces from members that where deliverd only once to /usr/lib/mailman/mail/mailman bounces NAME-OF-HUGE-LIST So from me it seems that somehow some few bounce where "multiplied" so that ~20 real bounce produced 4134 virtual bounce. I see the potential of a deny of service attack, as it could be used to fill up the disk where the bounce-event files get dumped But I don't know if this would warrant marking "This bug is a security vulnerability". ** Affects: mailman Importance: Undecided Status: New -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
We need more information to know what's happening. You say the MTA log shows only one delivery for, e.g., F2BEEE9318, to /usr/lib/mailman/mail/mailman bounces NAME-OF-HUGE-LIST yet strings on the bounce event file finds F2BEEE9318 a total of 3929 times. Can you examine the bounce-events-ppppp.pck file with Mailman's bin/dumpdb and see what the actual bounce messages are and where they all come from? ** Changed in: mailman Status: New => Incomplete -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
If it does appear that Mailman is somehow generating these multiple bounce-events entries, you might compare your BounceRunner.py to the 2.1.15 base at https://bazaar.launchpad.net/~mailman- coders/mailman/2.1/view/1357/Mailman/Queue/BounceRunner.py to see if there might be some corruption. -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
Thanks for the hint to the dumpdb command. though the command was killed by the system, it extracted sufficient data, and I was able to extract some useful informations. I did split the output into files for each object (csplit '/^<----- start object/' '{*}') 1. one bounced mail (qid F2BEEE9318) was multiple times in the bonce-event-13392.pck 2. the dumped objects for the duplicated bounce differ only in the 1st and 3rd line, first line beeing '<----- start object' and second line been the mailinglist 3. I guess the detected mail-address should be in the third line. For the first instance of the duplicated mail this was '\x00', for the other objects it seems to contain "random" data e.g. '\x00', '\x16\xc9\xeaNq\xf8\xfbU\xe9\xa6\xbd\xbc\x92\x19\xaf\xee.o.\xa3M\x82i\x98g\xaep=\xb3\\\x94S_[\x01', 'y\x9d\xf5O\x88\x05O\x88\x05O\xb0r/+\xbfQ\xd4<\xb2\xd7\xd2\xec\xf7E\xda\x1f\x86\xa7e\xb0\xa7e\xb0\xa52\xd8\xd3\xf6\xe6\x16{s\x8b\x0c\xb6\xd4\xfe|\xd4\xfe\xdcb\x7f', "\x87\xe5\xc7\xe3\xaa\xd2\xc5\xff\x8f\xb6\xbe\x15\xe2\x9b\xb8b\x8a\xa0\x842\xa8\x87\t0\xe5a\xd0\xb1\x9bqY\x1d\xc6\x9e\xddYY\x19\x93\xa7|\x14\x83\x02\xd8\xd9\xb3\x10\x83\xe6\xbb\xd9\x1f'\x13tVk\xb9\xb3Dq\x85\xd8`\xaa-W^!L\x81\xf2\xdew\xde~\x1a\x1f\xcf\xc7\x96\xf9\x9eg\xbe\xb7{^\xe91\xf6", 4. The envelope-from of the bounce mail did contain '=' 2 times prvs=126700a846=ndr@some.domain while the from address is ndr@some.domain 5. the bounced mail included the complete original mail with pdf attachment ~5 MB There is no change in my BounceRunner.py compared to the linked version. -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
Each object in the bounce events is a 4-tuple like (list-name, bouncing address, bounce date, bounce message). The bounce message is a Mailman.Message.Message object, but should be and apparently is rendered as the text of the message. So yes, the third line is the bouncing addresses, and it is clearly corrupted, but I suspect that is a symptom, not a cause. I still don't really understand what's happening. Given that the message contained a large PDF attachment, I suspect this isn't a real bounce, but rather is spam sent to the list-bounces address or possibly a list post mis-sent to the list-bounces address. The size of it may have something to do with the corruption that occurred, although I still don't understand how the duplication comes about. What's in Mailman's bounce log from the time this happened. I'm particularly interested in any 'bounce message w/no discernable addresses' or other messages just before the "Dec 30 19:53:29 2019 (13392) <BounceRunner at 140395473885088> processing 4134 queued bounces" message. Also are there any "VERP_PROBE_REGEXP doesn't yield the right match groups:" messages in Mailman's error log from this time. -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
To clarify some points: 1. the bounce mail is genuine. My logs show that a ~5MB mail was send to the list by the owner of the list, it was distributed to the members of the list, and caused some bounce mails by some recipients. The mail that caused the problem was normal non delivery notification with the characteristic that the complete original mail was included as message/rfc822. 2. I don't know if the size of the bounced mail has an impact on the number of duplications, but the size combined with the duplications resulted in the bounce process requesting using ~20 GB of ram 3. My logs indicate some cases with >2000 processed bounces that didn't crash the BounceRunner, most likely there was also a duplication, but the mail was smaller At the moment I try to create a striped down version that I can share that reproduces the problem -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
I was able to create a striped down version of the mail. I also found that the size of the attachment has an influence on the number of dulicates I have created 2 examples ** Attachment added: "example that creates 22 bounces out of one bounce" https://bugs.launchpad.net/mailman/+bug/1859011/+attachment/5320823/+files/b... -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
** Attachment added: "example that creates 81 bounces out of one bounce" https://bugs.launchpad.net/mailman/+bug/1859011/+attachment/5320825/+files/b... -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
Thanks for your work on this. I will use your examples to try to recreate the issue and diagnose further. ** Changed in: mailman Importance: Undecided => Medium ** Changed in: mailman Status: Incomplete => Triaged ** Changed in: mailman Milestone: None => 2.1.30 ** Changed in: mailman Assignee: (unassigned) => Mark Sapiro (msapiro) -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
I know what's going on. The bounce message itself is unusual. It consists of a simple text/plain body containing the text ---------------------------------------------------------------------------- This message was created automatically by mail delivery software. A message that you sent could not be delivered to one or more of its recipients. One or more of the recipients are not a valid user of the system. Please check to be sure the email address is correct. ---------------------------------------------------------------------------- followed by a message/rfc822 part containing the entire original message. Bounce processing has a number of recognizers that examine the message and if they recognize it, return a list of the bouncing addresses. One of these recognizers matches RFC compliant DSNs, but there are many non- compliant DSNs that we have seen that we try to recognize. Another 'heuristic' recognizer is SimpleMatch. It looks at every line of the message until it finds a line matching a start pattern. Tnen it continues to look for lines matching an address pattern until it finds a line matching a stop pattern. In this case, the start pattern that matches is 'A message that you( have)? sent could not be delivered' It then looks for addresses matching '<(?P<addr>[^>]*)>' until it finds a line matching '---' It never finds the stop pattern so it goes through the entire base64 decoded PDF line by line looking for any string between '<' and '>', and it finds a bunch of them, each of which is returned as a bouncing address even though they don't look at all like addresses, and in fact, the actual bouncing address is nowhere at all in the bounce message. Then each 'address' is recorded as a bounce in the bounce_events file. I will change the address patterns so the string inside the angle brackets has to at least resemble an email address. That should fix this. ** Changed in: mailman Status: Triaged => In Progress -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
** Branch linked: lp:mailman/2.1 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
** Changed in: mailman Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
Thanks for the Fix am I correct that I would need the patches from https://bazaar.launchpad.net/~mailman-coders/mailman/2.1/revision/1834 and https://bazaar.launchpad.net/~mailman-coders/mailman/2.1/revision/1835 -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
Actually, https://bazaar.launchpad.net/~mailman- coders/mailman/2.1/revision/1834 will fix your issue, but it is not completely correct and https://bazaar.launchpad.net/~mailman- coders/mailman/2.1/revision/1835 fixes that, so yes, install both. -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
** Changed in: mailman Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Mailman Coders, which is subscribed to GNU Mailman. https://bugs.launchpad.net/bugs/1859011 Title: bounce mail processed multiple times -> oom crash of BounceRunner To manage notifications about this bug go to: https://bugs.launchpad.net/mailman/+bug/1859011/+subscriptions
participants (3)
-
Launchpad Bug Tracker
-
Mark Sapiro
-
Michael Menge