Hi,
We are running mailman version 2.1.29
[root@listserver in]# rpm -qa|grep mailman mailman-2.1.29-10.module_el8.3.0+548+3169411d.x86_64
In the past 2 days we have noticed mailman posts queueing considerably in the /var/spool/mailman/in directory and some of the posts are either not being processed or very slowly come through after a delay.
[root@listserver in]# pwd /var/spool/mailman/in [root@listserver in]# ls -ls | wc -l 991
The runners are running as part the systemctl start mailman command
[root@listserver in]# ps -ef|grep runners root 4904 2591 0 11:58 pts/0 00:00:00 grep --color=auto runners [root@listserver in]# ps -ef|grep mailman mailman 1954 1 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/mailmanctl -s start mailman 1956 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=ArchRunner:0:1 -s mailman 1957 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=BounceRunner:0:1 -s mailman 1958 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=CommandRunner:0:1 -s mailman 1959 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s mailman 1960 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=NewsRunner:0:1 -s mailman 1961 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s mailman 1962 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=VirginRunner:0:1 -s mailman 1963 1954 0 11:49 ? 00:00:00 /usr/bin/python2 /usr/lib/mailman/bin/qrunner --runner=RetryRunner:0:1 -s root 4907 2591 0 11:58 pts/0 00:00:00 grep --color=auto mailman
I've tried to increase Incoming runners from 1 to 2 presuming this may increase processing, which had no effect.
qrunner logs doesn't show anything out of the ordinary, only when I stop start the service it logs.w
Mar 02 11:47:48 2023 (198033) Master watcher caught SIGTERM. Exiting. Mar 02 11:47:48 2023 (198034) ArchRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:48 2023 (198034) ArchRunner qrunner exiting. Mar 02 11:47:48 2023 (198041) RetryRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:48 2023 (198041) RetryRunner qrunner exiting. Mar 02 11:47:48 2023 (198035) BounceRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:48 2023 (198035) BounceRunner qrunner exiting. Mar 02 11:47:48 2023 (198036) CommandRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198036) CommandRunner qrunner exiting. Mar 02 11:47:48 2023 (198037) IncomingRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:48 2023 (198038) NewsRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198038) NewsRunner qrunner exiting. Mar 02 11:47:49 2023 (198033) Master watcher caught SIGTERM. Exiting. Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198034, sig: None, sts: 15, class: ArchRunner, slice: 1/1) Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198035, sig: None, sts: 15, class: BounceRunner, slice: 1/1) Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198041, sig: None, sts: 15, class: RetryRunner, slice: 1/1) Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198036, sig: None, sts: 15, class: CommandRunner, slice: 1/1) Mar 02 11:47:48 2023 (198039) OutgoingRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198039) OutgoingRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198039) OutgoingRunner qrunner exiting. Mar 02 11:47:48 2023 (198040) VirginRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198040) VirginRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198040) VirginRunner qrunner exiting. Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198038, sig: 15, sts: None, class: NewsRunner, slice: 1/1) Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198039, sig: None, sts: 15, class: OutgoingRunner, slice: 1/1) Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198040, sig: None, sts: 15, class: VirginRunner, slice: 1/1) Mar 02 11:47:49 2023 (198037) IncomingRunner qrunner caught SIGTERM. Stopping. Mar 02 11:47:49 2023 (198037) IncomingRunner qrunner exiting. Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198037, sig: None, sts: 15, class: IncomingRunner, slice: 1/1) Mar 02 11:49:59 2023 (1956) ArchRunner qrunner started. Mar 02 11:49:59 2023 (1963) RetryRunner qrunner started. Mar 02 11:49:59 2023 (1960) NewsRunner qrunner started. Mar 02 11:49:59 2023 (1961) OutgoingRunner qrunner started.
error logs have this Mar 02 11:38:57 2023 (189330) Unable to retrieve data from https://publicsuffix.org/list/public_suffix_list.dat: <urlopen error [Errno 97] Address family not supported by protocol> Mar 02 11:38:57 2023 (189331) Unable to retrieve data from https://publicsuffix.org/list/public_suffix_list.dat: <urlopen error
but I think this is related to some posts to unknown domains.
Could I get some advice to find out what else I should be looking into to debug this issue.
Kind Regards ssaini
On 3/2/23 04:12, ssaini wrote:
... Mar 02 11:47:49 2023 (198037) IncomingRunner qrunner exiting. Mar 02 11:47:49 2023 (198033) Master qrunner detected subprocess exit (pid: 198037, sig: None, sts: 15, class: IncomingRunner, slice: 1/1) Mar 02 11:49:59 2023 (1956) ArchRunner qrunner started. Mar 02 11:49:59 2023 (1963) RetryRunner qrunner started. Mar 02 11:49:59 2023 (1960) NewsRunner qrunner started. Mar 02 11:49:59 2023 (1961) OutgoingRunner qrunner started.
The rest of the messages from the startup might be interesting.
error logs have this Mar 02 11:38:57 2023 (189330) Unable to retrieve data from https://publicsuffix.org/list/public_suffix_list.dat: <urlopen error [Errno 97] Address family not supported by protocol> Mar 02 11:38:57 2023 (189331) Unable to retrieve data from https://publicsuffix.org/list/public_suffix_list.dat: <urlopen error
but I think this is related to some posts to unknown domains.
No, it isn't. It is not clear what exactly the urlopen issue is, but it is complaining that it can't retrieve https://publicsuffix.org/list/public_suffix_list.dat which could be the reason for IncomingRunner getting hung up.
Normally, the code which is encountering this error is executed only once upon the first post after startup.
Try the following withlist session:
$ bin/withlist -i
No list name supplied.
Python 2.7.18 (default, Jul 1 2022, 12:27:04)
[GCC 9.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import urllib2
>>> d =
urllib2.urlopen('https://publicsuffix.org/list/public_suffix_list.dat')
>>> print d.readlines()[0]
// This Source Code Form is subject to the terms of the Mozilla Public
>>>
If urllib2.urlopen throws an exception, you need to figure out why. Does
wget https://publicsuffix.org/list/public_suffix_list.dat
retrieve the data?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
If urllib2.urlopen throws an exception, you need to figure out why. Does
wget https://publicsuffix.org/list/public_suffix_list.dat
retrieve the data?
After posting the message to this list I checked if the mailman server could resolve https://publicsuffix.org and it failed thus for our incident it was related to a firewall rule that allowed outbound 443 traffic being revoked which caused the delay in delivering and build up of posts. After the rule was added back the spooled mailman posts in the "in" directory were delivered within minutes.
Good to know!
and thank you for the followup, we do appreciate it!
ssaini writes:
If urllib2.urlopen throws an exception, you need to figure out why. Does
wget https://publicsuffix.org/list/public_suffix_list.dat
retrieve the data?
After posting the message to this list I checked if the mailman server could resolve https://publicsuffix.org and it failed thus for our incident it was related to a firewall rule that allowed outbound 443 traffic being revoked which caused the delay in delivering and build up of posts. After the rule was added back the spooled mailman posts in the "in" directory were delivered within minutes.
Mailman-Users mailing list -- mailman-users@python.org To unsubscribe send an email to mailman-users-leave@python.org https://mail.python.org/mailman3/lists/mailman-users.python.org/ Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/ https://mail.python.org/archives/list/mailman-users@python.org/ Member address: turnbull.stephen.fw@u.tsukuba.ac.jp
participants (3)
-
Mark Sapiro
-
ssaini
-
Stephen J. Turnbull