Re: [Mailman-Developers] cron/gate_news watermark == 0

[Jim Tittsler]
Hmmmm... When checking the contents of my "data/gate_watermarks" marshal, I find several old mailing lists with an entry stating that their watermark is 0 -- even though these lists aren't actually gating any newsgroups.
As all these lists are rather old, I guess this could be how things were stored in some early Mailman version -- back when gate_news were forking processes to gate _all_ lists, regardless of whether the list in question had requested that any gating should be done, and the watermark file was written to by each and every child...
On checking this with CVS, it appears that revision 1.7 of gate_news indeed has the behaviour I suspected, while 1.8, which was checked in 1998/12/18 00:22:23, seems to do it "right".
The upshot of this is that any list touch by rev. 1.7 (and maybe earlier) which do not do any gating would possibly get *all* the messages present in the group it's gating from posted to it if it started gating after your patch was applied.
A possible way to approach all this would be to have "make update" replace those 0 values in gate_watermarks with None values iff there are no other None values (i.e. only change things on the first time "make update" is run).
Or, even better, move the per-list watermark info from gate_watermarks into the list's config.db, replacing any 0 values in gate_watermarks with None values in config.db. This has the advantage of reducing the number of different files that would have to be changed when cloning/renaming a list.
Harald

On Sun, Feb 13, 2000 at 08:57:20PM +0100, Harald Meland wrote:
This seems logical to me, since I think the administrator really wants to do the "catch up" any time gateway_to_mail transitions from No to Yes (implying that it needs to be able to update the watermark, either in config.db or by locking(?) and updating gate_watermarks).
With the present behavior, watermarks don't appear to be deleted if you stop gatewaying a newsgroup for a while. So if you restarted gatewaying... you would flood the list with all of the posts that happened in the interim because the old watermark is still hanging around. (You could del the watermark entries any time gate_news was run and the list wasn't marked as gatewaying... that seems unclean, but a smaller patch. :-)
Jim

Back in February, there was some discussion about fixing gate_news and the gate_watermarks file...
"JT" == Jim Tittsler <jwt@dskk.co.jp> writes:
JT> A watermark value of 0 is used for two different purposes in
JT> cron/gate_news [...] The result of this is the first article
JT> posted to the newly created newsgroup fails to get gatewayed
JT> to the mailing list because the watermark for that group is
JT> (still) 0 which is used as a flag to "catch up" so all that
JT> happens is the watermark gets set to 1.
The jist of your message is spot on, and I took your advice to make None the marker that means the list hasn't been gated before and/or needs a mass catchup.
"HM" == Harald Meland <Harald.Meland@usit.uio.no> writes:
HM> Hmmmm... When checking the contents of my
HM> "data/gate_watermarks" marshal, I find several old mailing
HM> lists with an entry stating that their watermark is 0 -- even
HM> though these lists aren't actually gating any newsgroups.
HM> [...]
HM> Or, even better, move the per-list watermark info from
HM> gate_watermarks into the list's config.db, replacing any 0
HM> values in gate_watermarks with None values in config.db. This
HM> has the advantage of reducing the number of different files
HM> that would have to be changed when cloning/renaming a list.
Makes tons of sense Harald! I like this idea a lot for your reasons, but also because it leads to a big simplification of gate_news. It's actually harder to synchronize access to a shared gate_watermarks file than it is to just keep the watermark in the list's attributes.
JT> This seems logical to me, since I think the administrator
JT> really wants to do the "catch up" any time gateway_to_mail
JT> transitions from No to Yes (implying that it needs to be able
JT> to update the watermark, either in config.db or by locking(?)
JT> and updating gate_watermarks).
Another excellent point. I've got a hack to the already horrible admin.py to None-ify the watermark whenever the gateway_to_mail property has changed. In reality we zap the watermark when it transitions to either "yes" or "no", but I don't /think/ that matters in practice. You could potentially want only watermark zapping on transition from no->yes so that you could write a little script to turn gatewaying on while preserving the watermark, but I just don't think this will be a common enough occurance. The Right Thing To Do is probably to improve the admin page for gatewaying so that there's a flag the admin can tickle to zap the watermark or preserve it. Not for 2.0 :)
So now the watermark is stored in the list's config.db under the usenet_watermark attribute. gate_news O_CREAT|O_EXCL's a block file to ensure that only one gate_news process is running at any time, and it only deletes this block file when all it's forked children exit (or on catastrophic exception). gate_news opens each list in turn unlocked, and then if the list is gating to mail, it forks a process for that list and immediately locks it. Then it gates the list largely as before. Similar to before, if the watermark is None it simply sets it to the last known article number and moves on, expecting the following gate_news to post subsequent messages.
I also implemented your suggestion to bin/update to pull the info out of gate_watermarks and stick them in mlist.usenet_watermark. It then deletes gate_watermarks.
An outgrowth of all this is that I realized I really want a numeric, easily comparable Mailman version number, and a way to figure out what version we are upgrading from. That's surprisingly difficult now, but it would have been darn handy for bin/update. I've now implemented a HEX_VERSION using the same scheme that Python does, e.g. Mailman 2.0beta2 will have a HEX_VERSION of 0x200000b2, and this can be compared against for running only selective updates in bin/update. Every time bin/update completes successfully, it writes the current hex version string to data/last_mailman_version.
[There's one minor icky bit: the warning about moving templates/options.html is only printed if 1) we can't determine the previous Mailman version, and 2) it is not a fresh install -- the determinant being any files in the logs subdir]
I think that's it. I've done some moderate testing of all this, but not on my production machines. I'm going to check it all in, and hope you take a look, but be aware I haven't banged on it terribly hard.
Slowly-catching-up-ly y'rs, -Barry

On Sun, Feb 13, 2000 at 08:57:20PM +0100, Harald Meland wrote:
This seems logical to me, since I think the administrator really wants to do the "catch up" any time gateway_to_mail transitions from No to Yes (implying that it needs to be able to update the watermark, either in config.db or by locking(?) and updating gate_watermarks).
With the present behavior, watermarks don't appear to be deleted if you stop gatewaying a newsgroup for a while. So if you restarted gatewaying... you would flood the list with all of the posts that happened in the interim because the old watermark is still hanging around. (You could del the watermark entries any time gate_news was run and the list wasn't marked as gatewaying... that seems unclean, but a smaller patch. :-)
Jim

Back in February, there was some discussion about fixing gate_news and the gate_watermarks file...
"JT" == Jim Tittsler <jwt@dskk.co.jp> writes:
JT> A watermark value of 0 is used for two different purposes in
JT> cron/gate_news [...] The result of this is the first article
JT> posted to the newly created newsgroup fails to get gatewayed
JT> to the mailing list because the watermark for that group is
JT> (still) 0 which is used as a flag to "catch up" so all that
JT> happens is the watermark gets set to 1.
The jist of your message is spot on, and I took your advice to make None the marker that means the list hasn't been gated before and/or needs a mass catchup.
"HM" == Harald Meland <Harald.Meland@usit.uio.no> writes:
HM> Hmmmm... When checking the contents of my
HM> "data/gate_watermarks" marshal, I find several old mailing
HM> lists with an entry stating that their watermark is 0 -- even
HM> though these lists aren't actually gating any newsgroups.
HM> [...]
HM> Or, even better, move the per-list watermark info from
HM> gate_watermarks into the list's config.db, replacing any 0
HM> values in gate_watermarks with None values in config.db. This
HM> has the advantage of reducing the number of different files
HM> that would have to be changed when cloning/renaming a list.
Makes tons of sense Harald! I like this idea a lot for your reasons, but also because it leads to a big simplification of gate_news. It's actually harder to synchronize access to a shared gate_watermarks file than it is to just keep the watermark in the list's attributes.
JT> This seems logical to me, since I think the administrator
JT> really wants to do the "catch up" any time gateway_to_mail
JT> transitions from No to Yes (implying that it needs to be able
JT> to update the watermark, either in config.db or by locking(?)
JT> and updating gate_watermarks).
Another excellent point. I've got a hack to the already horrible admin.py to None-ify the watermark whenever the gateway_to_mail property has changed. In reality we zap the watermark when it transitions to either "yes" or "no", but I don't /think/ that matters in practice. You could potentially want only watermark zapping on transition from no->yes so that you could write a little script to turn gatewaying on while preserving the watermark, but I just don't think this will be a common enough occurance. The Right Thing To Do is probably to improve the admin page for gatewaying so that there's a flag the admin can tickle to zap the watermark or preserve it. Not for 2.0 :)
So now the watermark is stored in the list's config.db under the usenet_watermark attribute. gate_news O_CREAT|O_EXCL's a block file to ensure that only one gate_news process is running at any time, and it only deletes this block file when all it's forked children exit (or on catastrophic exception). gate_news opens each list in turn unlocked, and then if the list is gating to mail, it forks a process for that list and immediately locks it. Then it gates the list largely as before. Similar to before, if the watermark is None it simply sets it to the last known article number and moves on, expecting the following gate_news to post subsequent messages.
I also implemented your suggestion to bin/update to pull the info out of gate_watermarks and stick them in mlist.usenet_watermark. It then deletes gate_watermarks.
An outgrowth of all this is that I realized I really want a numeric, easily comparable Mailman version number, and a way to figure out what version we are upgrading from. That's surprisingly difficult now, but it would have been darn handy for bin/update. I've now implemented a HEX_VERSION using the same scheme that Python does, e.g. Mailman 2.0beta2 will have a HEX_VERSION of 0x200000b2, and this can be compared against for running only selective updates in bin/update. Every time bin/update completes successfully, it writes the current hex version string to data/last_mailman_version.
[There's one minor icky bit: the warning about moving templates/options.html is only printed if 1) we can't determine the previous Mailman version, and 2) it is not a fresh install -- the determinant being any files in the logs subdir]
I think that's it. I've done some moderate testing of all this, but not on my production machines. I'm going to check it all in, and hope you take a look, but be aware I haven't banged on it terribly hard.
Slowly-catching-up-ly y'rs, -Barry
participants (3)
-
Barry A. Warsaw
-
Harald Meland
-
Jim Tittsler