[Mailman-Developers] problem with view other subscriptions..

Harald Meland Harald.Meland@usit.uio.no
14 Jun 2000 14:10:18 +0200


[Barry A. Warsaw]

> I think __break() is fundamentally broken.  Actually, I think
> breaking locks gets us into all kinds of problems.

I completely agree.  However:

> But there's the trade-off of deadlocking the whole system for
> something we haven't thought about.

That is one rather humongously-sized "but".  [ Do not read the
previous sentence out loud to your wife/girlfriend ;) ]

> One approach might be to never break locks implicitly, but have
> qrunner (which now runs every minute) check for long-dead locks and
> send warning emails to the site admin.  A simple rm should clear up
> the problem.

My gut feeling says that this would be too cumbersome.  The problem
with breaking a lock is that it introduces race conditions; and by
having human admins break the lock "by hand", all you really gain is a
reduced probability that two or more admins will (try to) break the
same lock (nearly) simultaneously -- meaning that all but the first
admin really break non-stale locks.

As long as we do locking by use of the file system, I think there
_has_ to be some way to break stale locks.  Furthermore, I don't think
it's possible to make the method for breaking these locks *completely*
free of race conditions.

I think that our focus should therefore be on reducing the
probabilities of

  a) the occurance of stale locks and
  b) multiple lock breakages in quick succession, as this could
     possibly lead to fresh, non-stale locks being broken.

I would vastly prefer reducing the probability of breaking non-stale
lock by automated means, e.g. by introducing (relatively) large
sleep()s in appropriate places.



Oh, hang on: I just realized that I might have misunderstood what
you're suggesting.  If you meant that there should only be *one*
process which is allowed to break locks, then I agree.  

Of course, there is a catch-22 inside that: The way of ensuring that
there is only *one* instance of that process running, is to do obtain
some lock...

> As an interim approach, I'm just cranking up the qrunner and list
> lock lifetimes to really big numbers (like 10 hours and 5 hours
> respectively).

Ouch.  I do agree with you, though.

> All this is just whacked anyway.  What we really need is a
> transactional database underneath so we wouldn't need all these
> stupid list locks.  I still believe that's too much work for 2.0,
> but as this beta3 drags on, I'm starting to have doubts.

Ummm, by saying "transactional" you're ruling out mysql, right?  I
think I read somewhere that the lack of "transactions" was the main
thing separating mysql from other DBMSes.

As Mailman is GNU software, I'm wondering which free transactional
DBMS(es) Mailman could (or should :) live on top of.
-- 
Harald