[Catalog-sig] Fwd: [PyPI] Internal Error

"Martin v. Löwis" martin at v.loewis.de
Fri May 11 09:54:34 CEST 2007


A.M. Kuchling schrieb:
> Here's a copy of the sporadically-received tracebacks from PyPI.

I think I have now fixed this problem, at least in the common case.
The problem was that I had a time-stamp associated with the browse
tally, to regenerate the tally if it is too old. I read the time
stamp, lock the tally table, regenerate it, and update the time stamp.

Now, if two processes do the same simultaneously, they first acquire
a read lock for time stamp (implicitly as a side effect of the
select). Then they ask for a table lock, and one of them gets it
and performs the write. That one then writes the time-stamp,
requiring a write lock for the time-stamp, and attempts to upgrade
the read lock. For this, the other reader would need to go away,
but it waits for a write lock that we hold - a deadlock occurs.

I'm not quite sure how Postgres proceeds exactly. It somehow
breaks the locks, apparently breaking the read lock, allowing
the upgrade to the write lock. So the entire update succeeds,
but the second writer fails.

The solution is to commit the read operation before locking the
table, to release the read lock. In the new transaction, only
a table lock is required at first, and no deadlock arises.

I have changed PyPI to operate that way, and on my local
installation, the error went away.

I'm unsure whether there could be conflicts with other
write operations, but at the moment, I can't see how.

Regards,
Martin


More information about the Catalog-sig mailing list