[ mailman-Feature Requests-1059566 ] Message number assignment
during archiving with pipermail...
SourceForge.net
noreply at sourceforge.net
Wed Nov 3 16:33:49 CET 2004
Feature Requests item #1059566, was opened at 2004-11-03 15:33
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=350103&aid=1059566&group_id=103
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brad Knowles (shub)
Assigned to: Nobody/Anonymous (nobody)
Summary: Message number assignment during archiving with pipermail...
Initial Comment:
Folks,
Currently, pipermail will generate a number for each message as it
is processing the archive, including importing mbox-format
archives from other sources. Unfortunately, the numbers are
specific to each installation, so importing an archive from
somewhere else causes the message numbers after import to be
different.
Among other things, this makes it difficult to move a list from one
server to another and to set up a simple redirect from the old
location to the new one, since the message numbers could well be
different between the two sites.
It would be really, really nice if pipermail would use something like
an md5 hash of the message headers to generate a unique archive
id that could be used instead, so that the id could be consistent
across systems.
Since the "Message-id:" header should be unique for every
message, and the date/time stamps and queue-ids used by the
various servers (and logged within the "Received:" headers) will
almost certainly be slightly different for every message, using an
md5 hash should give you a good guarantee that the output
pipermail archive id would likewise be completely unique.
You could use the lower 16 hex characters of the 128-bit/32-
character md5 hash that is typically generated, and your
probability of collisions between any two messages will be
vanishingly small in the resulting 64-bit space.
If you're concerned about filename length, represent the data in
base64 format (6 bits per ASCII character instead of four), and get
the whole 128-bit hash compressed down to 22 output characters.
You could then take a smaller slice and still get more bits of hash
output.
If you don't like md5 for personal reasons, then maybe sha-1?
But please, whatever is used, please, please, please let it be
something that could be derived from the headers of the messages
themselves and guaranteed to be consistent across systems.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=350103&aid=1059566&group_id=103
More information about the Mailman-coders
mailing list