[Mailman-Developers] "@" in mail **text** gets
brad.knowles at skynet.be
Sun Sep 28 10:33:45 EDT 2003
At 1:37 AM -0400 2003/09/28, Barry Warsaw wrote:
> I really really want to use something like message-ids to generate
> message file names.
IIRC, Earl talks about this in the FAQ. In short, for security
reasons, you can't trust any of the information you are given
anywhere in the message, unless you can scrub that information and
guarantee that it is now safe. Otherwise, you could get a message-id
like "<../.htaccess>" or some other equally nasty thing that could
potentially cause other files to be over-written inappropriately.
Moreover, given that there are a lot of people out there with
home networks using RFC 1918 private addressing, and this information
is being used to help generate otherwise properly formatted
message-ids, the probability of message-id collision increases
significantly. This issue was recently brought to my attention
because of my own RFC 1918 private networking here at home, and the
information my MUA uses to generate message-ids.
Therefore, I think we might want to be a bit more careful in how
we generate the file names.
> I want to be able to generate links to archived
> messages in the footers, but I think the best way to do that is to agree
> on a reproducible, independent algorithm for calculating them.
One thing that MHonArc does for messages that are not assigned a
message-id (to help detect and eliminate duplicates) is to calculate
an MD5 hash of the message headers and uses that as a substitute. We
could do the same, or perhaps even use the MD5 hash instead of the
message-id, and then store hash/message-id mappings in a database.
> approach would be to put even the public archives behind a cgi and have
> that implement a mapping between message-id derived links and the
> sequential file names (although that won't fix the regen problem).
One problem that most OSes have is with too many files in a
single directory -- go much over 1000 files in a directory and
accessing anything in that directory starts taking significantly
longer than it used to. If you use a sequential message numbering
system, it's hard to break those up into smaller chunks of messages
in a hashed directory scheme. With MD5 hashes, it would be a lot
easier to convert the hash into a path name, just by adding slashes
every so often in the hash value.
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
More information about the Mailman-Developers