As promised, here's a patch that: 1) Adds the SPLIT_DIRS option which does this: root@gandalf:/var/local/mailman/lists# l total 16 drwxrwsr-x 4 root mailman 4096 Jan 1 20:03 ./ drwxrwsr-x 19 mailman mailman 4096 Jan 1 09:55 ../ drwxrwsr-x 3 root mailman 4096 Jan 1 19:46 m/ lrwxrwxrwx 1 root mailman 43 Jan 1 20:01 mailman-owner -> /var/local/mailman/lists/m/ma/mailman-owner/ drwxrwsr-x 3 root mailman 4096 Jan 1 12:18 t/ lrwxrwxrwx 1 root mailman 34 Jan 1 20:02 test -> /var/local/mailman/lists/t/te/test/ lrwxrwxrwx 1 root mailman 35 Jan 1 20:03 test2 -> /var/local/mailman/lists/t/te/test2/ root@gandalf:/var/local/mailman/archives/private# l total 16 drwxrws--x 4 root mailman 4096 Jan 1 20:03 ./ drwxrwsr-x 4 root mailman 4096 Jan 1 09:55 ../ drwxrwsr-x 3 root mailman 4096 Jan 1 19:46 m/ lrwxrwxrwx 1 root mailman 54 Jan 1 20:01 mailman-owner -> /var/local/mailman/archives/private/m/ma/mailman-owner/ lrwxrwxrwx 1 root mailman 59 Jan 1 20:01 mailman-owner.mbox -> /var/local/mailman/archives/private/m/ma/mailman-owner.mbox/ drwxrwsr-x 3 root mailman 4096 Jan 1 12:18 t/ lrwxrwxrwx 1 root mailman 45 Jan 1 20:02 test -> /var/local/mailman/archives/private/t/te/test/ lrwxrwxrwx 1 root mailman 46 Jan 1 20:03 test2 -> /var/local/mailman/archives/private/t/te/test2/ lrwxrwxrwx 1 root mailman 51 Jan 1 20:03 test2.mbox -> /var/local/mailman/archives/private/t/te/test2.mbox/ lrwxrwxrwx 1 root mailman 50 Jan 1 20:02 test.mbox -> /var/local/mailman/archives/private/t/te/test.mbox/ This gets around the 32K link / directory limit in many filesystems (limiting mailman to 16K lists) 2) Creates the pipermail html dir at list creation time so that you don't get an http error when you view the archive of a list that doesn't have messages yet 3) rmlist now does what it advertises with -a (you couldn't erase archives after erasing a list) root@gandalf:/var/local/mailman/bin# ./rmlist test2 Not removing archives. Reinvoke with -a to remove them. Removing list info Removing list info root@gandalf:/var/local/mailman/bin# ./rmlist -a test2 List test2 does not exist or was already deleted, trying to remove archives. Removing private archives Removing private archives Removing private archives Removing private archives Removing public archives test2 public archives not found as /var/local/mailman/archives/public/t/te/test2 Removing public archives test2 public archives not found as /var/local/mailman/archives/public/t/te/test2.mbox If everyone is cool with this, I'll write a tool to convert an existing installation to the new (optional) directory layout (I need this for lists.sourceforge.net) diff -urN mailman/Mailman/Archiver/Archiver.py mailman.subdirs/Mailman/Archiver/Archiver.py --- mailman/Mailman/Archiver/Archiver.py Fri Oct 26 23:57:47 2001 +++ mailman.subdirs/Mailman/Archiver/Archiver.py Tue Jan 1 11:07:56 2002 @@ -76,6 +76,10 @@ # listname.mbox # listname/ # lots-of-pipermail-stuff + # (note that if mm_cfg.SPLIT_DIRS is set, we create subdirectories and + # use symlinks (this gets around a 32k directories limit in some + # filesystems linked to a 32k hardlink limit per inode -- Marc)) + # # public/ # listname.mbox@ -> ../private/listname.mbox # listname@ -> ../private/listname @@ -89,7 +93,25 @@ omask = os.umask(0) try: try: - os.mkdir(self.archive_dir()+'.mbox', 02775) + listname=self.internal_name(); + if mm_cfg.SPLIT_DIRS: + archprivdir=os.path.join(mm_cfg.PRIVATE_ARCHIVE_FILE_DIR, + listname[0], listname[0:2], listname + '.mbox') + os.makedirs(archprivdir, 02775) + os.symlink(archprivdir, self.archive_dir()+'.mbox') + else: + os.mkdir(self.archive_dir()+'.mbox', 02775) + # We also create an empty pipermail archive directory (pipermail + # would create it, but in the meantime lists with no archives + # return errors when you browse the non existant archive dir) + # Besides, pipermail won't know about mm_cfg.SPLIT_DIRS -- Marc + if mm_cfg.SPLIT_DIRS: + archprivdir=os.path.join(mm_cfg.PRIVATE_ARCHIVE_FILE_DIR, + listname[0], listname[0:2], listname) + os.makedirs(archprivdir, 02775) + os.symlink(archprivdir, self.archive_dir()) + else: + os.mkdir(self.archive_dir(), 02775) except OSError, e: if e.errno <> errno.EEXIST: raise finally: diff -urN mailman/Mailman/Archiver/pipermail.py mailman.subdirs/Mailman/Archiver/pipermail.py --- mailman/Mailman/Archiver/pipermail.py Fri Nov 30 09:07:32 2001 +++ mailman.subdirs/Mailman/Archiver/pipermail.py Tue Jan 1 11:28:15 2002 @@ -252,6 +252,9 @@ self.database = database # If the directory doesn't exist, create it + # This code shouldn't get run anymore, we create the directory in + # Archiver.py. It should only get used by legacy lists created that + # are only receiving their first message in the HTML archive now -- Marc try: os.stat(self.basedir) except os.error, errdata: diff -urN mailman/Mailman/Defaults.py.in mailman.subdirs/Mailman/Defaults.py.in --- mailman/Mailman/Defaults.py.in Tue Jan 1 08:29:01 2002 +++ mailman.subdirs/Mailman/Defaults.py.in Tue Jan 1 12:50:35 2002 @@ -62,6 +62,13 @@ HOME_PAGE = 'index.html' MAILMAN_SITE_LIST = 'mailman' +# Set to '1' to have mailman start creating lists in directories like +# ~mailman/lists/l/li/listname/ (same thing for the archives) to get around +# the 32K directory limitation in some filesystems. +# Because this sets symlinks to the expected positions, it is fully forward +# and backward compatible -- Marc +SPLIT_DIRS = 0 + ##### diff -urN mailman/Mailman/MailList.py mailman.subdirs/Mailman/MailList.py --- mailman/Mailman/MailList.py Tue Jan 1 08:29:02 2002 +++ mailman.subdirs/Mailman/MailList.py Tue Jan 1 10:40:25 2002 @@ -388,7 +388,13 @@ Utils.ValidateEmail(admin) omask = os.umask(0) try: - os.makedirs(os.path.join(mm_cfg.LIST_DATA_DIR, name), 02775) + listdir=os.path.join(mm_cfg.LIST_DATA_DIR, name) + if mm_cfg.SPLIT_DIRS: + splitdir=os.path.join(mm_cfg.LIST_DATA_DIR, name[0], name[0:2], name) + os.makedirs(splitdir, 02775) + os.symlink(splitdir, listdir) + else: + os.makedirs(listdir, 02775) finally: os.umask(omask) self._full_path = os.path.join(mm_cfg.LIST_DATA_DIR, name) diff -urN mailman/bin/rmlist mailman.subdirs/bin/rmlist --- mailman/bin/rmlist Sat Sep 8 01:18:47 2001 +++ mailman.subdirs/bin/rmlist Tue Jan 1 12:35:35 2002 @@ -1,4 +1,4 @@ -#! @PYTHON@ +#! /usr/bin/python # # Copyright (C) 1998,1999,2000,2001 by the Free Software Foundation, Inc. # @@ -78,11 +78,6 @@ usage(1) listname = args[0].lower().strip() - if not Utils.list_exists(listname): - usage(1, _('No such list: %(listname)s')) - - mlist = MailList.MailList(listname, lock=0) - removeArchives = 0 for opt, arg in opts: if opt in ('-a', '--archives'): @@ -90,28 +85,57 @@ elif opt in ('-h', '--help'): usage(0) + if not Utils.list_exists(listname): + if not removeArchives: + usage(1, _('No such list (or list already deleted): %(listname)s')) + else: + print _('List %(listname)s does not exist or was already deleted, trying to remove archives.') + if not removeArchives: print _('Not removing archives. Reinvoke with -a to remove them.') - # Do the MTA-specific list deletion tasks - if mm_cfg.MTA: - modname = 'Mailman.MTA.' + mm_cfg.MTA - __import__(modname) - sys.modules[modname].remove(mlist) - REMOVABLES = [('lists/%s', 'list info'), - ] + listsplitsubdir = os.path.join(listname[0], listname[0:2], listname) + listmboxsplitsubdir = os.path.join(listname[0], listname[0:2], listname + ".mbox") + REMOVABLES = [ ] + if Utils.list_exists(listname): + mlist = MailList.MailList(listname, lock=0) + + # Do the MTA-specific list deletion tasks + if mm_cfg.MTA: + modname = 'Mailman.MTA.' + mm_cfg.MTA + __import__(modname) + sys.modules[modname].remove(mlist) + + REMOVABLES = [ + (os.path.join('lists', listname), _('list info')), + (os.path.join('lists', listsplitsubdir), _('list info')) + ] + if removeArchives: - REMOVABLES.extend( - [('archives/private/%s', _('private archives')), - ('archives/private/%s.mbox', _('private archives')), - ('archives/public/%s', _('public archives')), - ('archives/public/%s.mbox', _('public archives')), - ]) + REMOVABLES.extend ([ + (os.path.join('archives', 'private', listname), + _('private archives')), + (os.path.join('archives', 'private', listsplitsubdir), + _('private archives')), + (os.path.join('archives', 'private', listname + '.mbox'), + _('private archives')), + (os.path.join('archives', 'private', listmboxsplitsubdir), + _('private archives')), + + (os.path.join('archives', 'public', listname), + _('public archives')), + (os.path.join('archives', 'public', listsplitsubdir), + _('public archives')), + (os.path.join('archives', 'public', listname + '.mbox'), + _('public archives')), + (os.path.join('archives', 'public', listmboxsplitsubdir), + _('public archives')) + ]) for dirtmpl, msg in REMOVABLES: - dir = os.path.join(mm_cfg.VAR_PREFIX, dirtmpl % listname) + dir = os.path.join(mm_cfg.VAR_PREFIX, dirtmpl) remove_it(listname, dir, msg) -- Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key
"MM" == Marc MERLIN <marc_news@vasoftware.com> writes:
MM> As promised, here's a patch that:
MM> 1) Adds the SPLIT_DIRS option which does this:
Some questions: what if you already have an m' list or a
t' list? I
guess you can't mix and match split dirs installations with non-split
dirs installations. It also means...
MM> If everyone is cool with this, I'll write a tool to convert an
MM> existing installation to the new (optional) directory layout
MM> (I need this for lists.sourceforge.net)
...that this tool will be essential. Also, will the conversion script be able to go back to non-SPLIT_DIRS configuration? This also hints that maybe we don't need the SPLIT_DIRS variable, but perhaps we can simply auto-detect it (although if it makes life easier, I'm not opposed to the variable).
Are you sure other bits that look for the list's directory still work (like template searching, or extend.py)? They should because of the symlinks.
Also...
MM> 2) Creates the pipermail html dir at list creation time so
MM> that you don't get an http error when you view the archive of
MM> a list that doesn't have messages yet
MM> 3) rmlist now does what it advertises with -a (you couldn't
MM> erase archives after erasing a list)
These are both useful patches on their own, so it's best in general to split them up into individual patches. I think I can strip them out from your diff, so don't worry about it this time. I'll go ahead and apply these parts now, and if necessary you can re-generate the SPLIT_DIRS patch based on the above comments.
Thanks! -Barry
On Tue, Jan 01, 2002 at 03:26:11PM -0500, Barry A. Warsaw wrote:
"MM" == Marc MERLIN <marc_news@vasoftware.com> writes:
MM> As promised, here's a patch that: MM> 1) Adds the SPLIT_DIRS option which does this:
Some questions: what if you already have an
m' list or a
t' list? I
I thought about list names of less than 3 characters to see if the dir hashing would work, but I forgot about this case. So basically, it means that it'll get a bit ugly if you have one letter listnames. For that matter, it should work, but your 't' list would have directories with other lists inside of it, not very pretty.
In other words we need at least a disclaimer in Default.py that if you turn the option on, you'd better have at least 2 letters for all your lists.
guess you can't mix and match split dirs installations with non-split dirs installations. It also means...
You should be able to, I wrote the patch with this in mind. All it does is move the creation place of new lists and sets a symlink back to the expected place. Unless I missed something (I've not given this real life testing outside of my laptop yet), I don't see why it can't cohabit with lists created the old way. Sure, it'd be inconsistent, but it should work. Note too that rmlist was changed to deal with both kind of lists (actually, it just deletes all possible names)
MM> If everyone is cool with this, I'll write a tool to convert an MM> existing installation to the new (optional) directory layout MM> (I need this for lists.sourceforge.net)
...that this tool will be essential. Also, will the conversion script be able to go back to non-SPLIT_DIRS configuration? This also hints
I hadn't planned to write that, but I don't expect people casually switch back and forth. Besides, if they hit 16K, they won't be able to go back anyway. That said, it shouldn't be too hard to write something to go back if someone needs it. It's also not too hard to do it by hand with a few shell commands, like cd ~mailman/lists; /bin/rm *; mv */*/* .; /bin/rm -rf [a-z] (untested and crude, but you get the idea)
that maybe we don't need the SPLIT_DIRS variable, but perhaps we can simply auto-detect it (although if it makes life easier, I'm not opposed to the variable).
Mmmh, I'd rather not have mailman try to be too fancy and try to autodetect the setup, besides, for someone who can't afford downtime (you'd have to stop mailman while the convertion happens), you could turn the switch on, and have new lists created the new way while the old ones stay the way they were. The code was written so that if you wanted to, you can turn the setting on and off for every other list.
Are you sure other bits that look for the list's directory still work (like template searching, or extend.py)? They should because of the symlinks.
I've done basic testing, but nothing real or life. That said, the idea behind symlinks is that I wouldn't have to care were all the code was since indeed, it should follow the said symlinks.
Also...
MM> 2) Creates the pipermail html dir at list creation time so MM> that you don't get an http error when you view the archive of MM> a list that doesn't have messages yet MM> 3) rmlist now does what it advertises with -a (you couldn't MM> erase archives after erasing a list)
These are both useful patches on their own, so it's best in general to split them up into individual patches. I think I can strip them out from your diff, so don't worry about it this time. I'll go ahead and apply these parts now, and if necessary you can re-generate the SPLIT_DIRS patch based on the above comments.
Yes, I know about splitting patches, I usually do, but as you'll see in splitting, I'd have had to write two versions because the SPLIT_DIRS functionality modifies how 2 and 3 would have been written (a bit)
Marc
Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key
On Wed, Jan 02, 2002 at 10:45:59AM +0100, Marc MERLIN wrote:
On Tue, Jan 01, 2002 at 03:26:11PM -0500, Barry A. Warsaw wrote:
> "MM" == Marc MERLIN <marc_news@vasoftware.com> writes:
MM> As promised, here's a patch that: MM> 1) Adds the SPLIT_DIRS option which does this:
Some questions: what if you already have an
m' list or a
t' list? II thought about list names of less than 3 characters to see if the dir hashing would work, but I forgot about this case. So basically, it means that it'll get a bit ugly if you have one letter listnames. For that matter, it should work, but your 't' list would have directories with other lists inside of it, not very pretty.
In other words we need at least a disclaimer in Default.py that if you turn the option on, you'd better have at least 2 letters for all your lists.
Actually, I just realized that if you have a list named 'a', and you try to delete it, my current rmlist code will blow away all the lists that start with 'a'. I think the best fix to this, instead of making mailman too complex, or removing the flexibility of running a dual setup (with both old and new lists), is
- Document that all lists need to have at least two letters above the setting in Defaults.py
- Patch newlist and rmlist further to refuse to deal with one letter lists.
For #2, we can do it in the following ways: a) all the time (even if you don't have split_dirs enabled) b) only if split_dirs is enabled (that would mean that if you enabled it and disabled it later, and then you try to delete list 'a', it will blow away all the lists that start with 'a' and that were created when split_dirs was enabled c) As soon as split dirs is enabled and a list is created, newlist drops a ~mailman/lists/.splitdirs_do_not_delete file, and after that refuses to create or delete one letter lists whether the setting is currently enabled or not
I do realize that none are ideal or foolproof, but I expect that few users are ever going to use this, and even fewer will toggle the setting back and forth. I believe choosing (b) or (c) and documenting appropriately should be enough.
What do you think?
Marc
Microsoft is to operating systems & security .... .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | Finger marc_f@merlins.org for PGP key
participants (2)
-
barry@zope.com
-
Marc MERLIN