On Mar 26, 2012, at 03:20 PM, David Jeske wrote:
>I'm writing to find out the state of and philosophy surrounding pipermail
>in mailman, to see if there is a productive way to provide some
>code/development-time to that part of mailman.
That's awesome. Pipermail (in its current state) is ancient, and it shows.
It's biggest flaw IMO is the lack of "stable URL" support, meaning if you
regenerate the archive from the underlying mbox file, you'll likely break all
your links. Pipermail has lots of other problems, but this one for example is
why it was ditched for Launchpad.
I think the time is ripe for new archivers, and Mailman 3's philosophy is to
provide a framework in which developers can easily experiment with new
archiver technology, integrated easily into Mailman.
I personally think it would be best to spend effort on one of the many
archiver projects being worked on, and mentioned here, rather than trying to
improve Pipermail. One possible connection though would be to consider how a
site would migrate from a Pipermail-based archiver to one of the new ones. Do
you keep the old URLs alive but just not add anything new to Pipermail? Do
you provide a mapping from Pipermail URLs to new-archiver URLs? etc.
>I've written code for this a number of times (eGroups, Yahoo Groups, Google
>Groups). I also released an open-source python/clearsilver/sqlite based
>archiver with redundant text-eliding, a few different thread views, and
>search... ( http://www.clearsilver.net/archive/ ) which is hardly used
>both because I don't try to popularize it, and because many sites just
>leave the default (pipermail).
Neat. Perhaps you'd like to contribute an implementation of the IArchiver
interface in Mailman 3 that would send posted messages to ClearSilver?
Here's the current interface definition:
and some example implementations:
>a) pipermail is fine... if you want to fix a bug or two submit a patch, but
>we don't want to improve it
Note that before I ripped Pipermail out of the Mailman 3 trunk branch, I
created a project on Launchpad and pushed a semi-sanitized branch:
I don't personally plan to touch this, but if someone is really motivated to
hack on Pipermail specifically, I'd happily give you access.
>b) we're ditching pipermail entirely... in the future sites will have to
>choose an install an external archiver
Yes, but remember, you can choose one or more archivers in Mailman 3. So it
would be easy to archive to something local, and The Mail Archive, and Gmane,
>c) we'd love pipermail to be improved... but we still want it to be simple,
>static-html, and dependency free
>d) we'd love a dynamic-ui replacement for pipermail... as long as it uses
>the same cgi/templating model as mailman ui
Because Postorius (the official, but not required MM3 web ui) is Django-based,
I think it should be pretty flexible for customizing the look and feel. I
won't dictate what a new archiver should look like, but some principles that I
personally think should be followed include:
- Modern web technologies, with flexible templating, so that sites can
customize the look and feel as needed.
simple browser, but that also doesn't mean it has to be feature-equivalent
term? I.e. make it awesome for today's web, but usable in older browsers,
screen readers (for accessibility), etc.
- Dynamic generation of pages with caching. I'd love to see an enhanceable
approach to the actual HTML generation. Let's say for example, that I
suddenly want to recognize and hyperlink "bug numbers" so they point to my
tracker. I should be able to drop in some extension to do that. Or maybe
the spammers have cracked my email obfuscation algorithm. I should be able
to drop in a replacement, invalidate the cache, and all my new page would
automatically get the new obfuscation algorithm.
- Separate, dynamic support for take down notices. You posted your personal
information and have asked the site's postmasters to take that down (either
the full message or the personal information parts of it). It should be
really simple for the site admins to do this, while possibly retaining the
original message in a publicly inaccessible location for forensic purposes.
- Support for private archivers, so configurable authentication would be
- Merging of forums, archives, newsgroups, and IMAP.
- A REST API for querying information about the site, its lists, individual
archived messages, metrics, etc. Maybe even some control (e.g. take downs)
for users with the proper permissions.
- Stable URLs, RFC 5064 + X-Message-ID-Hash. See the above links. If you can
implement the `IArchiver.permalink()` method and ensure that even if
completely wiped and regenerated from the underlying raw messages, your URLs
will remain stable, I think you will have won. :)
That's about all I can think of right now, but I'll say this: I think there's
huge untapped value in a really great archiving framework. I have lots of
ideas and it's something I'd like to work on eventually, once we get the core
engine stable and released.
Also, as you work on archivers and try to integrate them with mm3, please do
provide feedback on the IArchiver API and the integration implementation in
particular. What's missing? What's wrong with the API?
Note too that atm, the lp:mailman bzr trunk is a bit ahead of 3.0b1 here. I
had to disable some things at the last minute in the ArchiveRunner, and I've
since fixed and pushed updates. So you probably want to at least take a look
at the bzr branch.
Am 28.03.12 18:47, schrieb Odhiambo Washington:
> One more thing:
> In settings.py, I have this:
> REST_SERVER = 'http://192.168.40.252:8001'
> However, this doesn't seem to be respected when I do runserver:
> [root@jaribu] /usr/home/wash/Tools/Mailman/MM3/postorius/dev_setup#
> python manage.py runserver
> Validating models...
> 0 errors found
> Django version 1.4, using settings 'dev_setup.settings'
> Development server is running at http://127.0.0.1:8000/
> Quit the server with CONTROL-C.
> [root@jaribu] /usr/home/wash# sockstat -l | grep 800
> root python 77906 3 tcp4 127.0.0.1:8000
> <http://127.0.0.1:8000> *:*
> root python 14108 43 tcp4 127.0.0.1:8001
> <http://127.0.0.1:8001> *:*
> Since I am not using the server as a Desktop, I need a way to access it
> remotely, not via 127.0.0.1
The REST_SERVER setting defines the location of Mailman's rest API
(which is frequently accessed by postorius), *not* the address of
postorius itself. The API can only be accessed from localhost, so the
setting has to be 'http://localhost:8001'.
If you'd like to access postorius from a different machine as the one
you're running it on, that's no problem:
Just run the development server like this and you're good to go:
python manage.py runserver 192.168.x.xxx:8000
(Don't do that on a machine that is exposed to the web though, since
Django's dev server is not meant to be run in a production environment.)
Hope that helps!
> Best regards,
> Odhiambo WASHINGTON,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> I can't hear you -- I'm using the scrambler.
> Please consider the environment before printing this email.
On 12-03-28 7:49 PM, Stephen J. Turnbull wrote:
> Hey, welcome, Ana! I suspect that the Systers organization has moved
> on, since their specific needs were satisfied AFAIK. But they have
> some really good people (both as engineers and as human beings), so
> please do get in touch with them (especially the people who worked on
> their Mailman projects). I'm sure they'll be glad to talk with you.
*laugh* Stephen, maybe I should have mentioned that I'm *also* a GSoC
mentor (and one of several org admins) for Systers, and we are indeed
doing some Mailman related projects this year.
For the most part, I've kept the suggested projects separate on the
respective GSoC 2012 projects page. Systers is the obvious candidate
for some Mailman 3 related projects, specifically dynamic sublists
(which was invented for Systers and I hope to have replace topics in
Mailman 3) and some usability work I'd like done on the admin interface
(Systers has an experienced usability expert on our mentoring team who's
agreed to help if we can find the right student). And of course,
Systers still has wishlists for their current custom install which is
based on Mailman 2.1.10 and other organizational projects.
There's a bit of a grey area where students could put in a submission
under either Systers or Mailman, though. Archives work, for example,
could probably go with either. I'm pretty much waiting to see and
hoping we've got enough mentors to go around.
On Mar 26, 2012, at 06:07 PM, David Jeske wrote:
>I highly recommend reconsidering this and including a standard archiver
>with mailman. If the number of sites that use pipermail is any indication,
>I think failing to include something will basically mean lots of lists
>without any archives.
I think you're right. People want a turnkey solution - download one thing,
run one command, and you've got everything you need. Of course, with mm2, you
*don't* though because Pipermail never provided searching for example.
There are lots of questions to ask about how we in the Mailman project would
provide that kind of turnkey solution that includes our best-of-breed
services. OTOH, it's also powerful to provide choice and not require any
My gut tells me that we're on the right track with Postorius, but that we're
pretty far away from being able to bless an archiver right now. It's
definitely not something I want to hold up the 3.0 final release for, so we
have to find the right way to manage our users expectations.
Understand though that we're constrained in other ways when we start thinking
about bundling or officially blessing certain components. Mailman is a GNU
project and the core's copyright has been owned by the FSF for over a decade.
We require copyright assignments to the FSF. Postorius falls under the same
administrative structure, so there's no problem calling it the official GNU
Mailman web ui. A bless, bundled archiver will have to probably adhere to the
same constraints, or at least we'd have to have that conversation with the FSF
But that absolutely shouldn't stop any other third party archiver from being
Mailman 3 compatible.
As I said, like the Python standard library, it's both a blessing and a
>As for the features it doesn't have from your list: Editing would be easy
>to add because it's sqlite (deciding on the auth system is probably more of
>an issue than the editing). Anti-Crawl code is really an issue of
>configuration for cheap in-memory state-management. NNTP is well. that
>would be a big job that I doubt will be bitten off by something as "small"
>as a list archiver.
Why can't we kill off Gmane while we're killing off Gmail, and *Groups? :).
>What is the REST UI used by? CSLA supports RSS. When it comes to a more
>involved REST UI, what software would be hitting it? I don't think I'll
>understand your other API/REST points until I see an answer to this.
I'm a list owner and someone requests that a post containing private
information be taken down.
As a drive-by archive user, I want to request that a message get sent to me so
that I can reply to it in my mail reader as if I had received the original.
I run a question/answer forum that gateways a list, and I want to +1 really
helpful messages, or give some extra kudos to really helpful users.
On Mar 27, 2012, at 05:20 PM, Jeff Breidenbach wrote:
>What is the incantation for enabling an external archiving service?
>Currently I only see this in mailman.cfg after following 5 minute guide.
That configuration variable just sets up the appropriate queue runner. Of
course without that, nothing would get archived, but it's not the interesting
bit from your perspective. :)
(I know that figuring out the configuration ini-file stack can be a little
confusing; we need better/more docs!)
So, every archiver that Mailman knows about will have a section such as
where <name> can currently be mhonarc, mail_archive, or prototype. It
shouldn't be difficult to add new archivers by doing something like this in
your mailman.cfg file:
It's the 'enable' variable that defines whether the archiver is enabled
system-wide or not. This is documented in schema.cfg (think: the bottom rung
of the ini-stack, although it's slightly different than mailman.cfg).
(Aside: I think I need to start a YouTube channel on mm3 :).
The template for archiver sections is the [archiver.master] section in the
schema.cfg. You'll see all the standard configuration variables defined
there, with their default values.
These are all site-wide configurations which define, enable, and configure the
various archivers available to the system.
Mailing lists themselves have a more limited palette (currently) for
configuring their archiving behavior. There are two list-specific values:
- IMailingList.archive is a boolean which determines whether the list will
archive its messages at all or not. The default list style sets this to
- IMailingList.archive_private is a boolean only consulted in the
mail_archive archiver. If this is true, then it will not email messages to
I've thought about giving more control to individual lists, but I'm not sure
how much value there is in allowing a list owner to e.g. decide which of the
set of enabled archivers their messages get forwarded to.
>> archivers are configured site-wide, so there's almost nothing
>> to expose in the web-ui.
>I'm worried about confusion. The last thing we want is for a list to be
>accidentally archived contrary to the list administrator's wish. It sounds
>scary to me not to have any indication whatsoever in the web
Ah, sorry, yes the two booleans above should be exposed in the REST API.
I've marked it 'easy' and I think it would be.
I'm not sure whether you're asking if the ini-file settings should be exposed
in the REST API. I'm much less certain about that. I think it wouldn't be
difficult to expose them read-only, but I'm hesitant to let REST clients
change mailman.cfg variables, especially because having them take effect would
require a restart.
>Along similar lines, there seems opportunity for confusion if there are
>two independent mechanisms for archival; site wide configuration and
>also manually subscribing an archival subscriber such as
>archive(a)mail-archive.com. I can imagine someone turning off just
>one of these two mechanisms then being surprised that it had
>no practical effect.
Hopefully the above explains things. The mail-archive implementation of
IArchiver does just email the address specified in
[archiver.mail_archive]recipient, but it won't do this if the mailing list's
.archive_private setting is True.
Suggestions for better integration are welcome.
>Finally, it sounds like there are architectural reasons for having
>archiving a site-wide configuration. But I do think list admins would
>appreciate some sort per list GUI option, to easily distinguish
>between public and private lists. These are often different folks
>from the sysadmin who can apt-get install mailman without
>giving a first glance at the mailman.cfg file.
Hopefully the above explains the state of things. The system needs to know
about all the available archivers, but list admins do have some small amount
of control over whether their list gets archived or not.
Thanks for the super-detailed replies... I'm separating these discussions,
so here I have some questions about licensing and bundling..
> So in some sense, CSLA needn't become *the* Mailman archiver, but it should
> definitely be *a* Mailman archiver. Then you can make all the engineering
> design decisions you prefer, but with the confidence that it will Just Work
> with Mailman 3.
Sure but this isn't why I'm here. CSLA is already *a* mailman archiver..I
think we first released it in 2004. A few of us ex-egroups folks hacked it
out because we used it for private projects. We open-sourced it so we could
use it across organizational boundaries and because we were happy to give
it away to anyone who wanted it. We're just all primarily focused on
startup and commercial endeavors, so we havn't done much to package and
Right now I'm in between entrepreneural endeavors and spending some time
'giving back' and coding/donating-to/helping several open-source projects.
As I engage with these projects, all of them are using Mailman, which is
fantastic. However, nearly all of them are also using pipermail, which is
not so great. They are using it because it's the default, so it was easy.
I started to talk to one of them about installing CSLA (or MHonArc, or
anything really), and realized I should see if you folks are interested in
a great bundled archiver, to fix the problem at the source. I'm not
particularly interested in promoting or maintaining an open-source project
around this, so if you folks don't want a shiny new (S-BSD licensed)
archiver to bundle, I'll probably just fix a few things, bump the CSLA
archiver to 0.3 and move on.
I admit that even with a pretty good knowledge of these many licenses, I'm
not familiar with the intracacies of FSF copyright assignment and non-GPL
The ClearsilverArchiver code (written by me and two others) is released
under the "Simplified BSD" license and "totally free". It's important to me
that any code I release be similarly free-and-unrestricted
(i.e. BSD/Python/Artistic/PublicDomain), not free under certain conditions
(i.e. GPL/LGPL). It's not possible to assert GPL restrictions on
totally-free code, because it's already totally free.
FSF says S-BSD is GPL-Compatible, which I believe means they are saying
they have no problem with GPL code depending on and being combined with
(i.e. linked with) S-BSD code, because the S-BSD code is fully open-source
and does not put restrictions on the use of the GPL code.
It's also my understanding that the primary reason for FSF copyright
assignment is to provide a coherent entity to enforce the terms of the GPL
by challenging violators who don't redistribute source.... something which
is not necessary for S-BSD. (Though I suppose they could enforce that folks
include the S-BSD copyright notices.)
So I guess this all drives to the following question:
Is Mailman-team is interested in having a better built-in archiver that is
included in the distribution, but licensed under the less-restrictive S-BSD
Sorry for the length. This license stuff can be complicated.
On a weirdly unrelated coincidence, thanks for smtpd.py. I just hacked it
into smtp-to-maildir for a "private hosted webmail" installation. We were
migrating code/data to some new machines and smtpd.py seemed simpler than
fighting with qmail-installation or configuring postfix to "accept
everything" (something it doesn't seem designed to do).
> But that absolutely shouldn't stop any other third party archiver from
> Mailman 3 compatible.
> As I said, like the Python standard library, it's both a blessing and a
> curse. :)
> >As for the features it doesn't have from your list: Editing would be easy
> >to add because it's sqlite (deciding on the auth system is probably more
> >an issue than the editing). Anti-Crawl code is really an issue of
> >configuration for cheap in-memory state-management. NNTP is well. that
> >would be a big job that I doubt will be bitten off by something as
> >as a list archiver.
> Why can't we kill off Gmane while we're killing off Gmail, and *Groups?
> >What is the REST UI used by? CSLA supports RSS. When it comes to a more
> >involved REST UI, what software would be hitting it? I don't think I'll
> >understand your other API/REST points until I see an answer to this.
> I'm a list owner and someone requests that a post containing private
> information be taken down.
> As a drive-by archive user, I want to request that a message get sent to
> that I can reply to it in my mail reader as if I had received the
> I run a question/answer forum that gateways a list, and I want to +1
> helpful messages, or give some extra kudos to really helpful users.
> Mailman-Developers mailing list
> Mailman FAQ: http://wiki.list.org/x/AgA3
> Searchable Archives:
> Security Policy: http://wiki.list.org/x/QIA9
Am 26.03.12 19:03, schrieb Ana Cutillas:
> my name is Ana Cutillas and I am a senior Computer Science student from
> Spain. I am really interested in working on the Mailman project either with
> you directly or with Systers.
> I have been reading the list of ideas to implement and I am very interested
> in the #6 Creating user profiles (
> http://blog.linuxgrrl.com/2012/03/13/mailman-brainstorm/). I have been
> wanting to work on a project that involved data mining for a while now and
> I think this could be a good opportunity.
This would definitely be a very interesting GSOC project! It might
involve working on a couple of different ends of the mailman family,
like the django web ui (launchpad.net/postorius), the archiver/searcher
(see "hyperkitty" - Toshio Kuratomi probably has more details) and maybe
even the Mailman3 core (see: launchpad.net/mailman).
> In general, I think profiles should have the really straightforward
> information: last time they started a conversation, last time they sent an
> email to the list, when did they sign up, what time of the day are they
> more active, etc. But it should be fairly easy to add cool stuff like, in
> case a list allows the use of more than one language, the language the user
> uses the most and maybe even percentages of usage, and with some
> information retrieval we could get keywords to know what they like to talk
> about the most.
Yes, those would all be very interesting pieces of data.
> I like some of the other ideas too, so I can talk to you about them if you
> want to.
Of course! I think a good place to start would be to install mailman and
postorius and have a look at the code. Also, Toshio could probably tell
you a little bit more about the current work on the archiver. You can
also find us on irc (#mailman on freenode, my handle is florianf,
Toshio's is abadger1999) if you run into problems or have any questions.
On Mar 24, 2012, at 03:19 AM, vikash agrawal wrote:
>I am Vikash and very much interested in contributing to mailman and being a
>GSoC student this year. So far, I have successfully installed mailman in my
>I do have skills in Python 2.7 but as I am very new to mailman thus I am
>looking for something small to hack and doable in this summer. Also, the
>idea page doesnot mention the skills required for the project so its somewhat
>difficult for me to choose one. As a result I would like you to guide me over
>the same . I am willing to learn a lot this summer :-)
Fantastic! This is the right place to ask questions. Also many of us hang
out on IRC using the freenode channel #mailman.
Note that I've started to tag bugs in the tracker with 'easy' if I think they
are (but I could be wrong :). So you could search bugs.launchpad.net/mailman
for the 'easy' and 'mailman3' tags to find things to get started with.
On Tue, Mar 27, 2012 at 03:01:41PM +0530, Shayan Md wrote:
> On Tue, Mar 27, 2012 at 1:11 AM, Toshio Kuratomi <a.badger(a)gmail.com> wrote:
> Is this integration to be done with mailman2 or mailman3?
> In mailman3, the archivers are separated from the mailman core.
> I was working on mm3. But systers' indexer/searcher was implemented for
> mailman2. So it must be easy for to integrate it with mm2.
> Looks like archiver for mm3 is still in development stage. As far as I
> understand searcher depends on the srchiver, right? Not completely but it
> somewhat depends on archiver. I am not sure if searcher can be implemented
> without archiver. If possible I can implement for mm3 also.
The searcher wouldn't be much use without an archiver. There is a sample
archiver in mailman core -- if enabled, it stores the messages to lists in
maildirs. It does not have a frontend for retrieving or otherwise
displaying the archives.
In some of the other threads here, it was brought up that a builtin archiver
with the same feature set of pipermail could be desirable. If you're
interested in integrating search into mailman I'd watch (and participate!)
in that thread to see what the outcome of that discussion is.
Stephen J. Turnbull <stephen(a)xemacs.org> wrote:
> On Wed, Mar 28, 2012 at 4:21 AM, Terri Oda <terri(a)zone12.com> wrote:
> >> Looks like archiver for mm3 is still in development stage. As far as I
> >> understand searcher depends on the srchiver, right? Not completely but it
> >> somewhat depends on archiver. I am not sure if searcher can be implemented
> >> without archiver. If possible I can implement for mm3 also.
> > Searcher and archiver are interdependent *if* we want to share caches and
> > data stores, which we probably do for any installation with larger archives
> > where storing 2 copies vs 4 of each message would make a difference. Plus,
> > many archive views may be basically searches "messages in the last month"
> > "messages which are replies to messageid $foo" etc.
> Actually, as far as I can see, the summary/search/index/retrieval
> functions depend only on the API for the message store. If you
> want, you can split this into the database layer and a presentation
> layer, of course. However, the database layer is surely going to
> have its own schema optimized for the kinds of retrieval its
> designer considers important. If the designer emphasizes
> threads, however, she is *not* going to try to store messages in
> thread order or anything like that. Rather, any reasonable store
> will be message-ID-addressable.
Right. UpLib has a 'message-store', which the threading code interacts
with to generate threads as data referring to document IDs. The
message-store API can take both message-IDs or UpLib document IDs and
> The only tricky issue is that we *do* have to worry about
> message-ID collisions of truly different messages and about
> messages without message IDs, especially for converted
> historical archives. So the API needs to be able to deal
> with these issues, probably by returning a set or sequence
> of messages.
Right. UpLib takes a message and creates multiple 'documents' (one for
the message, and one for each attachment), each of which have their own
unique 'doc ID', the assigned UpLib ID. In addition, the email is
assigned a 'mail-guid', which is calculated from some of the header
information and may also include the doc ID. The metadata of each
attachment refers back to the 'mail-guid' of the message it was part of.
Message-ID, mail-guid, and document ID are all separately indexed for
each document, and any of them can be searched on.
> Oh, and we probably ought to have a more general notion
> of retrievable "object" rather than just messages, as some
> archive/retrieval backends may store some types of MIME
> part separately. Hopefully these would be presented to
> us as MIME parts with external bodies and content IDs.
Here's how I do it. In UpLib, a multipart email is analyzed into a
message plus possible attachments. The parts that are the 'message' are
unified and presented as a document. The parts that are attachments are
broken out and processed as independent documents, iconified links to
which are then put back into the 'message' document.
See http://uplib.parc.com/misc/noguchi.png for an example of the UpLib
reader, ReadUp, showing a plain-text email with an attached PDF file.
Most of the things that can be links there (like "Reply" or the email
addresses or my name or the URL or the attachment icon and name) are in
> And that's all we want to say about the archiver and the
> associated message-retrieval logic, I think. (In fact, it occurs to
> me that maybe we should say "RFC 3501" and be done with
> it. I don't mean that we necessarily implement IMAP protocol
> per se, but some subset of its functionality probably is what we
> need from an archiver.)
Yes, there's an IMAP server that runs in UpLib, and can export any
document via IMAP (including archived email). Though it currently
doesn't scale well; I need to re-write it with Tornado, too.