importing archived Maildir email lists into Mailman lists archives
Hi everybody!
I'm new to most of the fields related with this issue so, please, accept my apologies in advance if the question or the way towards its answer is too obvious!
I have a number of Maildir format mailing lists archives and I would like to add them to the archives of the same lists created in a new Mailman installation.
I find several messages/scripts doing the job the other way round (mbox/pipermail/mailman to maildir), but nothing about Maildir to Mailman, assumable by a "programming capabilities-less user".
Thanks for your time!!!
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
On 08/28/2015 12:52 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
I have a number of Maildir format mailing lists archives and I would like to add them to the archives of the same lists created in a new Mailman installation.
This search <https://www.google.com/?gws_rd=ssl#q=convert+maildir+to+mbox> should return something useful.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thanks Mark, all,
On Sun, Aug 30, 2015 at 4:09 AM, Mark Sapiro <mark@msapiro.net> wrote:
On 08/28/2015 12:52 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
I have a number of Maildir format mailing lists archives and I would like to add them to the archives of the same lists created in a new Mailman installation.
This search <https://www.google.com/?gws_rd=ssl#q=convert+maildir+to+mbox> should return something useful.
This has been more or less my entry point to this issue. Perhaps I've not been able to explain the issue correctly or I'm missing something. What I'm looking to move into mbox is a set of archives of Maildir mailing lists. Those archives don't contain new, cur and tmp folders all the scripts I've spotted at looked for. This is the structure of each archived email list folder I have here...
Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls Log bounce digissue headerremove lock mod outlocal remote allow bouncer dignum indexed lockbounce modsub owner subscribers archive config editor inlocal mailinglist num prefix text archived digest headeradd key manager outhost public tstdig
Within /archive, there are two folders, 0 and 1, with a number of files, each of them containing one message, and an index file.
Please, does this made sense for you? Am I completely lost? How could I deal with this "archives" to move them to mbox files and get them imported our brand new Mailman server?
Thank you very much for your help!
Ricardo
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/ricardo.rodriguez%40id...
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
On 08/31/2015 05:58 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
This has been more or less my entry point to this issue. Perhaps I've not been able to explain the issue correctly or I'm missing something. What I'm looking to move into mbox is a set of archives of Maildir mailing lists.
Why do you refer to these archives as being of Maildir mailing lists? What is the actual software that created them.
Those archives don't contain new, cur and tmp folders all the scripts I've spotted at looked for. This is the structure of each archived email list folder I have here...
Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls Log bounce digissue headerremove lock mod outlocal remote allow bouncer dignum indexed lockbounce modsub owner subscribers archive config editor inlocal mailinglist num prefix text archived digest headeradd key manager outhost public tstdig
Within /archive, there are two folders, 0 and 1, with a number of files, each of them containing one message, and an index file.
This looks vaguely like Maildir, except Maildir has no index file. What is in the index file(s)? What is the difference between 0 and 1? Do they each contain some complete (including all headers) messages, i.e. each contains a set of messages that combined are all the messages, or does one of them contain message headers and the other bodies, or does one contain messages and the other contain metadata about the messages?
Please, does this made sense for you? Am I completely lost? How could I deal with this "archives" to move them to mbox files and get them imported our brand new Mailman server?
You just need to add a *nix From_ separator to the beginning of each complete message and concatenate them.
If you can show me what's actually in some of those files, I could possibly create a conversion script for you.
It could be as simple as something like
cat archives/0/* archives/1/* > mbox
except for the index files, but probably not.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thanks for the answer! Please, read below...
On Tue, Sep 1, 2015 at 3:12 AM, Mark Sapiro <mark@msapiro.net> wrote:
On 08/31/2015 05:58 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
This has been more or less my entry point to this issue. Perhaps I've not been able to explain the issue correctly or I'm missing something. What I'm looking to move into mbox is a set of archives of Maildir mailing lists.
Why do you refer to these archives as being of Maildir mailing lists? What is the actual software that created them.
The original sentence I got from the provider said, translated to English: the format is vpopmail (qmail - courier). I must recognise that I'm not able to reconstruct the pathway that has led me to say that we were using Maildir. Ezmlm was the actual software that created that lists. Both, regular maildir account subfolders and ezmlm subfolders are stored in the same folder in the archives structure. I think this was the reason why I interpreted all of them are Maildir subfolders.
Those archives don't contain new, cur and tmp folders all
the scripts I've spotted at looked for. This is the structure of each archived email list folder I have here...
Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls Log bounce digissue headerremove lock mod outlocal remote allow bouncer dignum indexed lockbounce modsub owner subscribers archive config editor inlocal mailinglist num prefix text archived digest headeradd key manager outhost public tstdig
Within /archive, there are two folders, 0 and 1, with a number of files, each of them containing one message, and an index file.
This looks vaguely like Maildir, except Maildir has no index file. What is in the index file(s)? What is the difference between 0 and 1? Do they each contain some complete (including all headers) messages, i.e. each contains a set of messages that combined are all the messages, or does one of them contain message headers and the other bodies, or does one contain messages and the other contain metadata about the messages?
Here you can access to the whole structure of one of the subfolders...
http://datasource.idisantiago.es/r.users/
Here the indexes of subfolders 0 and 1 (ISO Latin 1 encoding required for a correct rendering of all the characters)...
http://datasource.idisantiago.es/r.users/archive/0/index http://datasource.idisantiago.es/r.users/archive/1/index
It looks like each subfolder in the row within /archive holds a maximum of 100 messages numbered from 0 to 99.
Each file within subfolders 0 and 1 contains a complete message, including attachments (this is a guess, as files' size ranges from few Kbs to several Mbs).
Please, does this made sense for you? Am I completely lost? How could I deal with this "archives" to move them to mbox files and get them imported our brand new Mailman server?
You just need to add a *nix From_ separator to the beginning of each complete message and concatenate them.
If you can show me what's actually in some of those files, I could
The link above gives access to a whole subfolder!
possibly create a conversion script for you.
It could be as simple as something like
cat archives/0/* archives/1/* > mbox
except for the index files, but probably not.
I founded this...
[Mailman-Users] Moving to mailman from ezmlm https://mail.python.org/pipermail/mailman-users/2003-November/032591.html
[Mailman-Users] Migrate from EZMLM to Mailman https://mail.python.org/pipermail/mailman-users/2012-November/074408.html
But I'm afraid I have been not able to find any message with more detailed information.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Once again: thank you for your help and, please, accept my apologies if I'm not able to provide accurate and clear information about the issue: at least for me, it is by no means easy to sail this sea plenty of different standards!
All the best,
Ricardo
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
[...]
Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls Log bounce digissue headerremove lock mod outlocal remote allow bouncer dignum indexed
That's Ezmlm. Have you tried something like this:
http://www.arctic.org/~dean/scripts/ezmlm2mbox
Get your Ezmlm archives to Mbox then that can be used as the basis for the Mailman archive.
Andrew.
Thanks! It worked like a charm for at least two of our old mailing lists! Script, ezmlm structure and mbox output keeps stored at...
http://datasource.idisantiago.es
... for easier reference.
Now, I only have to import that mbox contents into the actual list. I guess this is the right path...
5.1. How do I import an archive into a new mailing list? http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
Thank you so much for all your support!
Cheers,
Ricardo
On Wed, Sep 2, 2015 at 11:43 AM, Andrew Hodgson <andrew@hodgsonfamily.org> wrote:
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
[...]
Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls Log bounce digissue headerremove lock mod outlocal remote allow bouncer dignum indexed
That's Ezmlm. Have you tried something like this:
http://www.arctic.org/~dean/scripts/ezmlm2mbox
Get your Ezmlm archives to Mbox then that can be used as the basis for the Mailman archive.
Andrew.
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
[...]
Thanks! It worked like a charm for at least two of our old mailing lists! Script, ezmlm structure and mbox output keeps stored at...
http://datasource.idisantiago.es
... for easier reference.
Now, I only have to import that mbox contents into the actual list. I guess this is the right path...
5.1. How do I import an archive into a new mailing list? http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
Yup, this is similar to what I did. I recommend running cleanarch on that mbox before importing it, to ensure that it is ok. I took the second option in that FAQ, as it was a new list. Copy the Mbox file into the private directory after running it through cleanarch, then run the arch command to generate the HTML archives for the web interface.
Andrew.
Thanks! Please, read below...
On Wed, Sep 2, 2015 at 1:03 PM, Andrew Hodgson <andrew@hodgsonfamily.org> wrote:
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
[...]
Thanks! It worked like a charm for at least two of our old mailing lists! Script, ezmlm structure and mbox output keeps stored at...
http://datasource.idisantiago.es
... for easier reference.
Now, I only have to import that mbox contents into the actual list. I guess this is the right path...
5.1. How do I import an archive into a new mailing list?
http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
Yup, this is similar to what I did. I recommend running cleanarch on that mbox before importing it, to ensure that it is ok. I took the second option in that FAQ, as it was a new list. Copy the Mbox file into the private directory after running it through cleanarch, then run the arch command to generate the HTML archives for the web interface.
Andrew.
Before going ahead, there is one thing I'm curious about as I don't understand it. Please, why does the path to the mbox file include <listname>.mbox twice? Here...
http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
It reads...
archives/private/<listname>.mbox/<listname>.mbox
Thanks!
Ricardo
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
Before going ahead, there is one thing I'm curious about as I don't understand it. Please, why does the path to the mbox file include <listname>.mbox twice? Here...
http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
It reads...
archives/private/<listname>.mbox/<listname>.mbox
Yup, the file needs to be called listname.mbox in the directory ~/archives/private/listname.mbox. So for example, to copy the mailman-users archive into your home directory, assuming Mailman was installed in the default location, you would enter a command like:
cp /usr/local/mailman/private/mailman-users.mbox/mailman-users.mbox ~
That would give you a file called mailman-users.mbox in your home directory with all the archives in mbox format.
If your list is new then after running cleanarch on that mbox file, I would copy it to ~archives/private/listname.mbox/listname.mbox, ensure perms are correct by running check_perms, then run arch with the wipe parameter to build all the HTML indexes for the web pages.
If you have a list already, I would stop Mailman, then cat the mbox file in place at the location ~archives/private/listname.mbox/listname.mbox to a copy of your converted archives. I would then run cleanarch on the resulting file, then put that in place in ~archives/private/listname.mbox/listname.mbox before checking perms and running arch with the wipe parameter.
Hope this clears things up for you. Andrew.
Thanks Andrew and Mark! As a general rule of thumb, I usually avoid to reply to a message without reflecting for some hours on its content! Sorry for asking things with an obvious answer: I was using a sftp client to access /private and didn't realise that there is a folder, then a file, with the same name! listname.mbox. And thanks for the explanation!
Please, read below...
On Wed, Sep 2, 2015 at 1:57 PM, Andrew Hodgson <andrew@hodgsonfamily.org> wrote:
[IDIS Technical Secretariat] Ricardo Rodríguez wrote:
Before going ahead, there is one thing I'm curious about as I don't
understand it. Please, why does the path to the mbox file include <listname>.mbox twice? Here...
http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
It reads...
archives/private/<listname>.mbox/<listname>.mbox
Yup, the file needs to be called listname.mbox in the directory ~/archives/private/listname.mbox. So for example, to copy the mailman-users archive into your home directory, assuming Mailman was installed in the default location, you would enter a command like:
cp /usr/local/mailman/private/mailman-users.mbox/mailman-users.mbox ~
That would give you a file called mailman-users.mbox in your home directory with all the archives in mbox format.
If your list is new then after running cleanarch on that mbox file, I would copy it to ~archives/private/listname.mbox/listname.mbox, ensure perms are correct by running check_perms, then run arch with the wipe parameter to build all the HTML indexes for the web pages.
If you have a list already, I would stop Mailman, then cat the mbox file in place at the location ~archives/private/listname.mbox/listname.mbox to a copy of your converted archives. I would then run cleanarch on the resulting file, then put that in place in ~archives/private/listname.mbox/listname.mbox before checking perms and running arch with the wipe parameter.
Hope this clears things up for you. Andrew.
It worked! Thank you very much! I've learnt a lot, about a lot of things! Together with a great programe I've discovered a great community. Who knows? Perhaps in the future I'll be able to contribute in any way :-) Here we have our first "complete" list by consolidating former ezmlm managed contents with our brand new Mailman list with the same name...
http://lists.idisantiago.es/pipermail/r.users/
Still, I've a doubt about the syntax of Mailman commands. I do need to do...
[root@idis2 r.users.mbox]# cleanarch <r.users.all.mbox> r.users.all.clean.mbox
While <> are not required in command arch...
[root@idis2 r.users.mbox]# arch --wipe r.users r.users.mbox
If I don't use <> to enclose the input file name in cleanarch, I get the help page!
Usage of both commands as per their help pages are close to similar...
Usage: cleanarch [options] < inputfile > outputfile
Usage: /usr/lib/mailman/bin/arch [options] <listname> [<mbox>]
Also, arch alone shows its help page whereas cleanarch alone do "nothing". Those usage lines and command behaviour aren't to consistent, are they? Could this behaviour be caused by my local configuration?
Thanks!!!
Ricardo
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
On 09/02/2015 12:27 PM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
Still, I've a doubt about the syntax of Mailman commands. I do need to do...
[root@idis2 r.users.mbox]# cleanarch <r.users.all.mbox> r.users.all.clean.mbox
While <> are not required in command arch...
[root@idis2 r.users.mbox]# arch --wipe r.users r.users.mbox
If I don't use <> to enclose the input file name in cleanarch, I get the help page!
You are not 'enclosing' the input file name in cleanarch with <>. cleanarch reads the input mailbox from stdin and writes the output mbox to stdout. You are actually saying '<r.users.all.mbox' which redirects stdin from the terminal to the file r.users.all.mbox and '> r.users.all.clean.mbox' which redirects stdout from the terminal to the file r.users.all.clean.mbox.
Note that you didn't need to do this as the <http://www.arctic.org/~dean/scripts/ezmlm2mbox> script already writes a 'clean' mbox.
On the other hand if you give a second argument to bin/arch, it assumes that is the mbox. It doesn't read stdin in any case and writes various progress info to stdout.
Usage of both commands as per their help pages are close to similar...
Usage: cleanarch [options] < inputfile > outputfile
This means use shell redirection to read the input from inputfile and write the output to outputfile.
Usage: /usr/lib/mailman/bin/arch [options] <listname> [<mbox>]
Here the notation <listname> means that is a variable to be replaced with the actual list name and [<mbox>] means that <mbox> is a variable to be replaced with the actual mbox, but the [] mean it's optional - if not provided it is computed as archives/private/<listname>.mbox/<listname>.mbox.
Also, arch alone shows its help page
because arch without at least a <listname> argument is invalid
whereas cleanarch alone do "nothing".
Actually cleanarch alone is valid and both reads its input from and writes its output to the terminal. That's why if you type just 'cleanarch', it doesn't return immediately to a command prompt but waits for input from the terminal.
Consider
mark@msapiro:~$ /var/MM/21/bin/cleanarch
- it's waiting for input- I type
- it responds
- to stdout, and Unix-From line changed: 1
- to stderr. then I type ^D (control-D - end of file) and it responds 0 messages found
- and exits.
Those usage lines and command behaviour aren't to consistent, are they? Could this behaviour be caused by my local configuration?
From someone
From someone
From someone
No. They are caused by two different programs written by at least two different people. So yes, they aren't exactly consistent, but if you understand the different commands and the shell redirection, it may make more sense.
True, cleanarch could have been written to require an input file argument instead of using stdin, but using stdin and stdout makes it easy to insert it into a pipeline of commands as in for example
ezmlm2mbox [-d] ezmlm_dir | cleanarch | arch --wipe <listname> -
except for the fact that arch doesn't read its stdin even if the conventional '-' is used as the mbox file name.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
It does make sense indeed!
I do hope to be able to take into account all these details when posting my next question to this list. In the meantime, I keep learning! Thank you so much for your time!
On Wed, Sep 2, 2015 at 10:13 PM, Mark Sapiro <mark@msapiro.net> wrote:
On 09/02/2015 12:27 PM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
Still, I've a doubt about the syntax of Mailman commands. I do need to
do...
[root@idis2 r.users.mbox]# cleanarch <r.users.all.mbox>
r.users.all.clean.mbox
While <> are not required in command arch...
[root@idis2 r.users.mbox]# arch --wipe r.users r.users.mbox
If I don't use <> to enclose the input file name in cleanarch, I get the help page!
You are not 'enclosing' the input file name in cleanarch with <>. cleanarch reads the input mailbox from stdin and writes the output mbox to stdout. You are actually saying '<r.users.all.mbox' which redirects stdin from the terminal to the file r.users.all.mbox and '> r.users.all.clean.mbox' which redirects stdout from the terminal to the file r.users.all.clean.mbox.
Note that you didn't need to do this as the <http://www.arctic.org/~dean/scripts/ezmlm2mbox> script already writes a 'clean' mbox.
On the other hand if you give a second argument to bin/arch, it assumes that is the mbox. It doesn't read stdin in any case and writes various progress info to stdout.
Usage of both commands as per their help pages are close to similar...
Usage: cleanarch [options] < inputfile > outputfile
This means use shell redirection to read the input from inputfile and write the output to outputfile.
Usage: /usr/lib/mailman/bin/arch [options] <listname> [<mbox>]
Here the notation <listname> means that is a variable to be replaced with the actual list name and [<mbox>] means that <mbox> is a variable to be replaced with the actual mbox, but the [] mean it's optional - if not provided it is computed as archives/private/<listname>.mbox/<listname>.mbox.
Also, arch alone shows its help page
because arch without at least a <listname> argument is invalid
whereas cleanarch alone do "nothing".
Actually cleanarch alone is valid and both reads its input from and writes its output to the terminal. That's why if you type just 'cleanarch', it doesn't return immediately to a command prompt but waits for input from the terminal.
Consider
mark@msapiro:~$ /var/MM/21/bin/cleanarch
- it's waiting for input- I type From someone
- it responds
- to stdout, and Unix-From line changed: 1 From someone
- to stderr. then I type ^D (control-D - end of file) and it responds 0 messages found
- and exits.
Those usage lines and command behaviour aren't to consistent, are they? Could this behaviour be caused by my local configuration?
From someone
No. They are caused by two different programs written by at least two different people. So yes, they aren't exactly consistent, but if you understand the different commands and the shell redirection, it may make more sense.
True, cleanarch could have been written to require an input file argument instead of using stdin, but using stdin and stdout makes it easy to insert it into a pipeline of commands as in for example
ezmlm2mbox [-d] ezmlm_dir | cleanarch | arch --wipe <listname> -
except for the fact that arch doesn't read its stdin even if the conventional '-' is used as the mbox file name.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
On 09/02/2015 04:30 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
Before going ahead, there is one thing I'm curious about as I don't understand it. Please, why does the path to the mbox file include <listname>.mbox twice? Here...
http://wiki.list.org/DOC/How%20do%20I%20import%20an%20archive%20into%20a%20n...
It reads...
archives/private/<listname>.mbox/<listname>.mbox
That's just the way Mailman does it. archives/private/ contains two directories per list. The archives/private/<listname>/ directory contains the full pipermail archive (various files and directories) and The archives/private/<listname>.mbox/ directory contains the cumulative archive mbox file also named <listname>.mbox.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
[IDIS Technical Secretariat] Ricardo Rodríguez writes:
Thanks Mark, all,
On Sun, Aug 30, 2015 at 4:09 AM, Mark Sapiro <mark@msapiro.net> wrote:
On 08/28/2015 12:52 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
I have a number of Maildir format mailing lists archives
No, you don't have Maildir, at least not "Maildir" as most of the Internet understands it. Here's what Dan Bernstein (the inventor or at least popularizer of Maildir) says:
Can a maildir contain more than tmp, new, cur?
Yes:
.qmail: used to do direct deliveries with qmail-local.
bulletintime: empty file, used by system-wide bulletin programs.
bulletinlock: empty file, used by system-wide bulletin programs.
seriallock: empty file, used to serialize AutoTURN.
> Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls
Log bounce digissue headerremove lock mod outlocal remote
"Lock" -- no, this isn't Maildir. The whole point of Maildir is that you don't need locks because reading and writing are done in different directories, and changes happen atomically. (This can even work with editing.)
allow bouncer dignum indexed lockbounce modsub owner subscribers archive config editor inlocal mailinglist num prefix text archived digest headeradd key manager outhost public tstdig
Within /archive, there are two folders, 0 and 1, with a number of files, each of them containing one message, and an index file.
Please, does this made sense for you?
I don't recall anything like that. Please try to find an explanation of the structure in the system documentation, or ask the vendor. However, since you think they're "Maildir", probably what is meant is that they have a structure that is one message per file rather than many messages per file. You probably just need to figure out how to get the order of messages right, then concatenate the messages.
Most likely, all you need for each list are the archive folders and the single messages, and maybe the index file will be of some use depending on what it contains. If your documentation and/or the old vendor are of no help, see if you can find a whole message file you can send to us *as a file attachment* -- we want to see what headers are included (it probably doesn't really matter, though, except for the "Unix From_" which they probably don't have). If for privacy reasons you don't want to broadcast any message on a public mailing list, you can send it to Mark and me personally. Also it may be helpful to figure out the rule for the folders whose names are numbers: are they the leading digits where files are named 000 to 999? Are they months? Years? etc.
Am I completely lost?
No, of course not. Just don't delete anything until you're sure the new system is working. I did qualify everything with "probably", it may take a couple of guesses to get it right. :-)
Thanks for your reply! Please, read below!
On Tue, Sep 1, 2015 at 3:19 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
[IDIS Technical Secretariat] Ricardo Rodríguez writes:
Thanks Mark, all,
On Sun, Aug 30, 2015 at 4:09 AM, Mark Sapiro <mark@msapiro.net> wrote:
On 08/28/2015 12:52 AM, [IDIS Technical Secretariat] Ricardo Rodríguez wrote:
I have a number of Maildir format mailing lists archives
No, you don't have Maildir, at least not "Maildir" as most of the Internet understands it. Here's what Dan Bernstein (the inventor or at least popularizer of Maildir) says:
Can a maildir contain more than tmp, new, cur? Yes: .qmail: used to do direct deliveries with qmail-local. bulletintime: empty file, used by system-wide bulletin programs. bulletinlock: empty file, used by system-wide bulletin programs. seriallock: empty file, used to serialize AutoTURN.
> Ricardo-Rodriguezs-Mac-Pro:r.users rrodriguez$ ls
Log bounce digissue headerremove lock mod outlocal remote
"Lock" -- no, this isn't Maildir. The whole point of Maildir is that you don't need locks because reading and writing are done in different directories, and changes happen atomically. (This can even work with editing.)
That's now far clear for me! As stated in a previous message replying to a Mark's post, I make a mess interpreting wrongly several messages from our services provider and some googled information. Ezmlm was behind the scene. Sorry for the misinformation!
allow bouncer dignum indexed lockbounce modsub owner subscribers archive config editor inlocal mailinglist num prefix text archived digest headeradd key manager outhost public tstdig
Within /archive, there are two folders, 0 and 1, with a number of files, each of them containing one message, and an index file.
Please, does this made sense for you?
I don't recall anything like that. Please try to find an explanation of the structure in the system documentation, or ask the vendor. However, since you think they're "Maildir", probably what is meant is that they have a structure that is one message per file rather than many messages per file. You probably just need to figure out how to get the order of messages right, then concatenate the messages.
That's correct: messages are stored in separated files ordered in subfolders named with a series from 0 onward. 0, 1... each subfolder holds one hundred files, each of them with a complete message. I think the global order could be provided by the subfolder order plus the name of each file within a subfolder.
Mark's reply contents a simple line to get some concatenation. I'll play with this idea and report back!
Most likely, all you need for each list are the archive folders and the single messages, and maybe the index file will be of some use depending on what it contains. If your documentation and/or the old vendor are of no help, see if you can find a whole message file you can send to us *as a file attachment* -- we want to see what headers are included (it probably doesn't really matter, though, except for the "Unix From_" which they probably don't have). If for privacy reasons you don't want to broadcast any message on a public mailing list, you can send it to Mark and me personally. Also it may be helpful to figure out the rule for the folders whose names are numbers: are they the leading digits where files are named 000 to 999? Are they months? Years? etc.
Most lists are public, so privacy is not a concern! You can find here the complete folders' structure for one of the lists...
http://datasource.idisantiago.es/r.users/
Am I completely lost?
No, of course not. Just don't delete anything until you're sure the new system is working. I did qualify everything with "probably", it may take a couple of guesses to get it right. :-)
Thanks for your help!
Ricardo
-- Ricardo Rodríguez Research Management and Promotion Technician Technical Secretariat Health Research Institute of Santiago de Compostela (IDIS) http://www.idisantiago.es
participants (4)
-
[IDIS Technical Secretariat] Ricardo Rodríguez
-
Andrew Hodgson
-
Mark Sapiro
-
Stephen J. Turnbull