I am interested in counting the number of times a phrase has occurred on a mailing list in a given time period. For example, I want to know how many times the word "example" occurred from 2008 - 2009. Is it possible to do this?
Thanks,
David
- David Doria <daviddoria@gmail.com>:
I am interested in counting the number of times a phrase has occurred on a mailing list in a given time period. For example, I want to know how many times the word "example" occurred from 2008 - 2009. Is it possible to do this?
You could analyse the archives for that. Usually, the archives are stored as an mbox. So use your favourite tools on that
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
You could analyse the archives for that. Usually, the archives are stored as an mbox. So use your favourite tools on that
-- Ralf Hildebrandt
Hi Ralf,
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
Thanks,
David
- David Doria <daviddoria@gmail.com>:
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
wget is your friend.
wget -r -l1 --no-parent-nd -A.txt.gz http://www.vtk.org/pipermail/vtkusers/ or something like that
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
David Doria wrote:
[...]
Hi Ralf,
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
To get the .mbox file directly you will need to access the server and download the file as a privileged user. It may be quite large due to the size of the archive.
Alternatively you could use a crawl tool on the archives page like Wget or similar if you don't have access to the host.
Thanks. Andrew.
- Ralf Hildebrandt <Ralf.Hildebrandt@charite.de>:
- David Doria <daviddoria@gmail.com>:
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
wget is your friend.
wget -r -l1 --no-parent-nd -A.txt.gz http://www.vtk.org/pipermail/vtkusers/ or something like that
wget -r -l1 --no-parent -nd -A.txt.gz http://www.vtk.org/pipermail/vtkusers/ sorry.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
On Wed, Dec 01, 2010 at 10:43:37AM -0500, David Doria wrote:
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
http://lists.example.org/mailman/private/example-list.mbox/example-list.mbox, perhaps?
(works for me).
-- "How can you make good ideas sound so bad?" "I'm an engineer." -- Scott Adams
Adam McGreggor wrote:
[...]
http://lists.example.org/mailman/private/example-list.mbox/example-list.mbox, perhaps?
(works for me).
Luckily that doesn't work here, I don't believe you should be able to directly download the mbox files like that.
Andrew.
Andrew Hodgson wrote:
David Doria wrote:
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
To get the .mbox file directly you will need to access the server and download the file as a privileged user. It may be quite large due to the size of the archive.
You don't need shell/privileged access to the server to get the archive mbox. You just need to be a list member.
Authenticate for private archive access by going to a URL like <http://www.vtk.org/mailman/private/vtkusers/> and logging in. Then get <http://www.vtk.org/mailman/private/vtkusers.mbox/vtkusers.mbox>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro wrote:
[...]
Andrew Hodgson wrote:
David Doria wrote:
Is it possible to download the entire archive as one file? I see that they are broken down into months: http://www.vtk.org/pipermail/vtkusers/
To get the .mbox file directly you will need to access the server and download the file as a privileged user. It may be quite large due to the >size of the archive.
You don't need shell/privileged access to the server to get the archive mbox. You just need to be a list member.
Authenticate for private archive access by going to a URL like <http://www.vtk.org/mailman/private/vtkusers/> and logging in. Then get ><http://www.vtk.org/mailman/private/vtkusers.mbox/vtkusers.mbox>.
Ah, didn't realise that was possible, disregard my previous message on this subject. Authentication was the key point.
Andrew.
On Wed, Dec 01, 2010 at 04:02:51PM +0000, Andrew Hodgson wrote:
I don't believe you should be able to directly download the mbox files like that.
Why ever not? What's so different about grabbing one mbox at once, rather than however many iterations/scraping?
-- "They accused us of suppressing freedom of expression. This was a lie and we could not let them publish it." -- Nelba Blandon (Nicaraguan Interior Ministry Director of Censorship)
Adam McGreggor wrote:
[...]
On Wed, Dec 01, 2010 at 04:02:51PM +0000, Andrew Hodgson wrote: I don't believe you should be able to directly download the mbox files like that.
Why ever not? What's so different about grabbing one mbox at once, rather than however many iterations/scraping?
The point that I didn't grasp was that you needed to log into the private archives before you could in fact do this. From your message I mistakenly thought that you could just directly grab the mbox without authentication.
Sorry about that. Andrew.
Great, thanks everyone. I got the mbox file, now I just need to construct clever queries :)
Thanks,
David
participants (5)
-
Adam McGreggor
-
Andrew Hodgson
-
David Doria
-
Mark Sapiro
-
Ralf Hildebrandt