Standard module for parsing emails?
steve at REMOVE-THIS-cybersource.com.au
Thu Jul 31 04:25:37 CEST 2008
On Wed, 30 Jul 2008 07:11:45 -0700, Phillip B Oldham wrote:
> Most clients use ">" which is easy to check for, but I've seen some
> which use "|" and some which *don't* quote at all. Its causing us
> nightmares in parsing responses to system-generated emails. I was hoping
> someone might've seen the problem previously and released some code.
I've even seen clients that prefix new (unquoted) text with the quote
Well, possibly it's not the mail client, but the user. Who knows?
I will sometimes quote text like this:
But I'm writing for a human audience, not for a program.
The simple answer is that you can catch 90% of cases by checking for ">",
and another 1% by checking for "|". If the email contains HTML, I have
found that quoted text is sometimes in another colour. As for the rest,
well, sometimes even human beings can't easily determine what's quoted
and what isn't. Good luck getting a program to do it.
(Percentages are plucked out of thin air. YMMV.)
More information about the Python-list