Mail extraction problem (something's wrong with split methods)
pfortin at pfortin.com
Sun Sep 12 19:37:12 CEST 2004
On Sun, 12 Sep 2004 17:32:15 +0200 Luka wrote:
This msg has already been processed by something that appears to generate
list/tuple segments... I would suspect that whatever modified the message
has a string size limitation... However, it looks like whatever
manhandled this msg just did what looks like a python print of a tuple...
If you really want to process this type of message instead of getting at
the real problem, then here's a clue...
Here, I reduced the contents to just the items...
['Received', # brackets, braces, parens are just text herein
'X-Scanned-By: MIMEDefang 2.42',
'[6964, 7086, ..., 6730', # "[" is just text here
', 6793, ..., 5534]', # "]" ditto
Further reducing the items shows the structure:
['s', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', # headers
'', # header/body separator
's', # ---Code block---
's', # 1st part
's', # 2nd part
's' # ---Code block---
4815 # reported msg size
which boils down to:
(s,[s, ..., s],i) # aka: tuple(string,list(strings),int)
So... looks like you just need to isolate the strings between the
"---Code block---" strings (could be more than 2 or just 1) and
concatenate them. splitting the result...
Straight-line brute forcing it:
msg = .... # get the message as a tuple
sep = "---Code block---"
start = msg.index(sep)
data = msg[start+1:]
end = data.index(sep)
data = data[:end]
print "".join(data)[1:-1].split(", ")
> This is the original mail, sorry because of the size. As you can see,
> there are two problematic spots: 6730', ', and ','6573, at the end of
> the mail.
More information about the Python-list