Mail extraction problem (something's wrong with split methods)

Pierre Fortin pfortin at pfortin.com
Sun Sep 12 19:37:12 CEST 2004


On Sun, 12 Sep 2004 17:32:15 +0200 Luka wrote:

This msg has already been processed by something that appears to generate
list/tuple segments...  I would suspect that whatever modified the message
has a string size limitation...  However, it looks like whatever
manhandled this msg just did what looks like a python print of a tuple...
If you really want to process this type of message instead of getting at
the real problem, then here's a clue...

 Here, I reduced the contents to just the items...

('+OK',
 ['Received',    # brackets, braces, parens are just text herein
  'by',
  'for',
  'Date',
  'Message-Id',
  'From',
  'To',
  'Subject',
  'X-Scanned-By: MIMEDefang 2.42',
  'X-Virus-Scanned',
  'Content-Length: 4210',
  'Status:   ',
  '',
  '',
  '---Code block---',
  '[6964, 7086, ..., 6730',   # "[" is just text here
  ', 6793, ..., 5534]',       # "]" ditto
  '---Code block---'
 ],
 4815
)

Further reducing the items shows the structure:

('s',
  ['s', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', # headers
   '',     # header/body separator
   '',
   's',    # ---Code block---
   's',    # 1st part
   's',    # 2nd part
   's'     # ---Code block---
  ],
  4815     # reported msg size
)

which boils down to:

  (s,[s, ..., s],i)  # aka: tuple(string,list(strings),int)

So...  looks like you just need to isolate the strings between the
"---Code block---" strings (could be more than 2 or just 1) and
concatenate them. splitting the result...

Straight-line brute forcing it:

msg = ....     # get the message as a tuple  
sep = "---Code block---"
start = msg[1].index(sep)
data = msg[1][start+1:]
end = data.index(sep)
data = data[:end]
print "".join(data)[1:-1].split(", ")

> This is the original mail, sorry because of the size. As you can see,
> there are two problematic spots: 6730', ', and ','6573, at the end of
> the mail.


Pierre



More information about the Python-list mailing list