Mail extraction problem (something's wrong with split methods)

Luka Milkovic luka.milkovic at public.srce.hr
Sun Sep 12 17:32:15 CEST 2004


On Sun, 12 Sep 2004 16:24:31 +0200, Diez B. Roggisch wrote:

> The email text. Whatever the reason for the unexpected behaviour is, its in
> there. 

('+OK 4815 octets', ['Received: from galileo.resean
(aelthegrin at cmung149.cmu.carnet.hr [193.198.128.149])', '\tby
jagor.srce.hr (8.12.10/8.12.10) with ESMTP id i8BFvSRt009065', '\tfor
<luka.milkovic at public.srce.hr>; Sat, 11 Sep 2004 17:57:28 +0200 (CEST)',
'Date: Sat, 11 Sep 2004 17:57:28 +0200 (CEST)', 'Message-Id:
<200409111557.i8BFvSRt009065 at jagor.srce.hr>', 'From:
luka.milkovic at public.srce.hr', 'To: luka.milkovic at public.srce.hr',
'Subject: OTP', 'X-Spam-Score: 5.544 (*****)
DATE_MISSING,MSGID_FROM_MTA_SHORT,NO_REAL_NAME', 'X-Scanned-By: MIMEDefang
2.42', 'X-Virus-Scanned: by amavisd-new at jagor.srce.hr',
'Content-Length: 4210', 'Status:   ', '', '', '---Code block---', '[6964,
7086, 3211, 7522, 9472, 3265, 3610, 104, 9729, 6706, 8035, 5439, 7142,
360, 677, 1667, 1382, 9417, 4493, 8289, 9613, 3470, 889, 1021, 3381, 3480,
1385, 2027, 956, 9317, 6567, 5552, 1114, 3311, 4437, 631, 5881, 2101,
9948, 4529, 3088, 5548, 3728, 8727, 7787, 5754, 8315, 8250, 8308, 510,
8183, 4052, 9046, 8217, 5107, 8333, 7799, 4589, 209, 7465, 1010, 4459,
5984, 8272, 5311, 4458, 3565, 5747, 8460, 9845, 9305, 1662, 2650, 5290,
9725, 5743, 6679, 9896, 4776, 8586, 3075, 8824, 9369, 6957, 8564, 7165,
112, 9940, 6291, 1489, 3561, 1218, 3890, 9970, 9973, 7624, 7721, 8620,
456, 872, 4546, 926, 2687, 8884, 8598, 7544, 6857, 5363, 6686, 8579, 7937,
8290, 3578, 5411, 6375, 5596, 6860, 8392, 5300, 5927, 8211, 2232, 2194,
1388, 9047, 5384, 876, 4773, 7331, 3238, 5699, 7498, 2789, 8344, 5198,
1732, 3330, 6832, 908, 4210, 8943, 2390, 1655, 5324, 993, 6281, 2909,
2178, 9929, 40, 5060, 964, 4752, 8570, 7714, 607, 6450, 5793, 9292, 6428,
5410, 7567, 6040, 543, 3602, 8022, 4052, 7222, 6324, 6729, 1030, 299,
8641, 4312, 8614, 423, 6730', ', 6793, 3453, 9470, 9382, 2037, 4103,
6427, 5312, 1366, 6287, 2316, 5745, 6916, 1640, 2381, 7510, 1156, 1538,
3015, 1592, 4136, 2170, 6263, 3829, 6869, 8079, 9724, 1830, 3245, 4694,
782, 9703, 3615, 2907, 4435, 7329, 7511, 5418, 2913, 1567, 7865, 3729,
8289, 373, 5635, 8292, 9569, 4370, 8728, 3082, 7829, 4797, 9632, 8283,
2741, 7887, 6366, 9821, 1604, 1099, 3256, 2722, 8474, 6261, 8582, 6431,
1762, 8615, 9745, 599, 4078, 4779, 1469, 90, 5432, 5475, 9098, 5614, 184,
9515, 8909, 3868, 4880, 2408, 9665, 8552, 5444, 9209, 993, 9008, 1495,
1885, 3871, 4774, 8698, 5212, 1303, 6629, 6011, 4490, 9329, 1062, 4558,
4338, 2279, 8502, 473, 9650, 5787, 8329, 6816, 6858, 3868, 1854, 2991,
9958, 8931, 9276, 7837, 9372, 6732, 2402, 5453, 6012, 2958, 2593, 2258,
6599, 2127, 2214, 5839, 3947, 5270, 10093, 8043, 2905, 686, 6451, 312,
1682, 1947, 3447, 4083, 6838, 7896, 3054, 9913, 6716, 3831, 1861, 7286,
6863, 7754, 5534, 8451, 9536, 7945, 9747, 7075, 3808, 6180, 5387, 930,
9663, 7337, 3513, 9535, 4329, 6056, 2114, 8972, 8336, 9743, 5397, 3112,
8023, 3392, 1488, 1707, 8223, 9982, 4498, 1840, 962, 2471, 7919, 2731,
7935, 2826, 6904, 4150, 8780, 9697, 5955, 412, 1816, 7017, 5219, 1290,
7106, 6747, 1180, 1230, 2564, 1568, 373, 9301, 59, 9632, 4667, 7701, 9141,
6240, 3290, 7172, 4006, 8018, 5744, 1125, 4388, 7109, 7357, 5188, 841,
7950, 666, 6754, 4894, 7222, 9275, 7291, 3038, 6510, 8543, 7400, 2218,
2671, 1, 1753, 5620, 4833, 2920, 3754, 9364, 9724, 3445, 6378, 1986, 9350,
4887, 633, 6400, 4586, 1541, 5883, 2696, 306, 5971, 8164, 748, 2464, 550,
9843, 9373, 5004, 4295, 1055, 6916, 6386, 8480, 4480, 8744, 2586, ',
'6573, 869, 9277, 6960, 4871, 9340, 6119, 4271, 7572, 1230, 1213, 5534]',
'---Code block---'], 4815)

This is the original mail, sorry because of the size. As you can see,
there are two problematic spots: 6730', ', and ','6573, at the end of the
mail.

I was wandering is there any way to modify my splitting code I already
posted? The thing I want to implement is that the code would parse e-mail
as usual and when it comes to these problematic spots, it removes
unnecessary quotes and continues parsing...

Is there anything that could be done?

Thank you once more:)
Luka



More information about the Python-list mailing list