[Tutor] Processing .NK2 files

Fri Feb 15 17:42:08 CET 2008

I need to delete records in an MS Outlook .NK2 file if they contain a
specific email address.
This code has been inspired mainly by:
http://code.google.com/p/debunk2/wiki/fileformat.

import re

NUL='\x00'
sep1 = '\x04H\xfe\x13' # record separator
sep2 = '\x00\xdd\x01\x0fT\x02\x00\x00\x01'  # record separator
p = re.compile('@emaildomain', re.IGNORECASE)

props = open('nk2properties.txt', 'r')
lines = props.readlines()

infile  = lines[0].strip()
outfile = lines[1].strip()

props.close()

f = open(infile, 'rb')
filedata = f.read()
f.close()

out = open(outfile, 'wb')
for z in filedata.split(sep1):
    for y in z.split(sep2):
        split1 = [x.replace(NUL, '') for x in y.split(NUL*3)]
 for item in split1:
     m = p.search(item)
     if m:
         print m.group()
         out.write(item)

out.close()

nk2properties.txt contents:

Outlook.NK2
outputfile

I'm sure all matches are found. The trouble, it seems, is determining where
the records begin and end so I can delete them and keep them in a separate
file in case I have to reimport them later.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20080215/01d9a2c5/attachment.htm