can't open word document after string replacements
Antoine De Groote
antoine at vo.lu
Tue Oct 24 06:15:40 EDT 2006
Bruno Desthuilliers wrote:
> Antoine De Groote wrote:
>> Hi there,
>>
>> I have a word document containing pictures and text. This documents
>> holds several 'ABCDEF' strings which serve as a placeholder for names.
>> Now I want to replace these occurences with names in a list (members).
>
> Do you know that MS Word already provides this kind of features ?
No, I don't. Sounds interesting... What is this feature called?
>
>> I
>> open both input and output file in binary mode and do the
>> transformation. However, I can't open the resulting file, Word just
>> telling that there was an error. Does anybody what I am doing wrong?
>
> Hand-editing a non-documented binary format may lead to undesirable
> results...
>
>> Oh, and is this approach pythonic anyway?
>
> The pythonic approach is usually to start looking for existing
> solutions... In this case, using Word's builtin features and Python/COM
> integration would be a better choice IMHO.
>
>> (I have a strong Java
>> background.)
>
> Nobody's perfect !-)
>
>> Regards,
>> antoine
>>
>>
>> import os
>>
>> members = somelist
>>
>> os.chdir(somefolder)
>>
>> doc = file('ttt.doc', 'rb')
>> docout = file('ttt1.doc', 'wb')
>>
>> counter = 0
>>
>> for line in doc:
>
> Since you opened the file as binary, you should use file.read() instead.
> Ever wondered what your 'lines' look like ?-)
>
>> while line.find('ABCDEF') > -1:
>
> .doc is a binary format. You may find such a byte sequence in it's
> content in places that are *not* text content.
>
>> try:
>> line = line.replace('ABCDEF', members[counter], 1)
>> docout.write(line)
>
> You're writing back the whole chunk on each iteration. No surprise the
> resulting document is corrupted.
>
>> counter += 1
>
> seq = list("abcd")
> for indice, item in enumerate(seq):
> print "%02d : %s" % (indice, item)
>
>
>> except:
>> docout.write(line.replace('ABCDEF', '', 1))
>> else:
>> docout.write(line)
>>
>> doc.close()
>> docout.close()
>>
>
>
>
More information about the Python-list
mailing list