Thank you all for your comments. I ended up saving the word document in XML and then using (a slightly modified version of) my script of the OP. For those interested, there was also a problem with encodings. Regards, antoine