[Tutor] regular expressions question

Alan Gauld alan.gauld at freenet.co.uk
Sat Aug 12 11:33:32 CEST 2006


> The file's encoding is binary or something
>
> Here is the first section of the file:
> '\x00\x00\x00\x02\xb8,\x08\x9f\x00\x00z\xa8\x00\x00\x01\xf4\x00\x00\x01\xf4\x00\x00\x00t\x00f\x00i\x00l\x00e\x00:\x00/\x00h\x00o\x00m\x00e\x00/\x00a\x00l'
>
> Does that tell you anything?

Recall that on a 32 bit computer each "word" is 4 bytes long
so to get anyting meaningful you often need to consider 4 byte
blocks.

00000002
B8089F00
The next bit is tricky, either 'z' or

(ord('z')A80000

etc.

Now whether those numbers mean anything to you is a moot
point but thats usually the starting point.

> I have been trying to replace the pesky \x00's with something less

But that is almost certainly the wrong approach, you'll never
figure out where the word boundaries are without them!

> Suggestions greatly appreciated!!

Go look on the Konqueror site (at the code if necessary) to find the
format of the data structure in the file and use the struct module
to unpack it.

You might find the section in my tutorial (under Handling Files)
on binary files and using struct useful

HTH

Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 



More information about the Tutor mailing list