Searching binary data
tim_one at email.msn.com
Thu Feb 3 04:49:06 CET 2000
> Didn't have access to the internet today which forced me to have
> a creative thought of my own. Now to find out if I wasted my time.
> The problem is to find patterns in gobs of binary data.
> Treat it as a string you see something like this.
> I found writing a re for patterns in that, a pain.
> What if I wanted r"[\000-\077]".
> It won't work because there are nulls in the result and re doesn't
> like that.
Actually, that works fine (if it didn't, what you just told us you did is
not what you actually did). You can't pass a pattern with an actual null to
re (minor flaw of the implementation, IMO), but the raw string
r"[\000-\077]" doesn't contain an actual null: it contains the 4-character
escape sequence "\000", which re converts to a null.
>>> p = re.compile(r"[\000-\001]")
>>> print p.match(chr(2))
> Not to mention all this octal to hex is annoying
Hex escapes work fine too: r"[\x00-\x3f]" means the same as the above.
> an who knows what trouble Nulls will be.
I do: none <wink>. Really, nulls aren't special at all to re. The glitch
in *passing* an actual null in the pattern to re has to do with the engine's
C interface, which uses a char* for the pattern without an additional count
argument. That's as deep as this one goes.
> So I wrote an extension to covert everything to hex in the
> following format.
> Now I can treat the whole thing as a string :)
That's fine too.
More information about the Python-list