[Tutor] escape character regex

Ian D duxbuz at hotmail.com
Sun Mar 29 09:55:01 CEST 2015


Ha ha thanks Danny for the hex message!


I am looking to basically match  2 unknown hex values or a byte at the end of the 4 byte sequence.

I realise now I am trying to use a numeric \d expression when it needs to be matching 2 nibbles or a byte.


Is there a way to match using some sort of wildcard for the last byte as it changes?


Thanks 

----------------------------------------
> Date: Sat, 28 Mar 2015 20:21:09 -0400
> From: davea at davea.name
> To: tutor at python.org
> Subject: Re: [Tutor] escape character regex
>
> On 03/28/2015 03:37 PM, Ian D wrote:
>> Hi
>>
>>
>> I run a regex like this:
>>
>>> pchars = re.compile('\x00\x00\x00') #with or without 'r' for raw
>
> Which one did you actually want? The 3 byte sequence consisting of
> nulls, or the 12 byte one containing zeroes and backslashes? I'm going
> to assume the former, in which case you cannot use 'r' for raw. Unless
> you've got a null key on your keyboard.
>
>>
>> on a string like this:
>>
>>> data = "['broadcast', 'd8on\x00\x00\x00\x11broadcast', 'd11on']"
>>
>>> print "found pchars :",pchars.findall(data)
>>
>> which returns:
>>
>>> found pchars : ['\x00\x00\x00']
>>
>>
>> But if I try to match the extra digits at the end like this:
>>
>>> pchars = re.compile('\x00\x00\x00\x\d+')
>>
>> I get an error:
>>
>>> ValueError: invalid \x escape
>
> The \x escape sequence must be followed by exactly two hex digits, and
> forms a single byte from them. What did you want that byte to be, and
> why didn't you specify it?
>
>>
>> Or if I use another ide than idle it actually flags it as an "illegal hexadecimal escape sequence"
>>
>
> The question is not what the various IDE's produce, but what the Python
> compiler produces. So once you started getting errors, you really
> should have just run it in the interactive interpreter, without IDE's
> second-guessing you. Anyway, in 2.7.6's interactive interpreter, I get:
>
>>>> a = '\x00\x00\x00\x\d+'
> ValueError: invalid \x escape
>>>>
>
> So it has nothing to do with re, and is simply the result of trying an
> invalid string literal.
>
> What string were you hoping to get? You mention you wanted to match
> digits at the end (end of what?). Perhaps you wanted a real backslash
> followed by the letter d. In that case, since you cannot use a raw
> string (see my first response paragraph), you need to double the backslash.
>
>>>> a = '\x00\x00\x00\\d+'
>>>> print a
> \d+
>
>
> Your data is funny, too, since it almost looks like it might be a string
> representation of a Python list. But assuming you meant it exactly like
> it is, there is a funny control character following the nulls.
>>
>> How could I match the \x00\x00\x00\x11 portion of the string?
>>
>
> There are no digits in that portion of the string, so I'm not sure why
> you were earlier trying to match digits.
>
> Perhaps you meant you were trying to match the single control character
> x'11'. In that case, you'd want
>
> a = '\x00\x00\x00\x11'
> pchars = re.compile(a)
>
>
> But if you wanted to match an arbitrary character following the nulls,
> you'd want something different.
>
> I think you'd better supply several strings to match against, and show
> which ones you'd expect a match for.
>
> --
> DaveA
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor 		 	   		  


More information about the Tutor mailing list