[Python-ideas] Python octal escape character encoding "wats"

Richard Damon Richard at Damon-Family.org
Sat Nov 10 08:08:59 EST 2018


On 11/9/18 11:19 PM, Steven D'Aprano wrote:
> On Sat, Nov 10, 2018 at 12:56:07PM +1100, Chris Angelico wrote:
>
>> Not ambiguous. It takes as many valid octal digits as it can.
> What is the rationale for that? Hex escapes don't.
>
> My guess is, "Because that's what C does". And C probably does it 
> because "Dennis Ritchie wanted to minimize the number of keypresses when 
> he was typing" :-)
>
>
>> "Up to" means that one or two digits can also define a character. For
>> obvious reasons, it has to take digits greedily (otherwise "\777"
>> would be "\x07" followed by "77"), and it's not an error to have fewer
>> digits.
> In hindsight, I think we should have insisted that octal escapes must 
> always be three digits, just as hex escapes are always two. The status 
> quo has too much magical "Do What I Mean" in it for my liking:
>
> py> '\509\51'  # pair of brackets surrounding a nine
> '(9)'
> py> '\507\51'  # pair of brackets surrounding a seven
> 'G)'
>
> Dammit Python, that's not what I meant!
>
Since the 'normal' usage for octal escapes in C (which came long before
hex escapes) was to input control characters, the most likely being \0,
and the next most likely \33 (Escape), and by far most being in the
range of \0 - \37, requiring 3 all the time would be very inconvenient.
You would never use the escape for a printable character and interleave
it with other printable characters.

Yes, if you are putting in codes for a string of arbitrary byte values
using escapes, then you would likely always use 3 digits for
readability, but then you don't have the ambiguity as EVERY code is an
escape.

The one case where you might get the problem is if you had a control
character (like escape) followed by a digit between 0 and 7, you needed
to expand the escape to 3 digits. This was just one of the traps you
learned to live with (and it seemed that terminal escape codes seemed to
avoid that issue by normally following the escape character with a
non-digit character.)


-- 
Richard Damon



More information about the Python-ideas mailing list