[Python-Dev] Draft Guide for code migration and modernation
Walter Dörwald
walter@livinglogic.de
Tue, 04 Jun 2002 16:06:29 +0200
Guido van Rossum wrote:
>>string.zfill has a "decadent feature": It also works for
>>non-string objects by calling repr before formatting.
>
>
> Hm, but repr() was the wrong thing to call here anyway. :-(
The old code used `x`. Should we change it to use str()?
>>> c in string.whitespace --> c.isspace()
>>
>>This changes the meaning slightly for unicode characters, because
>>chr(i).isspace() != unichr(i).isspace()
>>for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 }
>
>
> That's unfortunate, because I'd like unicode to be an extension of
> ASCII also in this kind of functionality. What are these and why are
> they considered spaces?
http://www.unicode.org/Public/UNIDATA/NamesList.txt says:
001C
<control>
= INFORMATION SEPARATOR FOUR
= file separator (FS)
001D
<control>
= INFORMATION SEPARATOR THREE
= group separator (GS)
001E
<control>
= INFORMATION SEPARATOR TWO
= record separator (RS)
001F
<control>
= INFORMATION SEPARATOR ONE
= unit separator (US)
0085
<control>
= NEXT LINE (NEL)
00A0
NO-BREAK SPACE
x (space - 0020)
x (figure space - 2007)
x (narrow no-break space - 202F)
x (word joiner - 2060)
x (zero width no-break space - FEFF)
# <noBreak> 0020
> Would it hurt to make them spaces in ASCII
> too?
stringobject.c::string_isspace() currently uses the isspace()
function from <ctype.h>.
>>New ones:
>>
>>Pattern: "foobar"[:3] == "foo" -> "foobar".startswith("foo")
>> "foobar"[-3:] == "bar" -> "foobar".endswith("bar")
>>Version: ??? (It was added on the string_methods branch)
>
>
> 2.0.
>
>
>>Benefits: Faster because no slice has to be created.
>> No danger of miscounting.
>>Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "=="
>> grep "\[\w*:\w*[0-9]*\w*\]" | grep "=="
>
>
> Are these regexes really worth making part of the migration guide?
> \w* isn't a good pattern to catch an arbitrary expression, it only
> catches simple identifiers!
Ouch, that was meant to be
grep "\[[[:space:]]*-[[:digit:]]*[[:space:]]*:[[:space:]]*\]" | grep "=="
grep "\[[[:space:]]*:[[:space:]]*[[:digit:]]*[[:space:]]*\]" | grep "=="
This doesn't find "foobar"[-len("bar"):]=="bar", only constants.
But at least it's a little better than vgrep. ;)
Bye,
Walter Dörwald