[Python-Dev] String methods... finally
Mark Hammond
MHammond at skippinet.com.au
Wed Jun 16 00:53:00 CEST 1999
[Before I start: Skip mentioned "why L, not U". I know C/C++ uses L,
presumably to denote a "long" string (presumably keeping the analogy
between int and long ints). I guess Java has no such indicator, being
native Unicode?
Is there any sort of agreement that Python will use L"..." to denote
Unicode strings? I would be happy with it.
Also, should:
print L"foo" -> 'foo'
and
print `L"foo"` -> L'foo'
I would like to know if there is agreement for this, so I can change the
Pythonwin implementation of Unicode now to make things more seamless later.
]
> >> space = " "
> >> foo = L"foo"
> >> bar = L"bar"
> >> result = space.join((foo, bar))
>
> > The same should happen as for L"foo" + " " + L"bar".
I must admit Guido's position has real appeal, even if just from a
documentation POV. Eg, join can be defined as:
sep.join([s1, ..., sn])
Returns s1 + sep + s2 + sep + ... + sepn
Nice and simple to define and understand. Thus, if you can't add 2 items,
you can't join them.
Assuming the Unicode changes allow us to say:
assert " " == L" ", "eek"
assert L" " + "" == L" "
assert " " + L"" == L" " # or even if this == " "
Then this still works well in a Unicode environment; Unicode and strings
could be mixed in the list, and as long as you understand what L" " + ""
returns, you will understand immediately what the result of join() is going
to be.
> The attraction of auto-conversion for me is that I had never once seen
> string.join blow up where the exception revealed a conceptual
> error; in
> every case conversion to string was the intent, and an
> obvious one at that.
OTOH, my gut tells me this is better - that an implicit conversion to the
seperator type be performed. Also, it appears that this technique will
never surprise anyone in a bad way. It seems the rule above, while simple,
basically means "sep.join can only take string/Unicode objects", as all
other objects will currently fail the add test. So, given that our rule is
that the objects must all be strings, how can it hurt to help the user
conform?
> off-to-work-ly y'rs - tim
where-i-should-be-instead-of-writing-rambling-mails-ly,
Mark.
More information about the Python-Dev
mailing list