Re: [I18n-sig] Unicode experience
data:image/s3,"s3://crabby-images/163a8/163a80a2f5bd494435f25db087401841370a66e9" alt=""
Marc-Andre Lemburg replied:
I think this would be helpful to have in the std library. Note that in JPython, you'd already use str() for this, and in Python 3000 this may also be the case. At some point in the design discussion for the current Unicode support we also thought that we wanted str() to do this (i.e. allow 8-bit and Unicode string returns), until we realized that there were too many places that would be very unhappy if str() returned a Unicode string! The problem is similar to a situation you have with numbers: sometimes you want a coercion that converts everything to float except it should leave complex numbers complex. In other words it coerces up to float but it never coerces down to float. Luckily you can write that as "x+0.0" while converts int and long to float with the same value while leaving complex alone. For strings there is no compact notation like "+0.0" if you want to convert to string or Unicode -- adding "" might work in Perl, but not in Python. I propose ustr(x) with the semantics given by Toby. Class support (an __ustr__ method, with fallbacks on __str__ and __unicode__) would also be handy.
Yes, that's what we need. Thanks to Toby for pioneering this! --Guido van Rossum (home page: http://dinsdale.python.org/~guido/)
data:image/s3,"s3://crabby-images/691b7/691b7585f53b413eda0d2fc54ab00faea46f4db3" alt=""
guido wrote:
I propose ustr(x) with the semantics given by Toby.
+1 on concept. not sure about the name and the semantics. maybe a better name would be "unistr" (to match "unistr"). or maybe that's backwards? how about "String" (!). (the perfect name is "string", but that appears to be reserved by someone else...) as for the semantics, note that __str__ is allowed to return a unicode string in the current code base ("str" converts it to 8- bit using the default encoding). ustr/unistr/String should pass that one right through: def ustr(s): if type(s) in (type(""), type(u"")): return s s = s.__str__() if type(s) in (type(""), type(u"")): return s raise "__str__ returned wrong type"
Class support (an __ustr__ method, with fallbacks on __str__ and __unicode__) would also be handy.
-0 on this one (__str__ can already return either type, and if the goal is to get rid of both unichr and unistr in the future, we shouldn't add more hooks if we can avoid it. it's easier to remove stuff if you don't add them in the first place ;-) </F>
data:image/s3,"s3://crabby-images/addaf/addaf2247848dea3fd25184608de7f243dd54eca" alt=""
Fredrik Lundh wrote:
Uhm, what's left then ;-) ?
Agreed. I'm just adding coercion support for instances using that technique: instance defining __str__ can return Unicode objects which will then be used by the implementation whereever coercion to Unicode takes place. I'll add a similar hook to unicode(). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Guido van Rossum wrote:
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure. C++, depending on the type. Python's strictness about this issue has never caught a bug for me. It has only caused errors. Okay, some newbie may think that "5"+"5"=="10". But they could also expect [5]+[5]==[10]. There are limits to the extent that we can protect them from incorrect mental models. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/163a8/163a80a2f5bd494435f25db087401841370a66e9" alt=""
[me]
[PaulPrescod]
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure.
Really? Does 3+"" really convert the 3 to a string in Java?
Are you sure? This is the kind of error where you immediately see what's wrong and move on to the next bug.
I won't argue with you there :-) --Guido van Rossum (home page: http://dinsdale.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido]
BTW, "+0.0" is not a correct way to "boundedly coerce" to float under 754 arithmetic; "*1.0" is safer (the former does not always preserve the sign bit of a float zero correctly, but the latter does). [Paul Prescod]
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure.
[Guido]
Really? Does 3+"" really convert the 3 to a string in Java?
I don't remember about that specifically, but believe ""+3 does. OTOH, in *Perl* 3+"" converts "" to the *number* 0 and leaves 3 alone.
Python's strictness about this issue has never caught a bug for me. It has only caused errors.
Are you sure? This is the kind of error where you immediately see what's wrong and move on to the next bug.
It's certainly caught errors for me, and especially when introducing Perl programmers to Python, where "they expect" string+number to convert the string to a number, apparently the opposite of the arbitrary choice Paul prefers. It's ambiguous as hell -- screw it.
data:image/s3,"s3://crabby-images/5f02a/5f02afabddba578e516a9541249423abc308ea67" alt=""
On Fri, 7 Jul 2000, Tim Peters wrote:
It's not arbitrary -- the decision is made according to the type of the *operator* rather than the type of the operands. anything + anything returns a number anything . anything returns a string So "34"+0 converts to a number and 34."" converts to a string (i've seen both idioms fairly often). Anyway, i still agree that it's best to avoid automatic coercion between numbers and strings -- since it's now very easy to say int("3") as opposed to import string; string.atoi("3"), there's really no excuse for not being explicit. -- ?!ng "Things are more like they are now than they ever were before." -- Dwight D. Eisenhower
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim]
[Ping]
Of the languages discussed, this is true only in Perl. The other languages map "+" to the string meaning, so it's arbitrary wrt the universe under discussion. Toss Icon into the Perl camp on this one, btw. Within Perl, that "+" means numeric addition and "." string catenation is also abitrary -- it could just as well have been the other way around. Perl simply aped Awk's arbitrary <wink> choice for what "+" means.
So "34"+0 converts to a number and 34."" converts to a string (i've seen both idioms fairly often).
Yes, & about as often as you see explicit str() or int() calls in Python. It's not a question of not needing the functionality, but of how to spell it (both!).
... there's really no excuse for not being explicit.
i-think-that's-what-i-said-the-first-time<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/d4610/d4610167fb99aff56ebc2d699165eebfb614c9c5" alt=""
Ping> So "34"+0 converts to a number and 34."" converts to a string Ping> (i've seen both idioms fairly often). Halleluhah! Never thought I'd get Perl programming help here! We have this nagging problem with our XML-RPC stuff. When a band or venue name consisting of all digits (like "311") name comes into our concert database server from a Perl client, it comes in as a number instead of a string. The only workaround so far was to check at all server interfaces where this might occur and call str() when we found a number. Now it looks like I can toss the problem back into the client and save a few cpu cycles on the server (and more to the point, not pay the cost of the test all the time)... Ka-Ping> Anyway, i still agree that it's best to avoid automatic Ka-Ping> coercion between numbers and strings Amen to that! Skip
data:image/s3,"s3://crabby-images/827ad/827adb3637ea24940db85cbb6c945b0d6a15967f" alt=""
"SM" == Skip Montanaro <skip@mojam.com> writes:
Ping> So "34"+0 converts to a number and 34."" converts to a string Ping> (i've seen both idioms fairly often). SM> Halleluhah! Never thought I'd get Perl programming help here! Talk about a low signal-to-noise ratio! You get insights into a lot more than Python on python-dev. Jeremy
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Skip Montanaro]
Halleluhah! Never thought I'd get Perl programming help here!
Skip, if you can't get Perl help on Python-Dev, where *could* you get it?! Passing out Perl help is one of Python-Dev's most important functions. never-leave-home-ly y'rs - tim
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Ka-Ping Yee wrote:
I want to be clear that I'm not asking for automatic coercion between numbers and strings but rather automatic coercion of any type to string. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/5f02a/5f02afabddba578e516a9541249423abc308ea67" alt=""
On Fri, 7 Jul 2000, Paul Prescod wrote:
I want to be clear that I'm not asking for automatic coercion between numbers and strings but rather automatic coercion of any type to string.
As it stands, with both 8-bit strings and Unicode strings, i think this would result in too much hidden complexity -- thinking about this can wait until we have only one string type. We could consider this after we unify strings, but even then it would take some arguing to convince me that automatic coercion is desirable. -- ?!ng
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Guido van Rossum wrote:
Really? Does 3+"" really convert the 3 to a string in Java?
class foo{ void foo(){ System.err.println( 5+"" ); } }
I am sure that 99% of the time when I get an error message trying to add a string to something, it is because I expect the thing to be automatically coerced to the string. This probably comes from the other languages I have used.
I don't see the choice as arbitrary. Perl's choice is just insane. :) According to the definition used in Java, Javascript and C++ (sometimes) x+y If y is a string then x+y is well-defined no matter what the type of x or the content of y. Under the Perl definition, it totally depends on the type of y and the contents of x. Insane! I advocate special casing of strings (which are already special cased in various ways) whereas Perl special cases particular string values. Insane! -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/5ae7c/5ae7c201824b37c3633187431441e0f369a52a1a" alt=""
On Fri, Jul 07, 2000 at 06:04:20PM -0500, Paul Prescod wrote:
According to the definition used in Java, Javascript and C++ (sometimes)
x+y
If y is a string then x+y is well-defined no matter what the type of x or the content of y.
The right hand side of the operation determines what the result is ? Somehow I find that confusing and counter-intuitive, but perhaps I'm too used to __add__ vs. __radd__ ;) Anyway, I've tried to learn C++ and Java both twice now, buying books by *good* authors (like Peter v/d Linden) and I still manage to run screaming after reading a single page of feature listings ;)
Under the Perl definition, it totally depends on the type of y and the contents of x. Insane!
Nono, absolutely not. Perl is weirdly-typed: the type is determined by the *operator*, not the operands. If you do '$x + $y', and both $x and $y are strings, they will both be converted to numbers (ending up 0 if it fails) and added as numbers, and the result is a number. (But since perl is weirdly-typed, if you need it to result in a string, the number will be converted to a string as necessary.) If you want string-concatenation, use '.', which will turn numbers into strings. But Perl wouldn't be Perl if it hadn't *some* exceptions to this kind of rule ;) '$x++', where $x is a string, can return a string. The 'string-auto-increment' operator is Magic(tm): if the string matches /^[a-zA-Z]*[0-9]*$/, that is, something like 'spam001', the ++ operator adds one to the number at the end of the string. so 'spam001' would become 'spam002'. If you do it to 'spam999', it'll be incremented to 'span000', 'spaz999' becomes 'spba000', and so on. However, Perl wouldn't be Perl if it hadn't some excpetion to *this* too ;) Well, not really an exception, but... The string *must* exactly match the regexp above, or the '++' operator will turn it into a number '1' instead. So your string cannot be '-spam01', or 'sp0m01', or anything like that. Rendering this magic behaviour pretty useless in general, because it's easier, more readable and less buggy to do the work the magic ++ does by hand. (We actually had some database 'corruption' in our old user-database because of this... a tool accepted a 'template' username and tried to create a specified number of accounts with a name in sequence to that template... The original code had a safeguard (the above regexp) but it got removed somehow, and eventually someone used a wrong template, and we ended up with accounts named '1' through '8' ;P)
Well, yes. But Perl special cases about everything and their dog. Not Good. (Okay, I promise to stay on-topic for a while now ;) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Thomas Wouters wrote:
...
The right hand side of the operation determines what the result is ?
No, either side determines what the result is. This is not magic. It's just like floating point coercion or complex number coercion.
This can be easily defined in terms of add and radd: class PaulsString: def __init__( self, str ): self.str=str def __add__( self, foo ): return self.str + str( foo ) def __radd__( self, bar ): return str( bar ) + self.str print PaulsString("abcdef")+5 print open+PaulsString("ghijkl") abcdef5 <built-in function open>ghijkl Here's an even simpler definition: class PaulsString: def __init__( self, str ): self.str=str def __coerce__( self, other ): return (self.str, str( other )) Ping says:
I don't see any hidden complexity. Python has features specifically designed to allow this sort of thing on a per-type basis. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/264c7/264c722c1287d99a609fc1bdbf93320e2d7663ca" alt=""
On Sat, 8 Jul 2000, Paul Prescod wrote:
Well... what happens when the other operand is a Unicode string? Do you also do automatic coercion when adding something to a Unicode string? When you add one to an arbitrary object, how do you convert the other object into a Unicode string? When you add an 8-bit string and Unicode together, what do you get? It's not that i don't think you might be able to come up with consistent rules. I just suspect that when you do, it might amount to more hidden stuff than i'm comfortable with. Of course you could also just use Itpl.py :) or a built-in version of same (Am i half-serious? Half-kidding? Well, let's just throw it out there...). Instead of: print PaulsString("abcdef")+5 print open+PaulsString("ghijkl") with Itpl.py it would just be: printpl("abcdef${5}") printpl("${open}ghijkl") A built-in Itpl-like operator might almost be justifiable, actually... i mean, we already have "%(open)sghijkl" % globals() Well, i don't know. Perhaps it looks too frighteningly like Perl. Anyway, the rules as implemented (see http://www.lfw.org/python/ for the actual Itpl.py module) are: 1. $$ becomes $ 2. ${ } around any expression evaluates the expression 3. $ followed by identifier, followed by zero or more of: a. .identifier b. [identifier] c. (identifier) evaluates the expression What i'm getting at with this approach is that you are clear from the start that the goal is a string: you have this string thing, and you're going to insert some stringified expressions and objects into it. I think it's clearer & less error-prone for interpolation to be its own operation, rather than overloading +. It also means you could start with a Unicode string with $s in it, and you would be assured of ending up with a Unicode string, for example. -- ?!ng
data:image/s3,"s3://crabby-images/691b7/691b7585f53b413eda0d2fc54ab00faea46f4db3" alt=""
guido wrote:
I propose ustr(x) with the semantics given by Toby.
+1 on concept. not sure about the name and the semantics. maybe a better name would be "unistr" (to match "unistr"). or maybe that's backwards? how about "String" (!). (the perfect name is "string", but that appears to be reserved by someone else...) as for the semantics, note that __str__ is allowed to return a unicode string in the current code base ("str" converts it to 8- bit using the default encoding). ustr/unistr/String should pass that one right through: def ustr(s): if type(s) in (type(""), type(u"")): return s s = s.__str__() if type(s) in (type(""), type(u"")): return s raise "__str__ returned wrong type"
Class support (an __ustr__ method, with fallbacks on __str__ and __unicode__) would also be handy.
-0 on this one (__str__ can already return either type, and if the goal is to get rid of both unichr and unistr in the future, we shouldn't add more hooks if we can avoid it. it's easier to remove stuff if you don't add them in the first place ;-) </F>
data:image/s3,"s3://crabby-images/addaf/addaf2247848dea3fd25184608de7f243dd54eca" alt=""
Fredrik Lundh wrote:
Uhm, what's left then ;-) ?
Agreed. I'm just adding coercion support for instances using that technique: instance defining __str__ can return Unicode objects which will then be used by the implementation whereever coercion to Unicode takes place. I'll add a similar hook to unicode(). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Guido van Rossum wrote:
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure. C++, depending on the type. Python's strictness about this issue has never caught a bug for me. It has only caused errors. Okay, some newbie may think that "5"+"5"=="10". But they could also expect [5]+[5]==[10]. There are limits to the extent that we can protect them from incorrect mental models. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/163a8/163a80a2f5bd494435f25db087401841370a66e9" alt=""
[me]
[PaulPrescod]
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure.
Really? Does 3+"" really convert the 3 to a string in Java?
Are you sure? This is the kind of error where you immediately see what's wrong and move on to the next bug.
I won't argue with you there :-) --Guido van Rossum (home page: http://dinsdale.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido]
BTW, "+0.0" is not a correct way to "boundedly coerce" to float under 754 arithmetic; "*1.0" is safer (the former does not always preserve the sign bit of a float zero correctly, but the latter does). [Paul Prescod]
Actually, these days, foo+"" works in a LOT of languages. Perl, Java and JavaScript for sure.
[Guido]
Really? Does 3+"" really convert the 3 to a string in Java?
I don't remember about that specifically, but believe ""+3 does. OTOH, in *Perl* 3+"" converts "" to the *number* 0 and leaves 3 alone.
Python's strictness about this issue has never caught a bug for me. It has only caused errors.
Are you sure? This is the kind of error where you immediately see what's wrong and move on to the next bug.
It's certainly caught errors for me, and especially when introducing Perl programmers to Python, where "they expect" string+number to convert the string to a number, apparently the opposite of the arbitrary choice Paul prefers. It's ambiguous as hell -- screw it.
data:image/s3,"s3://crabby-images/5f02a/5f02afabddba578e516a9541249423abc308ea67" alt=""
On Fri, 7 Jul 2000, Tim Peters wrote:
It's not arbitrary -- the decision is made according to the type of the *operator* rather than the type of the operands. anything + anything returns a number anything . anything returns a string So "34"+0 converts to a number and 34."" converts to a string (i've seen both idioms fairly often). Anyway, i still agree that it's best to avoid automatic coercion between numbers and strings -- since it's now very easy to say int("3") as opposed to import string; string.atoi("3"), there's really no excuse for not being explicit. -- ?!ng "Things are more like they are now than they ever were before." -- Dwight D. Eisenhower
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Tim]
[Ping]
Of the languages discussed, this is true only in Perl. The other languages map "+" to the string meaning, so it's arbitrary wrt the universe under discussion. Toss Icon into the Perl camp on this one, btw. Within Perl, that "+" means numeric addition and "." string catenation is also abitrary -- it could just as well have been the other way around. Perl simply aped Awk's arbitrary <wink> choice for what "+" means.
So "34"+0 converts to a number and 34."" converts to a string (i've seen both idioms fairly often).
Yes, & about as often as you see explicit str() or int() calls in Python. It's not a question of not needing the functionality, but of how to spell it (both!).
... there's really no excuse for not being explicit.
i-think-that's-what-i-said-the-first-time<wink>-ly y'rs - tim
data:image/s3,"s3://crabby-images/d4610/d4610167fb99aff56ebc2d699165eebfb614c9c5" alt=""
Ping> So "34"+0 converts to a number and 34."" converts to a string Ping> (i've seen both idioms fairly often). Halleluhah! Never thought I'd get Perl programming help here! We have this nagging problem with our XML-RPC stuff. When a band or venue name consisting of all digits (like "311") name comes into our concert database server from a Perl client, it comes in as a number instead of a string. The only workaround so far was to check at all server interfaces where this might occur and call str() when we found a number. Now it looks like I can toss the problem back into the client and save a few cpu cycles on the server (and more to the point, not pay the cost of the test all the time)... Ka-Ping> Anyway, i still agree that it's best to avoid automatic Ka-Ping> coercion between numbers and strings Amen to that! Skip
data:image/s3,"s3://crabby-images/827ad/827adb3637ea24940db85cbb6c945b0d6a15967f" alt=""
"SM" == Skip Montanaro <skip@mojam.com> writes:
Ping> So "34"+0 converts to a number and 34."" converts to a string Ping> (i've seen both idioms fairly often). SM> Halleluhah! Never thought I'd get Perl programming help here! Talk about a low signal-to-noise ratio! You get insights into a lot more than Python on python-dev. Jeremy
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Skip Montanaro]
Halleluhah! Never thought I'd get Perl programming help here!
Skip, if you can't get Perl help on Python-Dev, where *could* you get it?! Passing out Perl help is one of Python-Dev's most important functions. never-leave-home-ly y'rs - tim
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Ka-Ping Yee wrote:
I want to be clear that I'm not asking for automatic coercion between numbers and strings but rather automatic coercion of any type to string. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/5f02a/5f02afabddba578e516a9541249423abc308ea67" alt=""
On Fri, 7 Jul 2000, Paul Prescod wrote:
I want to be clear that I'm not asking for automatic coercion between numbers and strings but rather automatic coercion of any type to string.
As it stands, with both 8-bit strings and Unicode strings, i think this would result in too much hidden complexity -- thinking about this can wait until we have only one string type. We could consider this after we unify strings, but even then it would take some arguing to convince me that automatic coercion is desirable. -- ?!ng
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Guido van Rossum wrote:
Really? Does 3+"" really convert the 3 to a string in Java?
class foo{ void foo(){ System.err.println( 5+"" ); } }
I am sure that 99% of the time when I get an error message trying to add a string to something, it is because I expect the thing to be automatically coerced to the string. This probably comes from the other languages I have used.
I don't see the choice as arbitrary. Perl's choice is just insane. :) According to the definition used in Java, Javascript and C++ (sometimes) x+y If y is a string then x+y is well-defined no matter what the type of x or the content of y. Under the Perl definition, it totally depends on the type of y and the contents of x. Insane! I advocate special casing of strings (which are already special cased in various ways) whereas Perl special cases particular string values. Insane! -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/5ae7c/5ae7c201824b37c3633187431441e0f369a52a1a" alt=""
On Fri, Jul 07, 2000 at 06:04:20PM -0500, Paul Prescod wrote:
According to the definition used in Java, Javascript and C++ (sometimes)
x+y
If y is a string then x+y is well-defined no matter what the type of x or the content of y.
The right hand side of the operation determines what the result is ? Somehow I find that confusing and counter-intuitive, but perhaps I'm too used to __add__ vs. __radd__ ;) Anyway, I've tried to learn C++ and Java both twice now, buying books by *good* authors (like Peter v/d Linden) and I still manage to run screaming after reading a single page of feature listings ;)
Under the Perl definition, it totally depends on the type of y and the contents of x. Insane!
Nono, absolutely not. Perl is weirdly-typed: the type is determined by the *operator*, not the operands. If you do '$x + $y', and both $x and $y are strings, they will both be converted to numbers (ending up 0 if it fails) and added as numbers, and the result is a number. (But since perl is weirdly-typed, if you need it to result in a string, the number will be converted to a string as necessary.) If you want string-concatenation, use '.', which will turn numbers into strings. But Perl wouldn't be Perl if it hadn't *some* exceptions to this kind of rule ;) '$x++', where $x is a string, can return a string. The 'string-auto-increment' operator is Magic(tm): if the string matches /^[a-zA-Z]*[0-9]*$/, that is, something like 'spam001', the ++ operator adds one to the number at the end of the string. so 'spam001' would become 'spam002'. If you do it to 'spam999', it'll be incremented to 'span000', 'spaz999' becomes 'spba000', and so on. However, Perl wouldn't be Perl if it hadn't some excpetion to *this* too ;) Well, not really an exception, but... The string *must* exactly match the regexp above, or the '++' operator will turn it into a number '1' instead. So your string cannot be '-spam01', or 'sp0m01', or anything like that. Rendering this magic behaviour pretty useless in general, because it's easier, more readable and less buggy to do the work the magic ++ does by hand. (We actually had some database 'corruption' in our old user-database because of this... a tool accepted a 'template' username and tried to create a specified number of accounts with a name in sequence to that template... The original code had a safeguard (the above regexp) but it got removed somehow, and eventually someone used a wrong template, and we ended up with accounts named '1' through '8' ;P)
Well, yes. But Perl special cases about everything and their dog. Not Good. (Okay, I promise to stay on-topic for a while now ;) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
data:image/s3,"s3://crabby-images/213dc/213dc7eeaa342bd5c3d5aba32bce7e6cba3a0cf8" alt=""
Thomas Wouters wrote:
...
The right hand side of the operation determines what the result is ?
No, either side determines what the result is. This is not magic. It's just like floating point coercion or complex number coercion.
This can be easily defined in terms of add and radd: class PaulsString: def __init__( self, str ): self.str=str def __add__( self, foo ): return self.str + str( foo ) def __radd__( self, bar ): return str( bar ) + self.str print PaulsString("abcdef")+5 print open+PaulsString("ghijkl") abcdef5 <built-in function open>ghijkl Here's an even simpler definition: class PaulsString: def __init__( self, str ): self.str=str def __coerce__( self, other ): return (self.str, str( other )) Ping says:
I don't see any hidden complexity. Python has features specifically designed to allow this sort of thing on a per-type basis. -- Paul Prescod - Not encumbered by corporate consensus Pop stars come and pop stars go, but amid all this change there is one eternal truth: Whenever Bob Dylan writes a song about a guy, the guy is guilty as sin. - http://www.nj.com/page1/ledger/e2efc7.html
data:image/s3,"s3://crabby-images/264c7/264c722c1287d99a609fc1bdbf93320e2d7663ca" alt=""
On Sat, 8 Jul 2000, Paul Prescod wrote:
Well... what happens when the other operand is a Unicode string? Do you also do automatic coercion when adding something to a Unicode string? When you add one to an arbitrary object, how do you convert the other object into a Unicode string? When you add an 8-bit string and Unicode together, what do you get? It's not that i don't think you might be able to come up with consistent rules. I just suspect that when you do, it might amount to more hidden stuff than i'm comfortable with. Of course you could also just use Itpl.py :) or a built-in version of same (Am i half-serious? Half-kidding? Well, let's just throw it out there...). Instead of: print PaulsString("abcdef")+5 print open+PaulsString("ghijkl") with Itpl.py it would just be: printpl("abcdef${5}") printpl("${open}ghijkl") A built-in Itpl-like operator might almost be justifiable, actually... i mean, we already have "%(open)sghijkl" % globals() Well, i don't know. Perhaps it looks too frighteningly like Perl. Anyway, the rules as implemented (see http://www.lfw.org/python/ for the actual Itpl.py module) are: 1. $$ becomes $ 2. ${ } around any expression evaluates the expression 3. $ followed by identifier, followed by zero or more of: a. .identifier b. [identifier] c. (identifier) evaluates the expression What i'm getting at with this approach is that you are clear from the start that the goal is a string: you have this string thing, and you're going to insert some stringified expressions and objects into it. I think it's clearer & less error-prone for interpolation to be its own operation, rather than overloading +. It also means you could start with a Unicode string with $s in it, and you would be assured of ending up with a Unicode string, for example. -- ?!ng
participants (11)
-
Fredrik Lundh
-
Guido van Rossum
-
Jeremy Hylton
-
Ka-Ping Yee
-
Ka-Ping Yee
-
M.-A. Lemburg
-
Paul Prescod
-
Skip Montanaro
-
Thomas Wouters
-
Tim Peters
-
Toby Dickenson