re: More spillover re the division PEP
Kirby writes -
I think Arthur points too much emphasis on someone writing their own programs and following the unambiguous rules in the tutorial, vs. someone inheriting code written by others, and trying to puzzle out what's going on.
If we are disagreeing, I'm not sure where. I thought we were coming to similar places - just from different angles. Kirby also writes -
In any case, these are all just hats. I'm a newbie with respect to language A, casual user of B, non-user of C, pro with D. Always a newbie, always a pro - that's life in the big city.
Here we are disagreeing. I liked where you started better - newbie in the sense of new to programming different from newbie in the sense of prior programming experience new to Python. Quite different. I attempt to add a third category. Scripter. You seem to disagree with my assertion that we need to be more explicit than "newbie" when discussing something like the divisor operator. Who stumbles, why, and what accommodation to their pre-existing expectations are possible or wise. I think it is impossible to speak productively of "newbies" as a group in having such a conversation - because all the conversation ends up being is about different peoples different definition of the word "newbie", caste as a discussion about something else. ART
If we are disagreeing, I'm not sure where.
I thought we were coming to similar places - just from different angles.
I think I was responding to your sentiment that we don't want to pander to people who don't make it to chapter 3, i.e. if someone really learns the rules completely, then a/b, as currently defined, shouldn't be problematic. After all, it's all spelled out in black and white. You don't want Python to be dumbed down as it panders to some illusory "newbie" audience, a.k.a. people without the discipline to do enough learning. In the long run, it's better to have a steep on ramp than a dumbed down language. Whereas I agree with the sentiment, what I'm saying is that a/b is less problematic from a typical newbie standpoint of just trying to get something done in the language on one's own time. Typically, the posture of a newbie is: I want to program something that will do X, i.e. is one of writing from scratch. As such, the newbie is both the author and debugger of her own code, and therefore has deeper knowledge than lexemes like a/b taken in isolation. You can presume more insight and comprehension, because the reader is also the writer. Further along in a typical programming learning curve, you're not just writing your own stuff, but joining in teams, or inheriting code from others. You start eyeballing these vast programs, with thousands of lines of code, none of which you wrote. This is where you start to feel real gratitude for comments, clear documentation, cross-references. And this is where a/b starts to be problematic, because you don't know if it's losing information, returning 0 for 1/2 3/5 6/7 8/9... because you don't necessarily have all the clues you need to decide the types of a and b. Whereas if the proposed change is implemented, then I know for sure that a/b is a float-style division, and a//b isn't. It matters more with division, because a-b, a+b and a*b, if a,b are integers, usually don't give essentially different results than if a,b were floats. But with /, the current behavior is radically different either way. So it's not a matter of not understanding what / does for reason of not having read and/or understood the docs, it's a matter of knowing full well how / behaves, and finding this a stumbling block, especially because, unlike C or Java (especially Java), you don't have to declare the types of your arguments. In Java, you would go something like: int myfunc(int x, int y){ return x/y } Fine. myfunc expects two integers and returns an integer. No problem telling how / will behave. But in Python: def myfunc(x,y): return x/y How am I supposed to know that this does? It's not a matter of not having studied the docs, it's just that there's a lot more ambiguity built in to the language (by design). To compensate for that abiguity (an advantage, a selling point), we maybe need to be more explicity about what / is doing right where it's being used. This is what the PEP is aimed at fixing. Where I think we're agreeing is on the point that this has little to do with newbies vs. pros. However, I'm adding that I think the limitations of the current / are more likely to trip up a pro than a dedicated, serious- minded newbie, simply because pros are often immersed in the mundane drudgery of scanning large programs written by others, whereas the paradigm newbie is often more blissfully self-involved in pet projects, teaching herself the language, but not forced to rub her nose in a lot of code by others (except maybe the short examples offered in the tutorials and "How to" books). Kirby
I hate to continue this thread, but I have a point I haven't seen yet: Ask a grade-school student who is at the point where she can divide integers, but hasn't learned about fractions or decimals, "What's 10 divided by 3?" She will answer, "3". Students initially learn division as a lossy operator, and only later (though not much later) learn that it can do more. Student's don't learn that this 'lossiness' is wrong -- they learn that it is one sense of division; one that is useful for long division and writing mixed fractions, for example. Similarly, those learning Python should learn the 'lossy' operator first, and should learn that the 'lossy' behavior is useful in many situations. They should learn about the 'lossless' form of division shortly thereafter, learn why they both exist, and learn that even 'lossless' division loses something on a computer. I agree that the current notation for 'lossless' division is clumsy and non-intuitive. I feel that changing the semantics of an existing operator will cause more compatibility problems[1] than adding a new operator with new semantics. So, if I were king of my own little universe, Python would get a new '//' operator which does 'lossless' division, and the '/' operator would retain its original semantics. 'course, I'm no king. :-) Dustin [1] Note there are two flavors of compatibility problems: old scripts on a new interpreter, and new scripts on an old interpreter. To maximize usability of the language, both combinations are important. )O(
Ask a grade-school student who is at the point where she can divide integers, but hasn't learned about fractions or decimals, "What's 10 divided by 3?" She will answer, "3".
Grade school students learn long division, which is divmod() in Python - it returns a dividend and a remainder. So she will answer (3,1) ;) Dave
vided by 3?" She will answer, "3".
Grade school students learn long division, which is divmod() in Python - it returns a dividend and a remainder. So she will answer (3,1) ;)
Dave
I agree. They say "3 remainder 1" -- gradeschoolers don't just toss the remainder. Although integer division is useful, it's more obscure than regular floating point. Even though converting / to floating, and using // for integer return is going to break more code, I think it's better in the long run, for all the new code as yet unwritten. There's some pain in changing such a basic, primitive feature in a language already in production. Going to case-insensitivity would have been an even bigger nightmare. I'm OK with Guido pushing ahead with the above, and am glad the CI issue is being dropped. Kirby
When teaching Python, do we begin with a treatment of low level representations of data? Or is that saved for a later time? For example:
import struct,array,sys sys.byteorder 'little' m = array.array('I','THIS') m array('I', [1397311572L]) k = [hex(ord(i))[2:] for i in 'THIS'] k.reverse() k ['53', '49', '48', '54'] k = "".join(k) k '53494854' eval('0x'+k) 1397311572 struct.pack('i',1397311572) 'THIS'
Basically, I'm taking the string 'THIS' and converting it to an integer. The way this is done internally corresponds to concatenating 4 hex bytes in reverse order, i.e. 'T' = 0x54:
ord('T') 84 hex(84) '0x54'
Link those bytes to return a bigger number in decimal form (1397311572). Then tell struct it's reading an integer, which it returns as a corresponding string: lo and behold, our THIS is returned to us. Kirby
Another way to look at the relationship between bytes, integers, characters:
[hex(i) for i in array.array('b',"THIS")] ['0x54', '0x48', '0x49', '0x53'] >>> binascii.hexlify("THIS") '54484953' eval("0x"+binascii.hexlify("THIS")) 1414023507 struct.pack('i',1414023507) 'SIHT'
hexlify strings together hex bytes. Four bytes will be in reverse order from the 'i' type, such that packing the resulting decimal will result in a four-byte string in reverse order. 'h' works on byte pairs:
binascii.hexlify("MY") '4d59' [hex(i) for i in array.array('b',"MY")] ['0x4d', '0x59'] eval("0x"+binascii.hexlify("MY")) 19801 struct.pack('h',19801) 'YM'
You may also be able to go "double-wide" with 8-byte mappings of characters to double long floats (type d):
array.array('d',"ABCDEFHI") array('d', [1.0826786300000142e+045]) struct.pack('d',1.0826786300000142e+045) 'ABCDEFHI'
Question: Why does float -> long work like this: >>> long(1.0826786300000142e+045) 1082678630000014218353234260713996413124476928L and not like this? >>> long(1.0826786300000142e+045) 1082678630000014200000000000000000000000000000L Kirby
[Kirby Urner]
... Question:
Why does float -> long work like this:
>>> long(1.0826786300000142e+045) 1082678630000014218353234260713996413124476928L
and not like this?
>>> long(1.0826786300000142e+045) 1082678630000014200000000000000000000000000000L
Kirby
Because your machine floating-point isn't decimal, it's binary:
x = 2.**100 print x 1.26765060023e+030 print long(x) 1267650600228229401496703205376 print 2L ** 100 1267650600228229401496703205376
So making long(2.**100) 1267650600230000000000000000000L instead would be plain wrong. The same thing happens in your example, but is harder to picture because, offhand, I bet you don't know the exact IEEE-754 bit pattern corresponding to 1.0826786300000142e+045 <wink>.
x = 1.0826786300000142e+045 import math mantissa, exponent = math.frexp(x) print mantissa 0.758577950788 print exponent 150 print math.ldexp(mantissa, 53) 6.832662753e+015 mantissa_as_long = long(math.ldexp(mantissa, 53)) mantissa_as_long 6832662753002049L print mantissa_as_long << (exponent-53) 1082678630000014218353234260713996413124476928 print long(x) 1082678630000014218353234260713996413124476928
IOW, long(some_huge_integer_in_float_format) *does* have a large number of trailing zeroes, but in base 2, not necessarily in base 10.
print hex(long(x)) 0x308C8A88868482000000000000000000000000L
participants (5)
-
Arthur_Siegel@rsmi.com
-
David Scherer
-
Dustin Mitchell
-
Kirby Urner
-
Tim Peters