Multi-dimensional list initialization

Andrew Robinson andrew3 at r3dsolutions.com
Wed Nov 7 23:02:22 CET 2012


On 11/06/2012 05:55 PM, Steven D'Aprano wrote:
> On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:
>
>> Yes.  But this isn't going to cost any more time than figuring out
>> whether or not the list multiplication is going to cause quirks, itself.
>>   Human psychology *tends* (it's a FAQ!) to automatically assume the
>> purpose of the list multiplication is to pre-allocate memory for the
>> equivalent (using lists) of a multi-dimensional array.  Note the OP even
>> said "4d array".
> I'm not entirely sure what your point is here. The OP screwed up -- he
> didn't generate a 4-dimensional array. He generated a 2-dimensional
> array. If his intuition about the number of dimensions is so poor, why
> should his intuition about list multiplication be treated as sacrosanct?
Yes he did screw up.
There is a great deal of value in studying how people screw up, and 
designing interfaces which tend to discourage it.  "Candy machine 
interfaces".

> As they say, the only truly intuitive interface is the nipple.
No it's not -- that interface really sucks.  :)
Have you ever seen a cat trying to suck a human nipple -- ?
Or, have you ever asked a young child who was weaned early and doesn't 
remember nursing -- what a breast is for ?  Once the oral stage is left, 
remaining behavior must be re-learned.
>   There are
> many places where people's intuition about programming fail. And many
> places where Fred's intuition is the opposite of Barney's intuition.
OK.  But that doesn't mean that *all* places have opposite intuition; 
Nor does it mean that one intuition which is statistically *always* 
wrong shouldn't be discouraged, or re-routed into useful behavior.

Take the candy machine,  if the items being sold are listed by number -- 
and the prices are also numbers; it's very easy to type in the price 
instead of the object number because one *forgets* that the numbers have 
different meaning and the machine can't always tell from the price, 
which object a person wanted (duplicate prices...); Hence a common 
mistake... people get the wrong item, by typing in the price.

By merely avoiding a numeric keypad -- the user is re-routed into 
choosing the correct item by not being able to make the mistake.

For this reason, Python tends to *like* things such as named parameters 
and occasionally enforces their use.  etc.

> Even more exciting, there are places where people's intuition is
> *inconsistent*, where they expect a line of code to behave differently
> depending on their intention, rather than on the code. And intuition is
> often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
> give 43? (Unless it is intuitively obvious that it should give 421.)
I agree, and in places where an *exception* can be raised; it's 
appropriate to do so.
Ambiguity, like the candy machine, is *bad*.

> So while I prefer intuitively obvious behaviour where possible, it is not
> the holy grail, and I am quite happy to give it up.
"where possible"; OK, fine -- I agree.  I'm not "happy" to give it up; 
but I am willing.
I don't like the man hours wasted on ambiguous behavior; and I don't 
ever think that should make someone "happy".

>> The OP's original construction was simple, elegant, easy to read and
>> very commonly done by newbies learning the language because it's
>> *intuitive*.  His second try was still intuitive, but less easy to read,
>> and not as elegant.
> Yes. And list multiplication is one of those areas where intuition is
> suboptimal -- it produces a worse outcome overall, even if one minor use-
> case gets a better outcome.
>
> I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
> disputing that this matters. Python would be worse off if list
> multiplication behaved intuitively.
How would it be worse off?

I can agree, for example, that in "C" -- realloc -- is too general.
One can't look at the line where realloc is being used, and decide if it is:
1) mallocing
2) deleting
3) resizing

Number (3) is the only non-redundant behavior the function provides.
There is, perhaps, a very clear reason that I haven't discovered why the 
extra functionality in list multiplication would be bad;  That reason is 
*not* because list multiplication is unable to solve all the copying 
problems in the word;  (realloc is bad, precisely because of that);  But 
a function ought to do at least *one* thing well.

Draw up some use cases for the multiplication operator (I'm calling on 
your experience, let's not trust mine, right?);  What are all the 
Typical ways people *Do* to use it now?

If those use cases do not *primarily* center around *wanting* an effect 
explicitly caused by reference duplication -- then it may be better to 
abolish list multiplication all together; and rather, improve the list 
comprehensions to overcome the memory, clarity, and speed pitfalls in 
the most common case of initializing a list.

For example, in initialization use cases; often the variable of a for 
loop isn't needed and all the initializers have parameters which only 
need to be evaluated *once* (no side effects).

Hence, there is an opportunity for speed and memory gains,while 
maintaining clarity and *consistency*.

Some ideas of use cases:
[ (0) in xrange(10) ]  # The function to create a tuple cache's the 
parameter '0', makes 10 (0)'s
[ dict.__new__(dict) in xrange(10) ]  # dict.__new__, The dict parameter 
is cached -- makes 10 dicts.
[ lambda x:(0) in xrange(10) ] # lambda caches (0), returns a 
*reference* to it multiple times.


> An analogy: the intuitively obvious thing to do with a screw is to bang
> it in with a hammer. It's long, thin, has a point at the end, and a flat
> head that just screams "hit me". But if you do the intuitive thing, your
> carpentry will be *much worse* than the alternatives.
:)
I agree.  Good point and Good "thin point".

> Having list multiplication copy has consequences beyond 2D arrays. Those
> consequences make the intuitive behaviour you are requesting a negative
> rather than a positive. If that means that newbie programmers have to
> learn not to hammer screws in, so be it. It might be harder, slower, and
> less elegant to drill a pilot hole and then screw the screw in, but the
> overall result is better.
no, the overall result is still bad.  If the answer is *don't* hammer 
nails, then it's better to raise an exception when it's tried.  There's 
no way to do that with list multiplication.

>>> * Consistency of semantics is better than a plethora of special
>>>     cases. Python has a very simple and useful rule: objects should not
>>>     be copied unless explicitly requested to be copied. This is much
>>>     better than having to remember whether this operation or that
>>>     operation makes a copy. The answer is consistent:
>> Bull.  Even in the last thread I noted the range() object produces
>> special cases.
>>   >>>  range(0,5)[1]
>> 1
>>   >>>  range(0,5)[1:3]
>> range(1, 3)
> What's the special case here? What do you think is copied?
>
>
> You take a slice of a range object, you get a new range object.
You were'nt paying attention, OCCASIONALLY, get an integer, or a list.
 >>> range(3)[2]
2

LOOOOK! That's not a range object, that's an integer.  Use Python 3.2 
and try it.

> I'm honestly not getting what you think is inconsistent about this.
How about now?

> Two-dimensional arrays in Python using lists are quite rare. Anyone 
> who is doing serious numeric work where they need 2D arrays is using 
> numpy, not lists.
Game programmers routinely use 2D lists to represent the screen layout;
For example, they might use 'b' to represent a brick tile, and 'w' to 
represent a water tile.
This is quite common in simple games;  I have seen several use 2D lists 
(or tuples) to do this.
Serious numeric work is not needed in most simple games; especially if 
motion is not involved.

There are *many* non serious uses of matrix mathematics and 2D lists.  
Numpy isn't desired even if it would work.  Cost benefit analysis....

Crossword puzzles, periodic table of the elements with different sources 
of weights listed under each element, etc.  (That can also be done with 
a dict, but I've seen an implementation do it the other way.) etc.

> There are millions of people using Python, so it's hardly surprising 
> that once or twice a year some newbie trips over this. But it's not 
> something that people tend to trip over again and again and again, 
> like C's "assignment is an expression" misfeature.
Good point.  I don't have a statistic -- except the handful of times I 
searched for some other topic -- and I have seen it three times already....

>> I read some of the documentation on why Python 3 chose to implement it
>> this way.
> What documentation is this?
The documentation for range() -- which I just studied read because of 
another thread we both were in.  You're misconstruing the subject -- 
which was "inconsistency" of Python is allowed; not "is list 
multiplication inconsistent."
>>       Q: What about [[]]*10?
>>       A: No, the elements are never copied.
>>
>> YES! For the obvious reason that such a construction is making mutable
>> lists that the user wants to populate later.  If they *didn't* want to
>> populate them later, they ought to have used tuples -- which take less
>> overhead.  Who even does this thing you are suggesting?!
> Who knows? Who cares? Nobody does:
exactly !!!!  But I do care, even though I don't do it (because it 
doesn't *work*)

>
> n -= n
>
> instead of just n=0, but that doesn't mean that we should give it some
> sort of special meaning different from n -= m. If it turns out that the
> definition of list multiplication is such that NOBODY, EVER, uses [[]]*n,
> that is *still* not a good reason for special-casing it.
Ahh... but some people *DO* try to use it for another purpose.
Your example is a bad analogy.

>
> Special cases aren't special enough to break the rules.
finish the sentence: ALTHOUGH practicality beats purity.

> There are perfectly good ways to generate a 2D array out of lists, and
> even better reasons not to use lists for that in the first place. (Numpy
> arrays are much better suited for serious work.)
Duh... I answered that....


> I'm afraid you've just lost an awful lot of credibility there.
>
> py>  x = [{}]*5
> py>  x
> [{}, {}, {}, {}, {}]
No, I showed what happed when you do {}*3;
That *DOESN'T* work;  You aren't multiplying the dictionary, you are 
multiplying the LIST of dictionaries.  Very different things.
You were complaining that my method doesn't multiply them -- well, gee 
-- either mine DOES or python DOESN'T.  Double standards are *crap*.


> py>  x[0]['key'] = 1
> py>  x
> [{'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}]
>
> And similarly for any other mutable object.
>
> If you don't understand that lists can contain other mutable objects
> apart from lists, then you really shouldn't be discussing this issue.

I do; that's why I DEMONSTRATED this issue in my own replies.

>
>>> Your proposal throws away consistency for a trivial benefit on a rare
>>> use- case, and replaces it with a bunch of special cases:
>> RARE!!!! You are NUTS!!!!
> Yes, rare. I base that on about 15 years of Python coding and many
> thousands (tens of thousands?) of hours on Python forums like this one.
> What's your opinion based on?
Which opinion?

2D lists are NOT rare; I've seen them in dozens of python programs not 
written by me.

As to my other opinion regarding why change it, there are two separate 
issues:

One was that a poster asked if it would be difficult to do without 
introducing bugs;  That's the question I answered affirmatively.  The 
_OP_ problem can be removed using a simple fix which isn't going to 
break any major number of programs.  That's a fact by your admission as 
well.

The second issue is would it be consistent to make the change (and NOTE 
I wasn't asked that, and wasn't answering that) You and others brought 
it up tangentially.  I agree, There is a problem, as I noted, with 
subclassing.  I also will note that ([],[],[]) is effectively the same 
question as sub-classing -- () are merely lists that are immutable.

As to how often people make the mistake -- There's a big difference 
between how often someone makes the mistake, and how often it shows up 
on a forum.  The issue shows up in a forum when someone can't figure out 
what they did wrong after a long debugging session.

There is more than one way to resolve such an issue, even if a person 
doesn't know why their construction is wrong.  If it is easy to see the 
construction produces the wrong result, one can simply try list 
comprehensions which will work correctly.  Then, either the person 
learns why they made the mistake -- or they don't.  If they are able to 
get it to work sometimes, but sometimes they can't -- they may or not 
stop using it.   I cite Java programming where the API is notorious for 
that kind of inconsistent behavior; Yet Java programmers feel compelled 
to still use those constructions. etc.



> values = [None]*n  # or 0 is another popular starting value
>
> Using it twice to generate a 2D array is even rarer.
Sure, and if it is used that way -- I doubt it is ever used like [ {} 
]*n, because that object will have a side effect.  SO again, if this is 
the *main* use case, the default behavior is not the main reason they 
use it -- but they can often work around the default behavior by using 
it with specially thought about data.

>
>>>       Q: How about if I use delegation to proxy a list? A: Oh no, they
>>>       definitely won't be copied.
>> Give an example usage of why someone would want to do this.  Then we can
>> discuss it.
> Proxying objects is hardly a rare scenario. Delegation is less common
> since you can subclass built-ins, but it is still used. It is a standard
> design pattern.
I was not judging you; I was asking for an example to discuss.

> Python is a twenty year old language. Do you really think this is the 
> first time somebody has noticed it? It's hard to search for 
> discussions on the dev list, because the obvious search terms bring up 
> many false positives.
No, I don't think it's the first time.

> But here are a couple of bug reports closed as "won't fix": 
> http://bugs.python.org/issue1408 http://bugs.python.org/issue12597 I 
> suspect it is long past time for a PEP so this can be rejected once 
> and for all. 
Yeah, that'd be good -- and perhaps they'd abolish it. :)
>> The copy speed will be the same or *faster*, and the typing less -- and
>> the psychological mistakes *less*, the elegance more.
> You think that it is *faster* to copy a list than to make a new pointer
> to it? Your credibility is not looking too good here.
YES -- WHEN copying by reference is a BUG, then copying is NOT by reference.
That's the use case I spelled out.  You're changing the subject to make 
me look dumb? or being purposely facile to hide being destroyed on the 
substance of the argument ?

When true copying is desired, then doing it at the "C" level is better 
than the interpreter Level.
ergo: You're not looking too intelligent either.

>> It's hardly going to confuse anyone to say that lists are copied with
>> list multiplication, but the elements are not.
> Well, that confuses me. What about a list where the elements are lists?
> Are they copied?
YES!  It's a LIST copy; the all lists, and only the lists, are copied; 
the rest are referenced.
> What about other mutable objects? Are they copied?
No.
It's a list copy, not random mutable object copy.
> What about mutable objects which are uncopyable, like file objects?
No.
It's a list copy, not a file copy.
>
>> Every time someone passes a list to a function, they *know* that the
>> list is passed by value -- and the elements are passed by reference.
> And there goes the last of your credibility. *You* might "know" this, but
> that doesn't make it so.
No, it's not gone.  You saying so, doesn't make it gone.

People aren't ignorant of the passing mechanism just because they didn't 
transfer from another language like "C", or "Pascal", "ADA", "Fortran", 
etc.  But when they do transfer from a language which makes a 
distinction -- then, yes, it's weird.

> Python's calling behaviour is identical to that used by languages 
> including Java (excluding unboxed primitives) and Ruby, to mention 
> only two. You're starting to shout and yell, so perhaps it's best if I 
> finish this here. 
Huh?
I'm not yelling any more than you are.  Are ???YOU??? yelling?

:-\

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121107/7cc54b33/attachment-0001.html>


More information about the Python-list mailing list