Python arrays and sting formatting options

Wed Oct 1 06:38:12 EDT 2008

On Wed, 01 Oct 2008 09:35:03 +0000, Steven D'Aprano wrote:

> On Wed, 01 Oct 2008 06:58:11 +0000, Marc 'BlackJack' Rintsch wrote:
> 
>>> I would weaken that claim a tad... I'd say it is "usual" to write
>>> something like this:
>>> 
>>> alist = []
>>> for x in some_values:
>>>     alist.append(something_from_x)
>>> 
>>> 
>>> but it is not uncommon (at least not in my code) to write something
>>> like this equivalent code instead:
>>> 
>>> alist = [None]*len(some_values)
>>> for i, x in enumerate(some_values):
>>>     alist[i] = something_from_x
>> 
>> I have never done this, except in the beginning I used Python, and --
>> maybe more importantly -- I've never seen this in others code.  I
>> really looks like a construct from someone who is still programming in
>> some other language(s).
> 
> 
> It occurs at least twice in the 2.5 standard library, once in
> sre_parse.py:
> 
>     groups = []
>     groupsappend = groups.append
>     literals = [None] * len(p)
>     for c, s in p:
>         if c is MARK:
>             groupsappend((i, s))
>             # literal[i] is already None
>         else:
>             literals[i] = s
> 
> and another time in xdrlib.py:
> 
>     succeedlist = [1] * len(packtest)
>     count = 0
>     for method, args in packtest:
>         print 'pack test', count,
>         try:
>             method(*args)
>             print 'succeeded'
>         except ConversionError, var:
>             print 'ConversionError:', var.msg
>             succeedlist[count] = 0
>         count = count + 1

I guess the first falls into the "micro optimization" category because it 
binds `groups.append` to a name to spare the attribute look up within the 
loop.

Both have in common that not every iteration changes the list, i.e. the 
preset values are not just place holders but values that are actually 
used sometimes.  That is different from creating a list of place holders 
that are all overwritten in any case.

>>> - the with statement acts by magic; if you don't know what it does,
>>> it's an opaque black box.
>> 
>> Everything acts by magic unless you know what it does.  The Fortran
>> 
>>   read(*,*)(a(i,j,k),j=1,3)
>> 
>> in the OP's first post looks like magic too.
> 
> It sure does. My memories of Fortran aren't good enough to remember what
> that does.
> 
> But I think you do Python a disservice. One of my Perl coders was
> writing some Python code the other day, and he was amazed at how
> guessable Python was. You can often guess the right way to do something.

I think my code would be as guessable to a Lisp, Scheme, or Haskell 
coder.  Okay, Lispers and Schemers might object the ugly syntax.  ;-)

>> I admit that my code shows off advanced Python features but I don't
>> think ``with`` is one of them. It makes it easier to write robust code
>> and maybe even understandable without documentation by just reading it
>> as "English text".
> 
> The first problem with "with" is that it looks like the Pascal "with"
> statement, but acts nothing like it. That may confuse anyone with Pascal
> experience, and there are a lot of us out there.

But Python is not Pascal either.  Nonetheless a Pascal coder might guess 
what the ``with`` does.  Not all the gory details but that it opens a 
file and introduces `lines` should be more or less obvious to someone who 
has programmed before.

> The second difficulty is that:
> 
>     with open('test.txt') as lines:
> 
> binds the result of open() to the name "lines". How is that different
> from "lines = open('test.txt')"? I know the answer, but we shouldn't
> expect newbies coming across it to be anything but perplexed.

Even if newbies don't understand all the details they should be 
introduced to ``with`` right away IMHO.  Because if you explain all the 
details, even if they understand them, they likely will ignore the 
knowledge because doing it right is a lot of boiler plate code.  So 
usually people write less robust code and ``with`` is a simple way to 
solve that problem.

> Now that the newbie has determined that lines is a file object, the very
> next thing you do is assign something completely different to 'lines':
> 
>         lines = (line for line in lines if line.strip())
> 
> So the reader needs to know that brackets aren't just for grouping like
> in most other languages, but also that (x) can be equivalent to a for-
> loop. They need to know, or guess, that iterating over a file object
> returns lines of the file, and they have to keep the two different
> bindings of "lines" straight in their head in a piece of code that uses
> "lines" twice and "line" three times.

Yes the reader needs to know a basic Python syntax construct to 
understand this.  And some knowledge from the tutorial about files.  So 
what?

> And then they hit the next line, which includes a function called
> "partial", which has a technical meaning out of functional languages and
> I am sure it will mean nothing whatsoever to anyone unfamiliar to it.
> It's not something that is guessable, unlike open() or len() or
> append().

Why on earth has everything to be guessable for someone who doesn't know 
Python or even programming at all?

>>> - you re-use the same name for different uses, which can cause
>>> confusion.
>> 
>> Do you mean `lines`?  Then I disagree because the (duck) type is always
>> "iterable over lines".  I just changed the content by filtering.
> 
> Nevertheless, for people coming from less dynamic languages than Python
> (such as Fortran), it is a common idiom to never use the same variable
> for two different things.  It's not a bad choice really: imagine reading
> a function where the name "lines" started off as an integer number of
> lines, then became a template string, then was used for a list of
> character positions...

Which I'm not doing at all.  It has the same duck type all the time: 
"iterable of lines".

> Of course I'm not suggesting that your code was that bad. But rebinding
> a name does make code harder to understand.

Introducing a new name here would be worse IMHO because then the file 
object would be still reachable by a name, which it shouldn't to document 
that it won't be used anymore in the following code.

Again, I don't think I have written something deliberately obfuscated, 
but readable, concise, and straight forward code -- for people who know 
the language of course.

If someone ask how would you write this code from language X in Python, I 
actually write Python, and not something that is a 1:1 almost literal 
translation of the code in language X.

*I* think I would do Python a disservice if I encourage people to 
continue writing Python code as if it where language X or pretending 
Python is all about "readable, executable Pseudocode for anyone".  Python 
has dynamic typing, first class functions, "functional" syntax 
constructs, and it seems the developers like iterators and generators.  
That's the basic building blocks of the language, so I use them, even in 
public.  :-)

Ciao,
	Marc 'BlackJack' Rintsch