[Tutor] python equivalents for perl list operators?

Martin A. Brown martin at linux-ip.net
Sat Apr 23 04:57:06 EDT 2016


Greetings and welcome Malcolm,

>hey folks - I've been a long time perl programmer and only recently 
>tried my hand a python, so it's probable that these questions are 
>non-sensical in this context but for the moment I'm trying to stay 
>afloat

Although I wrote Python first, I spent years writing in both Python 
and Perl, so I may be able to offer a tip or two.

OK, so on to your questions.  I'll paste my examples from an 
interactive Python3 shell.  I hope you have already found that you 
can simply run an interactive interpreter:

  $ python3
  Python 3.4.1 (default, May 23 2014, 17:48:28) [GCC] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> 

>I've been dabbling a bit with some lists and trying to work out how best
>to abitrarily sort and filter these. Perl has a number of operators that
>help with this in map(), grep() and sort() as follows:
>
>  @raw = (2, 1, 4, 3);
>  @grepped = grep { $_ >= 3 } @raw; # (4, 3)
>  @mapped = map { $_ + 1 } @raw; # (3, 2, 5, 4)
>  @sorted = sort { $a > $b } @raw; # (1, 2, 3, 4)
>
>in this case:
>
>grep() will return all list items for which the code block returns true

Use 'filter' in Python.

  https://docs.python.org/3/library/functions.html#filter

Vaguely Perl-like invocation:

  >>> list(filter(lambda x: x >= 3, [2, 1, 4, 3]))
  [4, 3]

Old-school Pythonic way:

  >>> def gt3(x):
  ...     return x >= 3
  >>> list(filter(gt3, [2, 1, 4, 3]))
  [4, 3]

New-school Pythonic way:

  >>> [x for x in [2, 1, 4, 3] if x >= 3]
  [4, 3]

[But, also, please see my readability remarks below on the 
immediately preceding example.]

>map() will return all list items as modified by the code block

Use 'map' in Python, too.

  https://docs.python.org/3/library/functions.html#map

Vaguely Perl-like invocation:

  >>> list(map(lambda x: x + 1, [2, 1, 4, 3]))
  [3, 2, 5, 4]

Old-school Pythonic way:

  >>> def plus1(x):
  ...     return x + 1
  >>> list(map(plus1, [2, 1, 4, 3]))
  [3, 2, 5, 4]  

New-school Pythonic way (which you already discovered):

  >>> [x+1 for x in [2, 1, 4, 3]]
  [3, 2, 5, 4]


>sort() will return a sorted list of items, using the code block to 
>compare them (where $a and $b represent two items to be compared)

Use 'sorted' in Python.

  https://docs.python.org/3/library/functions.html#sorted

Or, sort a list in place with the method .sort().

  https://docs.python.org/3/library/stdtypes.html#list.sort

What's the difference, you ask?  I'll generate a list of random 
integers and then show you:

  >>> import random
  >>> raw = [random.randint(0, 100) for _ in range(10)]
  >>> raw
  [91, 79, 4, 89, 17, 83, 89, 50, 74, 71]
  >>> new = sorted(raw)
  >>> new
  [4, 17, 50, 71, 74, 79, 83, 89, 89, 91]
  >>> raw
  [91, 79, 4, 89, 17, 83, 89, 50, 74, 71]
  >>> raw.sort()
  >>> raw
  [4, 17, 50, 71, 74, 79, 83, 89, 89, 91]

You may have gotten accustomed to writing your own function (or a 
block) for each call to Perl's sort.  This technique is far less 
common in Python.

The notion of the Schwartzian transform in Perl is referred to be a 
more general name in Python-lands.  It's called the 
Decorate-Sort-Undecorate (DSU) technique.  Much more often, though, 
you will see a developer pass key= to the sort() method or the 
sorted() function.

>so - I've been able to at least work out the map() case above with a
>list comprehension
>
>  raw = [2, 1, 4, 3]
>  mapped = [ x + 1 for x in raw] # [3, 2, 5, 4]

Yes!  Another way to do it.  You are using something called a list 
comprehension.  (You see I used it to generate random numbers.)  
This is a wonderful idiomatic technique to use in Python.

I mentioned that I would remark about readability.  When writing in 
Python, readability matters an awful lot to the majority of Python 
programmers.  So, I refer again to my example:

  [x for x in [2, 1, 4, 3] if x >= 3]

The 'if' condition is pretty far away from the action, and as such, 
I generally, don't like to use an 'if' condition like this in a more 
complex list comprehension.  It takes me longer to figure out when I 
read it later.  So, I would make this into two lines, or use a 
different approach entirely.

  l = [2, 1, 4, 3]
  [x for x in l if x >= 3]

>and I know that .sorted() would do what I want in this limited 
>example, but I'm after the ability to put abitrary code in here to 
>determine sort order or test an item for filtering (because the 
>items they're testing may be complex structures rather than these 
>simple integers, for example)

You definitely want to read this:

  https://docs.python.org/3/howto/sorting.html

>these seem so useful things to want to do that I'd imagine they're 
>probably a basic part of the language, but so far I've not seen 
>anything that might cover them with the exeption of map() as above 
>- I am slowly trawling my way through Learning Python (5ed) so I 
>might yet get to something related, I don't know

Python's sort() is pretty darned good.  Additionally, there's plenty 
of flexibility with the key parameter.  Here's an example.  I'm 
going to create a dead-simple object using a module you probably 
haven't heard of before, but hopefully, this is enough like a dict() 
[a Perl hash) or an object (did you use the grafted-on OO in Perl?) 
that it'll make sense.

Now, I'll omit the interactive shell prompts (for easier pasting).

If I run this code to create a Pilot and store it in 'p' ...

  from collections import namedtuple
  Pilot = namedtuple('Pilot', ['surname', 'age', 'biplane'])
  p = Pilot('Vernon', 32, 'Pander')

Then, p looks like this (if I were to print it):

  Pilot(surname='Vernon', age=32, biplane='Pander')

So, now, I'll take a few of these Pilot()s and put them in a list 
and show you how you can sort based on any of the attributes of the 
Pilot:

  aces = list()
  aces.append(Pilot('Vernon', 32, 'Pander'))
  aces.append(Pilot('Ehringhaus', 41, 'Curtiss'))
  aces.append(Pilot('Wilkins', 28, 'Sopwith'))
  aces.append(Pilot('Tessler', 37, 'de Havilland'))

Our flying aces are lined up!  Which ones are wearing blue?

  aces.sort(key=attrgetter('surname'))

Now the order would look like this:

  [Pilot(surname='Ehringhaus', age=41, biplane='Curtiss'),
   Pilot(surname='Tessler', age=37, biplane='de Havilland'),
   Pilot(surname='Vernon', age=32, biplane='Pander'),
   Pilot(surname='Wilkins', age=28, biplane='Sopwith')]

But, that's no good... I really wanted to order them by plane:

  aces.sort(key=attrgetter('surname'))

And now:

  [Pilot(surname='Ehringhaus', age=41, biplane='Curtiss'),
   Pilot(surname='Vernon', age=32, biplane='Pander'),
   Pilot(surname='Wilkins', age=28, biplane='Sopwith'),
   Pilot(surname='Tessler', age=37, biplane='de Havilland')]

But, when choosing who gets to drive off the airfield first, it's always done
by age:

  aces.sort(key=attrgetter('age'), reverse=True)

So, the story ends:

  [Pilot(surname='Ehringhaus', age=41, biplane='Curtiss'),
   Pilot(surname='Tessler', age=37, biplane='de Havilland'),
   Pilot(surname='Vernon', age=32, biplane='Pander'),
   Pilot(surname='Wilkins', age=28, biplane='Sopwith')]

I will make one other note about thinking in Perl vs. thinking in Python.
(This is something that I found strange and occasionally quite convenient in
Perl.)

Perl will flatten your lists when you pass them into functions.  If you pass
two lists into a function.

    &some_func(@a, @b)

In some_func, there's just a list of arguments, and all distinction between
the two lists is lost.  That's why I saw (and wrote) much more often:

    &some_func(\@a, \@b)

In Python, the list is not flattened (thank goodness).  It's just like any
other variable.  It may contain any type, basic, derived, class, function,
whatever.  I do not know if other people have found this list behaviour
surprising when writing in both languages, but I do like the Python approach
better on the list handling.

The standard library contains a wider assortment of tools than the Perl
standard library, so when you are looking for a toolkit to handle something,
have a peek in the standard library list of modules; it may already be there.

  https://docs.python.org/3/py-modindex.html

If not, check out PyPI (the Python equivalent of CPAN):

  https://pypi.python.org/PyPI

Best of luck, Malcom, and welcome to the world of Python where everything is a
first class object.

-Martin


import pprint
from operator import attrgetter
from collections import namedtuple

Pilot = namedtuple('Pilot', ['surname', 'age', 'biplane'])
aces = list()
aces.append(Pilot('Vernon', 32, 'Pander'))
aces.append(Pilot('Ehringhaus', 41, 'Curtiss'))
aces.append(Pilot('Wilkins', 28, 'Sopwith'))
aces.append(Pilot('Tessler', 37, 'de Havilland'))

l = sorted(aces, key=attrgetter('surname'))
pprint.pprint(l)

l = sorted(aces, key=attrgetter('age'))
pprint.pprint(l)

l = sorted(aces, key=attrgetter('biplane'))
pprint.pprint(l)

-- 
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list