[Tutor] python equivalents for perl list operators?
Steven D'Aprano
steve at pearwood.info
Sat Apr 23 05:37:48 EDT 2016
Hi Malcolm, and welcome!
On Sat, Apr 23, 2016 at 10:15:52AM +1000, Malcolm Herbert wrote:
> I've been dabbling a bit with some lists and trying to work out how best
> to abitrarily sort and filter these. Perl has a number of operators that
> help with this in map(), grep() and sort() as follows:
>
> @raw = (2, 1, 4, 3);
> @grepped = grep { $_ >= 3 } @raw; # (4, 3)
> @mapped = map { $_ + 1 } @raw; # (3, 2, 5, 4)
> @sorted = sort { $a > $b } @raw; # (1, 2, 3, 4)
>
> in this case:
>
> grep() will return all list items for which the code block returns true
>
> map() will return all list items as modified by the code block
>
> sort() will return a sorted list of items, using the code block to
> compare them (where $a and $b represent two items to be compared)
Thank you for explaining what the Perl code does!
Python doesn't have a compact short-cut for arbitrarily complex code
blocks. If the code can be written as a single expression, you can embed
it in a list comprehension, or use a "lambda" short-cut for creating a
function. But for code blocks with multiple statements, you will need to
predefine a function first.
The "grep" example can be done using either a list comprehension or the
filter() function. Here is a version using a pre-defined function:
def big_enough(num):
return num >= 3
raw = (2, 1, 4, 3)
grepped = filter(big_enough, raw)
We can skip the "big_enough" function and write it in place using a
lambda:
grepped = filter(lambda num: num >= 3, raw)
(The name "lambda" comes from theoretical computer science -- google for
"lambda calculus" if you care. But in Python, it is syntactic sugar for
creating a function on the fly, as an expression, rather than as a
statement. So unlike "def", lambda can be embedded in other expressions,
but it is limited to a body consisting of a single expression.)
Here's a version using a list comprehension:
grepped = [num for num in raw if num >= 3]
List comprehensions are syntactic sugar for for-loops, based on "set
builder notation" from mathematics. You can read the above as more or
less equivalent to:
grepped = [] # Empty list.
for num in raw:
if num >= 3:
grepped.append(num)
except more compact.
The map example:
> @mapped = map { $_ + 1 } @raw; # (3, 2, 5, 4)
is similar in Python. You can use the map() function, or a list
comprehension:
mapped = map(lambda n: n+1, raw)
mapped = [n+1 for n in raw]
The sort example:
> @sorted = sort { $a > $b } @raw; # (1, 2, 3, 4)
can be done two ways, either in-place, or copying the list into a new
list. By default, sort goes from smallest to largest:
# in-place
raw.sort()
# copy to a new list, then sort
newlist = sorted(raw)
Both the sort method and the sorted function allow you to specify how
the sort is done. In Python 2, you have a choice of using a comparison
function (but beware, that tends to be slow for large lists) or a key
function. In Python 3, you can only use a key function.
The comparison function specifies a function which takes two elements,
and then returns -1, 0 or 1 depending on whether the first is less than,
equal to, or greater than the second. So sorting odd and even numbers
separately:
def odds_evens(a, b):
if a%2 == b%2 == 0:
# Both even, sort smaller to larger.
return cmp(a, b)
elif a%2 == b%2 == 1:
# Both odd, sort larger to smaller.
return -cmp(a, b)
else:
# Odd numbers first.
if a%2 == 1: # a is odd, so it comes first.
return -1
# Otherwise b is odd, so it comes first.
return 1
And here is an example of how to use it:
py> import random
py> numbers = range(10)
py> random.shuffle(numbers)
py> print numbers
[1, 0, 2, 9, 7, 4, 5, 6, 8, 3]
py> print sorted(numbers, odds_evens)
[9, 7, 5, 3, 1, 0, 2, 4, 6, 8]
Alternatively, you can specify a key function, using a keyword argument.
This implements the DSU (decorate-sort-undecorate) idiom that you might
be familiar with under the name "Schwartzian transform". Here's how I
might sort a bunch of strings by length:
py> strings = ['aaa', 'bbbb', 'c', 'dd', 'eeeeee', 'fffff']
py> print sorted(strings, key=len)
['c', 'dd', 'aaa', 'bbbb', 'fffff', 'eeeeee']
Notice that I can just use the built-in len() function as the key=
argument.
[...]
> but I'm after the ability to put abitrary code in here to determine
> sort order or test an item for filtering (because the items they're
> testing may be complex structures rather than these simple integers, for
> example)
As I mentioned above, you can't embed arbitrarily complex
multi-statement code blocks in function calls. If your test is complex
enough that it needs more than one expression, you have to put it in a
function first, like the odds_evens example above.
--
Steve
More information about the Tutor
mailing list