[Tutor] python equivalents for perl list operators?

Steven D'Aprano steve at pearwood.info
Sat Apr 23 05:37:48 EDT 2016


Hi Malcolm, and welcome!

On Sat, Apr 23, 2016 at 10:15:52AM +1000, Malcolm Herbert wrote:

> I've been dabbling a bit with some lists and trying to work out how best
> to abitrarily sort and filter these. Perl has a number of operators that
> help with this in map(), grep() and sort() as follows:
> 
>   @raw = (2, 1, 4, 3);
>   @grepped = grep { $_ >= 3 } @raw; # (4, 3)
>   @mapped = map { $_ + 1 } @raw; # (3, 2, 5, 4)
>   @sorted = sort { $a > $b } @raw; # (1, 2, 3, 4)
> 
> in this case:
> 
> grep() will return all list items for which the code block returns true
> 
> map() will return all list items as modified by the code block
> 
> sort() will return a sorted list of items, using the code block to
> compare them (where $a and $b represent two items to be compared)

Thank you for explaining what the Perl code does!

Python doesn't have a compact short-cut for arbitrarily complex code 
blocks. If the code can be written as a single expression, you can embed 
it in a list comprehension, or use a "lambda" short-cut for creating a 
function. But for code blocks with multiple statements, you will need to 
predefine a function first.

The "grep" example can be done using either a list comprehension or the 
filter() function. Here is a version using a pre-defined function:


def big_enough(num):
    return num >= 3

raw = (2, 1, 4, 3)
grepped = filter(big_enough, raw)


We can skip the "big_enough" function and write it in place using a 
lambda:

grepped = filter(lambda num: num >= 3, raw)


(The name "lambda" comes from theoretical computer science -- google for 
"lambda calculus" if you care. But in Python, it is syntactic sugar for 
creating a function on the fly, as an expression, rather than as a 
statement. So unlike "def", lambda can be embedded in other expressions, 
but it is limited to a body consisting of a single expression.)

Here's a version using a list comprehension:

grepped = [num for num in raw if num >= 3]

List comprehensions are syntactic sugar for for-loops, based on "set 
builder notation" from mathematics. You can read the above as more or 
less equivalent to:

grepped = []  # Empty list.
for num in raw:
    if num >= 3:
        grepped.append(num)


except more compact.


The map example:
>   @mapped = map { $_ + 1 } @raw; # (3, 2, 5, 4)

is similar in Python. You can use the map() function, or a list 
comprehension:

mapped = map(lambda n: n+1, raw)

mapped = [n+1 for n in raw]


The sort example:
>   @sorted = sort { $a > $b } @raw; # (1, 2, 3, 4)


can be done two ways, either in-place, or copying the list into a new 
list. By default, sort goes from smallest to largest:

# in-place
raw.sort()

# copy to a new list, then sort
newlist = sorted(raw)


Both the sort method and the sorted function allow you to specify how 
the sort is done. In Python 2, you have a choice of using a comparison 
function (but beware, that tends to be slow for large lists) or a key 
function. In Python 3, you can only use a key function.

The comparison function specifies a function which takes two elements, 
and then returns -1, 0 or 1 depending on whether the first is less than, 
equal to, or greater than the second. So sorting odd and even numbers 
separately:

def odds_evens(a, b):
    if a%2 == b%2 == 0:
        # Both even, sort smaller to larger.
        return cmp(a, b)
    elif a%2 == b%2 == 1:
        # Both odd, sort larger to smaller.
        return -cmp(a, b)
    else:
        # Odd numbers first.
        if a%2 == 1:  # a is odd, so it comes first.
            return -1
        # Otherwise b is odd, so it comes first.
        return 1


And here is an example of how to use it:

py> import random
py> numbers = range(10)
py> random.shuffle(numbers)
py> print numbers
[1, 0, 2, 9, 7, 4, 5, 6, 8, 3]
py> print sorted(numbers, odds_evens)
[9, 7, 5, 3, 1, 0, 2, 4, 6, 8]


Alternatively, you can specify a key function, using a keyword argument. 
This implements the DSU (decorate-sort-undecorate) idiom that you might 
be familiar with under the name "Schwartzian transform". Here's how I 
might sort a bunch of strings by length:

py> strings = ['aaa', 'bbbb', 'c', 'dd', 'eeeeee', 'fffff']
py> print sorted(strings, key=len)
['c', 'dd', 'aaa', 'bbbb', 'fffff', 'eeeeee']


Notice that I can just use the built-in len() function as the key= 
argument.


[...]
> but I'm after the ability to put abitrary code in here to determine
> sort order or test an item for filtering (because the items they're
> testing may be complex structures rather than these simple integers, for
> example)

As I mentioned above, you can't embed arbitrarily complex 
multi-statement code blocks in function calls. If your test is complex 
enough that it needs more than one expression, you have to put it in a 
function first, like the odds_evens example above.



-- 
Steve


More information about the Tutor mailing list