[Tutor] feedback on permutations script

Steven D'Aprano steve at pearwood.info
Sun Jul 11 05:29:58 CEST 2010


On Sun, 11 Jul 2010 02:48:38 am Isaac wrote:
> This is my interpretation of an algorithm to generate all
> permutations of items in a list. I would appreciate any feedback
> regarding advice for improvements.

Some of this is going to just be personal style, some of this is going 
to be slightly stronger semi-official recommended style (PEP 8 if you 
want to google further), and right at the end some hints regarding the 
algorith.


> #!/usr/bin/python

If you're on Linux, it is generally recommended to use a shebang line 
of "#!/usr/bin/env python". Python can be installed in different 
places, while /usr/bin is the standard location for env. Of course, 
this isn't entirely portable either...


> def add_at_index(new_item, target_array, target_index):
>     """
>     new_item is a single list item.
>     target_array is a list or iterable.
>     target_index is a number.
>
>     This function returns a new list that inserts new_item inside
> target_array at target_array's target_index. The new list will be 1
> element longer than before.
>     """
>
>     new_list = target_array[:]
>
>     new_list.insert(target_index, new_item)
>     return new_list

This has misleading function and argument names: it's called 
add_at_index and takes an argument called target_array, but:

(1) target_array is not expected to be an array, but a list. People 
forget that Python *does* have a standard array type. Just import array 
to be reminded :)

(2) target_array is documented as being any iterable, but that's simply 
not correct. Try passing a tuple, string or iterator and watch it 
explode.

(3) The function and argument names imply that it *adds* to a *target*, 
but that's not what it does. It makes a copy. Unfortunately Python 
doesn't have a common convention for indicating functions that make a 
copy.

(4) target_index is said to be a number. Can you pass floats? Fractions? 
Decimals? Complex numbers?

(5) And the function is awfully verbose for such a tiny little thing.


I'm not very happy with the function name, but can't think of anything 
better that's not excessively long, so I'll keep the name. Here is how 
I would re-write the code:

def add_at_index(item, target, where):
    """Return a copy of target as a list and insert item at index where.

    item: any object
    target: any finite iterable
    where: an integer index
    """
    new_list = list(target)
    new_list.insert(where, item)
    return new_list


> def append_lists(working_list, list_of_lists):
>     # appends every list in list_of_lists to working_list; returns
> working_list 
>     for _list in list_of_lists: 
>         working_list.append(_list)
>     return working_list

You like those verbose variable names, don't you? :)

You have a comment, but no doc-string. Is that deliberate?

You have named a local variable "_list". This isn't *wrong* exactly, but 
it's very unusual. The standard convention in Python is that names 
starting with a single underscore are "private" names, but local 
variables inside a function are always private! Nothing outside of the 
function can get access to or see that variable, so marking it private 
is unnecessary.

If your only reason for using the underscore is to distinguish the 
variable from the built-in "list", then consider these alternatives:

* since this variable is just a generic name, you can use any 
  old generic name: L, mylist, alist, seq, ...

* where you absolutely have to use the word "list", the usual 
  Python convention to avoid shadowing the built-in is to 
  append the underscore to the end of the name: list_

* in a function this small, where you don't use the built-in
  list for anything, feel free to shadow the built-in. It won't
  hurt, although some people will hate it, and code checkers like
  PyLint or PyChecker may complain.


According to the comment and the name, list_of_lists must contain only 
lists. But why? There doesn't seem to be anything in the code that 
*requires* the items to be lists. This seems to be more of a 
multi-append than an append-lists function. So re-writing it:

def multi_append(target, items):
    """Return target list with each item of items appended.

    target: a list
    items: any sequence or iterable
    """
    for item in items:
        target.append(item)
    return target

But this function is entirely unnecessary, because that's exactly what 
the "extend" method does:

target.extend(items)


> def insert_each(orphan_item, target_array):
>     """
>     orphan_item is a single list item.
>     target_array is a list or iterable.
>
>     This function returns a list of lists that place orphan_item in
> between and at
>     the ends of each list element in target_array
>     """
>
>     new_array_length = len(target_array) + 1
>     array_of_arrays = []
>
>     for count in range(new_array_length):
>         array_of_arrays.append(add_at_index(orphan_item,
> target_array, count))
>
>     return array_of_arrays


The description of this function is complicated enough that it needs an 
example in the doc-string. Here's my version:


def insert_each(item, target):
    """Return a list of lists generated from target list by 
    inserting item into each position of the copy:

    >>> insert_each(0, [1, 2, 3])
    [[0, 1, 2, 3], [1, 0, 2, 3], [1, 2, 0, 3], [1, 2, 3, 0]]

    item: any object
    target: any finite sequence or iterable.
    """
    result = []
    for i in range(len(target) + 1):
        new_list = add_at_index(item, target, i)
        result.append(new_list)
    return result



> def permute(original_list, list_of_lists):
>     """
>     accept a list of items, original_list, to insert one at a time
> into list_of_lists
>     using the function insert_each. Called for the first time, 
>     new_list is a list containing an empty list. Then call
> recursively using an updated list_of_lists, called new_list_of_lists
> below, and the remaining items in original list.
>     """

This tells *how* the function works, but not *why*, or *what* it is 
supposed to do. It says to set new_list to [[]] but there is no 
argument new_list, so what is the user supposed to do with it?

Similarly, your variable names tend to describe what they are made of, 
rather than what they represent. "list_of_lists" isn't just some 
arbitrary list containing other lists. It is a list of permutations. 
The fact that each permutation is a list is unimportant.

If the user must call a function with a particular argument, then don't 
make that a requirement. Just have the function create it itself.

There's no need to have the comments "call recursively". Presumably the 
reader can actually read. If you're reading a function called "permute" 
and see a call to permute(), then it's pretty safe to bet that it's a 
recursive call. That's not quite so bad as the archetypal Bad Comment:

    x = x+1  # add one to x

    Wow, thanks Captain Obvious, what would we do without you?
    *rolls eyes*

but it is approaching it. The only time it does need a comment is if, 
somehow, it ISN'T a recursive call.

So, re-writing using the updated functions above and (IMO) more sensible 
variable names, but leaving the basic algorithm alone:

def permute(seq, _permutations=None):
    """Return a list of all the permutations of seq.

    seq: a finite sequence or iterable.
    _permutations: private argument, do not supply this.

    >>> permute([1, 2, 3])
    [[1, 2, 3], [2, 1, 3], [2, 3, 1], ... [3, 2, 1]]
    """
    if _permutations is None:
        _permutations = [[]]
    if not isinstance(seq, list):
        seq = list(seq)
    try:
        item = seq.pop()
    except IndexError:
        # The final iteration has been completed, so we're done.
        return _permutations
    if _permutations == [[]]:
        _permutations = [[item]]
    else:
        temp = []
        for perm in _permutations[:]:
            perm = insert_each(item, perm)
            temp.extend(perm)
        _permutations = temp
    return permute(seq, _permutations)


A couple of comments:

* This is going to be inefficient even for quite small sizes 
  of the input list. It seems to me that you're doing a lot 
  of unnecessary copying of lists and inserting. But don't 
  take that as gospel, I'd need to think about it a bit more
  to be sure of that.

* However the basic approach -- to generate all the 
  permutations at once -- truly is very inefficient. The 
  number of permutations grows exponentially fast, and so do 
  the time and memory requirements. A better approach is to 
  generate the permutations lazily, one at a time: see Python 
  generators and the yield statement.

* Recursion in Python isn't as efficient as some other 
  languages, so this will be unnecessarily slow compared to
  a version using iteration. For small lists you won't notice
  the difference; for large lists you'll run out of memory 
  first; but for intermediate lists you could see some gains
  by re-writing using a loop.




-- 
Steven D'Aprano


More information about the Tutor mailing list