Pythonification of the asterisk-based collection packing/unpacking syntax

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Dec 28 01:25:35 EST 2011


On Wed, 28 Dec 2011 15:06:37 +1100, Chris Angelico wrote:

> On Wed, Dec 28, 2011 at 10:10 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> Your original use-case, where you want to change the type of tail from
>> a list to something else, is simply solved by one extra line of code:
>>
>> head, *tail = sequence
>> tail = tuple(tail)
> 
> That achieves the goal of having tail as a different type, but it does
> have the additional cost of constructing and then discarding a temporary
> list. I know this is contrived, but suppose you have a huge
> set/frozenset using tuples as the keys, and one of your operations is to
> shorten all keys by removing their first elements. Current Python
> roughly doubles the cost of this operation, since you can't choose what
> type the tail is made into.

The First Rule of Program Optimization:
- Don't do it.

The Second Rule of Program Optimization (for experts only):
- Don't do it yet.


Building syntax to optimize imagined problems is rarely a good idea. The 
difference between 2 seconds processing your huge set and 4 seconds 
processing it is unlikely to be significant unless you have dozens of 
such huge sets and less than a minute to process them all.

And your idea of "huge" is probably not that big... it makes me laugh 
when people ask how to optimize code "because my actual data has HUNDREDS 
of items!". Whoop-de-doo. Come back when you have a hundred million 
items, then I'll take your question seriously.

(All references to "you" and "your" are generic, and not aimed at Chris 
personally. Stupid English language.)


> But if that's what you're trying to do, it's probably best to slice
> instead of unpacking.

Assuming the iterable is a sequence.

Fortunately, most iterable constructors accept iterators directly, so for 
the cost of an extra line (three instead of two), you can handle data 
structures as big as will fit into memory:

# I want to keep both the old and the new set
it = iter(huge_set_of_tuples)
head = next(it)  # actually an arbitrary item
tail = set(x[1:] for x in it)  # and everything else

If you don't need both the old and the new:

head = huge_set_of_tuples.pop()
tail = set()
while huge_set_of_tuples:
    tail.add(huge_set_of_tuples.pop()[1:])
assert huge_set_of_tuples == set([])

If you rely on language features, who knows how efficient the compiler 
will be?

head, tail::tuple = ::sequence

may create a temporary list before building the tuple anyway. And why 
not? That's what this *must* do:

head, second, middle::tuple, second_from_last, last = ::iterator

because tuples are immutable and can't be grown or shrunk, so why assume 
the language designers special cased the first form above?


> Fortunately, the Zen of Python "one obvious way to
> do it" doesn't stop there being other ways that work too.

Exactly. It is astonishing how many people think that if there isn't a 
built-in language feature, with special syntax, to do something, there's 
a problem that needs to be solved.


-- 
Steven



More information about the Python-list mailing list