Pythonification of the asterisk-based collection packing/unpacking syntax
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Dec 28 01:25:35 EST 2011
On Wed, 28 Dec 2011 15:06:37 +1100, Chris Angelico wrote:
> On Wed, Dec 28, 2011 at 10:10 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> Your original use-case, where you want to change the type of tail from
>> a list to something else, is simply solved by one extra line of code:
>>
>> head, *tail = sequence
>> tail = tuple(tail)
>
> That achieves the goal of having tail as a different type, but it does
> have the additional cost of constructing and then discarding a temporary
> list. I know this is contrived, but suppose you have a huge
> set/frozenset using tuples as the keys, and one of your operations is to
> shorten all keys by removing their first elements. Current Python
> roughly doubles the cost of this operation, since you can't choose what
> type the tail is made into.
The First Rule of Program Optimization:
- Don't do it.
The Second Rule of Program Optimization (for experts only):
- Don't do it yet.
Building syntax to optimize imagined problems is rarely a good idea. The
difference between 2 seconds processing your huge set and 4 seconds
processing it is unlikely to be significant unless you have dozens of
such huge sets and less than a minute to process them all.
And your idea of "huge" is probably not that big... it makes me laugh
when people ask how to optimize code "because my actual data has HUNDREDS
of items!". Whoop-de-doo. Come back when you have a hundred million
items, then I'll take your question seriously.
(All references to "you" and "your" are generic, and not aimed at Chris
personally. Stupid English language.)
> But if that's what you're trying to do, it's probably best to slice
> instead of unpacking.
Assuming the iterable is a sequence.
Fortunately, most iterable constructors accept iterators directly, so for
the cost of an extra line (three instead of two), you can handle data
structures as big as will fit into memory:
# I want to keep both the old and the new set
it = iter(huge_set_of_tuples)
head = next(it) # actually an arbitrary item
tail = set(x[1:] for x in it) # and everything else
If you don't need both the old and the new:
head = huge_set_of_tuples.pop()
tail = set()
while huge_set_of_tuples:
tail.add(huge_set_of_tuples.pop()[1:])
assert huge_set_of_tuples == set([])
If you rely on language features, who knows how efficient the compiler
will be?
head, tail::tuple = ::sequence
may create a temporary list before building the tuple anyway. And why
not? That's what this *must* do:
head, second, middle::tuple, second_from_last, last = ::iterator
because tuples are immutable and can't be grown or shrunk, so why assume
the language designers special cased the first form above?
> Fortunately, the Zen of Python "one obvious way to
> do it" doesn't stop there being other ways that work too.
Exactly. It is astonishing how many people think that if there isn't a
built-in language feature, with special syntax, to do something, there's
a problem that needs to be solved.
--
Steven
More information about the Python-list
mailing list