Syntactic sugar for assignment statements: one value to multiple targets?

gc gc1223 at gmail.com
Tue Aug 16 20:14:16 EDT 2011


On Aug 16, 4:39 pm, "Martin P. Hellwig" <martin.hell... at gmail.com>
wrote:
> On 03/08/2011 02:45, gc wrote:
> <cut>
>
> > a,b,c,d,e = *dict()
>
> > where * in this context means something like "assign separately to
> > all.

> <snip> . . . it has a certain code smell to it. <snip>
> I would love to see an example where you would need such a construct.

Perfectly reasonable request! Maybe there aren't as many cases when
multiple variables need to be initialized to the same value as I think
there are.

I'm a heavy user of collections, especially counters and defaultdicts.
One frequent pattern involves boiling (typically) SQLite records down
into Python data structures for further manipulation. (OK, arguably
that has some code smell right there--but I often have to do very
expensive analysis on large subsets with complex definitions which can
be very expensive to pull, sometimes requiring table scans over tens
of gigabytes. I *like* being able to use dicts or other structures as
a way to cache and structure query results in ways amenable to
analysis procedures, even if it doesn't impress Joe Celko.)

defaultdict(list) is a very clean way to do this. I'll often have four
or five of them collecting different subsets of a single SQL pull, as
in:

# PROPOSED SYNTAX:
all_pets_by_pet_store, blue_dogs_by_pet_store,
green_cats_by_pet_store, red_cats_and_birds_by_pet_store =
*defautdict(list)

# (Yes, indexes on color and kind would speed up this query, but the
actual fields can be way quite complex and have much higher
cardinality.)
for pet_store, pet_kind, pet_color, pet_weight, pet_height in
cur.execute("""SELECT s, k, c, w, h FROM SuperExpensivePetTable WHERE
CostlyCriterionA(criterion_basis) IN("you", "get", "the", "idea")"""):
    all_pets_by_pet_store[pet_store].append(pet_weight, pet_height)
    if pet_color in ("Blue", "Cyan", "Dark Blue") and pet_kind in
("Dog", "Puppy"):
        blue_dogs_by_pet_store[pet_store].append(pet_weight,
pet_height)
        #... and so forth

all_pets_analysis  =
BootstrappedMarkovDecisionForestFromHell(all_pets_by_pet_store)
blue_dogs_analysis =
BootstrappedMarkovDecisionForestFromHell(blue_dogs_by_pet_store)
red_cats_and_birds_analysis =
BMDFFHPreyInteracton(red_cats_and_bird_by_pet_store)
#... and so forth

Point is, I'd like to be able to create such collections cleanly, and
a,b,c = *defaultdict(list) seems as clean as it gets. Plus, when I
realize I need six (or only three) it's annoying to need to change two
redundant things (i.e. both the variable names and the count.)

if tl_dr: break()

Generally speaking, when you know you need several variables of the
same (maybe complex) type, the proposed syntax lets the equals sign
fully partition the variables question (how many variables do I need,
and what are they called?) from the data structure question (what type
should the variables be?) The other syntaxes (except Tim's generator-
slurping one, which as Tim points out has its own issues) all spread
the variables question across the equals sign, breaking the analogy
with single assignment (where the form is Variable = DataStructure).
If we're allowing multiple assignment, why can't we allow some form of
Variable1, Variable2, Variable3 = DataStructure without reaching for
list comprehensions, copy-pasting and/or keeping a side count of the
variables?

if much_tl_dr: break()

Let me address one smell from my particular example, which may be the
one you're noticing. If I needed fifty parallel collections I would
not use separate variables; I've coded a ghastly defaultdefaultdict
just for this purpose, which effectively permits what most people
would express as defaultdict(defaultdict(list)) [not possible AFAIK
with the existing defaultdict class]. But for reasons of explicitness
and simplicity I try to avoid hash-tables of hash-tables (and higher
iterations). I'm not trying to use dicts to inner-platform a fully
hashed version of SQL, but to boil things down.

Also bear in mind, reading the above, that I do a lot of script-type
programming which is constantly being changed under severe time
pressure (often as I'm sitting next to my clients), which needs to be
as simple as possible, and which tech-savvy non-programmers need to
understand. A lot of my value comes from quickly creating code which
other people (usually research academics) can feel ownership over.
Python is already a great language for this, and anything which makes
the syntax cleaner and more expressive and which eliminates redundancy
helps me do my job better. If I were coding big, stable applications
where the defaultdicts were created in a header file and wouldn't be
touched again for three years, I would care less about a more awkward
initialization method.



More information about the Python-list mailing list