Namedtuples: some unexpected inconveniences
MRAB
python at mrabarnett.plus.com
Wed Apr 12 16:42:12 EDT 2017
On 2017-04-12 20:57, Deborah Swanson wrote:
> I won't say the following points are categorically true, but I became
> convinced enough they were true in this instance that I abandoned the
> advised strategy. Which was to use defaultdict to group the list of
> namedtuples by one of the fields for the purpose of determining whether
> certain other fields in each group were either missing values or
> contained contradictory values.
>
> Are these bugs, or was there something I could have done to avoid these
> problems? Or are they just things you need to know working with
> namedtuples?
>
> The list of namedtuples was created with:
>
> infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in -
> test.csv")
> rows = csv.reader(infile)fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)
> . . .
> (many lines of field processing code)
> . . .
>
> then the attempt to group the records by title:
>
> import operator
> records[1:] = sorted(records[1:], key=operator.attrgetter("title",
> "Date")) groups = defaultdict() for r in records[1:]:
> # if the key doesn't exist, make a new group
> if r.title not in groups.keys():
> groups[r.title] = [r]
> # if key (group) exists, append this record
> else:
> groups[r.title].append(r)
>
> (Please note that this default dict will not automatically make new keys
> when they are encountered, possibly because the keys of the defaultdict
> are made from namedtuples and the values are namedtuples. So you have to
> include the step to make a new key when a key is not found.)
>
The defaultdict _will_ work when you use it properly. :-)
The line should be:
groups = defaultdict(list)
so that it'll make a new list every time a new key is automatically added.
Another point: namedtuples, as with normal tuples, are immutable; once
created, you can't change an attribute. A dict might be a better bet.
> If you succeed in modifying records in a group, the dismaying thing is
> that the underlying records are not updated, making the entire exercise
> totally pointless, which was a severe and unexpected inconvenience.
>
> It looks like the values and the structure were only copied from the
> original list of namedtuples to the defaultdict. The rows of the
> grouped-by dict still behave like namedtuples, but they are no longer
> the same namedtuples as the original list of namedtuples. (I'm sure I
> didn't say that quite right, please correct me if you have better words
> for it.)
>
> It might be possible to complete the operation and then write out the
> groups of rows of namedtuples in the dict to a simple list of
> namedtuples, discarding the original, but at the time I noticed that
> modifying rows in a group didn't change the values in the original list
> of namedtuples, I still had further to go with the dict of groups, and
> it was looking easier by the minute to solve the missing values problem
> directly from the original list of namedtuples, so that's what I did.
>
> If requested I can reproduce how I saw that the original list of
> namedtuples was not changed when I modified field values in group rows
> of the dict, but it's lengthy and messy. It might be worthwhile though
> if someone might see a mistake I made, though I found the same behavior
> several different ways. Which was when I called it barking up the wrong
> tree and quit trying to solve the problem that way.
>
> Another inconvenience is that there appears to be no way to access field
> values of a named tuple by variable, although I've had limited success
> accessing by variable indices. However, direct attempts to do so, like:
>
> values = {row[label] for row in group}
> (where 'label' is a variable for the field names of a namedtuple)
>
> gets "object has no attribute 'label'
>
> or, where 'record' is a row in a list of namedtuples and 'label' is a
> variable for the fieldnames of a namedtuple:
>
> value = getattr(record, label)
> setattr(record, label, value) also don't work.
>
> You get the error 'object has no attribute 'label' every time.
>
More information about the Python-list
mailing list