Namedtuples problem
Peter Otten
__peter__ at web.de
Thu Feb 23 05:34:11 EST 2017
Deborah Swanson wrote:
> This is how the list of namedtuples is originally created from a csv:
>
> infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in -
> test.csv")
> rows = csv.reader(infile)fieldnames = next(rows)
> Record = namedtuple("Record", fieldnames)
> records = [Record._make(fieldnames)]
> records.extend(Record._make(row) for row in rows)
>
> Thanks to Peter Otten for this succinct code, and to Greg Ewing for
> suggesting namedtuples for this type of problem to begin with.
>
> Namedtuples worked beautifully for the first two thirds of this code,
> but I've run into a snag attempting to proceed.
>
> Here's my code up to the snag, and I'll explain afterwards what I'm
> trying to do:
>
> import operator
> records[1:] = sorted(records[1:], key=operator.attrgetter("title",
> "Date"))
>
> groups = defaultdict()
> for r in records[1:]:
> # if the key doesn't exist, make a new group
> if r.title not in groups.keys():
> groups[r.title] = [r]
> # if key (group) exists, append this record
> else:
> groups[r.title].append(r)
>
> # make lookup table: indices for field names
> records_idx = {}
> for idx, label in enumerate(records[0]):
> records_idx[label] = idx
>
> LABELS = ['Location', 'ST', 'co', 'miles', 'first', 'Kind', 'Notes'] #
> look at field values for each label on group for group in
> groups.values():
> values = []
> for idx, row in enumerate(group):
> for label in LABELS:
> values.append(group[[idx][records_idx[label]]])
> <-snag
>
> I want to get lists of field values from the list of namedtuples, one
> list of field values for each row in each group (groups are defined in
> the section beginning with "groups = defaultdict()".
>
> LABELS defines the field names for the columns of field values of
> interest. So all the locations in this group would be in one list, all
> the states in another list, etc. (Jussi, I'm looking at your suggestion
> for the next part.)
>
> (I'm quite sure this bit of code could be written with list and dict
> comprehensions, but here I was just trying to get it to work, and
> comprehensions still confuse me a little.)
>
> Using the debugger's watch window, from
> group[[idx][records_idx[label]]], I get:
>
> idx = {int}: 0
> records_idx[label] = {int}: 4
>
> which is the correct indices for the first row of the current group (idx
> = 0) and the first field label in LABELS, 'Location' (records_idx[label]
> = 4).
>
> And if I look at
>
> group[0][4] = 'Longview'
>
> this is also correct. Longview is the Location field value for the first
> row of this group.
>
> However,
>
> group[[idx][records_idx[label]]]
> gets an Index Error: list index out of range
>
> I've run into this kind of problem with namedtuples before, trying to
> access field values with variable names, like:
>
> label = 'Location'
> records.label
>
> and I get something like "'records' has no attribute 'label'. This can
> be fixed by using the subscript form and an index, like:
>
> for idx, r in enumerate(records):
> ...
> records[idx] = r
>
> But here, I get the Index Error and I'm a bit baffled why. Both
> subscripts evaluate to valid indices and give the correct value when
> explicitly used.
>
> Can anyone see why I'm getting this Index error? and how to fix it?
I'm not completely sure I can follow you, but you seem to be mixing two
problems
(1) split a list into groups
(2) convert a list of rows into a list of columns
and making a kind of mess in the process. Functions to the rescue:
#untested
def split_into_groups(records, key):
groups = defaultdict(list)
for record in records:
# no need to check if a group already exists
# an empty list will automatically added for every
# missing key
groups[key(record)].append(record)
return groups
def extract_column(records, name):
# you will agree that extracting one column is easy :)
return [getattr(record, name) for record in records]
def extract_columns(records, names):
# we can build on that to make a list of columns
return [extract_column(records, name) for name in names]
wanted_columns = ['Location', ...]
records = ...
groups = split_into_groups(records, operator.attrgetter("title"))
Columns = namedtuple("Columns", wanted_columns)
for title, group in groups.items():
# for easier access we turn the list of columns
# into a namedtuple of columns
groups[title] = Columns._make(extract_columns(wanted_columns))
If all worked well you should now be able to get a group with
group["whatever"]
and all locations for that group with
group["whatever"].Locations
If there is a bug you can pinpoint the function that doesn't work and ask
for specific help on that one.
More information about the Python-list
mailing list