[Tutor] process and modify a list of strings, in place
Steven D'Aprano
steve at pearwood.info
Fri Feb 11 01:31:05 CET 2011
John Martinetti wrote:
> Hello -
>
> I'm a novice programmer, not even amateur level and I need some help with
> developing an algorithm to process a list of strings.
> I hope this list is tolerant of n00bs, if not, please let me know and I'll
> take this elsewhere.
Hi John! This list is specifically for n00bs, so don't be shy.
> #! /usr/bin/python
> import sys, re
> txtreport=open("open_pos.txt",'r')
>
> openPOs=[]
>
> while 1:
>
> record = txtreport.readline() # - reads in a line of data from the report
> file
>
> vendornum = record[:6] # - breaks out each column of the report
> into fields
[snip slicing and dicing of the line]
> # - create a record from all the fields
> linedata = (vendornum, vendorname, ordernum, ordersuffix, orderdate,
> buyer, partnum, qty, comment)
> # - append the record to the list of records representing the CQ report
> openPOs.append(linedata)
>
> # if not the end of the file, do it again
> if not record:
> break
The first thing that comes to mind is that you can simplify the line
processing by using a for loop instead of an infinite while loop:
for line in txtreport:
# walks the txtreport file, yielding each line in turn
# and stopping at the end of the file
vendornum = line[:6]
# etc.
record = (vendornum, vendorname, ordernum, ordersuffix,
orderdate, buyer, partnum, qty, comment)
openPOs.append(record)
Notice I've swapped the names around -- what you called "record" I'm
calling "line", and what you called "linedata" I'm calling a record.
> The part I'm having a problem with is once I've detected that a record has a
> blank buyerID field, I can't seem to figure out how to change it, in place.
> I've tried finding the index of the openPOs using this:
> openPOs.index(line)
> and then trying to change the item like this:
> openPOs[openPOs.index(line),5] = "NOBUYER"
> but I'm getting a strange error: "TypeError: 'tuple' object does not
> support item assignment"
Correct. Out of the box, Python supports two main built-in sequence
types, the tuple and the list. A tuple is designed for fixed-size
records, and is designed to be immutable -- once created, you can't
modify it.
Instead of writing to a tuple in place, you would create a new tuple and
replace it:
for i in range(len(openPOs)): # i = 0, 1, 2, ... N
record = openPOs[i]
if record[5] == ' ':
record = record[:5] + ('NOBUYER',) + record[6:]
openPOs[i] = record
The only part that's a bit tricky is that you need a one-item tuple for
the NOBUYER element. To do that, you need to know a trick for creating a
one-element tuple: take note of the comma inside the brackets.
The alternative is to use a list instead of a tuple. Lists are designed
for variable length data structures, and so lists can shrink and grow as
needed, and elements can be changed in place. To use a list, change the
line in the earlier for-loop to use square brackets instead of round:
record = [vendornum, vendorname, ordernum, ordersuffix,
orderdate, buyer, partnum, qty, comment]
and then replace NOBUYER fields in place:
for i in range(len(openPOs)):
record = openPOs[i]
if record[5] == ' ':
record[5] = 'NOBUYER'
One final improvement... whether you use a list or a tuple, the code
that checks for blank buyers can be simplified. Both versions start off
with the same two lines of boilerplate code:
for i in range(len(openPOs)):
record = openPOs[i]
# ...
We can simplify this to a single line that gets the record number i and
the record itself together:
for i,record in enumerate(openPOs):
# ...
Hope this helps.
--
Steven
More information about the Tutor
mailing list