[Tutor] process and modify a list of strings, in place

Steven D'Aprano steve at pearwood.info
Fri Feb 11 01:31:05 CET 2011


John Martinetti wrote:
> Hello -
> 
> I'm a novice programmer, not even amateur level and I need some help with
> developing an algorithm to process a list of strings.
> I hope this list is tolerant of n00bs, if not, please let me know and I'll
> take this elsewhere.

Hi John! This list is specifically for n00bs, so don't be shy.


> #! /usr/bin/python
> import sys, re
> txtreport=open("open_pos.txt",'r')
> 
> openPOs=[]
> 
> while 1:
> 
>    record = txtreport.readline() # - reads in a line of data from the report
> file
> 
>    vendornum = record[:6]       # - breaks out each column of the report
> into fields

[snip slicing and dicing of the line]

>    # - create a record from all the fields
>    linedata = (vendornum, vendorname, ordernum, ordersuffix, orderdate,
> buyer, partnum, qty, comment)
>    # - append the record to the list of records representing the CQ report
>    openPOs.append(linedata)
> 
>    # if not the end of the file, do it again
>    if not record:
>       break

The first thing that comes to mind is that you can simplify the line 
processing by using a for loop instead of an infinite while loop:

for line in txtreport:
     # walks the txtreport file, yielding each line in turn
     # and stopping at the end of the file
     vendornum = line[:6]
     # etc.
     record = (vendornum, vendorname, ordernum, ordersuffix,
               orderdate, buyer, partnum, qty, comment)
     openPOs.append(record)

Notice I've swapped the names around -- what you called "record" I'm 
calling "line", and what you called "linedata" I'm calling a record.


> The part I'm having a problem with is once I've detected that a record has a
> blank buyerID field, I can't seem to figure out how to change it, in place.
> I've tried finding the index of the openPOs using this:
> openPOs.index(line)
> and then trying to change the item like this:
> openPOs[openPOs.index(line),5] = "NOBUYER"
> but I'm getting a strange error:  "TypeError: 'tuple' object does not
> support item assignment"

Correct. Out of the box, Python supports two main built-in sequence 
types, the tuple and the list. A tuple is designed for fixed-size 
records, and is designed to be immutable -- once created, you can't 
modify it.

Instead of writing to a tuple in place, you would create a new tuple and 
replace it:

for i in range(len(openPOs)):  # i = 0, 1, 2, ... N
     record = openPOs[i]
     if record[5] == '      ':
         record = record[:5] + ('NOBUYER',) + record[6:]
         openPOs[i] = record


The only part that's a bit tricky is that you need a one-item tuple for 
the NOBUYER element. To do that, you need to know a trick for creating a 
one-element tuple: take note of the comma inside the brackets.

The alternative is to use a list instead of a tuple. Lists are designed 
for variable length data structures, and so lists can shrink and grow as 
needed, and elements can be changed in place. To use a list, change the 
line in the earlier for-loop to use square brackets instead of round:

     record = [vendornum, vendorname, ordernum, ordersuffix,
               orderdate, buyer, partnum, qty, comment]


and then replace NOBUYER fields in place:

for i in range(len(openPOs)):
     record = openPOs[i]
     if record[5] == '      ':
         record[5] = 'NOBUYER'


One final improvement... whether you use a list or a tuple, the code 
that checks for blank buyers can be simplified. Both versions start off 
with the same two lines of boilerplate code:

for i in range(len(openPOs)):
     record = openPOs[i]
     # ...

We can simplify this to a single line that gets the record number i and 
the record itself together:

for i,record in enumerate(openPOs):
     # ...



Hope this helps.



-- 
Steven


More information about the Tutor mailing list