FW: [Tutor] Finding items in list of lists.
Doug.Shawhan@gecits.ge.com
Doug.Shawhan@gecits.ge.com
Tue Mar 18 09:47:02 2003
Much food for thought. Thanks for the clarification vis. collating.
-----Original Message-----
From: Bob Gailer [mailto:bgailer@alum.rpi.edu]
Sent: Monday, March 17, 2003 5:08 PM
To: Shawhan, Doug (CAP, ITS, US); tutor@python.org
Subject: Re: FW: [Tutor] Finding items in list of lists.
At 03:32 PM 3/17/2003 -0500, Doug.Shawhan@gecits.ge.com wrote:
>------------------------snip-------------------------
>
>import string
>import xreadlines
># Grab data from disk
>f=open("\\tmp\\sample.txt","r")
>rawData=[]
>for line in xreadlines.xreadlines(f):
> rawData.append(string.split(line,'\t'))
># Get rid of the top "row", since it contains no useful data by default
>del rawData[0]
># We want to sort by the shared value which is in the tenth "column"
First, it looks like you are collating rather than sorting. Sorting implies
putting in order and all I see this code doing is creating dictionary
entries.
>db = {}
>gather = []
>for lines in rawData:
> parentItem = lines[9]
> for line in rawData:
> if line[9] == parentItem:
> gather.append(line)
> db[parentItem]=gather
> gather = []
Immediate observation and refinement: once a parentItem is found we don't
need to find and process it again, so after
parentItem = lines[9]
add
if parentItem not in db:
then continue with:
for line in rawData:
etc.
Also you could use list comprehension:
db[parentItem] = [line for line in rawData if line[9] ==
parentItem]
># Now we take the data that have been sorted by parentItem and
>further sort them by
># what type of item they are. For example, if the line has both a
>printer and a duplex unit
># therein, the printer and duplex are sorted out and given an entry
>of their own. This
># enables the items to be uploaded into dam with no issues.
>
>cookedData = {} # <-- new dictionary for the second sort.
>for each in db.keys():
> sortdb = {} # <-- new dictionary for the item sort
> for item in db[each]:
> sortdb[item[12]] = item
> # filter out the Printer/Duplex combinations
> if sortdb.has_key('DPLX') and sortdb.has_key('PRT'):
> print '%s printer // duplexer match'%each
> filtered=[sortdb['PRT'], sortdb['DPLX']]
> signify = sortdb['PRT']
> signify = signify[8]
> cookedData[signify]=filtered
> del sortdb['PRT']
> del sortdb['DPLX']
> # and the Laptop/Keyboard combinations
> elif sortdb.has_key('KBD') and sortdb.has_key('LAP'):
> print '%s laptop // keyboard match'%each
> filtered=[sortdb['LAP'], sortdb['KBD']]
> signify = sortdb['LAP']
> signify = signify[8]
> cookedData[signify]=filtered
> del sortdb['LAP']
> del sortdb['KBD']
> # now sort out the leftover items (usually
Cpu/Monitor
>combinations)
> else:
> old_potato = [] # <--A type of leftover (I crack me up.)
> for leftover in sortdb.keys():
> old_potato.append(sortdb[leftover])
> # and finally add the leftovers to the cookedData.
> cookedData[item[8]]=old_potato
>
># Now we place the various data into a single long string suitable for DAM
>to ingest
>for item in cookedData.keys():
> print item, cookedData[item]
>
>--------------------snip-----------------------
>
>Any suggestions for cleanup or concision are welcomed!
An idea (untested). Assumes there will be a pair of records for each shared
value. If there could be less or more then some modifications are needed.
sortableData = map((lambda x:list((x[9],x[12]))+x), rawData) # copy the
major and minor sort items to the front of each list.
sortableData.sort() # do the desired major/minor sort; all items of one
shared value will now be together and the types within each shared value
will be in order.
types = {'DPLX': ('PRT', '%s printer // duplexer match', 0), 'KBD': ('LAB',
'%s laptop // keyboard match', 0), etc.}
# key is the alphabetically earlier of the types
# 1st element of each tuple is the alphabetically later of the types
# 2nd element of each tuple is the message to print
# 3rd element of each tuple is the significantOffset. If 'DPLX' were the
signifyng item insted of 'PRT' then this offset would be -1
sharevalue = None
index = 0
old_potato = []
while index < len(sortableData): # instead of a for loop, so we can access
more than one item
item = sortableData[index]
if item[0] != sharevalue: # start processing first or next shared value
sharevalue = item[0]
if old_potato: # left ove from previous shared value set
coookedData[sortableData[index-1][10] = old_potato
if item[1] in types:
expect, msg, significantOffset = types[item[1]]
old_potato = []
else: # must be a leftover
old_potato = [item]
else: # continue with next item of current shared value
if old_potato : # add next leftover
old_potato.append(item)
else:
if item[1] == expect: # we have a pair
print msg%item[0]
filtered=sortableData[index-1:index+1] # keep in mind that the
shared value and type appear at the head of the list
signify = sortableData[index + significantOffset]
signify = signify[10]
cookedData[signify]=filtered
else: # deal with unmatched pair
index += 1
if old_potato: # left over from last shared value set
coookedData[sortableData[index-1][10] = old_potato
Bob Gailer
PLEASE NOTE NEW EMAIL ADDRESS bgailer@alum.rpi.edu
303 442 2625
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.459 / Virus Database: 258 - Release Date: 2/25/2003