removing the ' from list elements
Steve Holden
sholden at bellatlantic.net
Wed Mar 8 09:25:51 EST 2000
doyen at mediaone.net wrote:
>
> I think this is a newbie oops situation, but I have read the tutorial,
> library reference and laguage reference without any Eureka!'s. Maybe
> it's too many years of basic.
>
Yes, that CAN warp an otherwise good mind :-)
> I've written a program that reads ftp LIST output, filters it and puts
> in in a list for sorting. I'm taking each line from the file listing I
> retrieve, parsing it into a list of year, month, day, time, size, name
> and source by using string.split. I then use sting.join on each 'mini
> list' into a formated line and place it in a 'master' list to use
> list.sort() and list.reverse for sorting by date last modified.
>
Seems you might be better-advised to just retain the parsed-line list,
and keep those in a list-of-lists. You can sort this just the same.
> When I retirieve (or examine) the elements within the list, I use
> string.split again to parse them, then use strimg,replace('[', '') for
> compariisons, or for output.
By retaining the original lists you can access the first element as
myList[0], the second as myList[1], and so on. Since each subscript
holds a particular piece of information you could even set variables
to constant values, allowing you to refer to myList[FILENAME],
myList[DATE], and so on.
> I';ve got a routine that I use for removing them,. but I'd rather not
> add them in the first place. A single space delimiter would work.
>
But, as I hope I've convinced you, retaining the list format would be
MUCH simpler.
> Is there a format that I can create my 'columns' without using the list
> format ["", "", ""] with these characters that I have to keep removing?
>
You are actually working with the string representations of lists rather
than the lists themselves, which is bound to create more work (as you have
observed).
> btw: I'm not complaining, I'm loving Python, it runs fast, quick to
> code, easy to read, I just feel bad about adding then removing the extra
> charaters.
>
Yep, Python is great :-)
> Other python newbie questions are....
> Should I always return something from a function (note the return 1's)
A function with no return statement automatically returns None. If
you really want a procedure this is fine, since Python will throw away
the unwanted result.
> Is use string.find on the whole string first, to see if I need to do a
> string.split, thinking it would be more efficient if most of the strings
> would not return a match.
Don't know. But reular expresiions, while initially intimidating, are the
fastest way to see whether a string matches a pattern. They also have the
advantage of making it quite easy to extract particular pieces from the
matched string.
> Am I doing my filter modification routine correctly (see below)
> list = filter('i', 'Jan", list) # only leaves items with Jan
> or
> list = filter('r', 'Jan", list) # to remove any item with Jan in it.
>
> Here's some of my code snippits, suggestions welcome
>
Some random suggestions below, but regular expressions are really what you
need...
> def cleanline(inline):
> outline = string.replace(inline, "'", "")
> outline = string.replace(inline, '"', '')
> outline = string.replace(outline, ",", "")
> outline = string.replace(outline, "[", "")
> outline = string.replace(outline, "]", "")
> return outline
>
I suspect you might be interested in this:
>>> import string
>>> a = ["this", 'is', 'a', 'list']
>>> string.join(a)
'this is a list'
>>> string.join(a,"")
'thisisalist'
>>>
Simpler and quicker, wouldn't you agree?
> def getdatelist(ftpline):
> yr = m = dy = tm = 'z' # set to error if not converted
yr = m = dy = tm = None # Would seem more Pythonic
> tt = string.split(ftpline) # returns list of year, month, day and
> time all numeric for sorting
> yr = str(tt[7])
> tp = string.find(yr, ':') # current year, ftp shows time
> if tp > -1:
> tm = yr
> yr = 2000
> else:
> tm = '00:00' # use a dummy time of 00:00
> dy = string.zfill(tt[6], 2) # zero fill date for sorting
> mo = tt[5] # convert alpha month to number for sorting
mths = {'01': 'Jan', '02': 'Feb', ... , '12': 'Dec'} # As initialisation, once
...
> if mo == 'Jan':
> m = '01'
> elif mo == 'Feb':
> m = '02'
> elif mo == 'Mar':
> m = '03'
> elif mo == 'Apr':
> m = '04'
> elif mo == 'May':
> m = '05'
> elif mo == 'Jun':
> m = '06'
> elif mo == 'Jul':
> m = '07'
> elif mo == 'Aug':
> m = '08'
> elif mo == 'Sep':
> m = '09'
> elif mo == 'Oct':
> m = '10'
> elif mo == 'Nov':
> m = '11'
> elif mo == 'Dec':
> m = '12'
> else:
> m = '99'
try:
m = mths[mo]
except KeyError:
m = None
or, alternatively, you could test using mths.has_key(mo) in an if/else.
> outlst = [yr, m, dy, tm]
> return outlst
>
> def addtoclist(thisline, pgmname, ppath):
> global clist
> pathstr = ' (' + ppath + ')'
> tl = string.split(thisline)
> q = string.find(thisline, pgmname)
> if q > -1: # Check for match in entire line
> tln = str(tl[-1])
> q = string.find(tln, pgmname)
> if q > -1: # make sure match was from Name portion of line
> dl = getdatelist(thisline)
> sz = string.rjust(str(tl[4]), 7) # the size
> dl.append(sz) # append the size to aid in comparison of files
> dl.append(tln) # append name onto list
> dl.append(pathstr)
> clist.append(str(dl)) # add to tlist for sorting
> return 1 # default always return true
>
> def filter(ftype, txt, chklist): # after the initial display, this
> allows lines to be removed by supplying a string that is either
> included, or to be removed.
> newlist = [] # I want to modify the supplied
> chklist,
> for l in chklist:
> x = string.find(l, txt)
> if ftype == 'i':
> if x >= 0:
> newlist.append(l)
> else:
> if x < 0:
> newlist.append(l)
> list = newlist[:]
> return list
>
> def showfound(curlist, title, count): # count is a maximum number of
> items, if -1 display the entire list
> tc = len(curlist) # count of items in list
> if tc == 0:
> print title
> print 'No records found '
> return
> if count == -1: # a -1 means print the whole list
> count = tc
> if count > tc: # if there are fewer items in list, reset count
> to
> count = tc # avoid a subscript error
> i = 1
> print title
> while i: # lists are relative 0 subscripts so we subtract 1
> o = cleanline(curlist[i-1]) # remove list delimeters
> i2 = string.rjust(str(i), 3) # line number
> print i2, o
> i = i + 1
> if i > count:
> break
> if i % 20 == 0:
> c = raw_input("Press return to continue ") # pause every
> 20 items to view display
> if c == 'q'or c == 'Q': # q quits
> menu
> break
> print title
> return 1
>
> doyen at mediaone.net
And so on. Overall this program gives the impression of having grown by
accretion -- it started out fairly simple, and then lots of bits got stuck
on, each with slightly conflicting goals. Your command of low-level Python
is clearly quite good, but it may be better to treat the program so far as
a prototype.
Now you know what your real requirements are, re-design the data structures
using more appropriate representations and you may well find you get a
cleaner program. Please note that none of this is intended to discourage:
keep it up!
regards
Steve
--
"If computing ever stops being fun, I'll stop doing it"
More information about the Python-list
mailing list