removing the ' from list elements

Steve Holden sholden at bellatlantic.net
Wed Mar 8 09:25:51 EST 2000


doyen at mediaone.net wrote:
> 
> I think this is a newbie oops situation,  but I have read the tutorial,
> library reference and laguage reference without any Eureka!'s. Maybe
> it's too many years of basic.
> 
Yes, that CAN warp an otherwise good mind :-)

> I've written a program that reads ftp LIST output, filters it and puts
> in in a list for sorting. I'm taking each line from the file listing I
> retrieve, parsing it into a list of year, month, day, time, size, name
> and source by using string.split. I then use sting.join on each 'mini
> list'  into a formated line and place it in a 'master' list to use
> list.sort() and list.reverse for sorting by date last modified.
> 
Seems you might be better-advised to just retain the parsed-line list,
and keep those in a list-of-lists.  You can sort this just the same.

> When I retirieve (or examine) the elements within the list, I use
> string.split again to parse them, then use strimg,replace('[', '') for
> compariisons, or for output.

By retaining the original lists you can access the first element as
myList[0], the second as myList[1], and so on.  Since each subscript
holds a particular piece of information you could even set variables
to constant values, allowing you to refer to myList[FILENAME],
myList[DATE], and so on.

> I';ve got a routine that I use for removing them,. but I'd rather not
> add them in the first place. A single space delimiter would work.
> 

But, as I hope I've convinced you, retaining the list format would be
MUCH simpler.

> Is there a format that I can create my 'columns' without using the list
> format  ["", "", ""] with these characters that I have to keep removing?
> 

You are actually working with the string representations of lists rather
than the lists themselves, which is bound to create more work (as you have
observed).

> btw: I'm not complaining, I'm loving Python, it runs fast, quick to
> code, easy to read, I just feel bad about adding then removing the extra
> charaters.
> 
Yep, Python is great :-)

> Other python newbie questions are....
> Should I always return something from a function (note the return 1's)

A function with no return statement automatically returns None.  If
you really want a procedure this is fine, since Python will throw away
the unwanted result.

> Is use string.find on the whole string first, to see if I need to do a
> string.split, thinking it would be more efficient if most of the strings
> would not return a match.

Don't know.  But reular expresiions, while initially intimidating, are the
fastest way to see whether a string matches a pattern.  They also have the
advantage of making it quite easy to extract particular pieces from the
matched string.

> Am I doing my filter modification routine correctly (see below)
> list = filter('i', 'Jan", list) # only leaves items with Jan
> or
> list = filter('r', 'Jan", list) # to remove any item with Jan in it.
> 
> Here's some of my code snippits, suggestions welcome
> 
Some random suggestions below, but regular expressions are really what you
need...

> def cleanline(inline):
>     outline = string.replace(inline, "'", "")
>     outline = string.replace(inline, '"', '')
>     outline = string.replace(outline, ",", "")
>     outline = string.replace(outline, "[", "")
>     outline = string.replace(outline, "]", "")
>     return outline
> 
I suspect you might be interested in this:

>>> import string
>>> a = ["this", 'is', 'a', 'list']
>>> string.join(a)
'this is a list'
>>> string.join(a,"")
'thisisalist'
>>> 

Simpler and quicker, wouldn't you agree?

> def getdatelist(ftpline):
>     yr = m = dy = tm = 'z'  # set to error if not converted

yr = m = dy = tm = None # Would seem more Pythonic

>     tt = string.split(ftpline)  # returns list of year, month, day and
> time all numeric for sorting
>     yr = str(tt[7])
>     tp = string.find(yr, ':')  # current year, ftp shows time
>     if tp > -1:
>              tm = yr
>             yr = 2000
>     else:
>             tm = '00:00'   # use a dummy time of 00:00
>     dy = string.zfill(tt[6], 2)  # zero fill date for sorting
>     mo = tt[5]    # convert alpha month to number for sorting

mths = {'01': 'Jan', '02': 'Feb', ... , '12': 'Dec'} # As initialisation, once
...
>     if mo == 'Jan':
>         m = '01'
>     elif mo == 'Feb':
>         m = '02'
>     elif mo == 'Mar':
>         m = '03'
>     elif mo == 'Apr':
>         m = '04'
>     elif mo == 'May':
>         m = '05'
>     elif mo == 'Jun':
>         m = '06'
>     elif mo == 'Jul':
>         m = '07'
>     elif mo == 'Aug':
>         m = '08'
>     elif mo == 'Sep':
>         m = '09'
>     elif mo == 'Oct':
>         m = '10'
>     elif mo == 'Nov':
>         m = '11'
>     elif mo == 'Dec':
>         m = '12'
>     else:
>         m = '99'

try:
    m = mths[mo]
except KeyError:
    m = None

or, alternatively, you could test using mths.has_key(mo) in an if/else.

>     outlst = [yr, m, dy, tm]
>     return outlst
> 
> def addtoclist(thisline, pgmname, ppath):
>     global clist
>     pathstr = ' (' + ppath + ')'
>     tl = string.split(thisline)
>     q = string.find(thisline, pgmname)
>     if q > -1:   # Check for match in entire line
>      tln = str(tl[-1])
>      q = string.find(tln, pgmname)
>      if q > -1:   # make sure match was from Name portion of line
>         dl = getdatelist(thisline)
>         sz = string.rjust(str(tl[4]), 7) # the size
>         dl.append(sz)  # append the size to aid in comparison of files
>         dl.append(tln)  # append name onto list
>         dl.append(pathstr)
>         clist.append(str(dl)) # add to tlist for sorting
>     return 1  # default always return true
> 
> def filter(ftype, txt, chklist):    # after the initial display, this
> allows lines to be removed by supplying a string that is either
> included, or to be removed.
>     newlist = []                        # I want to modify the supplied
> chklist,
>     for l in chklist:
>         x = string.find(l, txt)
>         if ftype == 'i':
>             if x >= 0:
>                 newlist.append(l)
>         else:
>             if x < 0:
>                 newlist.append(l)
>     list = newlist[:]
>     return list
> 
> def showfound(curlist, title, count):    # count is a maximum number of
> items, if -1 display the entire list
>         tc = len(curlist)  # count of items in list
>         if tc == 0:
>             print title
>             print 'No records found '
>             return
>         if count == -1:   # a -1 means print the whole list
>             count = tc
>         if count > tc:   # if there are fewer items in list, reset count
> to
>             count = tc   # avoid a subscript error
>         i = 1
>         print title
>         while i:   # lists are relative 0 subscripts so we subtract 1
>             o = cleanline(curlist[i-1])    # remove list delimeters
>             i2 = string.rjust(str(i), 3)    # line number
>             print i2, o
>             i = i + 1
>             if i > count:
>                 break
>             if i % 20 == 0:
>                 c = raw_input("Press return to continue ") # pause every
> 20 items to view display
>                 if c == 'q'or c == 'Q':                     # q quits
> menu
>                     break
>                 print title
>         return 1
> 
> doyen at mediaone.net

And so on.  Overall this program gives the impression of having grown by
accretion -- it started out fairly simple, and then lots of bits got stuck
on, each with slightly conflicting goals.  Your command of low-level Python
is clearly quite good, but it may be better to treat the program so far as
a prototype.

Now you know what your real requirements are, re-design the data structures
using more appropriate representations and you may well find you get a
cleaner program.  Please note that none of this is intended to discourage:
keep it up!

regards
 Steve
--
"If computing ever stops being fun, I'll stop doing it"



More information about the Python-list mailing list