Splitting a list of strings

Sean Ross sross at connectmail.carleton.ca
Tue Sep 17 19:59:54 EDT 2002


Hi.
I need to read a list of strings from a file, weed out the comments and
separate "attribute" descriptions from the data. I have a working
implementation, but I'm interested in finding out if there's a better way to
do it.

Here's the code:

def getDataAndAttributes(filename):
    "Filters *.arff file into lists of data and attribute strings"
    lines = open(filename).readlines()

    # All lines beginning with "@'" are attributes, except the last one.
    attributes = [str for str in lines if str[0] == "@"]
    # Remove attributes from lines, so we don't process them again to find
data.
    lines = [str for str in lines if not str in attributes]
    attributes = attributes[:-1]

    # Filter comments out to get data.
    data = [str for str in lines if not str[0]=="%" ]

    return (data, attributes)

What I'm interested in knowing is, can I extract all of the "attribute"
strings from lines in such a way that I get my list of "attribute" strings,
but all of those strings have been removed from lines as well, all at the
same time? i.e., without filtering the list a second time. As I've stated,
the code above works, but I'm concerned that it may be sub-optimal.

Thanks, in advance, for your suggestions,
Sean Ross





More information about the Python-list mailing list