generic text read function
Michael Spencer
mahs at telcopartners.com
Fri Mar 18 01:11:34 EST 2005
John Hunter wrote:
>>>>>>"les" == les ander <les_ander at yahoo.com> writes:
>
>
> les> Hi, matlab has a useful function called "textread" which I am
> les> trying to reproduce in python.
>
> les> two inputs: filename, format (%s for string, %d for integers,
> les> etc and arbitary delimiters)
>
Builing on John's solution, this is still not quite what you're looking for (the
delimiter preference is set for the whole line as a separate argument), but it's
one step closer, and may give you some ideas:
import re
dispatcher = {'%s' : str,
'%d' : int,
'%f' : float,
}
parser = re.compile("|".join(dispatcher))
def textread(iterable, formats, delimiter = None):
# Splits on any combination of one or more chars in delimeter
# or whitespace by default
splitter = re.compile("[%s]+" % (delimiter or r"\s"))
# Parse the format string into a list of converters
# Note that white space in the format string is ignored
# unlike the spec which calls for significant delimiters
try:
converters = [dispatcher[format] for format in parser.findall(formats)]
except KeyError, err:
raise KeyError, "Unrecogized format: %s" % err
format_length = len(converters)
iterator = iter(iterable)
# Use any line-based iterable - like file
for line in iterator:
cols = re.split(splitter, line)
if len(cols) != format_length:
raise ValueError, "Illegal line: %s" % cols
yield [func(val) for func, val in zip(converters, cols)]
# Example Usage:
source1 = """Item 5 8.0
Item2 6 9.0"""
source2 = """Item 1 \t42
Item 2\t43"""
>>> for i in textread(source1.splitlines(),"%s %d %f"): print i
...
['Item', 5, 8.0]
['Item2', 6, 9.0]
>>> for i in textread(source2.splitlines(),"%s %f", "\t"): print i
...
['Item 1 ', 42.0]
['Item 2', 43.0]
>>> for item, value in textread(source2.splitlines(),"%s %f", "\t"): print
item, value
...
Item 1 42.0
Item 2 43.0
>>>
More information about the Python-list
mailing list