Considering migrating to Python from Visual Basic 6 for engineering applications
Peter Otten
__peter__ at web.de
Fri Feb 19 09:58:13 EST 2016
Tim Chase wrote:
> On 2016-02-19 02:47, wrong.address.1 at gmail.com wrote:
>> 2 12.657823 0.1823467E-04 114 0
>> 3 4 5 9 11
>> "Lower"
>> 278.15
>>
>> Is it straightforward to read this, or does one have to read one
>> character at a time and then figure out what the numbers are? --
>
> It's easy to read. What you do with that mess of data is the complex
> part. They come in as byte-strings, but you'd have to convert them
> to the corresponding formats:
>
> from shlex import shlex
> USE_LEX = True # False
> with open('data.txt') as f:
> for i, line in enumerate(f, 1):
> if USE_LEX:
> bits = shlex(line)
> else:
> bits = line.split()
> for j, bit in enumerate(bits, 1):
> if bit.isdigit():
> result = int(bit)
> t = "an int"
> elif '"' in bit:
> result = bit
> t = "a string"
> else:
> result = float(bit)
> t = "a float"
> print("On line %i I think that item %i %r is %s: %r" % (
> i,
> j,
> bit,
> t,
> result,
> ))
>
> The USE_LEX controls whether the example code uses string-splitting
> on white-space, or uses the built-in "shlex" module to parse for
> quoted strings that might contain a space. The naive way of
> string-splitting will be faster, but choke on string-data containing
> spaces.
>
> You'd have to make up your own heuristics for determining what type
> each data "bit" is, parsing it out (with int(), float() or whatever),
> but the above gives you some rough ideas with at least one known
> bug/edge-case.
Or just tell the parser what to expect:
$ cat read_data_shlex2.py
import shlex
CONVERTERS = {
"i": int,
"f": float,
"s": str
}
def parse_line(types, line=None, file=None):
if line is None:
line = file.readline()
values = shlex.split(line)
if len(values) != len(types):
raise ValueError("Too few/many values %r <-- %r" % (types, values))
return tuple(CONVERTERS[t](v) for t, v in zip(types, values))
with open("data.txt") as f:
print(parse_line("iffii", file=f))
print(parse_line("iiiii", file=f))
print(parse_line("s", file=f))
print(parse_line("fsi", file=f))
print(parse_line("ff", file=f))
$ cat data.txt
2 12.657823 0.1823467E-04 114 0
3 4 5 9 11
"Lower"
1.2 "foo \" bar \\ baz" 42
278.15
$ python3 read_data_shlex2.py
(2, 12.657823, 1.823467e-05, 114, 0)
(3, 4, 5, 9, 11)
('Lower',)
(1.2, 'foo " bar \\ baz', 42)
Traceback (most recent call last):
File "read_data_shlex2.py", line 24, in <module>
print(parse_line("ff", file=f))
File "read_data_shlex2.py", line 15, in parse_line
raise ValueError("Too few/many values %r <-- %r" % (types, values))
ValueError: Too few/many values 'ff' <-- ['278.15']
$
But we can't do *all* the work for you ;-)
If this thread goes long enough eventually we will ;)
More information about the Python-list
mailing list