Simple code and suggestion
Jussi Piitulainen
jussi.piitulainen at helsinki.fi
Wed Nov 30 09:01:38 EST 2016
g thakuri writes:
> I would want to avoid using multiple split in the below code , what
> options do we have before tokenising the line?, may be validate the
> first line any other ideas
>
> cmd = 'utility %s' % (file)
> out, err, exitcode = command_runner(cmd)
> data = stdout.strip().split('\n')[0].split()[5][:-2]
That .strip() looks suspicious to me, but perhaps you know better.
Also, stdout should be out, right?
You can use io.StringIO to turn a string into an object that you can
read line by line just like a file object. This reads just the first
line and picks the part that you want:
data = next(io.StringIO(out)).split()[5][:-2]
I don't know how much this affects performance, but it's kind of neat.
A thing I like to do is name all fields even I don't use them all. The
assignment will fail with an exception if there's an unexpected number
of fields, and that's usually what I want when input is bad:
line = next(io.StringIO(out))
ID, FORM, LEMMA, POS, TAGS, WEV, ETC = line.split()
data = WEV[:-2]
(Those are probably not appropriate names for your fields :)
Just a couple of ideas that you may like to consider.
More information about the Python-list
mailing list