how to extract columns like awk $1 $5
Roy Smith
roy at panix.com
Sat Jan 8 00:19:08 EST 2005
Dan Valentine <nobody at invalid.domain> wrote:
> On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:
>
> > Is there a simple way to extract words speerated by a space in python
> > the way i do it in awk '{print $4 $5}' . I am sure there should be some
> > but i dont know it.
>
> i guess it depends on how faithfully you want to reproduce awk's behavior
> and options.
>
> as several people have mentioned, strings have the split() method for
> simple tokenization, but blindly indexing into the resulting sequence
> can give you an out-of-range exception. out of range indexes are no
> problem for awk; it would just return an empty string without complaint.
It's pretty easy to create a list type which has awk-ish behavior:
class awkList (list):
def __getitem__ (self, key):
try:
return list.__getitem__ (self, key)
except IndexError:
return ""
l = awkList ("foo bar baz".split())
print "l[0] = ", repr (l[0])
print "l[5] = ", repr (l[5])
-----------
Roy-Smiths-Computer:play$ ./awk.py
l[0] = 'foo'
l[5] = ''
Hmmm. There's something going on here I don't understand. The ref
manual (3.3.5 Emulating container types) says for __getitem__(), "Note:
for loops expect that an IndexError will be raised for illegal indexes
to allow proper detection of the end of the sequence." I expected my
little demo class to therefore break for loops, but they seem to work
fine:
>>> import awk
>>> l = awk.awkList ("foo bar baz".split())
>>> l
['foo', 'bar', 'baz']
>>> for i in l:
... print i
...
foo
bar
baz
>>> l[5]
''
Given that I've caught the IndexError, I'm not sure how that's working.
More information about the Python-list
mailing list