convert script awk in python

Avi Gross avigross at verizon.net
Wed Mar 24 11:22:02 EDT 2021


Cameron,

I agree with you. I first encountered AWK in 1982 when I went to work for
Bell Labs.

I have not had any reason to use AWK since before the year 2000 so I was not
sure that unused variables were initialized to zero. The code seemed to
assume that. I have learned quite a few languages since and after a while,
they tend to blend into each other. 

I think it would indeed have been more AWKthonic (or should that be called
AWKward?) to have a BEGIN section in which functions were declared and
variables clearly initialized but the language does allow some quick and
dirty ways to do things and clearly the original programmer used some.

Which brings us back to languages like python. When I started using AWK and
a slew of other UNIX programs years ago, what I found interesting is how
much AWK was patterned a bit on the C language, not a surprise as the K in
AWK is Brian Kernighan who had a hand in C. But unlike C that made me wait
around as it compiled, AWK was a bit more of an interpreted language and I
could write one-liner shell scripts (well, stretched over a few lines if
needed) that did things. True, if you stuck an entire program in a BEGIN
statement and did not actually loop over data, it seems a tad wasteful. But
sometimes it was handy to use it to test out a bit of C code I was writing
without waiting for the whole compile thing. In a sense, it was  bit like
using the python REPL and getting raid feedback. Of course, when I was an
early adopter of C++, too many things were not in AWK!

What gets me is the original question which made it sound a bit like asking
how you would translate some fairly simple program from language A to
language B. For some fairly simple programs, the translation effort could be
minimal. There are often trivial mappings between similar constructs. Quite
a bit of python simply takes a block of code in another language that is
between curly braces, and lines it up indented below whatever it modifies
and after a colon. The reverse may be similarly trivial. There are of course
many such changes needed for some languages but when some novel twist is
used that the language does not directly support, you may need to innovate
or do a rewrite that avoids it. But still, except in complicated
expressions, you can rewrite x++ to "x += 1" if that is available or "x = x
+ 1" or "x -> x + 1" or whatever.

What gets me here is that AWK in his program  was being used exactly for
what it was designed. Python is more general-purpose. Had we been asked (not
on this forum) to convert that AWK script to PERL, it would have been much
more straightforward because PERL was also designed to be able to read in
lines and break them into parts and act on them. It has constructs like the
diamond operator or split that make it easy.

Hence, at the end, I suggested Tomasz may want to do his task not using just
basic python but some module others have already shared that emulates some
of the filter aspects of AWK. That may make it easier to just translate the
bits of code to python while largely leaving the logic in place, depending
on the module.

Just to go way off the rails, was our annoying cross-poster from a while
back also promising to include a language like AWK into their universal
translator by just saving some JSON descriptions?

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On
Behalf Of Cameron Simpson
Sent: Tuesday, March 23, 2021 6:38 PM
To: Tomasz Rola <rtomek at ceti.pl>
Cc: Avi Gross via Python-list <python-list at python.org>
Subject: Re: convert script awk in python

On 23Mar2021 16:37, Tomasz Rola <rtomek at ceti.pl> wrote:
>On Tue, Mar 23, 2021 at 10:40:01AM -0400, Avi Gross via Python-list wrote:
>[...]
>> I am a tod concerned as to where any of the variables x, y or z have 
>> been defined at this point. I have not seen a BEGIN {...} 
>> pattern/action or anywhere these have been initialized but they are 
>> set in a function that as far as I know has not been called. Weird. 
>> Maybe awk is allowing an uninitialized variable to be tested for in 
>> your code but if so, you need to be cautious how you do this in python.
>
>As far as I can say, the type of uninitialised variable is groked from 
>the first operation on it. I.e., "count += 1" first initializes count 
>to 0 and then adds 1.
>
>This might depend on exact awk being used. There were few of them 
>during last 30+ years. I just assume it does as I wrote above.

I'm pretty sure this behaviour's been there since very early times. I think
it was there when I learnt awk, decades ago.

>Using BEGIN would be in better style, of course.

Aye. Always good to be up front about initial values.

>There is a very nice book, "The AWK Programming Language" by Aho, 
>Kernighan and Weinberger. First printed in 1988, now free and in pdf 
>format. Go search.

Yes, a really nice book. [... walks into the other room to get his copy ...]
October 1988.

Wow. There're 11 pages of good example programmes before any need for user
variables at all. But at "1.5, Counting" is the sentence:

    Awk variables used as numbers begin life with the value 0, so we 
    don't need to initialise emp.

Which is great for writing ad hoc scripts, particularly on the command line.
But not a great style for anything complex.

Cheers,
Cameron Simpson <cs at cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list