Forth as a scripting language

Albert van der Horst albert at spenarnc.xs4all.nl
Wed Aug 14 10:11:41 CEST 2002


In article <just-3D93E5.19410108082002 at news1.xs4all.nl>,
Just  <just at xs4all.nl> wrote:
>In article <o58uia-q8q.ln at drebbelstraat20.dyndns.org>,
> Mart van de Wege <mvdwege.usenet at drebbelstraat20.dyndns.org> wrote:
>
>> For example, this is what I recently did to extract all IPs from my
>> access.log:
>>
>> ----- BEGIN SCRIPT -----
>> #!/usr/bin/perl
>>
>> use warnings; # Make Perl picky about syntax.
>> use strict; # Make Perl *really* picky.
>>
>> my @iplist; # Declare an array to hold all IP addresses.
>>
>> open (FILE, '/var/log/apache/access.log');
>>
>> while (<FILE>) {
>>      /^(\d+\.\d+\,\d+\.\d+)?/;
>>      next unless $1; # Skip if the first field is somehow empty.
>>      next if $1 eq '127.0.0.1'; # Skip localhost.
>>      push @iplist, $1;
>> }
>> # @iplist now holds all IPs in the first field of access.log.

Full of idiosyncracies :
   < > around FILE
   / / round a reg exp
   reg exp working on $_ (uck)
   next construction
   unless is a superfluous language embellishment

>>
>> ----- END SCRIPT -----
>>
>> Python can do this too of course, but somehow this is the sort of task
>> that comes naturally to me in Perl. Note the use of the regexp:
>>
>> 1. I don't have to explicitly declare and compile it.
>> 2. It operates on the default input variable ($_), so I don't have to
>> specify its target, I just use a bare regexp.
>
>
>import re
>
>iplist = []
>
>for line in open("/var/log/httpd/access_log"):
>   m = re.match(r"^(\d+\.\d+\.\d+\.\d+)", line)
>   if m:
>      ip = m.group(1)
>      if ip != "127.0.0.1":
>         iplist.append(ip)
>
>
>I don't thinkthat's significantly worse (or better...) than the Perl
>version?

I think it is clearly superior!
- There is just the amount of code to keep track of what is going on
- It properly reflects the logic of the program
- Constructions are consistent and evocative, even without
  studying the ``re'' module.

This is the way I want scripting done in Forth.

It would work out approximately this way:
---------------

REQUIRE RE-MATCH
REQUIRE COMPARE
REQUIRE SET

1000 SET Iplist  Iplist SET!

: 2SET+! >R SWAP R@ SET+! R> SET+! ;

"/var/log/httpd/access_log" GET-FILE $DO
$LINE "^(\d+\.\d+\.\d+\.\d+)" RE-MATCH IF
        \1 "127.0.0.1" COMPARE 0= IF \1 Iplist 2SET+! THEN
    THEN
$LOOP
---------------

This is admittedly inferior to Python, but not by too much
given the fact that Forth is such a light weight language.

Note that this doesn't work without slurping the file.
The pointers stored in Iplist would be worthless if pointing
to a fixed buffer reused all the time by READ-LINE.

Extension to be built in Forth in this behalf:
1. SET (can be borrowed from ciforth, a one-screener)
2. interpretive loops (idem)
3. GET-FILE (also known as SLURP, rather common)
4. $DO $LINE $LOOP ( To be done)
5. RE-MATCH  ( To be done)

Convention of usage of reg expr in Forth however have not yet
converged (in 20+ years. How long did it take in Python?)
The work of (at least) Marcel Hendrix and Putka has almost
the regular expression matching such as wanted there.

This article will appear once in linux.advocacy. Those
who are interested in languages issues can follow the thread
in one of the other groups.

Forthers are kindly requested to trim python / perl from the
Newsgroups: as soon as the article is no longer on topic in that
other group.

Groetjes Albert
-- 
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
To suffer is the prerogative of the strong. The weak -- perish.
albert at spenarnc.xs4all.nl     http://home.hccnet.nl/a.w.m.van.der.horst




More information about the Python-list mailing list