[Tutor] 2016-02-01 Filter STRINGS in Log File and Pass as VARAIBLE within PYTHON script

Cameron Simpson cs at zip.com.au
Tue Feb 2 18:36:40 EST 2016

On 02Feb2016 21:14, knnleow GOOGLE <knnleow at gmail.com> wrote:
>Sorry, forget to make use of SET() ....... this is the new update.....
>appreciate your advice if we can still optimized further...

A few remarks, interleaved below.

>myArray = sys.argv

I would not make this a global. Instead, pass sys.argv to main at the bottom 
where you call main:


and set up main as:

  def main(argv):

>def checkInputs():
>        if('-date' not in myArray):
>                #print(__doc__)
>                print('''

Also pass argv to checkInputs, so:

  def checkInputs(argv):
    if '-date' not in argv:

BTW, this is not C or Perl, you don't need brackets around your "if" 

Also, have a read of PEP 8:


Although it is the style guide for the standard library, most of its 
recommendations are followed by most Python code. I have in mind the section on 
function names; in Python it is common to name functions and variables using 
lowercase_with_underscores, so one would normally call "checkInputs" then name 

>USAGE:    python fail2ban-banned-ipAddress.py -date <YYYY-MM-DD>
>EXAMPLE:  python fail2ban-banned-ipAddress.py -date 2016-01-31
>                ''')
>                sys.exit(1)
>def main():

As remarked, go for:

  def main(argv):

>        try:
>                checkInputs()

and then:


>                myDate = myArray[myArray.index('-date') + 1]

It is better to properly examine the argument list.

For example (for something this simple), I often go:

  cmd = argv.pop(0)
  badopts = False
  if not argv:
    print("%s: missing options" % (cmd,), file=sys.stderr)
    badopts = True
    option = argv.pop(0)
    if option != '-date':
      print("%s: unrecognised option: %s" % (cmd, option)", file=sys.stderr)
      badopts = True
    elif not argv:
      print("%s: %s: missing date" % (cmd, option), file=sys.stderr)
      badopts = True
      my_date = argv.pop(0)
      ... check date for sanity here ...

  if argv:
    print("%s: extra arguments: %r" % (cmd, argv), file=sys.stderr)
    badopts = True

  if badopts:
    print(USAGE, file=sys.stderr)
    return 2

  ... proceed with main program here ...

See how it checks for options and their arguments, and has the scope to make 
many complaints before quitting?

If you have several options you will want to reach for a module like argparse, 
but your program has only one.

>                timestamp01 = time.strftime("%Y-%m-%d")
>                timestamp02 = time.strftime("%Y-%m-%d-%H%M%S")
>                wd01 = ("/var/tmp/myKNN/1_mySAMPLEpython-ver-001/" + 
>                wd02 = ("/var/tmp/myKNN/1_mySAMPLEpython-ver-001/" + 

You never use these. Also, these pathnames are very special looking; I would 
put them up the top as tunable constants (Python doesn't have constants, but it 
has an idiom of globals with UPPERCASE names for the same purpose).

>                # LOOP through the SET and WHOIS
>                for i in banIP_addrs:

You might do better to use a better name than "i", for example "ipaddr". More 
readable to others, and also to yourself later.

Also, you _may_ want to sort the addresses for reporting purposes (entirely 
your call), you could go:

  for ipaddr in sorted(banIP_addrs):

>                        print("i:", i)
>                        whoisVAR = os.popen("whois -H " + i + " |egrep 
>-i \"name|country|mail\" |sort -u").read()

Again, here you are running a shell pipeline which could be done in Python.  
Look at the subprocess module, and invoke:

  whois = subprocess.Popen(['whois', '-H', ipaddr], stdout=subprocess.PIPE)
  for line in whois:
    ... gather up name/country/mail as you did with the log file ...
  ... now print report ...

Cameron Simpson <cs at zip.com.au>

More information about the Tutor mailing list