[Tutor] getopt options

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Sun, 11 Feb 2001 03:58:48 -0800 (PST)


On Sun, 11 Feb 2001, Tom Connor wrote:

> I'm trying to find a tutorial for getopt. But most things seem to refer
> to Unix or some other source not suitable for my current level of
> understanding.  I find the Python documentation pretty cryptic.  Does
> anyone have suggestions where I should go to improve my understanding of
> getopt and it's options?

Good evening; you're probably looking at the reference material for the
getopt module here:

    http://python.org/doc/current/lib/module-getopt.html

[note: this message is very long, and will look at the very last example
in some detail.  Apologies in advance.]

Could you tell us where the reference material starts sounding funny?  We
can help interpret the examples on the bottom.

Let's take a look at the very last example that the reference manual
brings up.  I'll simplify the example so we don't look at the "long"
argument stuff.  We'll be looking at variations of this:

###
    opts, args = getopt.getopt(sys.argv[1:], "ho:")
###

getopt() will return two lists: "opts' will cotain all the option
name-value pairs that we're looking for, while the "args" will contain
anything that it doesn't know about.  When we use getopt(), we need to
tell it what options we're paying attention to.

Let's take a look at the second argument to that getopt() call:

    "ho:"

This means that our program will expect to see at most 2 types of
"short" one-character things.  For example, we could pass the following
arguments:

    -h
    -h -oOutput
    -h -o Output
    -oRabbit

which should be all legal.  That colon in front of the 'o' means that we
expect it to take in an additional argument, so getopt will suck the very
next word in front of any "-o"'s.  If we try to use the '-o' option
without something in front, getopt() will respond with an error.  That's
the theory, at least.  *grin*  Let's put it into practice.


Interpreter time.

###
## Case 1
>>> getopt(['-h', '-o', 'Object'], 'ho:')
([('-h', ''), ('-o', 'Object')], [])
###

In this case, we've probably sent getopt the following command line:

    some_program_name -h -o Object

Python will automagically parse out the arguments as the list sys.argv.  
We see that getopt returns back to us a tuple of two things.  The first
contains all the options.  The options themselves have an interesting
structure: each "option" is a 2-tuple:

    [('-h', ''), ('-o', 'Object')]

But why doesn't it do this instead:

    [-h, ('-o', 'Object')]   ?

Isn't this more efficient?  The reason getopt does it with 2-tuples always
is because it's a matter of consistency.  When we write programs to figure
out what options have turned on, it's easy if we can expect getopt to
return something with a very uniform structure.  If we look later at the
code:

###
    for o, a in opts:
###

we expect to place the option name in 'o' and the argument value in 'a';
we wouldn't be able to do this unless we were absolutely sure that every
element is a 2-tuple; otherwise, it wouldn't be able to unpack the tuple
properly.


Let's take a look at another call:

###
## Case 2
>>> getopt(['-o=Object'], 'ho:')
([('-o', '=Object')], [])
###

Here, we see that the options are optional; even though we expect to see
'-h' or '-o', nothing in getopt will break if we leave one of the options
off alone.  However, getopt is equipped to recognize when an option is
incomplete.  That's the next case:


###
## Case 3
>>> getopt(['-o'], 'ho:')
Traceback (innermost last):
# [edited for brevity]
getopt.error: option -o requires argument
###

Here, since '-o' needs to have something in front, getopt() will
ultimately fail and complain with an exception.  This error reporting is
actually useful, though, because we can use exception handling to respond
appropriately to these situations.  This message is too long already, so I
won't talk about exception handling for now.



Here's a tricky case:
###
## Case 4
>>> getopt(['-homer'], 'ho:')
([('-h', ''), ('-o', 'mer')], [])
###

What's going on?  The trick is that, in UNIX tradition, when we put
something like:

    -homer

we really mean:

    -h -o -m -e -r

as shorthand...  That is, unless -o is an option that sucks the next word
as its argument value.  Since we've defined -o as such, that's why 'mer'
becomes the argument to '-o'.  We can see this more clearly with another
example:

###
>>> getopt(['-abc'], 'abcd')
([('-a', ''), ('-b', ''), ('-c', '')], [])
###



Finally, here's a wacky case:

## Case 5
>>> getopt(['-h', 'radish', '-o=Object'], 'ho:')
([('-h', '')], ['radish', '-o=Object'])
###

What's happening?  You might be wondering, why didn't '-o=Object' get
parsed out into:

    (-o, Object)   ?? 

The reason is because all options need to come _before_ anything that
looks like a regular argument (like a filename).  So: ['-h', '-o=Object',
'radish'] would have worked normally.  Options can be irritating that way,
but that's how they're defined: as soon as getopt starts to see arguments
that don't look like options, it will disregard the rest and stop parsing.

If you have any questions, please feel free to ask; it's much too quiet in
this mailing list.  *grin*