[getopt-sig] More about commands on the command line

David Boddie david@sleepydog.net
Fri, 8 Mar 2002 12:55:49 +0000


[I've only quoted the text I wanted to reply to, so this may appear
quite disjointed in certain places.]

On Friday 08 Mar 2002 10:59 am, A.T. Hofkamp wrote:

> Looking at the wild variety of command-lines for all programs, I'd say
> there is not much fundamental understanding of what is good or bad, and
> why. I suspect that 99.9% percent of the programs choose something 'because
> that and that program also do it', or 'because that is what my option
> processing package assumes', not because they know it is the best approach.

I imagine that in many cases the syntax for the arguments passed to the
program is dictated both by the ease of parsing those arguments and the
type of functionality offered by the program. Therefore, I suspect that
we see something of the internal operation of utilities such as "tar"
and "rpm" in their syntax definitions.

> I consider my experiments as a way of gaining knowledge about the option
> processing problem, so that we can weigh the pros and cons well, rather
> than blindly adopting some standard because it just seems nice (or because
> 'everybody does it') without knowing the consequences and the alternatives
> (e.g. what does option processing look like if we do want to be able to
> handle 'cvs commit'-like command lines).

We give the parser the ability to parse different styles of command lines.

> It makes options and (command) more equal citizens. Except for the special
> treatment of for example -spam (which may be magically interpreted as '-s
> -p -a -m'), commands and options have equal status.

This would depend on the style of command line which you are asking the
parser to deal with. For example, -spam may be interpreted as

1. An argument, not an option.
2. A single option: -spam
3. A number of options: -s -p -a -m
4. A single option with a following argument: -s pam
5. Some other combination of options and arguments.

Although we can hack option libraries to deal with some of these in
a natural manner and cope with the others as special cases, I propose
that to remove ambiguity a given style of command line would not mix
these option styles.

So, the command line

    -spam -viking -longship

could not be interpreted as

    -spam -v -i -k -i -n -g -l ongship

or any other confused input.

> It is true that you can parse cvs-like command lines with multiple
> instances of parser, but it is a work-around rather than a proper solution.
> I mean, you fix the problem by writing a solution around the limitations of
> the option processing package.

I agree. We are in danger of rewriting options packages to deal with many
special cases rather than addressing the more general problem.

Indeed, in the cvs-like case, the complexity of the command line syntax
is being "passed upwards" to the programmer, who then may have to perform
simple syntax checking on command lines.

> The main reason for pursuing a `real' solution is that I have learned that
> code that relies on work-arounds tends to have some basic assumption that
> isn't true, at least not in all cases. A solution that can really copy with
> the situation does not have that assumption, and is thus a more generic
> solution to the problem.

We need to specify our requirements for such a solution, but not make it
too general.

> * With the equal status of commands and options, I can have commands that
> act as options, like 'cvs verbose commit'. Maybe this is not normal now,
> but can anybody give me a good reason why 'verbose' is bad, and '--verbose'
> or '-v' is not ? At least, 'verbose commit' looks more intuitive and less
> technical to me, which may be a + for non computer-experts (until now, I
> cannot give a good reason to such people why we need to write a '-' in
> front of options rather than my example).

With command line styles you could allow "verbose", "-verbose" or
"--verbose", but a mixture of these might prove problematic. You could
equally well allow both "-v" and "+v" and have them mean the same thing,
or different things.

> * With the equal status of commands and options, I can have optional
> dashes, like in 'tar xzf myfile.tgz'. Not pretty and not recommanded, but
> it fits in my solution without major head aches.

Without special characters to denote options, parsing would be slightly
more difficult in this case. I imagine that the position of the options
is important in the case of "tar", so it may be a special case command
line with positional options/commands.

Certainly, in extreme cases of this sort of command line, there is plenty
of scope for ambiguity.

> I consider it advantageous to have the more generic solution. I learned a
> few things, and I have more power to do things like I want rather than
> being forced by the option processing package.
> Sooner or later, somebody will need that power.

I think that we should be clear on what an option processing package
should contain, and make it sufficiently modular to allow users to
leave out or replace features they don't want or need.

The package should:

1. Parse the command line, possibly using an appropriate style definition.

2. Check the input against a syntax definition to prevent invalid or
   ambiguous input. Optional extras include:

  a. Correcting the user's input using the syntax definition and
     confirming the corrections with the user.

  b. Providing a more specific error message to the error encountered.

  c. Clarifying what the user meant in cases of ambiguous syntax
     definitions or input.

3. Extract the values given by the user and make them available to the
   programmer in a useful form.

4. Convert values according to type declarations.

The first feature would resolve any debate over the preferred style of
command line to support. It would leave only a debate on what should be
the default style. <wink>

I haven't seen much enthusiasm for the second feature so far, although I
would find it quite useful. It would allow one-shot parsing which
produces either a collection of values or an exception, depending on
whether a successful match was found.

The last two features are already present in many existing packages,
but I imagine that there is some scope for allowing different ways of
presenting values to the programmer.

> I hope to have made clear that I haven't yet reached the point where I
> consider everything 'understood', although the number of obscure points is
> getting smaller. I think there is still progress in the understanding. I
> thought that sharing the experiments was nice, but apparently not everybody
> shares that opinion.

Although I don't have the time to compare lots of libraries, I appreciate
the discussion of ideas. I feel that without discussion we could end up
with a library which suits a particular way of thinking without solving
some of the more fundamental problems involving command lines.

This wouldn't be too bad, but I'm sure that many people would then go
back to writing their own parsers as a result.

> The discussion of what should and should not be part of the option
> processing package is a seperate discussion to me. I can imagine that my
> generic aproach looks very wild, and seems to be wildly outside what is
> considered 'option processing'. On the other hand, there does seem to be a
> need for something stronger than what e.g. Optik delivers by default. That
> 'something stronger' is currently in the form of a work-around, which
> happens to function for some cases (like cvs). It does not handle all
> cases, and neither is there any hope that it ever will in its current form.

I believe that we shouldn't build an option processing package on a case
by case basis.

David

________________________________________________________________________
This email has been scanned for all viruses by the MessageLabs SkyScan
service. For more information on a proactive anti-virus service working
around the clock, around the globe, visit http://www.messagelabs.com
________________________________________________________________________