[Novalug] command line option/file ordering

Mike Miller mtmiller@ieee.org
Tue May 18 16:21:44 EDT 2010


On Tue, May 18, 2010 at 11:07 AM, Jon LaBadie <novalugml@jgcomp.com> wrote:
> The recent discussion of grep caused me to experiment on
> the command line and I came up with a surprise (at least
> to this old UNIX-head) about command line ordering.
> Maybe someone can tell me when and why linux and unix
> diverged in this respect.  And how to tell if/when
> command line ordering is important.

This is one of the many examples of GNU vs. POSIX, I like to make that
distinction rather than Linux vs. UNIX.  POSIX is a set of standards that GNU
(libraries, utilities, etc) does attempt to conform to, but in some cases
provides extensions which are the default on a GNU system.  Command-line
shuffling is one of those extensions provided by the getopt(3) function in
GNU's implementation of libc.  I believe all of the "standard" utilities use
the same getopt so the behavior is consistent.

> The SYNOPSIS section of a man page is supposed to document
> the valid form(s) of the command line.  Some samples:
>
>  wc [OPTION]... [FILE]...
>  ls [OPTION]... [FILE]...
>  grep [options] PATTERN [FILE...]
>  grep [options] [-e PATTERN | -f FILE] [FILE...]
>
> I learned to read this (particularly the first three)
> as "options come before the filenames" (and the PATTERN
> for grep).  A unix example:
>
>  $ ls .profile -l
>  -l: No such file or directory
>  .profile
>  $
>
> But with the gnu utilities it seems options and files
> can be in basically and order and mixed.

The magic word is POSIXLY_CORRECT.  If that environment variable is defined,
it is supposed to be recognized by all GNU libraries, utilities, and so on, to
disable the GNU extensions and more closely match the behavior dictated by
POSIX.  This is documented sparsely here and there in various man and info
pages.  For getopt, it does what you want.

The POSIX rule says to stop when it hits the first argument that doesn't look
like an option, or "--" by itself, which indicates "end of options".  The GNU
behavior allows non-option arguments to essentially be bubble-sorted up
through the argument list so all options are handled regardless of position
and all non-options are left at the end, with their relative order to each
other preserved.

Here's your ls example on the box I'm on now (RHEL 5.2):

    $ ls .bash_profile -l
    -rw-r----- 1 mike users 603 Apr 27 13:27 .bash_profile
    $ export POSIXLY_CORRECT=y
    $ ls .bash_profile -l
    ls: -l: No such file or directory
    .bash_profile
    $

In fact, if you invoke bash as "sh" it automatically sets the POSIXLY_CORRECT
variable in its environment.  Set, but not exported, so it only affects bash's
internal behavior.  If you run "/bin/sh" and export POSIXLY_CORRECT, you'll
have a shell environment that's more POSIX-like including all the standard
utilities.

> Also, I would have read the 4th SYNOPSIS example above
> to mean that instead of PATTERN alone, I could use
> "-e PATTERN" OR "-f FILE".  Note the OR denoted by the
> "|".  In testing I find that should read AND/OR because
> both the -e and the -f can be used on the same command
> line and are additive.  Further, they can be used
> multiple times mixed in with datafile names.

I agree, the additive property of these options is not explicitly stated in
this man page.  Not the first or last piece missing from a man page.  When in
doubt fall back to trial and error :)

-- 
mike :: mtmiller at ieee dot org



More information about the Novalug mailing list