Option -o Conventions? - Linux

This is a discussion on Option -o Conventions? - Linux ; Greetings, I've always thought that outfile specification in Unix utilities could be better standardized or implemented ... somehow. For example, I am writing a little utility of the form: myprog -o outfile infile where the -o option is optional (as ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Option -o Conventions?

  1. Option -o Conventions?



    Greetings,

    I've always thought that outfile specification in Unix utilities could
    be better standardized or implemented ... somehow.

    For example, I am writing a little utility of the form:

    myprog -o outfile infile

    where the -o option is optional (as this output may not always be
    wanted by the user). If outfile or infile is "-", then write or
    read to standard output or from standard input, respectively.
    So far, so good.

    However, what is a good convention to use to allow the user to specify
    that the input file is to be overwritten (via the internal use of a
    mkstemp file) by the output?

    One idea that occurred to me is to allow something like:

    myprog -o + infile

    to specify that the input file is to be overwritten. There seems to be
    a need for another character in addition to "-" for special argument
    signaling. Any good suggestions other than "+"?

    Another approach is to use -o with an *optional* argument, which if
    absent, would signify the overwriting of infile. However, I find
    the standard GNU getopt() handling of such optional option arguments
    to be a tad creepy:

    myprog -o infile

    Note that in this case getopt does not consider infile to be an
    argument of -o as an "=" between them would be required for that.

    or

    myprog -o= infile

    but this approach would require the use of the "=" always when
    a real argument is needed:

    myprog -o=outfile infile

    Sigh.
    Yet another approach would be to use an alternate set of options
    that do not take arguments to handle the overwrite case:

    myprog -O infile


    Are there any good precedents for cases like this?

    Bear in mind that I actually have multiple output files each of
    which should allow for independent specification.



    Thanks in advance for any advice,

    Mike Shell

  2. Re: Option -o Conventions?

    Michael Shell wrote:
    > One idea that occurred to me is to allow something like:
    >
    > myprog -o + infile
    >
    > to specify that the input file is to be overwritten.


    How about just:

    myprog -o infile infile

    with appropriate support within myprog, of course.


  3. Re: Option -o Conventions?

    Hi,

    >>One idea that occurred to me is to allow something like:

    ....
    >>to specify that the input file is to be overwritten.

    > How about just:
    >
    > myprog -o infile infile

    ....

    You'll end up with infile being a two-byte-file - you once you start
    overwriting infile, EOF is set appropriately.

    There are three work-arounds: (1) use fseek and fset and a "w+" opened
    file - which is very much work. (2) write to infile.tmp in your program
    and call "mv infile.tmp infile" at the end of the program or (3) skip
    the whole affair and call your program with

    # myprog -o tmp infile ; mv tmp infile

    on the shell.

    Ciao...

  4. Re: Option -o Conventions?

    On 17 Oct 2006 10:49:56 -0700
    "Kaz Kylheku" wrote:

    > How about just:
    >
    > myprog -o infile infile
    >
    > with appropriate support within myprog, of course.


    On Wed, 18 Oct 2006 09:39:57 +0200
    Bernhard Agthe wrote:

    : You'll end up with infile being a two-byte-file - you once you start
    : overwriting infile, EOF is set appropriately.


    I think what Kaz means is that the program should detect when both files
    are the same and then automatically invoke the use of a temp file. I had
    considered this, but there are two potential problems. (1) is that my
    program generally opens all output files in append mode. That is, if
    the output file exists, it is not overwritten, but rather appended to.
    Of course, infile=outfile is a special case in itself. And (2) is
    there a reliable means to determine if the two files really are the
    same? I'm serious. I'm thinking of complications with regard to
    symlinks or the use of both relative and absolute paths e.g., ~/myfile
    versus /home/mshell/myfile Sigh. I know a user that would do such a
    thing within a single command is sick, but still I would like to handle
    it correctly for all possible cases.

    : (2) write to infile.tmp in your program and call "mv infile.tmp infile"

    There is a gotcha here. If the tmp file is on another file system,
    the rename will fail and I will have to then do a full copy. (I want
    to handle all this with system calls and not calls to /usr/bin/mv).


    See what I mean by there being a need for a convention that handles all
    possible cases (overwrite input file, append versus overwrite output
    file, and to be able to specify this on a file by file basis, etc.)?


    Mike Shell

  5. Re: Option -o Conventions?

    Bernhard Agthe writes:

    > You'll end up with infile being a two-byte-file - you once you start
    > overwriting infile, EOF is set appropriately.


    > program and call "mv infile.tmp infile" at the end of the program or
    > (3) skip the whole affair and call your program with
    >
    > # myprog -o tmp infile ; mv tmp infile


    Better:
    $ myprog -o tmp infile && mv tmp infile

    You seldom want the infile to be replaced with garbage (and you
    shouldn't need root).

    --
    Lajos Parkatti


  6. Re: Option -o Conventions?

    Michael Shell writes:
    > On 17 Oct 2006 10:49:56 -0700
    > "Kaz Kylheku" wrote:


    > > myprog -o infile infile
    > >
    > > with appropriate support within myprog, of course.


    > two potential problems. (1) is that my
    > program generally opens all output files in append mode. That is, if
    > the output file exists, it is not overwritten, but rather appended to.
    > Of course, infile=outfile is a special case in itself.


    In that case - if the situation makes sense at all - you have to keep
    track on where the file originally ended. I do not see any real problem.
    But an option to overwrite the infile without having the program
    figure out what to do might be cleaner. Depends on the program.

    > And (2) is
    > there a reliable means to determine if the two files really are the
    > same? I'm serious. I'm thinking of complications with regard to
    > symlinks or the use of both relative and absolute paths e.g., ~/myfile
    > versus /home/mshell/myfile Sigh.


    You could check whether inode number and device are the same (man stat).
    There might of course be a situation with the same file accessed
    through two different devices (smb to localhost or something), but if
    that is a real program I think you should guard the file itself. You
    should perhaps check that the original file doesn't grow.

    > I know a user that would do such a
    > thing within a single command is sick, but still I would like to handle
    > it correctly for all possible cases.


    He might not know about the symlink or he might have written a script
    carelessly. The program should handle it sensibly.

    > : (2) write to infile.tmp in your program and call "mv infile.tmp infile"
    >
    > There is a gotcha here. If the tmp file is on another file system,
    > the rename will fail and I will have to then do a full copy. (I want
    > to handle all this with system calls and not calls to /usr/bin/mv).


    If you create the tempfile in /tmp, yes. But if it is in the same
    directory as the final file, how could they be on different file
    systems? Unless infile is a symlink and you want to overwrite/append to
    the file it points to - in that case you have to make the tempfile too
    in the target directory.

    > See what I mean by there being a need for a convention that handles all
    > possible cases (overwrite input file, append versus overwrite output
    > file, and to be able to specify this on a file by file basis, etc.)?


    For some of the cases the present conventions are lacking, but I think
    few programs need to handle all those cases. Usually one can run the
    program a few times instead and use some pre- and postprocessing.

    If you want different options to be effective for different files on
    the command line, you could allow options to appear after files which
    shouldn't be affected by those options.

    --
    Lajos Parkatti


  7. Re: Option -o Conventions?

    "Michael Shell" wrote:

    > However, what is a good convention to use to allow the user to specify
    > that the input file is to be overwritten (via the internal use of a
    > mkstemp file) by the output?


    Stealing blantantly from Perl, how about

    -i.bak infile

    This would indicate that the input file would be renamed
    as "infile.bak" and the output file would be "infile". With
    options like this, you want to make it quite hard for a user
    to destroy his input file.

    John





  8. Re: Option -o Conventions?


    Michael Shell wrote:

    > I've always thought that outfile specification in Unix utilities could
    > be better standardized or implemented ... somehow.
    >
    > For example, I am writing a little utility of the form:
    >
    > myprog -o outfile infile
    >
    > where the -o option is optional (as this output may not always be
    > wanted by the user). If outfile or infile is "-", then write or
    > read to standard output or from standard input, respectively.
    > So far, so good.


    This may be a minor point, but that's really not the way to do it.
    More appropriate is:
    myprog [-o outfile] [infile]
    where both default to stdin/stdout, respectively.

    I find it really annoying to be required to specify "-" as a file name.
    Sometimes it is appropriate (eg diff), but if myprog is reading from
    a single file and writing to a single file, then invocation with no
    arguments should work on the standard streams. Maybe I'm
    reading your post incorrectly, but it sounds like you
    want me to type "myprog -o - -". If the user doesn't want
    the output, then let them redirect it to a bit-bucket.

    --
    Bill Pursell


+ Reply to Thread