Setting up automatic delete - SGI

This is a discussion on Setting up automatic delete - SGI ; Hi We want to have a disk set up for automatic scheduled delete, and just want to know how exactly we go about doing that. We had it setup before but the system crashed, so just need to re-do it ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Setting up automatic delete

  1. Setting up automatic delete


    Hi

    We want to have a disk set up for automatic scheduled delete, and just
    want to know how exactly we go about doing that. We had it setup before
    but the system crashed, so just need to re-do it again but forgot how.
    Basically we want the disk to have files with certain extension, that
    are older than say 3months, to be deleted automatically (periodically
    maybe say daily).
    Thanks

    Andrew


  2. Re: Setting up automatic delete

    On 2003-12-01, Andrew wrote:
    > We want to have a disk set up for automatic scheduled delete, and just
    > want to know how exactly we go about doing that. We had it setup before
    > but the system crashed, so just need to re-do it again but forgot how.
    > Basically we want the disk to have files with certain extension, that
    > are older than say 3months, to be deleted automatically (periodically
    > maybe say daily).
    > Thanks


    find / -mtime +30 -type -f \( \
    -name "*.ext1" -o \
    -name "*.ext2" -o \
    ... \
    \) | xargs rm -f

    See crontab(1) to schedule the task daily:

    0 0 * * * find ...

    --
    wave++ (also known, in some places, as "Yuri D'Elia") http://www.yuv.info/
    The email address is fake (thanks swen)! You know how to contact me anyway.

  3. Re: Setting up automatic delete

    In article ,
    wave++ wrote:
    |On 2003-12-01, Andrew wrote:
    |> Basically we want the disk to have files with certain extension, that
    |> are older than say 3months, to be deleted automatically (periodically
    |> maybe say daily).

    |find / -mtime +30 -type -f \( \
    | -name "*.ext1" -o \
    | -name "*.ext2" -o \
    | ... \
    |\) | xargs rm -f


    You should not use xargs for that purpose, as the filenames might
    have spaces in them. Use the -exec action of find instead

    -exec /bin/rm '{};'

    (quoting the { and semi-colon is important.)
    --
    Reviewers should be required to produce a certain number of
    negative reviews - like police given quotas for handing out
    speeding tickets. -- The Audio Anarchist

  4. Re: Setting up automatic delete

    On 2003-12-01, Walter Roberson wrote:
    >|> Basically we want the disk to have files with certain extension, that
    >|> are older than say 3months, to be deleted automatically (periodically
    >|> maybe say daily).
    >
    >|find / -mtime +30 -type -f \( \
    >| -name "*.ext1" -o \
    >| -name "*.ext2" -o \
    >| ... \
    >|\) | xargs rm -f
    >
    > You should not use xargs for that purpose, as the filenames might
    > have spaces in them. Use the -exec action of find instead
    >
    > -exec /bin/rm '{};'
    >
    > (quoting the { and semi-colon is important.)


    Yes, forgot about spaces as I don't have to handle them usually .
    Using xargs however should be faster. If you have GNU utilities in place
    you could do this:

    find ... -print0 | xargs -0 rm -f

    --
    wave++ (also known, in some places, as "Yuri D'Elia") http://www.yuv.info/
    The email address is fake (thanks swen)! You know how to contact me anyway.

  5. Re: Setting up automatic delete

    In article , wave++ wrote:
    >
    > Yes, forgot about spaces as I don't have to handle them usually .
    > Using xargs however should be faster. If you have GNU utilities in place
    > you could do this:
    >
    > find ... -print0 | xargs -0 rm -f
    >


    % mkdir delete_etc
    % for i in `seq 1 10000` ; do touch delete_etc/$i; done
    % touch delete_etc/passwd
    (have fam monitor and notify when files are being deleted
    from this dir)
    % mv delete_etc delete_etc-
    % ln -s /etc delete_etc

    Whoops, where did my /etc/passwd go?


    -jf

  6. Re: Setting up automatic delete

    In article ,
    Jan-Frode Myklebust wrote:
    |In article , wave++ wrote:
    |>
    |> Yes, forgot about spaces as I don't have to handle them usually .
    >> Using xargs however should be faster. If you have GNU utilities in place

    |> you could do this:

    |> find ... -print0 | xargs -0 rm -f


    | % mkdir delete_etc
    | % for i in `seq 1 10000` ; do touch delete_etc/$i; done
    | % touch delete_etc/passwd
    | (have fam monitor and notify when files are being deleted
    | from this dir)
    | % mv delete_etc delete_etc-
    | % ln -s /etc delete_etc

    |Whoops, where did my /etc/passwd go?

    Took me a bit of time to understand what's going on in Jan-Frode's code.

    The idea is that you create a file and leave it sitting around long
    enough to be subject to the automatic cleaning. The 'find' process
    will discover the name of the file, but will not, in the 'xargs' form
    above, immediately delete the file; it will simply at first record
    the file name. Then, after the name has been recorded and the deletions
    start, you race the deletion section of the xargs and subtitute
    a symbolic link at the -directory- level above. The rm is just fed
    the -name- of the file+directory, so it goes ahead and follows
    the symbolic link over to new location (not having been told that
    it used to be a real directory before instead of a symlink),
    and what gets deleted is the system file, not the user file.

    It's a clever use of race conditions.

    The race condition is certainly made a lot easier by the
    find piped into xargs, but it would also happen if you used
    find -exec /bin/rm '{};' -- it would just be a much much shorter
    time window. With a modification of the above code, you could
    do the same thing for 10000 different directories instead of
    1 directory with 10000 files. The race only has to be won
    once over those 10000 times for there to be a Big Problem.

    What this tells us is that a good cleanup routine should
    somehow lock each file while it looks at its stats and determines
    whether to delete it. Of course, open()'ing the file would change
    its access times, which a good cleanup routine should not do,
    so you end up trading problems. I do not know offhand what
    a good alternative would be; I have an approach in mind, but would
    want to think further to see if there is a different way of exploiting
    race conditions.
    --
    Look out, there are llamas!

  7. Re: Setting up automatic delete

    Walter Roberson wrote:
    > In article ,
    > Jan-Frode Myklebust wrote:
    >|In article , wave++ wrote:
    >|>
    >|> Yes, forgot about spaces as I don't have to handle them usually .
    >>> Using xargs however should be faster. If you have GNU utilities in place

    >|> you could do this:
    >
    >|> find ... -print0 | xargs -0 rm -f
    >
    >
    >| % mkdir delete_etc
    >| % for i in `seq 1 10000` ; do touch delete_etc/$i; done
    >| % touch delete_etc/passwd
    >| (have fam monitor and notify when files are being deleted
    >| from this dir)
    >| % mv delete_etc delete_etc-
    >| % ln -s /etc delete_etc
    >
    >|Whoops, where did my /etc/passwd go?
    >
    > Took me a bit of time to understand what's going on in Jan-Frode's code.
    >
    > The idea is that you create a file and leave it sitting around long
    > enough to be subject to the automatic cleaning. The 'find' process
    > will discover the name of the file, but will not, in the 'xargs' form
    > above, immediately delete the file; it will simply at first record
    > the file name. Then, after the name has been recorded and the deletions
    > start, you race the deletion section of the xargs and subtitute
    > a symbolic link at the -directory- level above. The rm is just fed
    > the -name- of the file+directory, so it goes ahead and follows
    > the symbolic link over to new location (not having been told that
    > it used to be a real directory before instead of a symlink),
    > and what gets deleted is the system file, not the user file.
    >
    > It's a clever use of race conditions.
    >
    > The race condition is certainly made a lot easier by the
    > find piped into xargs, but it would also happen if you used
    > find -exec /bin/rm '{};' -- it would just be a much much shorter
    > time window. With a modification of the above code, you could
    > do the same thing for 10000 different directories instead of
    > 1 directory with 10000 files. The race only has to be won
    > once over those 10000 times for there to be a Big Problem.
    >
    > What this tells us is that a good cleanup routine should
    > somehow lock each file while it looks at its stats and determines
    > whether to delete it. Of course, open()'ing the file would change
    > its access times, which a good cleanup routine should not do,
    > so you end up trading problems. I do not know offhand what
    > a good alternative would be; I have an approach in mind, but would
    > want to think further to see if there is a different way of exploiting
    > race conditions.


    Since find wouldn't follow symlinks, all you'd need to do is replace
    rm with a script which would check to see if any element in the path
    is a symlink and if it is, to unlink the symlink itself.

    This would reduce the window, but not remove it.

    I suppose you could have the script do the above check and if it passes
    create a hardlink to the the file and then remove it, do the check again and
    then either unlink the hardlinked file or rename it back to the original and
    remove the bogus symlink.

    Ivan

  8. Re: Setting up automatic delete

    I think a pretty safe cleanup could be something like:

    #! /bin/ksh -
    find /somedir -local -type f -mtime +30 -print | \
    while read victim
    do
    d=${victim%/*} # same as: dirname $victim
    f=${victim##*/} # same as: basename $victim
    cd "$d"
    if [[ "`/sbin/pwd`" = "$d" ]]
    then
    rm -f -- "./$f"
    fi
    done

    i.e. find a file, go to its directory and check that pwd is
    what you expect it to be before deleting the file by its basename.

    But this will be _very_ slow. It's probably better to try building
    the 'tmpwatch' cleaner that comes with RedHat.


    -jf

  9. Re: Setting up automatic delete

    On 2003-12-03, Ivan Rayner wrote:
    >>|> Yes, forgot about spaces as I don't have to handle them usually .
    >>>> Using xargs however should be faster. If you have GNU utilities in place

    >>|> you could do this:
    >>
    >>|> find ... -print0 | xargs -0 rm -f
    >>
    >>
    >>| % mkdir delete_etc
    >>| % for i in `seq 1 10000` ; do touch delete_etc/$i; done
    >>| % touch delete_etc/passwd
    >>| (have fam monitor and notify when files are being deleted
    >>| from this dir)
    >>| % mv delete_etc delete_etc-
    >>| % ln -s /etc delete_etc
    >>
    >>|Whoops, where did my /etc/passwd go?
    >>
    >> Took me a bit of time to understand what's going on in Jan-Frode's code.
    >>
    >> The idea is that you create a file and leave it sitting around long
    >> enough to be subject to the automatic cleaning. The 'find' process
    >> will discover the name of the file, but will not, in the 'xargs' form
    >> above, immediately delete the file; it will simply at first record
    >> the file name. Then, after the name has been recorded and the deletions
    >> start, you race the deletion section of the xargs and subtitute
    >> a symbolic link at the -directory- level above. The rm is just fed
    >> the -name- of the file+directory, so it goes ahead and follows
    >> the symbolic link over to new location (not having been told that
    >> it used to be a real directory before instead of a symlink),
    >> and what gets deleted is the system file, not the user file.


    A _true_ solution would consist in stat(2)-ing all files encountered
    during the recursive scan to record inodes, and then verifing them when
    removing the files themselves. As a futher optimization you can verify
    inodes only on directories.

    The problem however could still persist if you moved some files around
    when scanning (assuming you unlink files immediately, you may still have
    problems on large directory trees). You have to double-check that all
    files resides inside the root you specify.

    > Since find wouldn't follow symlinks, all you'd need to do is replace
    > rm with a script which would check to see if any element in the path
    > is a symlink and if it is, to unlink the symlink itself.
    >
    > This would reduce the window, but not remove it.
    >
    > I suppose you could have the script do the above check and if it passes
    > create a hardlink to the the file and then remove it, do the check again and
    > then either unlink the hardlinked file or rename it back to the original and
    > remove the bogus symlink.
    >
    > Ivan


    This case definitively deserves a dedicated tool

    --
    wave++ (also known, in some places, as "Yuri D'Elia") http://www.yuv.info/
    The email address is fake (thanks swen)! You know how to contact me anyway.

  10. Re: Setting up automatic delete

    On Mon, 01 Dec 2003 12:22:51 -0500
    Andrew wrote:

    > Hi
    >
    > We want to have a disk set up for automatic scheduled delete, and just
    > want to know how exactly we go about doing that. We had it setup before
    > but the system crashed, so just need to re-do it again but forgot how.
    > Basically we want the disk to have files with certain extension, that
    > are older than say 3months, to be deleted automatically (periodically
    > maybe say daily).
    > Thanks
    >
    > Andrew



    here are two little more complex bash functions, that basically do the following:

    a) clean up you dumpsters from all files which have not been accessed for four weeks
    b) move all thrash files into the dumpsters
    c) wait some time before the next run is allowed (one month)
    d) if the waiting period expired, start over with a)

    This script can either be run from the startup sequence, if the machine is not running 24/4 but frequently turned on and off, or it can be executed by cron.

    function empty_dumpster
    {
    if [ -r /var/adm/locate.gz ]
    then
    DUMPSTERS=`echo /var/tmp; gzcat /var/adm/locate.gz | egrep '/dumpster$'`

    for i in $DUMPSTERS
    do
    find $i -name "*" -atime +28 -exec rm -r {} \; -print | logger -t deleted -p local0.notice
    done
    fi

    trash_cleanup
    }

    function trash_cleanup
    {
    # each month a new flag
    flag=/var/adm/last_trash_cleanup.`date +%m`

    if [ ! -r $flag ]
    then
    echo do cleanup
    rm /var/adm/last_trash_cleanup.* 2>/dev/null
    touch $flag

    TRASH=`find / -name *[^.]~ \
    -o -name core \
    -o -name Thumbs.db \
    -o -name DEADJOE \
    -o -name ,* \
    -o -name \#* \
    -o -name -sbc-* `

    for i in $TRASH
    do
    j=`echo $i | sed 's/\//\=/g'`

    echo $i | grep dumpster > /dev/null

    if [ $? -eq 1 ]
    then
    echo $j | grep "export=people" > /dev/null

    if [ $? -eq 1 ]
    then
    mv $i /dumpster/$j
    echo $i" -> /dumpster/"$j | logger -t cleanup -p local0.notice
    else
    user=`echo $j | sed 's/\=export\=people\=//g' | sed s/\=.*//g`
    file=`echo $j | sed s/\=export\=people\=$user\=//g`
    dest='/export/people/'$user'/dumpster/'$file
    echo $i" -> "$dest | logger -t cleanup -p local0.notice
    mv $i $dest
    fi
    fi

    done

    fi
    }

    the function empty_dumpsters() is the one that must be called, it first will delete all files from the dumpsters that have not been accessed for 4 weeks. The file locate.gz is generated the following way:

    cd / && find . | gzip -c > /var/adm/locate.gz

    This script (or the function 'empty_dumpsters') will only execute once a month, using a flag file to indicate whether it already run or not in the current month. The user directories (on my machine) reside under /export/people.
    At first 'find' collects all trash files and stores them in a variable, then the 'for'-loop iterates over all of them. Now it is checked whether the file is already in a dumpster, if it is, then it is ignored, else the complete original path of the file is encoded into the file name like this

    /path/to/junk/file.foo -> path=to=junk=file.foo

    then the file is renamed and moved either into the users dumpster or the root's dumpster, depending on who owned the file. This way you can always put a accidentially removed file back into it's original location (and fix the thrash removal script . When a file has been moved into a dumpster or finally has been nuked, you'll see it in the syslog.

    hope this helps.

+ Reply to Thread