Crawl local filesystem site for output? - Linux

This is a discussion on Crawl local filesystem site for output? - Linux ; I have a large site with .html files in many places. I would like to find a script or product that would crawl the site root (as a local filesystem) and create a "pretty" text file showing the directory/folder structure ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Crawl local filesystem site for output?

  1. Crawl local filesystem site for output?

    I have a large site with .html files in many places. I would like to
    find a script or product that would crawl the site root (as a local
    filesystem) and create a "pretty" text file showing the
    directory/folder structure of each file.

    Thanks.


  2. Re: Crawl local filesystem site for output?

    bradwiseathome@hotmail.com writes:

    >I have a large site with .html files in many places. I would like to
    >find a script or product that would crawl the site root (as a local
    >filesystem) and create a "pretty" text file showing the
    >directory/folder structure of each file.


    find . -type f -name \*.html -print

    Nick.
    --
    http://www.nick-andrew.net/ http://aus.news-admin.org/
    I prefer USENET replies. Don't send email copies. Drop the spamtrap to reply.

  3. Re: Crawl local filesystem site for output?

    Nick Andrew wrote:
    > bradwiseathome@hotmail.com writes:
    >
    > >I have a large site with .html files in many places. I would like to
    > >find a script or product that would crawl the site root (as a local
    > >filesystem) and create a "pretty" text file showing the
    > >directory/folder structure of each file.

    >
    > find . -type f -name \*.html -print


    .... and if you want that pretty, pass it through some filter that
    determines the common prefix each line of output has with its previous
    line, and clobbers it with spaces. E.g.

    1: /usr/local
    2: /usr/local/bin
    3: /usr/local/bin/bar
    4: /usr/local/bin/foo
    5: /usr/local/bin/xyzzy
    6: /usr/share
    7: /var

    The common prefix in 2 is /usr/local, so it's replaced with spaces. The
    common prefix in 3 and 4 is /usr/local/bin, etc.

    1: /usr/local
    2: /bin
    3: /bar
    4: foo
    5: xyzzy
    6: share
    7: var

    As an additional post-processing step, apply these rules until no rule
    can be applied any longer:
    1. In each line whose first non-whitespace character is /not/ a slash,
    replace the space immediately before that character with a slash.
    2. Whenever a line contains a slash, and the immediately preceding line
    has a space character in the same position as that slash, replace that
    space with a slash.

    Rule 1 could be rolled into the prefix removal logic: prefixes could be
    defined as excluding trailing slashes, so the slashes before foo,
    xyzzy, share and var in the above example remain undeleted.

    The result of these rules on our continuing example is:

    1: /usr/local
    2: / / /bin
    3: / / /bar
    4: / / /foo
    5: / / /xyzzy
    6: / /share
    7: /var

    Of course, these rules can be applied in a single pass if you process
    the lines in reverse.The columns of slashes grow until they encounter a
    non-blank.

    Pretty enough for me, after a few beers.


+ Reply to Thread