lynx to dump text file like using p and saving. - SCO

This is a discussion on lynx to dump text file like using p and saving. - SCO ; Hello, I have over 10,000 html files and I want to dump them to .txt files. I can do lynx html/file.html then enter "p" and choose save to file. What I want to do is do this at the command ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: lynx to dump text file like using p and saving.

  1. lynx to dump text file like using p and saving.

    Hello,

    I have over 10,000 html files and I want to dump them to .txt files.

    I can do lynx html/file.html

    then enter "p" and choose save to file. What I want to do is do this at
    the command prompt. I have tried

    lynx -print html/file.html > txt/file.txt
    lynx -print -dump html/file.html > txt/file.txt
    lynx -crawl -dump html/file.html > txt/file.txt
    lynx -dump html/file.html > txt/file.txt

    Does anyone know the right syntax to use at the command prompt to dump a
    html file to a text file? I want to do a loop that will dump a text
    version of the html file.

    Thanks,

    --
    Boyd Gerber
    ZENEZ 1042 East Fort Union #135, Midvale Utah 84047

  2. Re: lynx to dump text file like using p and saving.

    On Sun, 10 Feb 2008 21:17:21 -0700, Boyd Lynn Gerber wrote:

    > Hello,
    >
    > I have over 10,000 html files and I want to dump them to .txt files.
    >
    > I can do lynx html/file.html
    >
    > then enter "p" and choose save to file. What I want to do is do this at
    > the command prompt. I have tried
    >
    > lynx -print html/file.html > txt/file.txt
    > lynx -print -dump html/file.html > txt/file.txt
    > lynx -crawl -dump html/file.html > txt/file.txt
    > lynx -dump html/file.html > txt/file.txt
    >
    > Does anyone know the right syntax to use at the command prompt to dump a
    > html file to a text file? I want to do a loop that will dump a text
    > version of the html file.
    >
    > Thanks,
    >


    The last option you wrote should work.
    Try it more simply first.

    lynx -dump html/file.html

    if you get text on your console, then try it again with a redirect.
    lynx -dump html/file.html > file.txt

    in you last example you have > txt/file.txt

    Maybe the folder txt does not exist?

    Perhaps you could install html2text.

  3. Re: lynx to dump text file like using p and saving.

    On Feb 11, 1:05 am, jellybean stonerfish
    wrote:
    > On Sun, 10 Feb 2008 21:17:21 -0700, Boyd Lynn Gerber wrote:


    >
    > > Does anyone know the right syntax to use at the command prompt to dump a
    > > html file to a text file? I want to do a loop that will dump a text
    > > version of the html file.

    >
    > > Thanks,

    >
    > The last option you wrote should work.
    > Try it more simply first.
    >
    > lynx -dump html/file.html
    >


    You may also want to add -nolist

  4. Re: lynx to dump text file like using p and saving.

    On Feb 10, 11:17 pm, Boyd Lynn Gerber wrote:
    > Hello,
    >
    > I have over 10,000 html files and I want to dump them to .txt files.
    >
    > I can do lynx html/file.html
    >
    > then enter "p" and choose save to file. What I want to do is do this at
    > the command prompt. I have tried
    >
    > lynx -print html/file.html > txt/file.txt
    > lynx -print -dump html/file.html > txt/file.txt
    > lynx -crawl -dump html/file.html > txt/file.txt
    > lynx -dump html/file.html > txt/file.txt
    >
    > Does anyone know the right syntax to use at the command prompt to dump a
    > html file to a text file? I want to do a loop that will dump a text
    > version of the html file.


    html2ascii is available on OSR507 and OSR6 and can be scripted to do
    what you want. By default, it uses lynx to do the conversion.

  5. Re: lynx to dump text file like using p and saving.

    On Mon, 11 Feb 2008, Tony Lawrence wrote:
    > On Feb 11, 1:05 am, jellybean stonerfish
    > wrote:
    > > On Sun, 10 Feb 2008 21:17:21 -0700, Boyd Lynn Gerber wrote:
    > > > Does anyone know the right syntax to use at the command prompt to dump a
    > > > html file to a text file? I want to do a loop that will dump a text
    > > > version of the html file.

    > >
    > > The last option you wrote should work.
    > > Try it more simply first.
    > >
    > > lynx -dump html/file.html

    >
    > You may also want to add -nolist


    The -nolist removed the html. I found that html2txt also worked. I had
    forgot about it thanks, to all.

    Thanks,


    --
    Boyd Gerber
    ZENEZ 1042 East Fort Union #135, Midvale Utah 84047

  6. Re: lynx to dump text file like using p and saving. (SOLVED)

    Thanks to all.

    The -nolist was the key.

    Thanks,

    --
    Boyd Gerber
    ZENEZ 1042 East Fort Union #135, Midvale Utah 84047

+ Reply to Thread