Towards a GREP Utility for WS4 Files... The end? - CP/M

This is a discussion on Towards a GREP Utility for WS4 Files... The end? - CP/M ; Towards a GREP utility for WS4 files... The end? A few days (weeks?) ago, I published a small program (WS4WORD) displaying the lines containing a given word inside a single file. I asked if anybody out there had a subroutine ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Towards a GREP Utility for WS4 Files... The end?

  1. Towards a GREP Utility for WS4 Files... The end?

    Towards a GREP utility for WS4 files... The end?

    A few days (weeks?) ago, I published a small program (WS4WORD)
    displaying the lines containing a given word inside a single
    file.

    I asked if anybody out there had a subroutine for searching
    ambiguous files.

    As usual, I got not answer.

    So, it was time to go back to the manual.

    It has been years since I last opened it. Normally, I write my
    programs as simply as possible, so they are portable to CP/M
    2.2, CP/M-86 and MS-DOS. One way to do it is by not opening the
    manual(s), so as to be not influenced by what they are saying.
    When you are fluent in a language, you know how to make you
    understood by others.

    Since I programmed my beloved Epson QX-10 under CP/M Plus during
    15 years, I had such a subroutine (in assembly language).

    However, I wanted something more portable, so wanted something
    in high-level language.

    High-level language for the above-mentioned 3 Operating Systems
    means "BASIC".

    In my personal case, since discovering Mallard BASIC on the
    Amstrad PCW8256, I totally stopped using MBASIC v5.22 (even if
    it is me who provided v5.29 to the Retroarchive Web site...).

    Until then, since I was working on one single file, I simply
    typed "dir" in "Direct Mode" (that is to say: not in "Programmed
    Mode"), was seeing the files present, and was chosing one for
    running my program. That is to say: DIR was displaying the
    filenames on the screen... but how do you do to have BASIC work
    on several files?

    After WS4WORD was running, I wanted to have a version of this
    program able to find a string inside several WS4 files. So, how
    to do it in high-level language for 3 different Operating
    Systems?

    So, back to the "Mallard BASIC Reference Manual". In the "Disc
    and Directory Operations" section, I found that "The following
    commands act on complete files or their names. Some of the
    commands have forms which are intended for use in Direct Mode,
    and which mimic the syntax of the equivalent operating system
    commands."

    Ok. This seems logical. I will only give the first line of the
    following paragraphs.

    The RESET command has two effect.
    The DIR and FILES commands produce a directory listing to the console.
    The DEL, ERA, and KILL commands may be used to delete files.
    The REN and NAME commands may be used to change the name of a file.
    The TYPE and DISPLAY commands read the given file and display it
    on the console.

    The problem being that those commands were sending their outputs
    to the console... But how could BASIC read the console? (And
    some people talk about Artificial Intelligence!) Fortunately,
    the last command looked interesting:

    The FIND$ function returns the full name of the first or Nth
    file whose name matches the given name.

    ??? What does it mean? Later, the manual says that the syntax
    is: FIND$ filespec, ordinal? What's that?

    As usual, some interactive tests should help us to understand.

    load"dir
    Ok
    list
    10 PRINT
    20 DIR *.ws4
    30 PRINT
    40 END
    Ok
    run

    AYRSWRIT.WS4 YAK1686 .WS4 IANDAY .WS4 CAPTVOSS.WS4 LIFERET .WS4
    MONITOR .WS4 AYRSLIST.WS4 PRIDEJOY.WS4 AYRSTALK.WS4 EPICURE .WS4
    COROSION.WS4 ODYSSEUS.WS4 YAKHEAT .WS4 WS4FIND .WS4
    Ok

    (Those are the files about sailing that I published in a Yahoo
    group since October 2006. That's why I was so quiet, lately, on
    the comp.os.cpm Newsgroup.)

    Ok. Now, let us try the FIND$ function.

    load"find1
    Ok
    list
    10 PRINT
    20 found$ = FIND$ ("*.ws4")
    30 PRINT found$
    40 PRINT
    50 END
    Ok
    run

    AYRSWRIT.WS4

    Ok

    Well, obviously, FIND$ returns the first file that DIR
    displayed. Now, let us try the famous "ordinal". In order not to
    get again the first file, let us choose, say the last one on the
    line: the 5th file.

    load"find2
    Ok
    list
    10 PRINT
    20 found$ = FIND$ ("*.ws4", 5)
    30 PRINT found$
    40 PRINT
    50 END
    Ok
    run

    LIFERET .WS4

    Ok

    Wahoo! It worked! So, if I want to know all the WS4 files on a
    disk, I use the ordinal version of the FIND$ function, with a
    number corresponding to the number of WS4 files present. Last
    thing to know: FIND$ returns an empty string if there is no such
    file, or no more files.

    Since, at the beginning, we have no idea how many WS4 files are
    going to be present, let us ask BASIC to do the counting for us.

    load"find3
    Ok
    list
    10 PRINT
    20 WHILE FIND$ ("*.ws4") <> ""
    30 ordinal = ordinal + 1
    40 found$ = FIND$ ("*.ws4", ordinal)
    50 IF found$ = "" THEN GOTO 80
    60 PRINT found$
    70 WEND
    80 PRINT
    90 END
    Ok
    run

    AYRSWRIT.WS4
    YAK1686 .WS4
    IANDAY .WS4
    CAPTVOSS.WS4
    LIFERET .WS4
    MONITOR .WS4
    AYRSLIST.WS4
    PRIDEJOY.WS4
    AYRSTALK.WS4
    EPICURE .WS4
    COROSION.WS4
    ODYSSEUS.WS4
    YAKHEAT .WS4
    WS4FIND .WS4

    Ok

    It worked! All that remains to do is modify WS4WORD so that it
    will parse each file in turn, looking for a particular string.
    After a few massages, I got the following on my console (I did a
    lot of tests, but am only showing you a nice one, involving only
    2 files):

    load"ws4find
    Ok
    run

    WS4FIND> Enter WS4 File Name: ? yak*
    Enter word to find: it is

    before we make into Harwich, appeared to us at noon, for IT IS high
    land, and

    There is 1 time the word "IT IS" in file YAK1686 .WS4.

    arduous weather conditions of this sort, IT IS perhaps opportune
    for AYRS'
    gaseous constituents which are irritants. So, IT IS essential in the
    confines
    vapour at a particular temperature; IT IS then said to be saturated:
    IT IS at
    If further action is taken, IT IS suggested that it could well proceed
    in the

    There are 5 times the word "IT IS" in file YAKHEAT .WS4.

    (The above lines are 78-columns wide... This is normal for 25x80
    screens!)

    Conclusion: I finally have a utility helping me to find if a
    given string is present in a WS4 files. Since I have about 800
    WS4 files (compare that with the handful that can be found on
    the Internet!) on my DR-DOS 7.03 system, such a utility was much
    needed.

    Remark: "GREP" is a commonly-used expression to name a program
    doing the same thing but, in fact, it means "Get and REPlace".
    Since I don't care about replacing the target string, just want
    to know if it present in a file, and its context (hence the
    CONtext file generated for each "hit"), I chose to name it
    WS4FIND, rather than WS4GREP. In addition, it is using the FIND$
    function, so is nicely fitting.

    Now, all that remains is to translate it into assembly
    language... Maybe a good subject for Literate Programming?

    Yours Sincerely,
    Mr Emmanuel Roche


    Post-Scriptum:
    It would be funny if I was not showing you the program! But I am
    too kind:

    list
    10 REM WS4FIND.BAS by Emmanuel ROCHE
    20 :
    30 PRINT
    40 INPUT "WS4FIND> Enter WS4 File Name: " ; file$
    50 LINE INPUT " Enter word to find: " ; word$
    60 PRINT
    70 WHILE FIND$ ("*.WS4") <> ""
    80 found$ = FIND$ (file$ + ".WS4")
    90 IF found$ = "" THEN GOTO 180
    100 ordinal = ordinal + 1
    110 file1$ = FIND$ (file$ + ".WS4", ordinal)
    120 file2$ = LEFT$ (file1$, 8) + ".CON" ' Short for CONtext...
    130 IF file1$ = "" THEN GOTO 160
    140 GOSUB 200
    150 WEND
    160 END
    170 :
    180 PRINT CHR$ (7) "File not found." : PRINT : END
    190 :
    200 OPEN "R", #1, file1$, 1
    210 FIELD #1, 1 AS byte$
    220 OPEN "O", #2, file2$
    230 :
    240 ln = 0
    250 total = 0
    260 ' Trick if we use a WHILE NOT EOF...
    270 GET #1
    280 GOSUB 580
    290 :
    300 WHILE NOT EOF (1)
    310 GET #1
    320 IF ASC (byte$) = &H1A THEN GOSUB 880 : CLOSE : RETURN
    330 GOSUB 580
    340 WEND
    350 RETURN
    360 :
    370 ' Echo ASCII char to screen?
    380 line$ = line$ + STRIP$ (byte$)
    390 RETURN
    400 :
    410 ' Echo char > ASCII to screen?
    420 GET #1
    430 line$ = line$ + byte$
    440 GET #1
    450 RETURN
    460 :
    470 ' Get rid of WS4 internal commands.
    480 i$ = CHR$(9)+CHR$(10)+CHR$(13)+CHR$(27)+CHR$(155)
    490 i = INSTR (i$, byte$)
    500 REM Bytes: 09 0A 0D 1B 9B
    510 ON i GOSUB 380, 380, 380, 420, 420
    520 IF ASC (byte$) = &H82 THEN RETURN
    530 IF ASC (byte$) > &H1F THEN GOSUB 380
    540 IF ASC (byte$) = &HA THEN GOSUB 740 : line$ = ""
    550 IF ASC (byte$) = &H1A THEN CLOSE : RETURN
    560 RETURN
    570 :
    580 IF byte$ = "." THEN GOTO 680
    590 ' WS4 text
    600 GOSUB 480
    610 WHILE ASC (byte$) <> &HA
    620 GET #1
    630 GOSUB 480
    640 IF ASC (byte$) = &H8A THEN RETURN
    650 WEND
    660 :
    670 ' Dot commands
    680 WHILE ASC (byte$) <> &HA
    690 GET #1
    700 IF ASC (byte$) = &H8A THEN RETURN
    710 WEND
    720 RETURN
    730 :
    740 ' Find word
    750 IF INSTR (line$, word$) = 0 THEN RETURN ELSE found = 1
    760 IF INSTR (line$, word$) <> 0 THEN found = 1
    770 IF INSTR (found, line$, word$) <> 0 THEN total = total + 1 ELSE
    GOTO 810
    780 found = INSTR (found, line$, word$) + LEN (word$)
    790 line$ = LEFT$ (line$, found - LEN (word$) - 1) + UPPER$ (word$)
    + MID$ (line$, found, 255)
    800 GOTO 770
    810 PRINT line$ ;
    820 PRINT #2, line$ ;
    830 ln = ln + 1
    840 IF ln = 24 THEN PRINT : PRINT "Press ENTER to Continue " ; :
    WHILE INKEY$ = "" : WEND : PRINT : PRINT : ln = 0
    850 RETURN
    860 :
    870 ' Display statistics
    880 IF total = 0 THEN GOTO 900
    890 PRINT
    900 PRINT "There " ;
    910 IF total = 1 THEN PRINT "is" ; ELSE PRINT "are" ;
    920 PRINT total "time" ;
    930 IF total <> 1 THEN PRINT "s" ;
    940 PRINT " the word " CHR$ (34) UPPER$ (word$) CHR$ (34) ;
    950 PRINT " in file " UPPER$ (file1$) "."
    960 PRINT
    970 RETURN
    Ok
    system
    A>That's all, folks!

    (Lines 790 and 840 were longer than the width of my screen.)


    EOF


  2. Re: Towards a GREP Utility for WS4 Files... The end?

    --{ roche182@laposte.net a plopé ceci: }--

    > Remark: "GREP" is a commonly-used expression to name a program
    > doing the same thing but, in fact, it means "Get and REPlace".



    No... Grep means "Go Regular Expression, Print".


    --
    http://tontonth.free.fr/libsound77.html

  3. Re: Towards a GREP Utility for WS4 Files... The end?

    On Wed, 23 Jan 2008 01:06:34 +0100, "Thierry B."
    wrote:

    >--{ roche182@laposte.net a plopé ceci: }--
    >
    >> Remark: "GREP" is a commonly-used expression to name a program
    >> doing the same thing but, in fact, it means "Get and REPlace".

    >
    >
    > No... Grep means "Go Regular Expression, Print".


    Back in my minimainframe days GREP was Global Regular Expression
    PARSER.


    Allison

  4. Re: Towards a GREP Utility for WS4 Files... The end?

    --{ no.spam@no.uce.bellatlantic.net a plopé ceci: }--

    >>> Remark: "GREP" is a commonly-used expression to name a program
    >>> doing the same thing but, in fact, it means "Get and REPlace".

    >>
    >>
    >> No... Grep means "Go Regular Expression, Print".

    >
    > Back in my minimainframe days GREP was Global Regular Expression
    > PARSER.


    Just curious: what is a "Global Regular Expression Parser",
    and what kind of mainframe ?


    --
    Imaginez votre dilemne si en regardant 2 jolies capture d'écran on vous
    disait : Alors, "XP vs VISTA", que choisissez vous ?

  5. Re: Towards a GREP Utility for WS4 Files... The end?

    On Wed, 23 Jan 2008 10:25:34 +0100, "Thierry B."
    wrote:

    >--{ no.spam@no.uce.bellatlantic.net a plopé ceci: }--
    >
    >>>> Remark: "GREP" is a commonly-used expression to name a program
    >>>> doing the same thing but, in fact, it means "Get and REPlace".
    >>>
    >>>
    >>> No... Grep means "Go Regular Expression, Print".

    >>
    >> Back in my minimainframe days GREP was Global Regular Expression
    >> PARSER.

    >
    > Just curious: what is a "Global Regular Expression Parser",
    > and what kind of mainframe ?


    PDP11 running Unix V6, microVAX2000 running Ultrix4.2, VAX11/780
    running some flavor of unix (1980).

    Allison

  6. Re: Towards a GREP Utility for WS4 Files... The end?

    no.spam@no.uce.bellatlantic.net wrote:
    > On Wed, 23 Jan 2008 01:06:34 +0100, "Thierry B."

    (snip)

    >> No... Grep means "Go Regular Expression, Print".


    > Back in my minimainframe days GREP was Global Regular Expression
    > PARSER.


    Average the two for global regular expression print.

    I believe it came from ed or ex, the command:

    g/re/p

    where re is any regular expression, g is the global command
    which will execute the following command for all matching lines.
    p is the print command.

    In vi you can execute ex commands following a colon. I found

    :g/re/d

    useful some years ago to delete all lines matching the regular
    expression, or

    :v/re/d

    to delete all lines not matching /re/

    -- glen


+ Reply to Thread