CP/M spell checker source? - CP/M
This is a discussion on CP/M spell checker source? - CP/M ; I am looking at writing a spell checker in Perl and was wondering,
since the english language has not changed much in 30 years, if there
are any CP/M spell checkers out there with source I could look at for
...
-
CP/M spell checker source?
I am looking at writing a spell checker in Perl and was wondering,
since the english language has not changed much in 30 years, if there
are any CP/M spell checkers out there with source I could look at for
code clues / algorythms, since any CP/M code had to be (or should have
been) fast and compact.
Bill H
-
Re: CP/M spell checker source?
On 2007-09-23, Bill H wrote:
> I am looking at writing a spell checker in Perl and was wondering,
> since the english language has not changed much in 30 years, if there
> are any CP/M spell checkers out there with source I could look at for
> code clues / algorythms, since any CP/M code had to be (or should have
> been) fast and compact.
Don't know of any CP/M ones with source, but there was an
early Unix spellchecker that calculated the frequencies of
three-character sequences and pointed out places that had rare
sequences. I've always thought that would be interesting to play with,
but never have.
I forget where I read about it, though...
--
roger ivie
rivie@ridgenet.net
-
Re: CP/M spell checker source?
Roger Ivie wrote:
> Bill H wrote:
>
>> I am looking at writing a spell checker in Perl and was wondering,
>> since the english language has not changed much in 30 years, if
>> there are any CP/M spell checkers out there with source I could
>> look at for code clues / algorythms, since any CP/M code had to be
>> (or should have been) fast and compact.
>
> Don't know of any CP/M ones with source, but there was an early
> Unix spellchecker that calculated the frequencies of
> three-character sequences and pointed out places that had rare
> sequences. I've always thought that would be interesting to play
> with, but never have.
To give you an idea of the problems, the 'words' file on my
(unixlike) system is 3776kbytes long. This contains the accurately
spelled words against which to compare, in alphabetical order. A
CP/m spell checker will use a smaller word list, and probably
compressed, but the elementary method is to compare each word
against this file.
--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
--
Posted via a free Usenet account from http://www.teranews.com
-
Re: CP/M spell checker source?
On 2007-09-23, CBFalconer wrote:
> Roger Ivie wrote:
>>
>> Don't know of any CP/M ones with source, but there was an early
>> Unix spellchecker that calculated the frequencies of
>> three-character sequences and pointed out places that had rare
>> sequences. I've always thought that would be interesting to play
>> with, but never have.
>
> To give you an idea of the problems, the 'words' file on my
> (unixlike) system is 3776kbytes long. This contains the accurately
> spelled words against which to compare, in alphabetical order.
This is precisely why I was intrigued by the little Unix spell
checker I talked about above. It doesn't have a dictionary, but
just points out words that contain infrequent letter sequences.
It was allegedly pretty good at catching typos.
--
roger ivie
rivie@ridgenet.net
-
Re: CP/M spell checker source?
On Sun, 23 Sep 2007 18:55:01 -0400, CBFalconer
wrote:
>Roger Ivie wrote:
>> Bill H wrote:
>>
>>> I am looking at writing a spell checker in Perl and was wondering,
>>> since the english language has not changed much in 30 years, if
>>> there are any CP/M spell checkers out there with source I could
>>> look at for code clues / algorythms, since any CP/M code had to be
>>> (or should have been) fast and compact.
>>
>> Don't know of any CP/M ones with source, but there was an early
>> Unix spellchecker that calculated the frequencies of
>> three-character sequences and pointed out places that had rare
>> sequences. I've always thought that would be interesting to play
>> with, but never have.
>
>To give you an idea of the problems, the 'words' file on my
>(unixlike) system is 3776kbytes long. This contains the accurately
>spelled words against which to compare, in alphabetical order. A
>CP/m spell checker will use a smaller word list, and probably
>compressed, but the elementary method is to compare each word
>against this file.
This is the common CP/M scheme with a few tweeks.
The dictionary is usually around 30,000 words, though I've seen
a few that hit 50,000. There is some compression as well as the
abiility to add a user list of new words. Most of the the software
I've encountered knows how to deal with simple modifiers like
plurals and simple prefix to base words. This is done to keep the
dictionary small.
It is possible to to a rules based program (i before e except...)
but a dictionary is still needed for the exceptions.
I'd be checking the various archives for freeware source code
to build on.
Allison
-
Re: CP/M spell checker source?
On Sep 23, 11:54 am, Bill H wrote:
> I am looking at writing a spell checker in Perl and was wondering,
> since the english language has not changed much in 30 years, if there
> are any CP/M spell checkers out there with source I could look at for
> code clues / algorythms, since any CP/M code had to be (or should have
> been) fast and compact.
>
> Bill H
Bill -
Not that this matters much but I would disagree with your claim that
the "English language has not changed much in the last 30 years."
In Dr. Dobb's Journal, Number 66, April 1982 Alan Bomberger wrote "A
Poor Person's Spelling Checker" on pages 42-53. The article contains
source code in Z80 assembler. I have paper copy of this article. If
you are interested, I could scan it and send it to you. A CP/M library
file which has documentation and executables is at
http://www.retroarchive.org/cpm/cdro.../I/PMSPELL.LBR. The
source code for the spell checker is NOT in this library. If you were
really interested in getting the source code to the spell checker
you'd have to run OCR software against the article and then do the
usual edits to get it to assemble.
- Lee Bradley
-
Re: CP/M spell checker source?
"Bill H" wrote in message
news:1190562895.508392.21200@w3g2000hsg.googlegrou ps.com...
>I am looking at writing a spell checker in Perl and was wondering,
> since the english language has not changed much in 30 years, if there
> are any CP/M spell checkers out there with source I could look at for
> code clues / algorythms, since any CP/M code had to be (or should have
> been) fast and compact.
>
> Bill H
>
One possible lead to research comes from my previous mention of John
Dvorak's article "What Ever Happened To CBASIC?" at
http://www.dvorak.org/blog/?page_id=8221
A brief citation reads: "... Eubanks went on to form C&E software with
Dennis Coleman who had made a name for himself by developing the first
modern Spell Checker. They were to develop the Q&A integrated word
processor/database program. "
-
Re: CP/M spell checker source?
On Sep 23, 11:54 am, Bill H wrote:
> I am looking at writing a spell checker in Perl and was wondering,
> since the english language has not changed much in 30 years, if there
> are any CP/M spell checkers out there with source I could look at for
> code clues / algorythms, since any CP/M code had to be (or should have
> been) fast and compact.
>
> Bill H
It seems those who have replied are more interested in this topic than
the original poster!
I was able to locate the source code to Alan Bomberger's "A Poor
Person's Spelling Checker." I've also scanned the
Dr. Dobb's article. If anyone is interested, please see
http://primepuzzle.com/ppspell/A_Poo...ng_Checker.pdf
and
http://primepuzzle.com/ppspell/SPELL11A.MAC
The dictionary file(s) this speller uses can be compacted. I was able
to contact Alan Bomberger who replied when I asked him if he had the
source to the compactor
"Sounds like a good exercise for the reader
Heck a 20 line
C program would do that. I am somewhat surprised to see SPELL
distributed on the Walnut Creek disk. I am surprised that you
still have one. I actually think I know where I have one. Don't
these old bits every die?
Having lowered the priority of locating and restoring this software
considerably you should head to a C manual and not wait for me."
He's right. I've written pseudocode so far. I'll be back w/ a working
C program in a bit.
-
Re: CP/M spell checker source?
On Oct 5, 12:15 pm, Lee wrote:
> On Sep 23, 11:54 am, Bill H wrote:
>
> > I am looking at writing a spell checker in Perl and was wondering,
> > since the english language has not changed much in 30 years, if there
> > are any CP/M spell checkers out there with source I could look at for
> > code clues / algorythms, since any CP/M code had to be (or should have
> > been) fast and compact.
>
> > Bill H
>
> It seems those who have replied are more interested in this topic than
> the original poster!
>
> I was able to locate the source code to Alan Bomberger's "A Poor
> Person's Spelling Checker." I've also scanned the
> Dr. Dobb's article. If anyone is interested, please see
>
> http://primepuzzle.com/ppspell/A_Poo...ng_Checker.pdf
>
> and
>
> http://primepuzzle.com/ppspell/SPELL11A.MAC
>
> The dictionary file(s) this speller uses can be compacted. I was able
> to contact Alan Bomberger who replied when I asked him if he had the
> source to the compactor
>
> "Sounds like a good exercise for the reader
Heck a 20 line
> C program would do that. I am somewhat surprised to see SPELL
> distributed on the Walnut Creek disk. I am surprised that you
> still have one. I actually think I know where I have one. Don't
> these old bits every die?
>
> Having lowered the priority of locating and restoring this software
> considerably you should head to a C manual and not wait for me."
>
> He's right. I've written pseudocode so far. I'll be back w/ a working
> C program in a bit.
No I am very interested in it, I haven't begun coding yet thats all. A
few of the methods I found that are used now are converting a word
into a phonetic key and then checking to see what other words have the
same key, and if the word isn't in that list show the other
possibilities (explained at: http://author.handalak.com/archives/042003/000078
).
Bill H
-
Re: CP/M spell checker source?
On Oct 5, 6:02 pm, Bill H wrote:
> On Oct 5, 12:15 pm, Lee wrote:
>
>
>
> > On Sep 23, 11:54 am, Bill H wrote:
>
> > > I am looking at writing a spell checker in Perl and was wondering,
> > > since the english language has not changed much in 30 years, if there
> > > are any CP/M spell checkers out there with source I could look at for
> > > code clues / algorythms, since any CP/M code had to be (or should have
> > > been) fast and compact.
>
> > > Bill H
>
> > It seems those who have replied are more interested in this topic than
> > the original poster!
>
> > I was able to locate the source code to Alan Bomberger's "A Poor
> > Person's Spelling Checker." I've also scanned the
> > Dr. Dobb's article. If anyone is interested, please see
>
> >http://primepuzzle.com/ppspell/A_Poo...ng_Checker.pdf
>
> > and
>
> >http://primepuzzle.com/ppspell/SPELL11A.MAC
>
> > The dictionary file(s) this speller uses can be compacted. I was able
> > to contact Alan Bomberger who replied when I asked him if he had the
> > source to the compactor
>
> > "Sounds like a good exercise for the reader
Heck a 20 line
> > C program would do that. I am somewhat surprised to see SPELL
> > distributed on the Walnut Creek disk. I am surprised that you
> > still have one. I actually think I know where I have one. Don't
> > these old bits every die?
>
> > Having lowered the priority of locating and restoring this software
> > considerably you should head to a C manual and not wait for me."
>
> > He's right. I've written pseudocode so far. I'll be back w/ a working
> > C program in a bit.
>
> No I am very interested in it, I haven't begun coding yet thats all. A
> few of the methods I found that are used now are converting a word
> into a phonetic key and then checking to see what other words have the
> same key, and if the word isn't in that list show the other
> possibilities (explained at:http://author.handalak.com/archives/042003/000078
> ).
>
> Bill H
Glad to hear it! Good luck w/ phonetic key etc. research. I'd like to
see what you finally come up with. As I mentioned to you "personally,"
another spell checker was written for CP/M in the early 80's by a guy
name Michael Adler. Much more sophisticated than Alan Bomberger's,
this one supports a highly compressed dictionary.
http://primepuzzle.com/ppspell/SPELL21.LBR
http://primepuzzle.com/ppspell/SPELL21X.LBR
As promised, I've written and tested a compactor for the dictionary
files used by Bomberger's speller. It was not that easy to write!
http://primepuzzle.com/ppspell/CMPRLEX1.C
Executables and source for all files related to "A Poor Person's
Spelling Checker" are in
http://primepuzzle.com/ppspell/PPSPELL.LBR