[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[seul-edu] [Fwd: "If you know of such a lexical analysis program, ..."]



"Brown, Rodney" wrote:

> S. Barret Dolph posed an interesting problem:
>
>       I would like to be able to check some books for root words.
>
>       Example.... look for occurances of ped, cor, cit, in
> swift.txt
>
>       Would it be possible with Perl, or anything else, to find
> the occurances of a list
> of words, show where those words are, and save this task to a
> file?
>
> While not a direct solution, I believe the stemmer in  the GPL
> program mg
> <http://www.cs.mu.oz.au/mg/> as described in
> "The second edition of Managing Gigabytes: Compressing and
> Indexing Documents and Images
> by Ian H. Witten, Alistair Moffat, and Timothy C. Bell, is now
> available (May 1999),
> published by Morgan Kaufmann Publishing, San Francisco, ISBN
> 1-55860-570-3."
> may be a basis for what you want to do. The indexing works on the
> stemmed words so could
> go part of the way. I have a copy of mg-1.3f from somewhere
> (possibly New Zealand) too
> so the mg-1.2.1 source linked off the page may not be the latest
> available.
>
> While I haven't gone looking, I though tools for generating
> concordances etc had been
> around for ages... You may get more help from the Information
> Retrieval community
> (ACM SIGIR for example).

--
Doug Loss                 God is a comedian playing
Data Network Coordinator  to an audience too afraid
Bloomsburg University     to laugh.
dloss@bloomu.edu                Voltaire