[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FW: FW: [kidsgames] word familiarity



>From: Brian Thompson [mailto:briandthompson@home.com]
>Sent: Wednesday, February 23, 2000 7:27 PM
>To: sjbaker1@airmail.net
>Subject: RE: FW: [kidsgames] word familiarity
>
>
>   By necessity I think we have to go by entries by definition, if you use
>wordnet for example the result is in instances of the word found.  I think
>that to be useful this idea should be implemented.  We are essentially then
>talking about billions of possible entries, through multiple definitions and
>translation as well, but only a few word files or picture files (well
>perhaps we could allow people to link in their own picture files) necessary
>because the words, except for translation of course, will all sound the same
>for multiple definitions.

Differing pronunciations like 'bow' are very rare, but it is not uncommon
for nouns to be pronounced differently from verbs (e.g., abuse).  Still,
list of exceptions is small enough that they can be record separately.
I still don't see the point of recording the words, but that's another
discussion.

>  I agree entirely that some of the usability such
>as picture or 3d models is not necessary early on.  I have already thought
>of what the translation entry page would look like.  Words in different
>languages with the same definitions would have the same entry numbers in
>their own wordtable making the words easier to translate on the fly, you
>just looking up the word sound file in each table and play both.
>   I think the first version should be very basic, database structure and
>forms for most of the data entry complete.  I think the first words should
>be colors, shapes, basic animals (dog cat).  I think this will be a good
>start and will show us where we need to improve the data base.  I realize
>that with wordnet association to other words such as the hypernyms and
>hyponyms are used but at first we should get the basics only and then as
>words are added we can link them and continue to grow in functionality as
>well as size since we have paradigms it will be easier to link words into
>realms for educators at all levels, and it will only take a drop down menu
>for them to add this to the records.
>
>												Brian
>>
>> When you have multiple languages, how do you deal with the fact
>> that a word in one language may have multiple meanings ("Bow" -
>> something you shoot an arrow with, the sharp end of a ship or
>> respectfully bending at the waist, a fancy knot...etc) - but
>> those multiple meanings are very different in the other
>> language - there is no one word in French that means all of
>> those three things.  What's worse, the French word 'noeud'
>> (which is the word for 'bow' in the context of something you'd
>> put on the wrapping of a gift) also means 'knot' - just any
>> old knot - the shape a snake makes when it coils up - and a
>> collection of gem stones on an item of jewellery...so there
>> isn't a 1:1 relationship between words in one language and
>> words in another - it's a mess!
>>
>> Seems like you suddenly need the 'records' in your database to be
>> fundamental meanings and not words.

This is how wordnet is organized.  For translation you just have to map
specific senses of the word to specific senses of (possibly different)
words in the other language.  For example, you must associate 'noeud'
with both 'knot' and 'bow'.  EuroWordNet claims to do this for a dozen
languages, so the problems are not insurmountable.

>>  What's more, you need to cope
>> with the fact that 'meanings' like "Snow" is only a single concept
>> in English - in Eskimo, there are many (what was it - 10?) different
>> words that relate to OUR concept of snow - because there are finer
>> gradations of meaning in Eskimo.

A myth.  English has just as many.  Just ask a skier what the conditions
are like, and you will hear responses like 'corn', 'powder' or 'slush',
or maybe that there was a 'blizzard' or 'white-out' conditions.  Plus,
you can do what WordNet does, and not bother with fine distinctions.
This is a lack in wordnet.  It would be nice to provide within each sense,
the fine distinctions which allows you to choose the appropriate word.
Even without these, I find WordNet to be a very useful resource. 

>>
>> If you have a database of 'meanings' - how the heck do you index
>> into it?
>>
>>                    Time  flies like an arrow,
>>                    Fruit flies like a banana,
>>                    Green flies like a lettuce.

You access it via word, and sense number within the word.  You could just
as well use an arbitrary sense index for each of the ~100000 senses in the
database.  Note that for each sense, wordnet provides a set of words, so
it is effectively a thesaurus as well as a dictionary.

>>
>> ASIDE (and just my $0.02):
>>
>> It seems to me that this wordlist thing is in grave danger of
>> growing into something SO big and complex that it'll never get
>> implemented.
>>
>> I would strongly suggest retreating to an achievable goal of
>> getting a few thousand words of a child's vocabulary into a
>> format with the word, a picture and a sound.  Once that's
>> established, you can consider growing the number of fields
>> and the number of words...but the way this is going, it's
>> going to be the sum total of all human knowledge.

Or you could just hang them off the current wordnet structure, and
list which words you've provided sounds and pictures for, which is
what is being proposed is it not?  Even standard dictionaries provide
pictures for some of their entries.

>>
>> Just that - by itself - would be a tremendous resource.
>>
>> I've worked on LOTS of OpenSource projects, and I can tell you
>> that the sucessful ones are those that start with modest goals
>> and grow only AFTER they have a tangiable product to keep people
>> interested in it.
>>
>> You have to pick something you can do - and finish in a reasonable
>> time...resist the temptation to get more complex until that's
>> done.  In one project I'm working on (the PPE 3D modeller) we
>> have an acronym 'NIV100' - which is short for:
>>
>>     "Not In Version 1.0.0"
>>
>> ...I'd argue that foreign languages, synonyms and antonyms,
>> multiple sounds, etc are all NIV100 features.  It doesn't
>> mean "we won't do this" - it means "this is a good idea but
>> it's a distraction that (when added to all the other NIV100
>> features) would mean we'd never get ANYTHING out of the door.

Except that, by leveraging off an existing resource, synonyms and 
antonyms come for free.  And the foreign language vs. the multimedia
projects are really separate but parallel.  How many of us here
could really hope to contribute to a French dictionary?  Though
one would certainly be useful for a French tutor program, embedded
in a suitable game environment of course :)

Paul Kienzle
pkienzle@kienzle.powernet.co.uk

>>
>> --
>> Steve Baker                  http://web2.airmail.net/sjbaker1
>> sjbaker1@airmail.net (home)  http://www.woodsoup.org/~sbaker
>> sjbaker@hti.com      (work)
>>
>
>-
>kidsgames@smluc.org  -- To get off this list send "unsubscribe kidsgames"
>in the body of a message to majordomo@smluc.org
>
>
-
kidsgames@smluc.org  -- To get off this list send "unsubscribe kidsgames"
in the body of a message to majordomo@smluc.org