__/ [ Fred Hedges ] on Friday 05 May 2006 16:18 \__
> I notice our website (http://www.thermoteknix.com/) is 3rd in the google
> rankings for Miniature Infrared Camera, but nowhere for Small Infrared
> Camera. I thought the engines were smarter than that. I'm thinking it
> must be fairly easy to use Synonyms (especially in English), to find a
> "root" expression - or token - to represent a word or phrase, making the
> term a little more generic. In reality, "keyword" is a rather unforgiving
> concept. Don't you agree?
I fully agree, but the implementational issues are tremendous. Merely any
word has a synonym, or two, or more. These words, in turn have their own
synonyms, which may not be suitable for that synonym first one (it's a
graph then, not a tree; and the graph is not cyclic either).
Now, in search engine hyperspace you have axes for the different keywords
and each one is assumed to be quite independent and isolated (setting
aside proximity tests). If you start to merge dimensions in a fuzzy way
(relying on synonyms that are suitable to a lesser or greater extent), the
complexity becomes vast. Is already /is/ vast at present.
There is one alternative however. When you type in a search phrase, you
could opt for a slower process (or one which gets distributed across
multiple machines or gets hyperthreaded), whereby your words are
interchanged by known synonyms and queried in isolation. Then you apply
some relevance weighting (to account more for the original phrase) and
merge the results accordingly. This might work rather nicely, but again,
it's very computationally expensive and it might not produce more relevant
results. This could maybe be added as an option with a tickbox, but it'll
confuse users. Simplicity is better...
Roy S. Schestowitz | "Slashdot is standard-compliant... in Japan"
http://Schestowitz.com | GNU is Not UNIX ¦ PGP-Key: 0x74572E8E
4:35pm up 7 days 23:32, 12 users, load average: 1.01, 0.87, 0.70
http://iuron.com - proposing a non-profit search engine