Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Don't engines use Synonyms yet?

__/ [ Fred Hedges ] on Friday 05 May 2006 16:18 \__

> I notice our website (http://www.thermoteknix.com/)  is 3rd in the google
> rankings for Miniature Infrared Camera, but nowhere for Small Infrared
> Camera.  I thought the engines were smarter than that.  I'm thinking it
> must be fairly easy to use Synonyms (especially in English), to find a
> "root" expression - or token - to represent a word or phrase, making the
> search
> term a little more generic.  In reality, "keyword" is a rather unforgiving
> concept.  Don't you agree?

I  fully agree, but the implementational issues are tremendous. Merely any
word  has a synonym, or two, or more. These words, in turn have their  own
synonyms,  which  may not be suitable for that synonym first one  (it's  a
graph then, not a tree; and the graph is not cyclic either).

Now,  in search engine hyperspace you have axes for the different keywords
and  each  one  is assumed to be quite independent and  isolated  (setting
aside  proximity  tests). If you start to merge dimensions in a fuzzy  way
(relying on synonyms that are suitable to a lesser or greater extent), the
complexity becomes vast. Is already /is/ vast at present.

There  is  one alternative however. When you type in a search phrase,  you
could  opt  for  a slower process (or one which  gets  distributed  across
multiple   machines  or  gets  hyperthreaded),  whereby  your  words   are
interchanged  by  known synonyms and queried in isolation. Then you  apply
some  relevance  weighting (to account more for the original  phrase)  and
merge  the results accordingly. This might work rather nicely, but  again,
it's very computationally expensive and it might not produce more relevant
results.  This could maybe be added as an option with a tickbox, but it'll
confuse users. Simplicity is better...

'Nuff rambling...


Roy S. Schestowitz      | "Slashdot is standard-compliant... in Japan"
http://Schestowitz.com  |  GNU is Not UNIX  ¦     PGP-Key: 0x74572E8E
  4:35pm  up 7 days 23:32,  12 users,  load average: 1.01, 0.87, 0.70
      http://iuron.com - proposing a non-profit search engine

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index