Roy Schestowitz wrote:
> Chris Hope wrote:
>> Fritz M wrote:
>>> Chris Hope wrote:
>>>> In a lot of cases Google picks up typos when the search is done and
>>>> you get the page asking you if you meant to spell it the correct
>>> In Roy's "Oogle earth" example, though, Google isn't offering
>>> corrections. Interesting.
>> I was surprised it didn't pick that one up myself. One of the
>> mispellings I did once was "Freebsie" when the brand name is actually
>> "Freesbie". Google still hasn't learnt it's an incorrect spelling
>> yet, and the two sites I have it mispelled on come up within the
>> first few results. The original site I left with the mispelling (hey
>> why not!) and the second one I added it in with a note saying it's
>> often mispelled and had the mispelling. Gets me a bit of traffic each
>> day :)
>> The funny thing is, when you spell it correctly Google asks if you
>> meant to search on "frisbie"
>>> I intentionally misspell words on some of my web pages or include
>>> common variations of a word, usually for proper nouns like city
>>> names, restaurants, people and so forth.
> I hadn't realised what Fritz pointed out until he did. However, Google
> corrections are often as naive as one would expect them to be.
> A Google index of valid tokens considers 'Oogle' (whatever it may be)
> to be a valid word. It also considers 'Earth' to be a valid word.
> Finding the correlation between words and proposing corrections based
> on strings of words is computationally a hard task. Google Suggest is
> capable of pairing (or tupling) words based on the number of results,
> but if you introduce this extra dimension of misspellings and consider
> all possible things that can go wrong with spelling, you ask for too
> much. You would not get search results quickly enough OR, if doing it
> off-line, you could have Google spend a lot of computer power
> optimising searches in this way.
Compared with the crunching Google does to return the search queries I
don't think the spelling thing would necessarily be all that difficult.
However, I suspect that the way it works is by using some sort of
simple dictionary lookup.
This is a funny Google search that's doing the rounds at the moment
which illustrates the spelling thing (and of course the real reason it
does the suggestion is because there's no ' in isn't):
Chris Hope | www.electrictoolbox.com | www.linuxcdmall.com