__/ [Jim Carlock] on Sunday 12 February 2006 18:18 \__
> Google seems to drop alot of things when I search for specific
> sequences of text...
> For example:
> "PHP: if" site:www.php.net
> turns up garbage for some reason. It completely misses the page with the
> title "PHP: if".
> Now, for some odd reason, although this is not a Google problem, but
> when I employ the search mechanism on the PHP site to search the
> "online documentation (en)" it takes me to Google, which in turn, the
> first link that appears inside of Google takes me to a French page
> with a different title:
> Go figure. I click upon the If link inside that page, and it takes me to
> the French If as I would expect from that page. But once there I can
> change the /fr/ in the address bar to /en/ and I get to the appropriate
> English page...
> Anyways, my question involves Google specifically.
> If I want to search for:
> "PHP: if - manual"
> how do I get Google to search for this specific string (I only want
> exact matches). Google seems to do the things that the other search
> engines have done in the past... by changing what I want to find and
> throwing up links that I don't need. Specifically this occurs with the
> php.net site right at the moment, but in the past Google seems to
> drop certain items from the searches, like brackets "", parenthesis
> "()", semicolons ";", periods ".", commas "," et al. Is there a way to
> turn off this mess and get Google to return specific sequences of
> Jim Carlock
> Post replies to the group.
I don't think there are many participants in this newsgroup, so I suggest you
subscribe to alt.internet.search-engines where this could definitely start
an elaborate discussion.
Google does not do full text matches (or exact string matched I ought to
say), not even if you enclose them in delimiters like quotes, brackets or
apostrophes. In practice, this leads to something else, which has only
roughly the desired effect. It is not Altavista, which many of us still
recall. Pluses and minuses bear little weight as well. The nature of the
search is different due to the way Google engineer their index. You could do
an allintitle, allintext and the like to get something rather meaningful.
That's all I can say...
Moving on to the issue of punctuation, Google separates words (terms, to be
more general) from a variety of symbols. Do not attempt to include these in
your queries. The exception are filetypes which can be searched for using
filetype: if I recall correctly.
Searching the web will reveal more on issues pertaining to punctuation in
modern search engines (filenames, for example, are a good analog as spaces
are ambiguous). There are many issues associated with such an approach. I am
sure that after shallow thinking, they begin to become apparent to a
Roy S. Schestowitz | "Life is too short to proofread"
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
6:25am up 27 days 1:41, 12 users, load average: 0.22, 0.41, 0.58
http://iuron.com - next generation of search paradigms