Re: Detecting Content Mirrors

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: Detecting Content Mirrors

Subject: Re: Detecting Content Mirrors
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Fri, 26 Aug 2005 09:32:55 +0100
Newsgroups: alt.internet.search-engines
Organization: schestowitz.com / Manchester University
References: <dekrj8$47m$1@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com> <deksi9$2c6l$1@godfrey.mcc.ac.uk> <dekura$crp$1@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com> <430e1f91$0$18636$14726298@news.sunsite.dk> <dem2vi$2ppa$3@godfrey.mcc.ac.uk> <430ec59c$0$18639$14726298@news.sunsite.dk> <demhct$1438$1@godfrey.mcc.ac.uk> <430eca73$0$18648$14726298@news.sunsite.dk> <demibq$148p$2@godfrey.mcc.ac.uk> <Xns96BE22CE9993Ecastleamber@130.133.1.4>
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

__/ On Friday 26 August 2005 09:27, [John Bokma] wrote : \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> ___/ On Friday 26 August 2005 08:53, [Mikkel Møldrup-Lakjer] wrote :
>> \___
>> 
>>> "Roy Schestowitz" <newsgroups@xxxxxxxxxxxxxxx> skrev i en meddelelse
>>> news:demhct$1438$1@xxxxxxxxxxxxxxxxxxxx
>>>>
>>>> It sure gets copied rather quickly. Here is an example I found by
>>>> doing some
>>>> searches...
>>> 
>>> At least the first one mentions its sources, something which seems to
>>> have been left out on the other two. I think most pages that use the
>>> Factbook don't like to mention their source. Now why would that be?
>> 
>> An agent would come home and they would... *ahem*.. 'disappear'. I bet
>> you were referring to the lack of reliability though... not to mention
>> lack of intellect or creativity.
>> 
>> This shows the need for penalty on duplicates. But how can this be
>> done automatically if the copier does not even acknowledge (link) to
>> the source?!?!?! It's a hit-or-miss situation.
> 
> Nah, I am sure there are ways to see of a (part) of page A is also on page
> B.

Sorry, but I must disagree. Let us say that T is the original page and F
(false) is the copy.

If F = T + A where A is some extra content, then you have problems

If T = F + A then your assumption is correct

If T = F you can rely on links (acknowledgements)

What would you do when:

F = T1 + A + T2 + B + T3

Or even worse:

F = T1/2 + A + T2/2 + B 

To a black hat SEO it would be no problem to automate this and deceive the
search engines. it is much easier to carry out a robbery than it is for the
police to spot the crook in a town of millions.

Roy

-- 
Roy S. Schestowitz        Y |-(1^2)|^(1/2)+1 K
http://Schestowitz.com

Follow-Ups:
- Re: Detecting Content Mirrors
  - From: John Bokma

References:
- Re: Great source for content
  - From: Roy Schestowitz
- Re: Great source for content
  - From: T.J.
- Re: CIA Factbook Errors
  - From: Roy Schestowitz
- Re: CIA Factbook Mirrors
  - From: Roy Schestowitz
- Re: CIA Factbook Mirrors
  - From: Roy Schestowitz

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index