Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> __/ On Friday 26 August 2005 09:27, [John Bokma] wrote : \__
>> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
>>> ___/ On Friday 26 August 2005 08:53, [Mikkel Møldrup-Lakjer] wrote :
>>>> "Roy Schestowitz" <newsgroups@xxxxxxxxxxxxxxx> skrev i en
>>>> meddelelse news:demhct$1438$1@xxxxxxxxxxxxxxxxxxxx
>>>>> It sure gets copied rather quickly. Here is an example I found by
>>>>> doing some
>>>> At least the first one mentions its sources, something which seems
>>>> to have been left out on the other two. I think most pages that use
>>>> the Factbook don't like to mention their source. Now why would that
>>> An agent would come home and they would... *ahem*.. 'disappear'. I
>>> bet you were referring to the lack of reliability though... not to
>>> mention lack of intellect or creativity.
>>> This shows the need for penalty on duplicates. But how can this be
>>> done automatically if the copier does not even acknowledge (link) to
>>> the source?!?!?! It's a hit-or-miss situation.
>> Nah, I am sure there are ways to see of a (part) of page A is also on
>> page B.
> Sorry, but I must disagree. Let us say that T is the original page and
> F (false) is the copy.
> If F = T + A where A is some extra content, then you have problems
Not really, you can define similarities based on sentences, words, etc.
You don't have to look for exact matches. Similar is close enough.
I am sure there has already been a lot of research done. For example,
students copy papers written by others.
> To a black hat SEO it would be no problem to automate this and deceive
> the search engines. it is much easier to carry out a robbery than it
> is for the police to spot the crook in a town of millions.
You don't do exact matches in cases like this, just fuzzy matches.
John Perl SEO tools: http://johnbokma.com/perl/
Experienced (web) developer: http://castleamber.com/
Get a SEO report of your site for just 100 USD: