__/ [ catherine yronwode ] on Saturday 25 February 2006 03:16 \__
> Does anyone know of either a software package or a subscription service
> that employs search engine technology to roam the net looking for
> samples of copyright infringement / plagiarism?
I wrote about it last month, among other methods:
> The way i envision it, the search engine bot would be given the client's
> domain name, then run around comparing random snips from each of the
> client's pages with results at, say, google. When it finds a match, it
> reports back with a daily log listing all files that duplicate portions
> or the entirety of a client's files.
Yes, that is probably how copyscape works. It automates the analogous yet
more laborious process that a human otherwise undertakes. Some of the
lecturers used to be using Google to detect plagiarism. Submission in
electronic form has its merits.
> At this point what is sold could be just the search engine bot software
> (to tech-oriented cients) or a subscription service (for business
> clients without an interest in tech matters).
> The SEO bot would run X number of pages per day, and check the site
> through once, or it could go back and re-work the same site on a
> continual, ongoing subscription basis.
> The subscription service would also do a whois lookup and find the email
> and street addresses of the domain owner and the domain host. It then
> (hand-supervised, probably) would auto-generate legal letters of
> complaint to the domain contact(s) and the contacts for te isp hosting
> the site. This information would go into a weekly log. As a service, it
> coud be programmed to auto-generate the exact forms requested by the
> major isps, such as yahoo. It would also presumeably continually revisit
> pages that it found had been infringed until the infringement was
That's a lot of automated traffic, which can raise many concerns. What, for
example, will you do when the offending site copies in part or attributes
the source using a link? This needs careful attention and judgment by a
human, preferably the victim. Also, imagine the load on abuse@isp . net.
> Thinking farther ahead, if a partnership with google were made, google
> could agree to de-list sites that did not comply with the legal complaint
> rcedures (e.g. those bootleg Rumanian sites). (I do not want to get off
> onto a tangent about google's own sopyright infringement issues; i know
> about them and i hope and trust that they will be resolved. This is just
> an idea, that's all, so please do not turn it into an excuse for
> google-bashing. Thanks.)
Why just Google? *smile* It promotes monoculture.
> I would pay a yearly fee for such a service.
> Does it exist?
I doubt it.
> If not, why not? (And can those restrictions be overcome?)
Such a tool would need to hammer a search engine quite heavily. How would the
search engine feel about it and what does the search engine have to earn?
> cat yronwode
> Blues Lyrics and Hoodoo
With friendly regards,
Roy S. Schestowitz
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
6:30am up 7 days 18:49, 9 users, load average: 1.15, 1.04, 1.00
http://iuron.com - help build a non-profit search engine