Knowledge Engines - A Formal Proposition

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Knowledge Engines - A Formal Proposition

Subject: Knowledge Engines - A Formal Proposition
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Tue, 18 Oct 2005 13:14:09 +0100
Newsgroups: alt.internet.search-engines
Organization: schestowitz.com / MCC / Manchester University
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

ABSTRACT

Search engine technologies have been discussed to death and further devel-
oped endlessly in the past decade. However, such engines have no so-called
"thirst  for knowledge", but rather a thirst for text. We continue to live
in  an age where best results for a query are produced given an input com-
prising  keywords. The outcome, rather than answers or self-tailored  con-
tent,  is merely a linear collection of pages, whose static content resem-
bles  the  keywords. There is no way to guarantee, nonetheless, that  such
pages  will provide the desired information or provide information that is
credible.

Iuron  is set to become a collection of tools for knowledge engines, which
are  intended to crawl the World Wide Web. The aim is to create a semantic
entity that captures facts from a large number of pages, thereby providing
an  intelligent  front-end for user search. Results are generated 'on  the
fly' based on acquired knowledge and are solely intended to serve individ-
ual users.

OVERVIEW

Let us think of the Internet as a collection of complex, inter-related in-
formation.  More cohesively, it takes an immense number of hypotheses  and
thus  can  contain  valid, consistent knowledge. Although we  can  process
(scan)  all the information, higher-level knowledge, which is derived from
collection  of  pages, is still missing. There is enough knowledge  across
the World Wide Web to answer more or less any question, assuming it is not
subjective.  All that is done at present is word indexing with the  notion
of work proximity.

Let  us face the fact that, among the more popular uses of search engines,
are pursuits for commercial companies, which provide products or services.
Results  that get returned by the engines sometimes correspond to the most
valid  and relevant authority for a given niche. This may be fine for  in-
sight  into  magnitude and breadth of companies (or their Web sites),  but
this equally often misleads the user.

Search engines at present fail to extend beyond a potentially morbid state
of  "dominance  prevails". Rather than an engine that provides users  with
the  most reasonable answer and/or reference to a site, it provides a  Web
link  to what is most cited, typically due to fraudulent practices or sub-
jective search engine optimisations.

All  the  all, search engines at present encourage link-related  spam  and
content-related spam. In worse scenarios, their backlinks-based algorithms
lead  to  rise in sponsored listings, whereas our natural incentive is  to
prefer what would "work best for us", not what got recommended by automat-
ed  tools. These tools, which work at a shallow level without  understand-
ing,  opt to prioritise large corporations with money to be spent on  good
listings and inbound links.

Iuron is a project that addresses the issues above. First and foremost, it
converts  the vast amount of information in the World Wide Web into facts.
Moreover,  it serves as an impartial source for answers and is not  highly
susceptible to deceit as it can discern true from false.

METHODS

There  are a variety of plausible ideas, which have been expressed at some
depth  in the manifestation document alongside their pitfalls. To name one
of them briefly, pages should be obtained from the World Wide Web and then
reduced to a set of facts. Facts will be assigned varying weighs depending
on credibility factors. Frequently-repeated facts will be encouraged while
falsified  facts  discouraged  or altogether rejected.  First-order  logic
serves as the holy grail by which a sequence of words (elements) becomes a
set of arguments with associated semantics.

PRACTICABILITY

The fundamental approach to tacking the problem is not overly complicated.
The  goal is certainly feasible, while the resources to make it  practical
are the primary barrier.

Since  Iuron is an Open Source project, rapid assemblage and  construction
of the libraries would be rapid, making use of existing projects that fall
under  the  General Public Licence (GPL). In return, Iuron will provide  a
potentially  distributed environment, wherein any idle computer across the
world  can assist crawling and report back to a main knowledge repository.
Think  of it as a public-driven reciprocal effort to process and then cen-
tralise human knowledge.

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index