The agent-based approach to Web mining involves the development of sophisticated AI systems that can act autonomously or semi-autonomously on behalf of a particular user, to discover and organize Web-based information. Generally, the agent-based Web mining systems can be placed into the following three categories:
94],
FAQ-Finder [HBML95], Information Manifold [KLSS95],
OCCAM [KW96], and ParaSite [Spe97] rely either on
pre-specified and domain specific information about particular types of
documents, or on hard coded models of the information sources to
retrieve and interpret documents. Other agents, such as ShopBot
[DEW96] and ILA (Internet Learning Agent) [PE95], attempt to
interact with and learn the structure of unfamiliar information sources.
ShopBot retrieves product information from a variety of vendor sites using
only general information about the product domain. ILA, on the other hand,
learns models of various information sources and translates these into its
own internal concept hierarchy.
96]. For example,
HyPursuit [WVS
96] uses semantic information embedded in link
structures as well as document content to create cluster hierarchies of
hypertext documents, and structure an information space. BO (Bookmark
Organizer) [MS96] combines hierarchical clustering techniques and
user interaction to organize a collection of Web documents based on
conceptual information.