next up previous
Next: Association Rules Up: Discovery Techniques on Previous: Discovery Techniques on

Path Analysis

There are many different types of graphs that can be formed for performing path analysis, since a graph represents some relation defined on Web pages (or other objects). The most obvious is a graph representing the physical layout of a Web site, with Web pages as nodes and hypertext links between pages as directed edges. Other graphs could be formed based on the types of Web pages with edges representing similarity between pages, or creating edges that give the number of users that go from one page to another [PPR96]. Most of the work to date involves determining frequent traversal patterns or large reference sequences from the physical layout type of graph. The navigation-content transactions of [CMS97], maximal forward reference transactions of [CPY96], or user sessions of [PPR96] can be used for path analysis. Path analysis could be used to determine most frequently visited paths in a Web site. Other examples of information that can be discovered through path analysis are:

The first rule suggests that there is useful information in /company/products/file2.html, but since users tend to take a circuitous route to the page, it is not clearly marked. The second rule simply states that the majority of users are accessing the site through a page other than the main page (assumed to be /company in this example) and it might be a good idea to include directory type information on this page if it is not there already. The last rule indicates an attrition rate for the site. Since many users don't browse further than four pages into the site, it would be prudent to ensure that important information is contained within four pages of the common site entry points.



next up previous
Next: Association Rules Up: Discovery Techniques on Previous: Discovery Techniques on



Bamshad Mobasher
Wed Jul 16 02:08:33 CDT 1997