next up previous
Next: Analysis of Mined Up: Research Directions Previous: Data Pre-Processing for

The Mining Process

The key component of Web mining is the mining process itself. As discussed in this paper, Web mining has adapted techniques from the field of data mining, databases, and information retrieval, as well as developing some techniques of its own, e.g. path analysis. A lot of work still remains to be done in adapting known mining techniques as well as developing new ones. Specifically, the following issues must be addressed:

  1. New Types of Knowledge: Web usage mining studies reported to date have mined for association rules, temporal sequences, clusters, and path expressions. As the manner in which the Web is used continues to expand, there is a continual need to figure out new kinds of knowledge about user behavior that needs to be mined for.

  2. Improved Mining Algorithms: The quality of a mining algorithm can be measured both in terms of how effective it is in mining for knowledge and how efficient it is in computational terms. There will always be a need to improve the performance of mining algorithms along both these dimensions.

  3. Incremental Web mining: Usage data collection on the Web is incremental in nature. Hence, there is a need to develop mining algorithms that take as input the existing data and mined knowledge, and the new data, and develop a new model in an efficient manner.

  4. Distributed Web mining: Usage data collection on the Web is distributed by its very nature. If all the data were to be integrated before mining, a lot of valuable information could be extracted. However, an approach of collecting data from all possible server logs is both non-scalable and impractical. Hence, there needs to be an approach where knowledge mined from various logs can be integrated together into a more comprehensive model.



next up previous
Next: Analysis of Mined Up: Research Directions Previous: Data Pre-Processing for



Bamshad Mobasher
Wed Jul 16 02:08:33 CDT 1997