On-Line Analytical Processing (OLAP) is emerging as a very powerful paradigm
for strategic analysis of databases in business settings. Some of the key
characteristics of strategic analysis include 1) very large data volume,
2) explicit support for the temporal dimension, 3) support for various
kinds of information aggregation, and 4) long-range analysis, where overall
trends are more important than details of individual data items. While
OLAP can be performed directly on top of relational databases,
industry has developed specialized tools to
make it more efficient and effective, e.g. [Adv97]. Also,
the research community has recently demonstrated that the functional and
performance needs of OLAP require that new information structures be
designed. This has led to the development of the data cube information
model [GBLP96], and techniques for its efficient implementation
[HRU96,SDNR96,AAD
96].
Recent work [Dyr97] has shown that the analysis needs of Web usage data have much in common with those of a data warehouse, and hence OLAP techniques are quite applicable. The access information in server logs is modeled as an append-only history, which grows over time. A single access log is not likely to contain the entire request history for pages on a server, especially since many clients use a proxy server. Because information on access requests will be distributed, and there is a need to integrate it. Since the size of server logs grows quite rapidly, it may not be possible to provide on-line analysis of all of it. Therefore, there is a need to summarize the log data, perhaps in various ways, to make its on-line analysis feasible. Making portions of the log selectively (in)visible to various analysts may be required for security reasons. These requirements for Web usage data analysis show that OLAP techniques may be quite applicable, and this issue needs further investigation.