next up previous
Next: Using Latent Semantic Analysis Up: Using Semantic Knowledge to Previous: Extracting Domain Semantics from

Integrating Semantic Similarity with Collaborative Filtering

As noted earlier, the item-based CF framework provides a computational advantage over user-based approaches, since item similarities can be computed offline, prior to the online task of generating recommendations. But, this framework also provides another important advantage. Since the computation of item similarities is independent of the methods used for generating predictions or recommendations, other sources of evidence about items (in addition to item ratings or weights) can be used for performing the similarity computations.

The integration of semantic similarities for items with rating (or usage-based) similarities provides two primary advantages. First, the semantic attributes for items provide additional clues about the underlying reasons for which a user may or may not be interested in particular items (something that is hidden behind the rating values in the usual context). This, in turn, allows the system to make inferences based on this additional source of knowledge, possibly improving the accuracy of recommendations. Secondly, in cases where little or no rating (or usage) information is available (such as in the case of newly added items, or in very sparse data sets), the system can still use the semantic similarities to provide reasonable recommendations for users.

In the following we describe our approach for integrating semantic similarities into the standard item-based collaborative filtering framework. Our approach involves first performing latent semantic analysis on the semantic attribute matrix obtained using the process described in Section 3.1. This is necessary in order to reduce noise and to collapse highly correlated attributes. We then compute item similarities, both based on the reduced semantic attribute matrix, as well as based on the user-item ratings (or usage) matrix. Finally, we use a combined similarity measure, as a linear combination of the two similarities to perform item-based collaborative filtering.


Subsections
next up previous
Next: Using Latent Semantic Analysis Up: Using Semantic Knowledge to Previous: Extracting Domain Semantics from
Bamshad Mobasher 2004-03-09