General Information
Announcements
Course Material
Assignments
Class Project
Online Resources
Home
Comments/Suggestions
|
Online
Resources & Reference Material |
- Tools and Software
- Data Mining with WEKA,
A Tutorial
- Online Papers
- Data Sets and Sources of Data
-
Preprocessed DePaul CTI Web Usage Data -
Cleaned, filtered, and sessionized data of visits to the main CTI site during a 2 week period
in April 2002. The data also includes basic statistics on users and
sessions.
-
Cleaned DePaul CTI Web Usage Data - The full cleaned CTI
Web usage data for April 2002. This data set has been cleaned
(including spider removal) and converted into tab delimited format.
However, no user identification, sessionization, or other data
preparation steps have been performed.
-
Non-Preprocessed DePaul CTI Web Usage Data -
The full CTI Web usage data for April 2002. The only cleaning step
performed on this data was the removal of references to auxiliary
files (e.g., image files). No other cleaning or preprocessing has been
performed. The data is in the original log format used by Microsoft
IIS.
-
Movie Ratings Data - Real movie ratings data from www.movielens.org Web site. Contains ratings on 1600+ movies by 1000 users.
-
UCI KDD Archive - An online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas.
- Resources
& Reference Material
|