DePaul University DePaul CTI Homepage

ect584.gif (3224 bytes)


 General Information 

 Announcements 

 Course Material 

 Assignments 

 Class Project 

 Online Resources 

 Home


Comments/Suggestions

General Information

 

Instructor:

Bamshad Mobasher
Email: mobasher@cs.depaul.edu
Office: Loop Campus, CTI Building, Room 833
Phone: (312) 362-5174
Office Hours: Mondays. 4:00-5:30 (or by appointment)

Description and Objectives:

Web mining refers to the automatic discovery of interesting and useful patterns from the data associated with the usage, content, and the linkage structure of Web resources. It has quickly become one of the most popular areas in computing and information systems because of its direct applications in e-commerce, e-CRM, Web analytics, information retrieval/filtering, Web personalization, and recommender systems. Employees knowledgeable about Web mining techniques and their applications are highly sought by major Web companies such as Google, Amazon, Yahoo, MSN and others who need to understand user behavior and utilize discovered patterns from terabytes of user profile data to design more intelligent applications. The primary focus of this course is on Web usage mining and its applications to e-commerce and business intelligence. Specifically, we will consider techniques from machine learning, data mining, text mining, and databases to extract useful knowledge from Web data which could be used for site management, automatic personalization, recommendation, and user profiling. The first half of the course will be focused on a detailed overview of the data mining process and techniques, specifically those that are most relevant to Web mining. The second half will concentrate on the applications of these techniques to Web and e-commerce data, and their use in Web analytics, user profiling and personalization. This course also counts as an advanced course for Computer Science students concentrating in AI or Data Analysis.

Textbooks and Reading Material:

Prerequisites:

  • CSC 383 or 393 (or equivalent background in data structures and algorithms) and CSC 449 (or other equivalent course in relational databases)

Grading Policy:

The final grade will be determined (tentatively) based on the following components:
 
    Assignments = 65%
    Final Project = 35%

     
The general grading scheme will be based on a curve. At the end of the quarter, some adjustments may be made based on overall class performance as well as signs of individual effort. Plusses and minuses will be given at the high/low ends of each grade range.

Assignments:

There will be 4-5 assignments during the quarter involving the concepts and techniques discussed in class. The assignments may involve experimenting with various tools, as well as other written or problem-oriented exercises. These assignments must be done individually. Late assignments will be penalized 10% per day (with weekends counting as one day).

Course Project:

For the class project, students can choose to do an implementation project, a data analysis project, or a research paper. Implementation projects may be done individually or in groups of 2 people (depending the complexity and the type of the project). Research papers and data analysis projects must be done individually. Each group or individual will submit a specific project proposal to be approved. More details about the possible project options, as well as due dates for the proposal and the final submission, are available in the Project section.

Tentative List of Topics

The following issues and topics will be covered throughout the course. Many of these topics will be revisited several times during the course in a variety of contexts.

  • Data Mining and Knowledge Discovery
    • The KDD process and methodology
    • Data preparation for knowledge discovery
    • Overview of data mining techniques
    • Market basket analysis
    • Classification and prediction
    • Clustering
    • Memory-based reasoning
    • Evaluation and Interpretation
  • Web Usage Mining Process and Techniques
    • Data collection and sources of data
    • Data preparation for usage mining
    • Mining navigational patterns
    • Integrating e-commerce data
    • Leveraging site content and structure
    • User tracking and profiling
    • E-Metrics: measuring success in e-commerce
    • Privacy issues
  • Web Mining Applications and Other Topics
    • Data integration for e-commerce
    • Web personalization and recommender systems
    • Web content and structure mining
    • Web data warehousing
    • Review of tools, applications, and systems


  • Copyright © 2007-2008, Bamshad Mobasher, School of CTI, DePaul University.