DePaul University DePaul CTI Homepage

ect584.gif (3224 bytes)


 Syllabus

 Announcements

 Course Material

 Assignments

 Class Project

 Online Resources

 Home


Comments/Suggestions

Tools and Software

  • WEKA
    WEKA is an open-source data mining package containing a full collection of machine learning algorithms for solving various data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code. It includes several implemented schemes for classification, association rule discovery, clustering, prediction, etc. The full distribution of WEKA as well as additional information and supporting material can be found at the official WEKA Web site. The site also includes additional data (from the UCI data repository) already converted into the ARFF format which is used by WEKA.

  • Clustering and Profile Generation Tools
    This is set of programs developed here for clustering and generation of profiles based on the results of clustering. The set also includes some programs to assist in characterizing the generated clusters. The documentation for each program and some example data sets are included in the distribution. All of these programs and the documentation are included in a single Zip Archive.

  • ODBCMine
    ODBCMINE is a data mining tool for classification that generates decision trees from ODBC databases using the C4.5 classification model algorithm. It analyzes the data in any ODBC data source, and creates graphical decision trees in Scalable Vector Graphics (SVG) format.

  • Magnum Opus
    Magnum Opus is a tool for finding association rules from data. It uses a highly efficient search algorithm for fast association rule discovery and does not rely on sparse data for efficient processing.
    More information on Magnum Opus as well as an evaluation download version can be found from the G.I. Webb & Associates.

  • See5/C5.0
    See5 is the commercial version of the C4.5 decision tree algorithm developed by Ross Quinlan. See5/C5.0 classifiers are expressed as decision trees or sets of if-then rules. RuleQuest provides C source code so that classifiers constructed by See5/C5.0 can be embedded in your own systems.

    • More information on See5/C5.0 as well as an evaluation download version can be found from the RuleQuest Site.

    • The evaluation version can also be downloaded locally from here (zip archive). Note that the evaluation version is limited only to 200 cases. Please read the Help files for the program to become familiar with the data format (the distribution includes a sample data sets). Also available locally (and from the RuleQuest site) is the file see5-public.zip which contains public source code and binary programs for applying the classifier (obtained after running See5) to new cases. Please read the documentation to see how these programs can be used.

  • Cubist
    Yet another program from RuleQuest. Cubist builds rule-based predictive models that output values, complementing See5/C5.0 that predicts categories. For instance, See5/C5.0 might classify the yield from some process as "high", "medium", or "low", whereas Cubist would output a number such as 73%.

    • Information on Cubist and the evaluation download version can be found from the RuleQuest Site.

    • Local downloaded for the evaluation version (limited to 200 cases) can be found here (zip archive).

  • CBA
    CBA is a data mining tool for the discovery of association rules and for classification. The classification technique used in CBA is based on using a subset of associations discovered. CBA implements two versions of Apriori algorithm (one using a single minimum support parameter, and another using multiple minimum support at different levels). It also includes features for visualizing association rules using a tree structure. A paper describing the multiple minimum support feature can be found here in Postscript format, and another paper describes the Association Based Classification method.

    • More Information on CBA can be found from the DMII Site.

    • Local downloaded for the full educational version of CBA can be found here (zip archive).


Back to Online Resources

Copyright © 2007-2009, Bamshad Mobasher, CDM, DePaul University.