Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
|
|
Fuzzy Clustering and Data Analysis Toolbox
Posted:
Apr 21, 2005 1:39 PM
|
|
Fuzzy Clustering and Data Analysis Toolbox
The first release of the toolbox is now available from
http://www.fmt.vein.hu/softcomp/fclusttoolbox/
The purpose of the development of this toolbox was to compile a continuously extensible, standard tool, which is useful for any Matlab user for one's aim. In Chapter 1 of the downloadable related documentation one can find a theoretical introduction containing the theory of the algorithms, the definition of the validity measures and the tools of visualization, which help to understand the programmed Matlab files. Chapter 2 deals with the exposition of the files and the description of the particular algorithms, and they are illustrated with simple examples, while in Chapter 3 the whole Toolbox is tested on real data sets during the solution of three clustering problems: comparison and selection of algorithms; estimating the optimal number of clusters; and examining multidimensional data sets.
About the Toolbox
The Fuzzy Clustering and Data Analysis Toolbox is a collection of Matlab functions. The toolbox provides five categories of functions:
- Clustering algorithms. These functions group the given data set into clusters by different approaches: functions Kmeans and Kmedoid are hard partitioning methods, FCMclust, GKclust, GGclust are fuzzy partitioning methods with different distance norms.
- Evaluation with cluster prototypes. On the score of the clustering results of a data set there is a possibility to calculate membership for "unseen" data sets with these set of functions. In 2-dimensional case the functions draw a contour-map in the data space to visualize the results.
- Validation. The validity function provides cluster validity measures for each partition. It is useful when the number of cluster is unknown a priori. The optimal partition can be determined by the point of the extrema of the validation indexes in dependence of the number of clusters. The indexes calculated are: Partition Coefficient (PC), Classification Entropy (CE), Partition Index (SC), Separation Index (S), Xie and Beni's Index (XB), Dunn's Index (DI) and Alternative Dunn Index (DII).
- Visualization. The Visualization part of this toolbox provides the modified Sammon mapping of the data. This mapping method is a multidimensional scaling method described by Sammon.
- Examples. An example based on industrial data set to present the usefulness of these toolbox and algorithms.
-------------------------- Janos Abonyi, Ph.D
Head of the Department of Process Engineering University of Veszprem P.O.Box 158 H-8200, Veszprem, Hungary Tel: +36-88-624209 or 36-88-622793 Fax: +36-88-421-709 www.fmt.vein.hu/softcomp
You can order our new book (Fuzzy Model Identification for Control) from Birkhauser Boston (Springer - NY) http://www.springer-ny.com/detail.tpl?cart=1048164347947749&ISBN=0817642382 or from Amazon.com http://www.amazon.com/exec/obidos/ASIN/0817642382/
|
|
|
|