The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » sci.math.* »

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Fuzzy Clustering and Data Analysis Toolbox
Replies: 0  

Advanced Search

Back to Topic List Back to Topic List  
Janos Abonyi

Posts: 6
Registered: 12/7/04
Fuzzy Clustering and Data Analysis Toolbox
Posted: Apr 21, 2005 1:39 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Fuzzy Clustering and Data Analysis Toolbox

The first release of the toolbox is now available from

The purpose of the development of this toolbox was to compile a
continuously extensible, standard tool, which is useful for any
Matlab user for one's aim. In Chapter 1 of the downloadable related
documentation one can find a theoretical introduction containing the
theory of the algorithms, the definition of the validity measures
and the tools of visualization, which help to understand the
programmed Matlab files. Chapter 2 deals with the exposition of the
files and the description of the particular algorithms, and they are
illustrated with simple examples, while in Chapter 3 the whole
Toolbox is tested on real data sets during the solution of three
clustering problems: comparison and selection of algorithms;
estimating the optimal number of clusters; and examining
multidimensional data sets.

About the Toolbox

The Fuzzy Clustering and Data Analysis Toolbox is a collection of
Matlab functions. The toolbox provides five categories of functions:

- Clustering algorithms. These functions group the given data set
into clusters by different approaches: functions Kmeans and Kmedoid
are hard partitioning methods, FCMclust, GKclust, GGclust are fuzzy
partitioning methods with different distance norms.

- Evaluation with cluster prototypes. On the score of the clustering
results of a data set there is a possibility to calculate membership
for "unseen" data sets with these set of functions. In 2-dimensional
case the functions draw a contour-map in the data space to visualize
the results.

- Validation. The validity function provides cluster validity
measures for each partition. It is useful when the number of cluster
is unknown a priori. The optimal partition can be determined by the
point of the extrema of the validation indexes in dependence of the
number of clusters. The indexes calculated are: Partition
Coefficient (PC), Classification Entropy (CE), Partition Index (SC),
Separation Index (S), Xie and Beni's Index (XB), Dunn's Index (DI)
and Alternative Dunn Index (DII).

- Visualization. The Visualization part of this toolbox provides the
modified Sammon mapping of the data. This mapping method is a
multidimensional scaling method described by Sammon.

- Examples. An example based on industrial data set to present the
usefulness of these toolbox and algorithms.

Janos Abonyi, Ph.D

Head of the Department of Process Engineering
University of Veszprem
P.O.Box 158 H-8200, Veszprem, Hungary
Tel: +36-88-624209 or 36-88-622793
Fax: +36-88-421-709

You can order our new book (Fuzzy Model Identification for Control)
from Birkhauser Boston (Springer - NY)
or from

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.