Research has shown fuzzy c-means (FCM) clustering to be a powerful tool to partition samples into different categories. However, the objective function of FCM is based only on the sum of distances of samples to their cluster centers, which is equal to the trace of the within-cluster scatter matrix. In this study, we propose a clustering algorithm based on both within-and between-cluster scatter matrices, extended from linear discriminant analysis (LDA), and its application to an unsupervised feature extraction (FE). Our proposed methods comprise between-and within-cluster scatter matrices modified from the between-and within-class scatter matrices of LDA. The scatter matrices of LDA are special cases of our proposed unsupervised scatter matrices. The results of experiments on both synthetic and real data show that the proposed clustering algorithm can generate similar or better clustering results than 11 popular clustering algorithms: K-means, K-medoid, FCM, the Gustafson-Kessel, Gath-Geva, possibilistic c-means (PCM), fuzzy PCM, possibilistic FCM, fuzzy compactness and separation, a fuzzy clustering algorithm based on a fuzzy treatment of finite mixtures of multivariate Student's t distributions algorithms, and a fuzzy mixture of the Student's t factor analyzers model. The results also show that the proposed FE outperforms principal component analysis and independent component analysis.
- Cluster scatter matrices
- linear discriminant analysis (LDA)
- unsupervised feature extraction (FE)