TY - JOUR
T1 - An accelerated K-means clustering algorithm using selection and erasure rules
AU - Lee, Suiang Shyan
AU - Lin, Chih_Ching
PY - 2012/10/1
Y1 - 2012/10/1
N2 - The K-means method is a well-known clustering algorithm with an extensive range of applications, such as biological classification, disease analysis, data mining, and image compression. However, the plain K-means method is not fast when the number of clusters or the number of data points becomes large. A modified K-means algorithm was presented by Fahim et al. (2006). The modified algorithm produced clusters whose mean square error was very similar to that of the plain K-means, but the execution time was shorter. In this study, we try to further increase its speed. There are two rules in our method: a selection rule, used to acquire a good candidate as the initial center to be checked, and an erasure rule, used to delete one or many unqualified centers each time a specified condition is satisfied. Our clustering results are identical to those of Fahim et al. (2006). However, our method further cuts computation time when the number of clusters increases. The mathematical reasoning used in our design is included.
AB - The K-means method is a well-known clustering algorithm with an extensive range of applications, such as biological classification, disease analysis, data mining, and image compression. However, the plain K-means method is not fast when the number of clusters or the number of data points becomes large. A modified K-means algorithm was presented by Fahim et al. (2006). The modified algorithm produced clusters whose mean square error was very similar to that of the plain K-means, but the execution time was shorter. In this study, we try to further increase its speed. There are two rules in our method: a selection rule, used to acquire a good candidate as the initial center to be checked, and an erasure rule, used to delete one or many unqualified centers each time a specified condition is satisfied. Our clustering results are identical to those of Fahim et al. (2006). However, our method further cuts computation time when the number of clusters increases. The mathematical reasoning used in our design is included.
KW - Acceleration
KW - Erasure
KW - K-means clustering
KW - Selection
KW - Vector quantization
UR - http://www.scopus.com/inward/record.url?scp=84870876648&partnerID=8YFLogxK
U2 - 10.1631/jzus.C1200078
DO - 10.1631/jzus.C1200078
M3 - Article
AN - SCOPUS:84870876648
VL - 13
SP - 761
EP - 768
JO - Frontiers of Information Technology and Electronic Engineering
JF - Frontiers of Information Technology and Electronic Engineering
SN - 2095-9184
IS - 10
ER -