Abstract—Representing the data by smaller amount of clusters necessarily loses certain fine details, but achieves simplification. The most commonly used efficient clustering technique is k-means clustering. The better results of K-Means clustering can be achieved after computing more than one times. In this paper, a new approach is proposed for computing the initial centroids for K-means. This paper uses the first principal component generated using Principal Component Analysis (PCA) for initializing the centroid for KMeans clustering. Initially, the principal components in the dataset are gathered using PCA. From the obtained components, the first principal component is used for initializing the cluster centroid. As a result developed technique helps in decreasing the clustering time at the same time, the clustering accuracy is better for the proposed technique when compared to the existing technique.
Index Terms—K-means clustering, initial centroid, computational time, PCA, clustering time.
Adnan Alrabea is with Albalqa Applied University Jordan (email: adnan_alrabea@yahoo.com).
A.V. Senthil Kumar is with the Department of MCAt, Hindusthan College of Arts and Science, Coimbatore, India (email: avsenthilkumar@yahoo.com).
Hasan Alshalabi is with Al-Hussein Bin Talal University Maan, Jordan (email: hmfnamYahoo.com).
Ahmad Bader is with the American Academy of Cosmetic Surgery Hospital, Dubai (email: ahmad_baderjo@yahoo.com).
[PDF]
Cite:Adnan Alrabea, A. V. Senthilkumar, Hasan Al-Shalabi, and Ahmad Bader, "Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with PCA," Journal of Advances in Computer Networks vol. 1, no. 2, pp. 137-142, 2013.