Implementation and Analysis of Clustering Algorithms in Data Mining
DOI:
https://doi.org/10.24297/ijct.v6i1.4448Abstract
Data mining plays a very important role in information industry and in society due to the presence of huge amount of data. Organizations in the whole world are already aware about data mining. Data mining is the process which uses various kinds of data analysis tools to obtain patterns which also referred to as knowledge discovery from data. Clustering is called unsupervised learning algorithm as groups are not predefined but defined by the data. There are so many research areas in data mining. This paper is focusing on performance and evaluation of clustering algorithm: K-means, SOM and HAC. Evaluations of these three algorithms are purely based on the survey based analysis. These algorithms are analyzed by applying on the data set of banking which is a very high dimensional data. Performances of these algorithms are also compared with each other. Our results indicate that SOM technique is better than k-means and as good as or better than the hierarchical clustering technique. We have also generated one code in Orange Python which is the enhanced algorithm based on the hybrid approach of SOM, K-means and HAC.