EFFICIENT CLUSTERING ALGORITHM USING BIRCH CLUSTERS
##plugins.themes.bootstrap3.article.main##
Abstract
The search for useful patterns in large data sets has recently attracted considerable interest, and one of the most common problems in this area is the identification of clusters or densely populated regions in a multidimensional data set. Premature work does not adequately address the problem of large data sets and minimize 1/0 costs. Clustering is a widely used technique in data mining. At present, there are many clustering algorithms, but most existing clustering algorithms are either limited to handle the single attribute or can handle both types of data, but are not efficient when clustering large data sets. Only a few algorithms can do both well. Clustering is the process of grouping of data, where the grouping is established by finding similarities between data based on their characteristics. Such groups are termed as Clusters. A comparative study of clustering algorithms across two different data items is performed here. The performance of the various clustering algorithms is compared based on the time taken to form the estimated clusters. The experimental results of various clustering algorithms to form clusters are depicted as a graph. Thus it can be concluded as the time taken to form the clusters increases as the number of cluster increases. The BIRCH clustering algorithm takes very few seconds to cluster the data items whereas the simple KMeans takes the longest time to perform clustering. The experimental results suggest that the BIRCH algorithm is effective when compared to k-means algorithm.The results show that the BIRCH algorithm is efficient and produces better quality of clusters.