An Instance Selection Algorithm Based On Reverse k Nearest Neighbor
DOI:
https://doi.org/10.24297/ijct.v10i7.3217Keywords:
Reverse Nearest Neighbor, Reverse k Nearest Neighbor, reduction rate, classificationAbstract
Classification is one of the most important data mining techniques. It belongs to supervised learning. The objective of classification is to assign class label to unlabelled data. As data is growing rapidly, handling it has become a major concern. So preprocessing should be done before classification and hence data reduction is essential. Data reduction is to extract a subset of features from a set of features of a data set. Data reduction helps in decreasing the storage requirement and increases the efficiency of classification. A way to measure data reduction is reduction rate. The main thing here is choosing representative samples to the final data set. There are many instance selection algorithms which are based on nearest neighbor decision rule (NN). These algorithms select samples on incremental strategy or decremental strategy. Both the incremental algorithms and decremental algorithms take much processing time as they iteratively scan the dataset. There is another instance selection algorithm, reverse nearest neighbor reduction (RNNR) based on the concept of reverse nearest neighbor (RNN). RNNR does not iteratively scan the data set. In this paper, we extend the RNN to RkNN and we use the concept of RNNR to RkNN. RkNN finds the query objects that has the query point as their k nearest-neighbors. Our approach utilizes the advantage of RNN and proposes to use the concept of RkNN. We have taken the dataset of theatres, hospitals and restaurants and extracted the sample set. Classification has been done the resultant sample data set. We observe two parameters here they are classification accuracy and reduction rate.