Outlier Analysis of Categorical Data Using Infrequency

Authors

  • Lakshmi Sreenivasareddy Dirisinapu Rise Gandhi Groups of Institutions
  • Krishna Murthy Mudumbi ANU, Guntur
  • Govardhan Aliseri JNTUH, Hyderabad

DOI:

https://doi.org/10.24297/ijct.v8i3.3397

Keywords:

Data Mining, Outlier detection, FPOF score, FDOD Score, MAD score

Abstract

Anomalies are those objects, which will act with different behavior and do not follow with the remaining records in the databases. Detecting anomalies is an important issue in many fields. Though many methods are available to detect anomalies in numerical datasets, only a few methods are available for categorical datasets. In this work, a new method has been proposed. This algorithm finds anomalies based on infrequent itemsets in each record. These outliers are generated by Apriori property on each record values in datasets. Previous methods may not distinguish different records with the same frequency. These give same score for each record. For each record a score is generated based on infrequent itemsets which is called MAD score in this paper.  This algorithm utilizes the frequency of each value in the dataset. FPOF method is used the concept of frequent itemset and otey method used infrequent itemset. But these cannot distinguish records perfectly. The proposed algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repositoryâ€. Numerical attributes are excluded from Datasets for this analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.

Downloads

Download data is not yet available.

Author Biographies

Lakshmi Sreenivasareddy Dirisinapu, Rise Gandhi Groups of Institutions

Department of CSE

Krishna Murthy Mudumbi, ANU, Guntur

Research scholar

Govardhan Aliseri, JNTUH, Hyderabad

Director of Evaluation

Downloads

Published

2013-06-30

How to Cite

Dirisinapu, L. S., Mudumbi, K. M., & Aliseri, G. (2013). Outlier Analysis of Categorical Data Using Infrequency. INTERNATIONAL JOURNAL OF COMPUTERS &Amp; TECHNOLOGY, 8(3), 868–873. https://doi.org/10.24297/ijct.v8i3.3397

Issue

Section

Research Articles