The Automated VSMs to Categorize Arabic Text Data Sets
DOI:
https://doi.org/10.24297/ijct.v13i1.2925Keywords:
Arabic data sets, Data mining, Text categorisation, Term weighting, VSM.Abstract
Text Categorization is one of the most important tasks in information retrieval and data mining. This paper aims at investigating different variations of vector space models (VSMs) using KNN algorithm. we used 242 Arabic abstract documents that were used by (Hmeidi & Kanaan, 1997). The bases of our comparison are the most popular text evaluation measures; we use Recall measure, Precision measure, and F1 measure. The Experimental results against the Saudi data sets reveal that Cosine outperformed over of the Dice and Jaccard coefficients.