SSAAR: An enhanced System for Sentiment Analysis of Arabic Reviews
Keywords:machine learning, Natural language processing, opinion mining, sentiment Analysis
Sentiment Analysis, or Opinion Mining, has recently captivated the interest of scientists worldwide. With the increasing use of the internet, the web is becoming overloaded by data that contains useful information, which can be used in different fields. In fact, many studies have shed light on Sentiment Analysis of online data in different languages. However, the amount of research dealing with the Arabic language is still limited. In this paper, an empirical study is led to Sentiment Analysis of online reviews written in Modern Standard Arabic. A new system called SSAAR (System for Sentiment Analysis of Arabic Reviews) is proposed, allowing computational classification of reviews into three classes (positive, negative, neutral). The input data of this system is built by using a proposed framework called SPPARF (Scraping and double Preprocessing Arabic Reviews Framework), which generates a structured and clean dataset. Moreover, the provided system experiments two improved approaches for sentiment classification based on supervised learning, which are: Double preprocessing method and Feature selection method. Both approaches are trained by using five algorithms (Naïve Bayes, stochastic gradient descent Classifier (SGD), Logistic Regression, K-Nearest Neighbors, and Random Forest) and compared later under the same conditions. The experimental results show that the feature selection method using the SGD Classifier performs the best accuracy (77.1%). Therefore, the SSAAR System proved to be efficient and gives better results when using the feature selection method; nevertheless, satisfying results were obtained with the other approach, considered consequently suitable for the proposed system.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Now Publishers. https://doi.org/10.1561/9781601981516
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. https://doi.org/10.3115/1118693.1118704
Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. https://doi.org/10.3115/1073083.1073153
Wang, Y. Z., Zheng, X., Hou, D., & Hu, W. (2018). Short text sentiment classification of high dimensional hybrid feature based on SVM. Comput. Technol. Develop., 28(2), 88-93.
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267-307. https://doi.org/10.1162/COLI_a_00049
Kiritchenko, S., Zhu, X., & Mohammad, S. M. (2014). Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, 50, 723-762. https://doi.org/10.1613/jair.4272
El-Halees, A. M. (2011). Arabic opinion mining using combined classification approach. Arabic opinion mining using combined classification approach.
Abdul-Mageed, M., Diab, M., & Korayem, M. (2011, June). Subjectivity and sentiment analysis of modern standard Arabic. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (pp. 587-591).
Abdul-Mageed, M., Diab, M., & Kübler, S. (2014). SAMAR: Subjectivity and sentiment analysis for Arabic social media. Computer Speech & Language, 28(1), 20-37. https://doi.org/10.1016/j.csl.2013.03.001
Ahmad, K., & Almas, Y. (2005, July). Visualising sentiments in financial texts?. In Ninth International Conference on Information Visualisation (IV'05) (pp. 363-368). IEEE.
Nabil, M., Aly, M., & Atiya, A. (2015, September). Astd: Arabic sentiment tweets dataset. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 2515-2519).. https://doi.org/10.18653/v1/D15-1299
Perkins, J. (2010). Python text processing with NLTK 2.0 cookbook. Packt Publishing Ltd.
Nejjari, M., & Meziane, A. (2019, March). Overview of Opinion Detection Approaches in Arabic. In Proceedings of the 2nd International Conference on Networking, Information Systems & Security (pp. 1-5). https://doi.org/10.1145/3320326.3320410
Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science, 17, 26-32. https://doi.org/10.1016/j.procs.2013.05.005
O’Keefe, T., & Koprinska, I. (2009, December). Feature selection and weighting methods in sentiment analysis. In Proceedings of the 14th Australasian document computing symposium, Sydney (pp. 67-74).
Lewis, D. D. (1998, April). Naive (Bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning (pp. 4-15). Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026666
Boulaiche, A., & Adi, K. (2018). An auto-learning approach for network intrusion detection. Telecommunication Systems, 68(2), 277-294. https://doi.org/10.1007/s11235-017-0395-z
Blair, D. C. (1979). Information Retrieval, CJ Van Rijsbergen. London: Butterworths; 1979: 208 pp. Price: $32.50. Journal of the American Society for Information Science, 30(6), 374-375. https://doi.org/10.1002/asi.4630300621
Alotaibi, S. S. (2015). Sentiment analysis in the Arabic language using machine learning. 2000-2019-CSU Theses and Dissertations.
How to Cite
Copyright (c) 2020 Manal Nejjari, Abdelouafi Meziane
This work is licensed under a Creative Commons Attribution 4.0 International License.