Comparison Study of Logistic Regression Model for Albanian Texts

Authors

  • Denisa Salillari Polytechnic University of Tirana, Sheshi Nene Tereza, nr. 1, Tirana
  • Luela Prifti Polytechnic University of Tirana, Sheshi Nene Tereza, nr. 1, Tirana,

DOI:

https://doi.org/10.24297/jam.v12i9.127

Keywords:

Logistic regression, classification, R

Abstract

Considering authorship attribution as a classification problem we attempt to estimate the probability to find the right author for each text under study. In this paper using R we first improve the simple model for six Albanian texts, (I) increasing number of texts and number of independent variables and then compare the results taken with them of the multinomial logistic regression (II). The model was applied on a set of one hundred texts of ten different authors. For all the authors under study the average correct predicted probability is 0.918. Analyzing data from different Albanian texts, results that about 40% of their letters consist of vowels. As conclusion comparing results taken with them of (II) multinomial logistic regression model for Albanian texts has more advantages than logistic regression model.

Downloads

Download data is not yet available.

Author Biographies

Denisa Salillari, Polytechnic University of Tirana, Sheshi Nene Tereza, nr. 1, Tirana

Department of Mathematical Engineering

Luela Prifti, Polytechnic University of Tirana, Sheshi Nene Tereza, nr. 1, Tirana,

Department of Mathematical Engineering

References

I. D. Salillari, L. Prifti, Sh. Kuka “Logistic regression for authorship attribution in albanian text ” 7th Annual Meeting of Institute Alb-Shkenca, Conference Of Natural Sciences.
II. D. Salillari, L.Prifti A multinomial logistic regression model for text in Albanian language, Journal of Advances in Mathematics, Volume 12 Number 07.
III. T.Hastie, R.Tibshirani, J.Friedman “The Elements of Statistical Learning” Data Mining, Inference, and Prediction Second Edition.
IV. G. James, D. Witten, T. Hastie, R. Tibshirani “An Introduction to Statistical Learning” with applications in R. Springer 2013

Downloads

Published

2016-09-28

How to Cite

Salillari, D., & Prifti, L. (2016). Comparison Study of Logistic Regression Model for Albanian Texts. JOURNAL OF ADVANCES IN MATHEMATICS, 12(9), 6572–6575. https://doi.org/10.24297/jam.v12i9.127

Issue

Section

Articles