A multinomial logistic regression model for text in Albanian language
DOI:
https://doi.org/10.24297/jam.v12i7.5486Keywords:
Multinomial logistic regression, classification.Abstract
In this paper we present a multinomial logistic regression model for authorship identification in the Albanian language texts. In the model fitted the dependent variable is categorical which takes different values from 1 to 10 for each of the author and the independent variables are number of words, number of letters, number of vowels, number of consonants, number of punctuations and number of sentences for each text. The model was applied with success in the set of ten authors, each of them being represented by a set of one hundred texts they authored. As results first, second and the third authors have the higher correct predicted percentage and the highest overall correct predicted probability taken was 0.738. As conclusion adding in the model number of consonants, number of punctuations and number of sentences as independent variables the overall correct predicted percentage is increased.Downloads
Download data is not yet available.
References
1. Alan Julian Izenman Modern Multivariate Statistical Techniques Regression, Classification,and Manifold Learning.
2. T. Zhang and F. Oles. Text categorization based on regularized linear classifiers. Information Retrieval, 4(1):5.31, April 2001.
3. Genkin, D. D. Lewis, and D. Madigan. Large-scale bayesian logistic regression for text categorization., 2004
4. D. Salillari, L. Prifti, Sh. Kuka “Logistic regression for authorship attribution in albanian text †Alb-shkenca Conference
2. T. Zhang and F. Oles. Text categorization based on regularized linear classifiers. Information Retrieval, 4(1):5.31, April 2001.
3. Genkin, D. D. Lewis, and D. Madigan. Large-scale bayesian logistic regression for text categorization., 2004
4. D. Salillari, L. Prifti, Sh. Kuka “Logistic regression for authorship attribution in albanian text †Alb-shkenca Conference
Downloads
Published
2016-07-18
How to Cite
Salillari, D., & Prifti, L. (2016). A multinomial logistic regression model for text in Albanian language. JOURNAL OF ADVANCES IN MATHEMATICS, 12(7), 6407–6411. https://doi.org/10.24297/jam.v12i7.5486
Issue
Section
Articles
License
All articles published in Journal of Advances in Linguistics are licensed under a Creative Commons Attribution 4.0 International License.