Abstract
Introduction. Twitter has become a medium for citizens to express in politics, transmitting feelings and opinions of users through tweets. Analyzing this data allows to discover trends and turning points in political criteria. The study aim was to develop an automatic sentiment analysis process in Honduran political tweets, through supervised machine learning techniques. Methods. A collection of 1,800 Honduran political tweets was carried out through filters based in users and hashtags in the period from January to September 2022, followed by a manual tweet tagging. The following techniques of natural language processing were applied: Bag of Words (BOW) and term frequency-inverse document frequency (TF-IDF). The considered methods were: linear SVM, logistic regression and multinomial Naïve Bayes (MNB). The performance metrics used to compare classifiers were a term frequency (F1-score), accuracy and time (training and validation). Results. The selected model was the MNB due to its higher F1-score (62.48%) and shorter training time, while linear SVM obtained 61.80% and logistic regression 61.34%. The final performance of the MNB with new tweets was an F1-score of 63.37%. Conclusion. For the data set presented, it was found that the best classifier was MNB. However, the performance gap between classifiers is small, which implies that preprocessing optimizations and larger scale data collection should be considered.This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Downloads
Download data is not yet available.