Sentiment Analysis on WeTV Application Reviews Using Naïve Bayes: A Study of Preprocessing, Balancing, and Model Performance

Main Article Content

Wilis Brawijaya
Khothibul Umam
Siti Nur'aini
Maya Rini Handayani

Abstract

This study investigates the application of the Naïve Bayes classification algorithm for sentiment analysis of user-generated reviews on the WeTV application available on the Google Play Store. A structured methodology was employed, consisting of data scraping, sentiment labeling based on heuristics, multi-stage preprocessing, class balancing using Synthetic Minority Over-sampling Technique (SMOTE), and performance evaluation through standard metrics. Prior to balancing, the model exhibited strong performance on the dominant class but underperformed on the minority class. The introduction of SMOTE led to improved F1-scores, particularly for positive sentiment, increasing from 61% to 64%, while maintaining overall accuracy around 71%. These findings confirm that Naïve Bayes, when supported by effective preprocessing and data balancing, can deliver robust and interpretable classification results in text mining tasks. This research contributes to the growing literature on machine learning for opinion mining and provides practical implications for developers aiming to extract structured insights from large-scale user reviews.

Article Details

How to Cite
Brawijaya, W., Umam, K., Nur'aini, S., & Handayani, M. R. (2025). Sentiment Analysis on WeTV Application Reviews Using Naïve Bayes: A Study of Preprocessing, Balancing, and Model Performance. JUSIFO (Jurnal Sistem Informasi), 11(1), 43-52. https://doi.org/10.19109/jusifo.v11i1.27925
Section
Articles

How to Cite

Brawijaya, W., Umam, K., Nur'aini, S., & Handayani, M. R. (2025). Sentiment Analysis on WeTV Application Reviews Using Naïve Bayes: A Study of Preprocessing, Balancing, and Model Performance. JUSIFO (Jurnal Sistem Informasi), 11(1), 43-52. https://doi.org/10.19109/jusifo.v11i1.27925

References

Adib, khoirul, Handayani, M. R., Yuniarti, W. D., & Umam, K. (2024). Opini publik pasca-pemilihan presiden: eksplorasi analisis sentimen media sosial x menggunakan svm. Sintech (Science and Information Technology) Journal, 7(2), 80–91. https://doi.org/10.31598/sintechjournal.v7i2.1581

Alasadi, S. A., & Bhaya, W. S. (2017). Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences, 12(16), 4102–4107. https://doi.org/10.3923/JEASCI.2017.4102.4107

Alexandropoulos, S. A. N., Kotsiantis, S. B., & Vrahatis, M. N. (2019). Data preprocessing in predictive data mining. The Knowledge Engineering Review, 34, e1. https://doi.org/10.1017/S026988891800036X

Ali, H., Salleh, M. N. M., Hussain, K., Ahmad, A., Ullah, A., Muhammad, A., Naseem, R., & Khan, M. (2019). A review on data preprocessing methods for class imbalance problem. International Journal of Engineering and Technology, 8(3), 390–397.

Allorerung, P. P., & Rismayani, R. (2023). Sentiment analysis on wetv app reviews on google play store using nbc and svm algorithms. Sistemasi, 12(2), 404–414. https://doi.org/10.32520/STMSI.V12I2.2518

Elreedy, D., Atiya, A. F., & Kamalov, F. (2024). A theoretical distribution analysis of synthetic minority oversampling technique (smote) for imbalanced learning. Machine Learning, 113(7), 4903–4923. https://doi.org/10.1007/S10994-022-06296-4

Felix, E. A., & Lee, S. P. (2019). Systematic literature review of preprocessing techniques for imbalanced data. IET Software, 13(6), 479–496. https://doi.org/10.1049/IET-SEN.2018.5193

Friadi, J., & Kurniawan, D. E. (2024). Analisis sentimen ulasan wisatawan terhadap alun-alun kota batam: perbandingan kinerja metode naive bayes dan support vector machine. Jurnal Sistem Informasi Bisnis, 14(4), 403–407. https://doi.org/10.21456/VOL14ISS4PP403-407

García, S., Luengo, J., & Herrera, F. (2015). Data preprocessing in data mining (Vol. 72). Springer International Publishing. https://doi.org/10.1007/978-3-319-10247-4

Güner, L., Coyne, E., & Smit, J. (2019). Sentiment analysis for amazon.com reviews.

Hussein, A. S., Li, T., Yohannese, C. W., & Bashir, K. (2019). A-smote: a new preprocessing approach for highly imbalanced datasets by improving smote. International Journal of Computational Intelligence Systems, 12(2), 1412–1422. https://doi.org/10.2991/ijcis.d.191114.002

Kaburuan, E. R., Sari, Y. S., & Agustina, I. (2022). Sentiment analysis on product reviews from shopee marketplace using the naïve bayes classifier. Lontar Komputer: Jurnal Ilmiah Teknologi Informasi, 13(3), 150–159. https://doi.org/10.24843/LKJITI.2022.V13.I03.P02

Kulsum, U., Jajuli, M., & Sulistiyowati, N. (2022). Analisis sentimen aplikasi wetv di google play store menggunakan algoritma support vector machine. Journal of Applied Informatics and Computing, 6(2), 205–212. https://doi.org/10.30871/JAIC.V6I2.4802

Lestari, N., Haerani, E., & Candra, R. M. (2023). Analisa sentimen ulasan aplikasi wetv untuk peningkatan layanan menggunakan metode naïve bayes. Journal of Information System Research (JOSH), 4(3), 874–882. https://doi.org/10.47065/JOSH.V4I3.3355

Madyatmadja, E. D., Candra, H., Nathaniel, J., Jonathan, M. R., & Rudy, R. (2024). Sentiment analysis on user reviews of threads applications in indonesia. Journal Europeen Des Systemes Automatises, 57(4), 1165–1171. https://doi.org/10.18280/JESA.570423

Mareby, Y. S. P., & Desanti, R. I. (2024). Exploring wetv application with naïve bayes, decision tree, and random forest classifiers for sentiment analysis. 2024 International Visualization, Informatics and Technology Conference, IVIT 2024, 35–42. https://doi.org/10.1109/IVIT62102.2024.10692731

Musu, W., Ibrahim, A., & Heriadi, H. (2021). Pengaruh komposisi data training dan testing terhadap akurasi algoritma c4.5. Sisiti: Seminar Ilmiah Sistem Informasi Dan Teknologi Informasi, 10(1), 186–195. https://doi.org/10.36774/SISITI.V10I1.802

Nurzaman, N., Suarna, N., & Prihartono, W. (2024). Analisis sentimen ulasan aplikasi threads di google playstore menggunakan algoritma naïve bayes. JATI (Jurnal Mahasiswa Teknik Informatika), 8(1), 967–974. https://doi.org/10.36040/JATI.V8I1.8708

Razaq, M. T., Nurjanah, D., & Nurrahmi, H. (2023). Analisis sentimen review film menggunakan naive bayes classifier dengan fitur tf-idf. EProceedings of Engineering, 10(2), 45–49. https://doi.org/10.5120/IJCA2017916005

Saleem, A., Asif, K. H., Ali, A., Awan, S. M., & Alghamdi, M. A. (2014). Pre-processing methods of data mining. Proceedings - 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, UCC 2014, 451–456. https://doi.org/10.1109/UCC.2014.57

Siregar, M. Y., Wiranata, A. D., & Saputra, R. A. (2024). Analisis Sentimen Pada Ulasan Pengguna Aplikasi Streaming Vidio Menggunakan Metode Naïve Bayes. KLIK: Kajian Ilmiah Informatika Dan Komputer, 4(5), 2419–2429. https://doi.org/10.30865/KLIK.V4I5.1787

Wojciechowski, S., & Wilk, S. (2017). Difficulty factors and preprocessing in imbalanced data sets: an experimental study on artificial data. Foundations of Computing and Decision Sciences, 42(2), 149–176. https://doi.org/10.1515/FCDS-2017-0007

Yang, F. J. (2018). An implementation of naive bayes classifier. Proceedings - 2018 International Conference on Computational Science and Computational Intelligence, CSCI 2018, 301–306. https://doi.org/10.1109/CSCI46756.2018.00065