Membandingkan Nilai Akurasi BERT dan DistilBERT pada Dataset Twitter

Main Article Content

Faisal Fajri
Bambang Tutuko
Sukemi Sukemi

Abstract

The growth of digital media has been incredibly fast, which has made consuming information a challenging task. Social media processing aided by Machine Learning has been very helpful in the digital era. Sentiment analysis is a fundamental task in Natural Language Processing (NLP). Based on the increasing number of social media users, the amount of data stored in social media platforms is also growing rapidly. As a result, many researchers are conducting studies that utilize social media data. Opinion mining (OM) or Sentiment Analysis (SA) is one of the methods used to analyze information contained in text from social media. Until now, several other studies have attempted to predict Data Mining (DM) using remarkable data mining techniques. The objective of this research is to compare the accuracy values of BERT and DistilBERT. DistilBERT is a technique derived from BERT that provides speed and maximizes classification. The research findings indicate that the use of DistilBERT method resulted in an accuracy value of 97%, precision of 99%, recall of 99%, and f1-score of 99%, which is higher compared to BERT that yielded an accuracy value of 87%, precision of 91%, recall of 91%, and f1-score of 89%.

Article Details

How to Cite
Fajri, F., Tutuko, B., & Sukemi, S. (2022). Membandingkan Nilai Akurasi BERT dan DistilBERT pada Dataset Twitter. JUSIFO (Jurnal Sistem Informasi), 8(2), 71-80. https://doi.org/10.19109/jusifo.v8i2.13885
Section
Articles

How to Cite

Fajri, F., Tutuko, B., & Sukemi, S. (2022). Membandingkan Nilai Akurasi BERT dan DistilBERT pada Dataset Twitter. JUSIFO (Jurnal Sistem Informasi), 8(2), 71-80. https://doi.org/10.19109/jusifo.v8i2.13885

References

Acheampong, F. A., Nunoo-Mensah, H., & Chen, W. (2021). Recognizing emotions from texts using an ensemble of transformer-based language models. 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 161–164. https://doi.org/10.1109/ICCWAMTIP53232.2021.9674102

Adel, H., Dahou, A., Mabrouk, A., Elaziz, M. A., Kayed, M., El-Henawy, I. M., Alshathri, S., & Ali, A. A. (2022). Improving Crisis Events Detection Using DistilBERT with Hunger Games Search Algorithm. Mathematics 2022, 10(3), 447. https://doi.org/10.3390/MATH10030447

Adoma, A. F., Henry, N. M., & Chen, W. (2020). Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. 17th International Computer Conference on Wavelet Active Media Technology and Information Processing, ICCWAMTIP 2020, 117–121. https://doi.org/10.1109/ICCWAMTIP51612.2020.9317379

Ayoub, J., Yang, X. J., & Zhou, F. (2021). Combat covid-19 infodemic using explainable natural language processing models. Information Processing & Management, 58(4), 102569. https://doi.org/10.1016/J.IPM.2021.102569

Basiri, M. E., Nemati, S., Abdar, M., Asadi, S., & Acharrya, U. R. (2021). A novel fusion-based deep learning model for sentiment analysis of covid-19 tweets. Knowledge-Based Systems, 228, 107242. https://doi.org/10.1016/J.KNOSYS.2021.107242

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: pre-training of deep bidirectional transformers for language understanding. Proceedings OfNAACL-HLT 2019, 4171–4186. https://aclanthology.org/N19-1423.pdf

Do, P., & Phan, T. H. V. (2021). Developing a bert based triple classification model using knowledge graph embedding for question answering system. Applied Intelligence 2021 52:1, 52(1), 636–651. https://doi.org/10.1007/S10489-021-02460-W

Dogra, V., Singh, A., Verma, S., Kavita, K., Jhanjhi, N. Z., & Talib, M. N. (2021). Analyzing distilbert for sentiment classification of banking financial news. Lecture Notes in Networks and Systems, 248, 501–510. https://doi.org/10.1007/978-981-16-3153-5_53/COVER

Faturrohman, F., & Rosmala, D. (2022). Analisis sentimen sosial media dengan metode bidirectional gated recurrent unit. Prosiding Diseminasi FTI. https://eproceeding.itenas.ac.id/index.php/fti/article/view/962

Gao, Z., Feng, A., Song, X., & Wu, X. (2019). Target-dependent sentiment classification with bert. IEEE Access, 7, 154290–154299. https://doi.org/10.1109/ACCESS.2019.2946594

Geetha, M. P., & Karthika Renuka, D. (2021). Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model. International Journal of Intelligent Networks, 2, 64–69. https://doi.org/10.1016/J.IJIN.2021.06.005

Gimpel, K., Schneider, N., O’connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., & Smith, N. A. (2011). Part-of-speech tagging for twitter: annotation, features, and experiments. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 42–47. https://aclanthology.org/P11-2008.pdf

Hermanto, D. T., Setyanto, A., & Luthfi, E. T. (2021). Algoritma lstm-cnn untuk binary klasifikasi dengan word2vec pada media online. Creative Information Technology Journal, 8(1), 64–77. https://doi.org/10.24076/CITEC.2021V8I1.264

Huddar, M. G., Sannakki, S. S., & Rajpurohit, V. S. (2021). Attention-based multimodal contextual fusion for sentiment and emotion classification using bidirectional lstm. Multimedia Tools and Applications, 80(9), 13059–13076. https://doi.org/10.1007/S11042-020-10285-X/METRICS

Joulin, A., Grave, É., Bojanowski, P., & Mikolov, T. (2017). Bag of tricks for efficient text classification. The Association for Computational Linguistics, 2, 427–431. https://aclanthology.org/E17-2068

Naseem, U., Razzak, I., & Eklund, P. W. (2021). A survey of pre-processing techniques to improve short-text quality: a case study on hate speech detection on twitter. Multimedia Tools and Applications, 80(28–29), 35239–35266. https://doi.org/10.1007/S11042-020-10082-6/METRICS

Nurrohmat, M. A., & SN, A. (2019). Sentiment analysis of novel review using long short-term memory method. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(3), 209–218. https://doi.org/10.22146/IJCCS.41236

Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., Al-Smadi, M., Al-Ayyoub, M., Zhao, Y., Qin, B., De Clercq, O., Hoste, V., Apidianaki, M., Tannier, X., Loukachevitch, N., Kotelnikov, E., Bel, N., Jiménez-Zafra, S. M., & Eryigit, G. (2016). Aspect based sentiment analysis. SemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings, 19–30. https://doi.org/10.18653/V1/S16-1002

Preite, S. (2019). Deep question answering: a new teacher for distilbert [University of Bologna]. https://amslaurea.unibo.it/20384/1/MasterThesisBologna.pdf

Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. https://arxiv.org/abs/1910.01108v4

Santosa, R. D. W., Bijaksana, M. A., & Romadhony, A. (2021). Implementasi algoritma long short-term memory (lstm) untuk mendeteksi penggunaan kalimat abusive pada teks bahasa indonesia. EProceedings of Engineering, 8(1). https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/14318