A Content-Based Thesis Supervisor Recommendation System Based on Research Interest Clustering and Cosine Similarity
Main Article Content
Abstract
The assignment of thesis supervisors is a critical academic decision that directly affects research quality and completion outcomes. However, supervisor selection in many higher education institutions remains reliant on subjective judgment and manual inspection of lecturers’ research profiles. This study proposes a content-based thesis supervisor recommendation system that integrates research interest clustering and cosine similarity to support more objective and transparent supervisor assignment. Lecturers’ research interests are derived from publication titles and abstracts collected from Google Scholar and represented using TF–IDF weighting. K-means clustering is applied to model dominant research interest themes, while cosine similarity is used to match students’ thesis proposal texts with clustered publication data. The proposed approach was implemented as a web-based decision-support system and evaluated using publication data from 21 lecturers comprising 469 records. The results indicate that research interest clustering provides a structured and interpretable representation of academic expertise, enabling contextually relevant supervisor recommendations. The system demonstrates practical value by enhancing transparency, consistency, and efficiency in academic decision-making. This study contributes to applied research on academic recommendation systems by extending publication-based approaches through cluster-level modeling of research interests.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
How to Cite
References
Abbasi, A. H., Rehman, S. U., & Ali, T. (2021). Multi-criteria decision support system for recommendation of phd supervisor. Foundation University Journal of Engineering and Applied Sciences, 2(2), 60–75. https://doi.org/10.33897/FUJEAS.V2I2.491
Abidin, Z., Junaidi, A., & Wamiliana. (2024). Text stemming and lemmatization of regional languages in indonesia: a systematic literature review. Journal of Information Systems Engineering and Business Intelligence, 10(2), 217–231. https://doi.org/10.20473/JISEBI.10.2.217-231
Álvarez-García, E., García-Costa, D., & Grimaldo, F. (2022). Streamlining text pre-processing and metrics extraction. Frontiers in Artificial Intelligence and Applications, 356, 55–58. https://doi.org/10.3233/FAIA220314
Asian, J., Williams, H. E., & Tahaghoghi, S. M. M. (2007). Stemming indonesian. ACM Transactions on Asian Language Information Processing (TALIP), 38(4), 307–314. https://doi.org/10.1145/1316457.1316459
Cahyani, D. E., & Patasik, I. (2021). Performance comparison of tf-idf and word2vec models for emotion text classification. Bulletin of Electrical Engineering and Informatics, 10(5), 2780–2788.
Devi, F. R., Sugiharti, E., & Arifudin, R. (2018). The comparison combination of naïve bayes classification algorithm with fuzzy c-means and k-means for determining beef cattle quality in semarang regency. Scientific Journal of Informatics, 5(2), 194–204. https://doi.org/10.15294/SJI.V5I2.15452
Falah, Z. F., & Suryawan, F. (2022). Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer Dan Informatika, 8(2). https://doi.org/10.23917/KHIF.V8I2.16235
Falahudin, I., Santi, R., Ruliansyah, R., Raharjeng, A. R. P., & Marzuki, H. (2018). Pedoman penulisan skripsi fakultas sains dan teknologi uin raden fatah palembang. Fakultas Sains dan Teknologi.
Ilyasa, M. D. H., & Yamasari, Y. (2023). Perbandingan cosine similarity dan euclidean distance pada model rekomendasi buku dengan metode item-based collaborative filtering. Journal of Informatics and Computer Science (JINACS), 4(3), 264–274. https://doi.org/10.26740/jinacs.v4n03.p264-274
Isinkaye, F. O., Folajimi, Y. O., & Ojokoh, B. A. (2015). Recommendation systems: principles, methods and evaluation. Egyptian Informatics Journal, 16(3), 261–273. https://doi.org/10.1016/J.EIJ.2015.06.005
Kazakovtsev, V., Oreshin, S., Serdyukov, A., Krasheninnikov, E., Muravyov, S., Bezvinnyi, A., Panfilov, A., Glukhov, I., Kaliberda, Y., Masalskiy, D., Podolenchuk, T., & Khlopotov, M. (2020). Recommender system for an academic supervisor with a matrix normalization approach. ACM International Conference Proceeding Series, 84–87. https://doi.org/10.1145/3437802.3437817
Khairunnisa, S., Adiwijaya, A., & Faraby, S. Al. (2021). Pengaruh text preprocessing terhadap analisis sentimen komentar masyarakat pada media sosial twitter (studi kasus pandemi covid-19). Jurnal Media Informatika Budidarma, 5(2), 406–414. https://doi.org/10.30865/MIB.V5I2.2835
Kinasih, H. W., Prajanto, A., & Sartika, M. (2021). Peran dosen pembimbing dalam lulus tepat waktu mahasiswa: study pada mahasiswa akuntansi universitas x. Proceeding SENDIU.
Kirişci, M. (2022). New cosine similarity and distance measures for fermatean fuzzy sets and topsis approach. Knowledge and Information Systems, 65(2), 855–868. https://doi.org/10.1007/S10115-022-01776-4
Ko, H., Lee, S., Park, Y., & Choi, A. (2022). A survey of recommendation systems: recommendation models, techniques, and application fields. Electronic, 11(1), 141. https://doi.org/10.3390/ELECTRONICS11010141
Li, H., & Han, D. (2020). A novel time-aware hybrid recommendation scheme combining user feedback and collaborative filtering. Mobile Information Systems, 2020(1). https://doi.org/10.1155/2020/8896694
Rianto, Mutiara, A. B., Wibowo, E. P., & Santosa, P. I. (2021). Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation. Journal of Big Data 2021 8:1, 8(1), 26-. https://doi.org/10.1186/S40537-021-00413-1
Rismanto, R., Syulistyo, A. R., & Agusta, B. P. C. (2020). Research supervisor recommendation system based on topic conformity. International Journal of Modern Education and Computer Science, 12(1), 26. https://doi.org/10.5815/IJMECS.2020.01.04
Roul, R. K., Sahoo, J. K., & Arora, K. (2018). Modified tf-idf term weighting strategies for text categorization. IEEE India Council International Conference. https://doi.org/10.1109/INDICON.2017.8487593
Saptono, R., Setiadi, H., Sulistyoningrum, T., & Suryani, E. (2018). Examiners recommendation system at proposal seminar of undergraduate thesis by using content-based filtering. International Conference on Advanced Computer Science and Information Systems (ICACSIS), 265–269. https://doi.org/10.1109/ICACSIS.2018.8618224
Sharma, D., Kumar, B., & Chand, S. (2021). Recommending researchers in machine learning based on author-topic model. https://doi.org/10.48550/arXiv.2109.02022
Singh, J., & Gupta, V. (2017). Text stemming: approaches, applications, and challenges. ACM Computing Surveys, 49(3). https://doi.org/10.1145/2975608