Clustering-Based Identification of Student Support Needs in Higher Education Transition
Main Article Content
Abstract
The transition from secondary to higher education represents a critical phase influenced by both academic readiness and socio-economic conditions. This study proposes a clustering-based approach to identify student support needs during this transition by analyzing multidimensional student profiles. Using secondary data from 1,226 senior high school students, three unsupervised clustering algorithms—K-Means, DBSCAN, and BIRCH—were applied to academic performance and socio-economic variables. Cluster quality was assessed using internal validation metrics, including the Silhouette Score, Davies–Bouldin Index, and Calinski–Harabasz Index. The results indicate that clustering-based methods provide richer insights than traditional rule-based approaches by capturing heterogeneous student profiles and revealing atypical cases. Among the evaluated algorithms, BIRCH demonstrated the most balanced performance in terms of cluster compactness and separation, while K-Means offered stable and interpretable results, and DBSCAN was effective in identifying outliers. Interpreted within the college readiness framework, the identified clusters highlight differentiated student support needs, enabling more targeted and equitable intervention strategies. These findings underscore the potential of educational data mining to support data-driven decision-making in facilitating students’ transition to higher education.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
How to Cite
References
Cahapin, E. L., Malabag, B. A., Santiago, C. S., Reyes, J. L., Legaspi, G. S., & Adrales, K. L. (2023). Clustering of students admission data using k-means, hierarchical, and dbscan algorithms. Bulletin of Electrical Engineering and Informatics, 12(6), 3647–3656. https://doi.org/10.11591/EEI.V12I6.4849
Caliñski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1–27. https://doi.org/10.1080/03610927408827101
Cheng, W., & Shwe, T. (2019). Clustering analysis of student learning outcomes based on education data. Proceedings - Frontiers in Education Conference, FIE, 2019-October. https://doi.org/10.1109/FIE43999.2019.9028400
Conley, D. T., & French, E. M. (2014). Student ownership of learning as a key component of college readiness. American Behavioral Scientist, 58(8), 1018–1034. https://doi.org/10.1177/0002764213515232
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Dutt, A., Aghabozrgi, S., Ismail, M. A. B., & Mahroeian, H. (2015). Clustering algorithms applied in educational data mining. International Journal of Information and Electronics Engineering, 5(2), 112–116.
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226–231.
Hooshyar, D., Yang, Y., Pedaste, M., & Huang, Y. M. (2020). Clustering algorithms in an educational context: an automatic comparative approach. IEEE Access, 8, 146994–147014. https://doi.org/10.1109/ACCESS.2020.3014948
Kosztyán, Z. T., Orbán-Mihálykó, Mihálykó, C., Csányi, V. V., & Telcs, A. (2020). Analyzing and clustering students’ application preferences in higher education. Journal of Applied Statistics, 47(16), 2961–2983. https://doi.org/10.1080/02664763.2019.1709052
Liu, R. (2022). Data analysis of educational evaluation using k-means clustering method. Computational Intelligence and Neuroscience, 2022(1), 3762431. https://doi.org/10.1155/2022/3762431
Lombard, P. (2020). Factors that influence transition from high school to higher education: a case of the juniortukkie programme. African Journal of Career Development, 2(1). https://doi.org/10.4102/AJCD.V2I1.5
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.
Maylawati, D. S., Priatna, T., Sugilar, H., & Ramdhani, M. A. (2020). Data science for digital culture improvement in higher education using k-means clustering and text analytics. International Journal of Electrical and Computer Engineering (IJECE), 10(5), 4569–4580. https://doi.org/10.11591/ijece.v10i5.pp4569-4580
Mohamed Nafuri, A. F., Sani, N. S., Zainudin, N. F. A., Rahman, A. H. A., & Aliff, M. (2022). Clustering analysis for classifying student academic performance in higher education. Applied Sciences, 12(19), 9467. https://doi.org/10.3390/APP12199467
Mohd Talib, N. I., Abd Majid, N. A., & Sahran, S. (2023). Identification of student behavioral patterns in higher education using k-means clustering and support vector machine. Applied Sciences, 13(5), 3267. https://doi.org/10.3390/APP13053267
OECD. (2018). Education at a glance 2018: oecd indicators. In Education at a Glance (Vol. 2018). OECD Publishing, Paris. https://doi.org/10.1787/EAG-2018-EN
Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 40(6), 601–618. https://doi.org/10.1109/TSMCC.2010.2053532
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(C), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Wang, Z. (2022). Higher education management and student achievement assessment method based on clustering algorithm. Computational Intelligence and Neuroscience, 2022(1), 4703975. https://doi.org/10.1155/2022/4703975
Zhang, T., Ramakrishnan, R., & Livny, M. (1996). Birch: an efficient data clustering method for very large databases. SIGMOD Record (ACM Special Interest Group on Management of Data), 25(2), 103–114. https://doi.org/10.1145/235968.233324