Comparison Of The Performance Of C4.5 And Naive Bayes Algorithms For Student Graduation Prediction
DOI:
https://doi.org/10.24014/coreit.v9i2.24931Abstract
Along with the development of technology, especially the development of increasingly large data storage. One organization that has large data storage is an educational organization. Educational organizations use data to obtain information, especially information about students. Student data has many attributes so that we can make predictions such as predictions of student performance, predictions of scholarship recipients and predictions of student graduation. Data mining methods in education are classified into five dimensions, one of which is prediction, such as predicting output values based on input data. From the results of the research conducted from the initial stage to the testing stage of the application of the C4.5 Algorithm, the accuracy results are higher than Naïve Bayes because in its classification stage, C4.5 processes attribute data one by one. The difference is with naïve Bayes which is influenced by the amount of data used, the comparison of the amount of training and testing data. The feasibility of the model obtained is supported by the high accuracy, precision, recall and AUC obtained from the two algorithms that have been tested. The C4.5 algorithm has an accuracy rate of 79.91%, 89.06% precision and 81.38% recall and an AUC value of 0.823. Meanwhile, Naïve Bayes has an accuracy rate of 76.95%, precision of 75.95% and recall of 98.38% and an AUC value of 0.838.
Keywords: Graduation, Prediction, Data Mining, C4.5, Naïve Bayes
References
Brucel Ratnelr, “Statistical and Machinel-Lelarning Data Mining Telchniquels for Belttelr Preldictivel Modelling and Analysis of Big Data Third Eldition,” 2017.
Elko Praseltiyo Rohmawan, “PRElDICTION OF STUDElNT GRADUATION ON TIMEl USING DElSICION TRElEl MElTHOD AND ARTIFICIAL NElURAL NElTWORK,” 2018.
S. Novia Helrmawanti and A. Adi Sunarto, “IMPLElMElNTATION OF C4.5 ALGORITHM FOR PRElDICTING ON TIMEl GRADUATION (Casel Study: Informatics Elnginelelring Study Program),” Jurnal Ilmiah SANTIKA, vol. 9, no. 1, 2019.
U. Kristeln elt al., “Managel thel Journal of Elducation Managelmelnt Mastelr of Elducation Managelmelnt FKIP,” no. 1, pp. 74–85, 2018.
R. Mikut and M. Relischl, “Data mining tools,” WIREls Data Mining and Knowleldgel Discovelry, vol. 1, no. 5, pp. 431–443, Selp. 2011, doi: https://doi.org/10.1002/widm.24.
B. Selrelf and El. Bostanci, “Selntimelnt Analysis using Naivel Bayels and Complelmelnt Naivel Bayels Classifielr Algorithms on Hadoop Framelwork,” in 2018 2nd Intelrnational Symposium on Multidisciplinary Studiels and Innovativel Telchnologiels (ISMSIT), 2018, pp. 1–7. doi: 10.1109/ISMSIT.2018.8567243.
T. Sinta Pelringkat elt al., “COMPARISON OF DElCISION TRElEl, NAIVEl BAYElS AND K-NElARElST NElIGHBOR ALGORITHMS FOR PRElDICTING STUDElNTS TO GRADUATEl ON TIMEl,” 2020, [Onlinel]. Availablel: www.bri-institutel.ac.id
F. D. Pranatasari, “THEl INFLUElNCEl OF ACADElMIC SUPElRVISOR MElNTORING ON STUDElNT ACADElMIC ACHIElVElMElNT,” 2016. [Onlinel]. Availablel: http://forlap.dikti.go.id/,
A. Pratama, R. Cahya Wihandika, and D. El. Ratnawati, “Implelmelntation of Support Velctor Machinel (SVM) Algorithm for Preldicting Studelnt Graduation Timellinelss,” 2018. [Onlinel]. Availablel: http://j-ptiik.ub.ac.id
Partelelk Bhatia, “Data Mining and Data Warelhousing,” 2019.
D. Forsyth, “Probability and Statistics for Computelr Scielncel,” 2018.
P. V. Ngoc, C. V. T. Ngoc, T. V. T. Ngoc, and D. N. Duy, “A C4.5 algorithm for elnglish elmotional classification,” Elvolving Systelms, vol. 10, no. 3, pp. 425–451, Selp. 2019, doi: 10.1007/s12530-017-9180-1.
D. Belrrar, “Bayels’ Thelorelm and Naivel Bayels Classifielr,” in Elncyclopeldia of Bioinformatics and Computational Biology, S. Ranganathan, M. Gribskov, K. Nakai, and C. Schönbach, Elds. Oxford: Acadelmic Prelss, 2019, pp. 403–412. doi: https://doi.org/10.1016/B978-0-12-809633-8.20473-1.
O. Caelleln, “A Bayelsian Intelrpreltation of thel Confusion Matrix,” 2017.
M. Kubat, An Introduction to Machinel Lelarning. Springelr Intelrnational Publishing, 2017. doi: 10.1007/978-3-319-63913-0.
J. Unpingco, Python for probability, statistics, and machinel lelarning. Springelr Intelrnational Publishing, 2016. doi: 10.1007/978-3-319-30717-6.
D. J. H. Wojtelk J. Krzanowski, “ROC Curvels for Continuous Data,” 2009.
J. Moolayil, Lelarn Kelras for Delelp Nelural Neltworks. Aprelss, 2019. doi: 10.1007/978-1-4842-4240-7.
Downloads
Published
Issue
Section
License
The Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to CoreIT journal and published by Informatics Engineering Department Universitas Islam Negeri Sultan Syarif Kasim Riau as publisher of the journal.
Authors who publish with this journal agree to the following terms:
Authors automatically transfer the copyright to the journal and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike (CC BY SA) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate permission for non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).