Jurnal Kajian Komputasi Inovatif https://oaj.jurnalhst.com/index.php/jkki id-ID Jurnal Kajian Komputasi Inovatif BREAST CANCER CLASSIFICATION USING MACHINE LEARNING https://oaj.jurnalhst.com/index.php/jkki/article/view/19766 <p><em>Breast cancer remains one of the most prevalent malignancies, where early and accurate diagnosis is critical to improve patient outcomes. This study investigates the performance of five supervised machine learning algorithms—Support Vector Machine (SVM), Decision Tree (DT), k‑Nearest Neighbors (KNN), Random Forest (RF), and Logistic Regression (LR)—for automated breast cancer classification using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. The dataset contains 569 samples with 30 numerical features extracted from digitized fine needle aspirate (FNA) images, labeled as benign or malignant. The experimental protocol employs an 80/20 stratified train–test split, feature standardization for scale‑sensitive models, and RandomizedSearchCV with 5‑fold cross‑validation for hyperparameter optimization. Models are evaluated using accuracy, precision, recall (sensitivity), specificity, F1‑score, ROC‑AUC, confusion matrices, and cross‑validation statistics, complemented by approximate 95% confidence intervals and McNemar’s test for pairwise comparison. The optimized SVM with radial basis function kernel achieves test accuracy of 98.25%, precision of 100%, recall of 95.24%, specificity of 100%, and ROC‑AUC of 0.9960, outperforming other models with statistically significant improvements over DT and KNN. Feature importance analysis from tree‑based models highlights “worst” size and shape descriptors (area_worst, perimeter_worst, radius_worst, concave_points_worst) as dominant predictors, aligning with cytopathological understanding of malignant nuclei. The results demonstrate that properly tuned traditional models can provide robust and interpretable performance for tabular medical data, and establish a reproducible baseline for future research in breast cancer classification.</em></p> I Gede Wahyu Surya Dharma I Gede Karang Komala Putra Hak Cipta (c) 2026 Jurnal Kajian Komputasi Inovatif 2026-02-28 2026-02-28 17 2 IMPLEMENTASI ALGORITMA NAÏVE BAYES DALAM KLASIFIKASI PENYAKIT DIABETES https://oaj.jurnalhst.com/index.php/jkki/article/view/20128 <p>Penyakit Diabetes Mellitus merupakan salah satu ancaman kesehatan global yang memerlukan mekanisme deteksi dini yang presisi untuk meminimalisir risiko komplikasi kronis. Namun, efektivitas model klasifikasi seringkali terhambat oleh karakteristik data rekam medis yang tidak ideal. Penelitian ini bertujuan untuk menganalisis karakteristik data dan mengevaluasi kinerja algoritma Gaussian Naïve Bayes dalam mengklasifikasikan risiko diabetes pada pasien di Rumah Sakit Dirgahayu Samarinda. Metodologi yang diusulkan mencakup rangkaian pra-pemrosesan data yang komprehensif, dimulai dari penggunaan mean imputation untuk menjaga integritas data yang kosong, diikuti dengan transformasi Z-Score guna menormalisasi skala fitur klinis agar proses perhitungan probabilitas pada algoritma Naïve Bayes menjadi lebih objektif. Untuk mengatasi bias prediksi akibat ketimpangan data, diterapkan teknik Synthetic Minority Over-sampling Technique (SMOTE) pada data latih. Selanjutnya, dilakukan eksperimen sistematis dengan membandingkan tiga skenario rasio pembagian data (split data), yaitu 90:10, 80:20, dan 70:30. Hasil penelitian mengonfirmasi bahwa integrasi algoritma Naïve Bayes dengan metode SMOTE mampu memberikan peningkatan performa yang stabil. Skenario pembagian data dengan rasio 80:20 teridentifikasi sebagai model yang paling optimal, dengan perolehan nilai Akurasi sebesar 83,19%, Presisi 79,17%, Recall 82,61%, dan F1-Score 80,85%. Kesimpulan dari penelitian ini menunjukkan bahwa pendekatan pra-pemrosesan yang tepat dan penyeimbangan data secara sintetis terbukti efektif dalam meningkatkan reliabilitas diagnosa komputasional, sehingga dapat diandalkan sebagai instrumen pendukung keputusan medis bagi tenaga profesional di rumah sakit.</p> <p><em>Diabetes Mellitus is one of the major global health threats that requires a precise early detection mechanism to minimize the risk of chronic complications. However, the effectiveness of classification models is often constrained by the non-ideal characteristics of medical record data. This study aims to analyze data characteristics and evaluate the performance of the Gaussian Naïve Bayes algorithm in classifying diabetes risk among patients at Dirgahayu Hospital, Samarinda. The proposed methodology involves a comprehensive data preprocessing pipeline, beginning with mean imputation to preserve the integrity of missing values, followed by Z-Score transformation to normalize the scale of clinical features and ensure objective probability calculations within the Naïve Bayes algorithm. To mitigate prediction bias caused by data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to the training data. Furthermore, systematic experiments are conducted by comparing three data splitting scenarios, namely 90:10, 80:20, and 70:30. The experimental results confirm that integrating the Naïve Bayes algorithm with SMOTE leads to a stable improvement in classification performance. Among the evaluated scenarios, the 80:20 data split is identified as the optimal model, achieving an accuracy of 83.19%, precision of 79.17%, recall of 82.61%, and an F1-score of 80.85%. The findings indicate that appropriate preprocessing strategies and synthetic data balancing are effective in enhancing the reliability of computational diagnosis, thereby supporting medical decision-making processes for healthcare professionals in hospital environments.</em></p> Arini Diah Vita Loka Wawan Joko Pranoto Azhima Yoga Siswa Hak Cipta (c) 2026 Jurnal Kajian Komputasi Inovatif 2026-02-28 2026-02-28 17 2