Application of Cross-Validation Techniques to Handle Overfitting in a Case Study of Decision Tree Implementation for Lung Cancer Prediction

Authors

  • Faurika Faurika Institut Teknologi, Sains, dan Kesehatan RS.DR.Soepraoen Kesdam V/BRW
  • Ahsanun Naseh Khudori Institut Teknologi, Sains, dan Kesehatan RS. DR. Soepraoen Kesdam V/BRW
  • M. Syauqi Haris Institut Teknologi, Sains, dan Kesehatan RS. DR. Soepraoen Kesdam V/BRW

DOI:

https://doi.org/10.25181/rt.v2i2.3631

Keywords:

Machine learning, Decision tree, Aturan, Cross-validation, lung cancer

Abstract

Lung cancer is a condition caused by cancer cells growing in the lungs. Lung cancer causes a weakened immune system, tumors, and other abnormalities that prevent the body from functioning properly. Lung cancer examination uses various technologies, namely CT Scan, X-ray, and others. However, the examination is relatively expensive and takes a long time. The use of machine learning makes it possible to support lung cancer diagnosis. With the large amount of medical data available today, machine learning can recognize patterns in the data so that it will help the process of diagnosing lung cancer more effectively. This study aims to correct overfitting in previous research which used the decision tree method to predict lung cancer with cross-validation techniques. In this research, we use a public dataset from Data World. This dataset consists of 25 data attributes and has 1000 data. The results of this research are rules obtained from decision trees which are then evaluated to produce 96.7% accuracy, 96.7% precision, 96.7% recall, and 96.7% f1-score. These results show that the decision tree method performs well in predicting lung cancer early and the cross-validation technique can overcome overfitting in decision trees with more general and stable results.

Downloads

Download data is not yet available.

Downloads

Published

2024-07-19

Issue

Section

Articles