Heart Disease

Prediction Tool

Documentation and Technical Specifications

Project Overview

This project implements a deep learning model with KNN to predict heart disease risk based on patient risk factors. The model utilizes the comprehensive "Heart Disease Dataset" from Kaggle for training and validation.

Key Problems

Leading Cause of Death

Coronary heart disease remains one of the leading causes of death worldwide.

Risk Awareness Gap

Limited awareness of critical risk factors among the general population.

Technology Integration

Underutilization of data-driven prediction technologies in healthcare.

Statistical Support

WHO Statistics

Cardiovascular diseases account for approximately 30% of global deaths, with Indonesian hypertension prevalence at 34.1% for ages ≥18 (Riskesdas 2018).

JAMA Research

Over 70% of heart disease risk is attributed to lifestyle factors according to the Journal of the American Medical Association.

Dataset Attributes

ATTRIBUTEDESCRIPTION
ageAge in years
sexGender (0 = male; 1 = female)
cpChest pain type
trestbpsResting blood pressure (in mm Hg)
cholSerum cholesterol in mg/dl
fbsFasting blood sugar > 120 mg/dl (1 = true; 0 = false)
restecgResting electrocardiographic results
thalachMaximum heart rate achieved during exercise
exangExercise induced angina (1 = yes; 0 = no)
oldpeakST depression induced by exercise relative to rest
slopeSlope of peak exercise ST segment
caNumber of major vessels (0-3) colored by flourosopy
thal3 = normal; 6 = fixed defect; 7 = reversible defect

Model Configuration

K-Nearest Neighbors (KNN)

n_neighbors = 5

Random Forest

n_estimators = 20, criterion = 'entropy'

XGBoost

learning_rate = 0.1, max_depth = 15, n_estimators = 100

Performance Metrics

Accuracy

Measures the percentage of correct predictions

Precision

Measures the proportion of correct positive predictions

Recall

Measures the proportion of actual positives correctly predicted