Heart Disease

Prediction Tool

Documentation and Technical Specifications

Project Overview

This project implements a deep learning model with KNN to predict heart disease risk based on patient risk factors. The model utilizes the comprehensive "Heart Disease Dataset" from Kaggle for training and validation.

Key Problems

Leading Cause of Death

Coronary heart disease remains one of the leading causes of death worldwide.

Risk Awareness Gap

Limited awareness of critical risk factors among the general population.

Technology Integration

Underutilization of data-driven prediction technologies in healthcare.

Statistical Support

WHO Statistics

Cardiovascular diseases account for approximately 30% of global deaths, with Indonesian hypertension prevalence at 34.1% for ages ≥18 (Riskesdas 2018).

JAMA Research

Over 70% of heart disease risk is attributed to lifestyle factors according to the Journal of the American Medical Association.

Dataset Attributes

ATTRIBUTE	DESCRIPTION
age	Age in years
sex	Gender (0 = male; 1 = female)
cp	Chest pain type
trestbps	Resting blood pressure (in mm Hg)
chol	Serum cholesterol in mg/dl
fbs	Fasting blood sugar > 120 mg/dl (1 = true; 0 = false)
restecg	Resting electrocardiographic results
thalach	Maximum heart rate achieved during exercise
exang	Exercise induced angina (1 = yes; 0 = no)
oldpeak	ST depression induced by exercise relative to rest
slope	Slope of peak exercise ST segment
ca	Number of major vessels (0-3) colored by flourosopy
thal	3 = normal; 6 = fixed defect; 7 = reversible defect

Model Configuration

K-Nearest Neighbors (KNN)

n_neighbors = 5

Random Forest

n_estimators = 20, criterion = 'entropy'

XGBoost

learning_rate = 0.1, max_depth = 15, n_estimators = 100

Performance Metrics

Accuracy

Measures the percentage of correct predictions

Precision

Measures the proportion of correct positive predictions

Recall

Measures the proportion of actual positives correctly predicted