Prediction and Control of Stroke by Data Mining

Leila Amini, Reza Azarpazhouh, Mohammad Taghi Farzadfar, Sayed Ali Mousavi, Farahnaz Jazaieri, Fariborz Khorvash, Rasul Norouzi, Nafiseh Toghianifar


Background: Today there are abounding collected data in cases of various diseases in medical sciences. Physicians can access new findings about diseases and procedures in dealing with them by probing these data. This study was performed to predict stroke incidence.

Methods: This study was carried out in Esfahan Al‑Zahra and Mashhad Ghaem hospitals during 2010‑2011. Information on 807 healthy and sick subjects was collected using a standard checklist that contains 50 risk factors for stroke such as history of cardiovascular disease, diabetes, hyperlipidemia, smoking and alcohol consumption. For analyzing data we used data mining techniques, K‑nearest neighbor and C4.5 decision tree using WEKA.

Results: The accuracy of the C4.5 decision tree algorithm and K‑nearest neighbor in predicting stroke was 95.42% and 94.18%, respectively.

Conclusions: The two algorithms, C4.5 decision tree algorithm and K‑nearest neighbor, can be used in order to predict stroke in high risk groups.

Keywords: Data mining, decision tree, K‑nearest neighbor, prediction, stroke

Full Text: