Skip to main content
WorldCIST'19 - 7th World Conference on Information Systems and Technologies

Full Program »

Breast cancer classification with missing data imputation

Missing Data (MD) is a common drawback when applying Data Mining on breast cancer datasets since it affects the ability of the DM classifier. This study evaluates the influence of MD on three classifiers: Decision tree C4.5, Support vector machine (SVM), and Multi-Layer Perceptron (MLP). For this purpose, 162 experiments were conducted using KNN imputation with three missingness mechanisms (MCAR, MAR and NMAR), and nine percentages (form 10% to 90%) applied on two Wisconsin breast cancer datasets. The MD percentage affects negatively the classifier performance. MLP achieved the lowest accuracy rates regardless the MD mechanism/percentage.

Imane Chlioui
University Mohammed V in Rabat
Morocco

Ibtissam Abnane
University Mohammed V in Rabat
Morocco

Ali Idri
University Mohammed V in Rabat
Morocco

Jose Luis Fernandez-Aleman
University of Murcia
Spain

 


Powered by OpenConf®
Copyright ©2002-2018 Zakon Group LLC