Research Article Open Access

Missing Values Treatment and Feature Reduction Analysis to Enhance Classification

D. Muralidharan1, K. Renuka1, Mulagala Jaswant1, J. Karthikeyan1 and G.R. Brindha1
  • 1 SASTRA Deemed University, India

Abstract

Datasets may have large number of features which makes it hard and time consuming to classify. Additionally, they may have irrelevant and noise features too with missing values. The missing values should be treated in a proper way so that the classifier accuracy can be improved. There is also a need to reduce features and select only the features necessary to the classifier. Principal Component Analysis (PCA) is commonly considered for this process of reducing the number of features in a dataset. These reduced components can be applied as input to the classifiers. In this study, standard datasets are checked for missing values, classified using Support vector Machines (SVM) and Naive Bayes with and without reducing the features using PCA. Then, the proposed algorithm for missing value imputation is used on the datasets and the same analysis were carried out. The accuracy is evaluated using Confusion Matrix. The results are discussed with analysis based on the nature of features and missing values and how different datasets behave when used with machine learning algorithms.

Journal of Computer Science
Volume 16 No. 2, 2020, 211-216

DOI: https://doi.org/10.3844/jcssp.2020.211.216

Submitted On: 7 July 2019 Published On: 20 February 2020

How to Cite: Muralidharan, D., Renuka, K., Jaswant, M., Karthikeyan, J. & Brindha, G. (2020). Missing Values Treatment and Feature Reduction Analysis to Enhance Classification. Journal of Computer Science, 16(2), 211-216. https://doi.org/10.3844/jcssp.2020.211.216

  • 4,059 Views
  • 1,628 Downloads
  • 0 Citations

Download

Keywords

  • PCA
  • SVM
  • Naive Bayes
  • Missing Value Treatment