Review Article Open Access

A Survey of Methods for Managing the Classification and Solution of Data Imbalance Problem

Khan Md. Hasib1, Md. Sadiq Iqbal2, Faisal Muhammad Shah1, Jubayer Al Mahmud3, Mahmudul Hasan Popel1, Md. Imran Hossain Showrov4, Shakil Ahmed2 and Obaidur Rahman2
  • 1 Ahsanullah University of Science and Technology, Bangladesh
  • 2 Bangladesh University, Bangladesh
  • 3 University of Dhaka, Bangladesh
  • 4 Institute of Computer Science, Bangladesh

Abstract

The problem of class imbalance is extensive for focusing on numerous applications in the real world. In such a situation, nearly all of the examples are labeled as one class called majority class, while far fewer examples are labeled as the other class usually, the more important class is called minority. Over the last few years, several types of research have been carried out on the issue of class imbalance, including data sampling, cost-sensitive analysis, Genetic Programming based models, bagging, boosting, etc. Nevertheless, in this survey paper, we enlisted the 24 related studies in the years 2003, 2008, 2010, 2012 and 2014 to 2019, focusing on the architecture of single, hybrid and ensemble method design to understand the current status of improving classification output in machine learning techniques to fix problems with class imbalances. This survey paper also includes a statistical analysis of the classification algorithms under various methods and several other experimental conditions, as well as datasets used in different research papers.

Journal of Computer Science
Volume 16 No. 11, 2020, 1546-1557

DOI: https://doi.org/10.3844/jcssp.2020.1546.1557

Submitted On: 24 July 2020 Published On: 13 November 2020

How to Cite: Hasib, K. M., Iqbal, M. S., Shah, F. M., Al Mahmud, J., Popel, M. H., Showrov, M. I. H., Ahmed, S. & Rahman, O. (2020). A Survey of Methods for Managing the Classification and Solution of Data Imbalance Problem. Journal of Computer Science, 16(11), 1546-1557. https://doi.org/10.3844/jcssp.2020.1546.1557

  • 3,808 Views
  • 1,686 Downloads
  • 73 Citations

Download

Keywords

  • Class Imbalance
  • Ensemble
  • Survey Methods
  • Hybrid