Constructing a classification model for the multi-class data is a critical problem in many areas. In practical applications, data in multiple classes are often imbalanced which might result in a classification model with high overall accuracy rate but with low accuracy rate for the minority class. However, minority class is usually the more important one compared to other classes in practice. This study integrates dual response surface methodology, logistic regression analysis, and desirability function to develop an optimal re-sampling strategy for classifying multi-class imbalanced data to effectively improve the low classification accuracy rate of the minority class(es) while still maintain a certain accuracy rate for the majority class(es). Three data-sets drawn from KEEL Database were used in the numerical experiments. The results showed that the proposed method can effectively improve the low classification accuracy rate of the minority class in contrast to the previous work.
- Multi-class imbalanced data
- design of experiments
- dual response surface methodology
- re-sampling strategy