KernelADASYN: Kernel based adaptive synthetic data generation for imbalanced learning

Document Type

Conference Proceeding

Date of Original Version



In imbalanced learning, most standard classification algorithms usually fail to properly represent data distribution and provide unfavorable classification performance. More specifically, the decision rule of minority class is usually weaker than majority class, leading to many misclassification of expensive minority class data. Motivated by our previous work ADASYN [1], this paper presents a novel kernel based adaptive synthetic over-sampling approach, named KernelADASYN, for imbalanced data classification problems. The idea is to construct an adaptive over-sampling distribution to generate synthetic minority class data. The adaptive over-sampling distribution is first estimated with kernel density estimation methods and is further weighted by the difficulty level for different minority class data. The classification performance of our proposed adaptive over-sampling approach is evaluated on several real-life benchmarks, specifically on medical and healthcare applications. The experimental results show the competitive classification performance for many real-life imbalanced data classification problems.

Publication Title

2015 IEEE Congress on Evolutionary Computation, CEC 2015 - Proceedings