RAMOBoost: Ranked minority oversampling in boosting

Document Type

Article

Date of Original Version

10-1-2010

Abstract

In recent years, learning from imbalanced data has attracted growing attention from both academia and industry due to the explosive growth of applications that use and produce imbalanced data. However, because of the complex characteristics of imbalanced data, many real-world solutions struggle to provide robust efficiency in learning-based applications. In an effort to address this problem, this paper presents Ranked Minority Oversampling in Boosting (RAMOBoost), which is a RAMO technique based on the idea of adaptive synthetic data generation in an ensemble learning system. Briefly, RAMOBoost adaptively ranks minority class instances at each learning iteration according to a sampling probability distribution that is based on the underlying data distribution, and can adaptively shift the decision boundary toward difficult-to-learn minority and majority class instances by using a hypothesis assessment procedure. Simulation analysis on 19 real-world datasets assessed over various metricsincluding overall accuracy, precision, recall, F-measure, G-mean, and receiver operation characteristic analysisis used to illustrate the effectiveness of this method. © 2010 IEEE.

Publication Title, e.g., Journal

IEEE Transactions on Neural Networks

Volume

21

Issue

10

Share

COinS