Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Feature Sub-Spacing Based Stacking for Effective Imbalance Handling in Sensitive Data


Affiliations
1 Department of Computer Science, Bharathidasan University, India
     

   Subscribe/Renew Journal


Several real world classification applications suffer from an issue called data imbalance. Handling data imbalance is crucial in developing an effective classification system. This work presents an effective classifier ensemble model, Feature Sub-spacing Stacking Model (FSSM) that has been designed to operate on highly imbalanced, complex and sensitive data. The FSSM technique is based on creating subspace of features, to aid in the reduction of data complexity and also to handle data imbalance. First level trains models based on these features, which is followed by creating a stacking architecture. The second level stacking architecture trains on the predictions from the first level base models. This has enabled better and qualitative predictions. Experiments were conducted on bank data and also the NSL-KDD data. Results reveal highly effective performances compared to the existing models.

Keywords

Classification, Data Imbalance, Ensemble, Stacking, Feature Sub-Spacing.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Feature Sub-Spacing Based Stacking for Effective Imbalance Handling in Sensitive Data

Abstract Views: 339  |  PDF Views: 1

Authors

S. Josephine Theresa
Department of Computer Science, Bharathidasan University, India
D. J. Evanjaline
Department of Computer Science, Bharathidasan University, India

Abstract


Several real world classification applications suffer from an issue called data imbalance. Handling data imbalance is crucial in developing an effective classification system. This work presents an effective classifier ensemble model, Feature Sub-spacing Stacking Model (FSSM) that has been designed to operate on highly imbalanced, complex and sensitive data. The FSSM technique is based on creating subspace of features, to aid in the reduction of data complexity and also to handle data imbalance. First level trains models based on these features, which is followed by creating a stacking architecture. The second level stacking architecture trains on the predictions from the first level base models. This has enabled better and qualitative predictions. Experiments were conducted on bank data and also the NSL-KDD data. Results reveal highly effective performances compared to the existing models.

Keywords


Classification, Data Imbalance, Ensemble, Stacking, Feature Sub-Spacing.

References