Supervised Feature Selection based on the Law of Total Variance
Keywords:Correlation-based measure, dimensionality reduction, feature selection, law of total variance, classification
Feature selection is a fundamental pre-processing step in machine learning that decreases data dimensionality by removing superfluous and irrelevant features. This study proposes a supervised feature selection method based on feature relevance by employing the law of total variance (LTV). Specifically, the LTV is used to quantify the relevance of features by analysing the association between features and class label. Six classifiers were employed to evaluate the performance and reliability of the proposed method pertaining to classification accuracy. The results proved that a feature subset given by the proposed method has the capability to achieve comparable classification accuracy to the full feature set when just half or less than half of the original features are retained. The proposed method was also proven to be versatile as it can achieves adequate classification accuracy with all six classifiers with different learning schemes. In addition, a comparison with a similar type of feature selection method (AmRMR) shows that the proposed method yields a more accurate classification.
How to Cite
Copyright (c) 2023 The Author(s)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.