A STABILITY-AWARE HYBRID FEATURE SELECTION ALGORITHM BASED ON FILTERED SPARSITY AND WRAPPER REFINEMENT
Authors
Rashidov Khusan Shirinboyevich

Share
Annotation
The process of choosing relevant features is essential in supervised machine learning, especially when dealing with tabular data of medium to high dimensions where unnecessary, uninformative, and interrelated predictors can negatively impact how well a model generalizes, how easy it is to understand, and how quickly it runs. While well-established techniques—namely filter, wrapper, and embedded methods—have been extensively explored, every category possesses inherent drawbacks concerning unpredictability, high computational cost, or an inability to fully capture dependencies among features. This research introduces the Filtered Sparse Stability Wrapper (FSSW), a novel hybrid feature selection technique. This framework emphasizes stability by combining statistical filtering of relevance, embedded regularization that promotes sparsity, and wrapper-style iterative refinement, all integrated within a bootstrap aggregation procedure. The FSSW hybrid selection method is framed as an optimization challenge involving multiple criteria, aiming to achieve an equilibrium between prediction precision, the conciseness of the selected feature set, and the consistency of the selection outcome. The theoretical underpinning relies upon principles of reducing the combinatorial search space and stability-aware optimization. Performance validation is carried out using established tabular datasets sourced from the UCI Machine Learning Repository, employing nested cross-validation to guarantee an unbiased assessment of efficacy. Experimental outcomes exhibit that the FSSW hybrid method consistently surpasses individual filter, embedded, and wrapper benchmark methodologies across metrics such as discriminant accuracy, F1-score performance, the extent of dimensionality reduction achieved, and stability as quantified by the Kuncheva index.
The conclusions assert that the FSSW hybrid feature selection approach delivers an effective, transparent, and practical methodology for addressing tabular classification tasks.
Keywords
Authors
Rashidov Khusan Shirinboyevich

Share
References:
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.
Pirgazi, J., Alimoradi, M., Esmaeili Abharian, T., & Olyaee, M. H. (2019). An efficient hybrid filter-wrapper metaheuristic-based gene selection method for high-dimensional datasets. Scientific Reports, 9(1), 18580. https://doi.org/10.1038/s41598-019-54866-2
Nogueira, S., Sechidis, K., & Brown, G. (2018). On the stability of feature selection algorithms. Journal of Machine Learning Research, 18(174), 1–54.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
Kuncheva, L. I. (2007, February). A stability index for feature selection. In Artificial intelligence and applications (pp. 421-427).
Brown, G., Pocock, A., Zhao, M. J., & Luján, M. (2012). Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. The journal of machine learning research, 13, 27-66.
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine learning, 46(1), 389-422.
Asuncion, A., & Newman, D. (2007). UCI machine learning repository. University of California, Irvine.http://archive.ics.uci.edu/ml
