Deep-gKnock: Nonlinear group-feature selection with deep neural networks
Document Type
Article
Date of Original Version
3-1-2021
Abstract
Feature selection is central to contemporary high-dimensional data analysis. Group structure among features arises naturally in various scientific problems. Many methods have been proposed to incorporate the group structure information into feature selection. However, these methods are normally restricted to a linear regression setting. To relax the linear constraint, we design a new Deep Neural Network (DNN) architecture and integrating it with the recently proposed knockoff technique to perform nonlinear group-feature selection with controlled group-wise False Discovery Rate (gFDR). Experimental results on high-dimensional synthetic data demonstrate that our method achieves the highest power and accurate gFDR control compared with state-of-the-art methods. The performance of Deep-gKnock is especially superior in the following five situations: (1) nonlinearity relationship; (2) dimension p greater than sample size n; (3) high between-group correlation; (4) high within-group correlation; (5) large number of associated groups. And Deep-gKnock is also demonstrated to be robust to the misspecification of the feature distribution and the change of network architecture. Moreover, Deep-gKnock achieves scientifically meaningful group-feature selection results for cutting-edge real world datasets.
Publication Title, e.g., Journal
Neural Networks
Volume
135
Citation/Publisher Attribution
Zhu, Guangyu, and Tingting Zhao. "Deep-gKnock: Nonlinear group-feature selection with deep neural networks." Neural Networks 135, (2021): 139-147. doi: 10.1016/j.neunet.2020.12.004.