Fully Adaptive Elastic-Net (Faelastic) For Gene Selection In High Dimensional Cancer Classification
[Full Text]
AUTHOR(S)
Isah Aliyu Kargi, Norazlina bint Ismail, Ismail bin Mohammad
KEYWORDS
Adaptive elastic net, Classification of cancer, Gene selection, Regularized logistic regression
ABSTRACT
Classification of cancer in high dimensional DNA microarray data establish a significant field of research. Though, because of the challenges face by higher dimensional data in selection of genes and classification, numerous penalized likelihood methods are unsuccessful in identifying a small subset of significant genes. To address this problem, the present study proposed and applied a Fully Adaptive Elastic-net (FAElastic) model to perform gene selection and estimation of gene coefficients simultaneously. The proposed techniques, FAElastic-net has been assessed in terms of AUC, number of genes selected, Sensitivity, Specificity and informedness. From the findings which was computed from colon cancer microarray data set, it was confirmed that FAElastic outperforms the other four techniques from the performance metrics which includes: (i) selected number of genes (ii)AUC (iii)Sensitivity and Specificity and (iv) informedness. In addition, FAElastic results can be used practically to other related data of high dimensionality for cancer classification. Thus, we can accomplish the efficiency of the proposed FAElastic-net technique in practice to the medical research area.
REFERENCES
[1] N. A. Al-Thanoon, O. S. Qasim, and Z. Y. Algamal, “Tuning parameter estimation in SCAD-support vector machine using firefly algorithm with application in gene selection and cancer classification,” Comput. Biol. Med., vol. 103, no. August, pp. 262–268, 2018.
[2] Z. Yahya, R. Alhamzawi, H. Taha, and M. Ali, “Gene selection for microarray gene expression classi fi cation using Bayesian Lasso quantile regression,” Comput. Biol. Med., vol. 97, no. April, pp. 145–152, 2018.
[3] I. M. Johnstone and D. M. Titterington, “Statistical challenges of high-dimensional data,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 367, no. 1906, pp. 4237–4253, 2009.
[4] S. Ma, R. Fildes, and T. Huang, “Demand forecasting with high dimensional data : The case of SKU retail sales forecasting with intra- and inter-category promotional information,” Eur. J. Oper. Res., vol. 249, no. 1, pp. 245–257, 2016.
[5] G. Aneiros, R. Cao, R. Fraiman, C. Genest, and P. Vieu, “Recent advances in functional data analysis and high-dimensional statistics,” J. Multivar. Anal., vol. 170, pp. 3–9, 2019.
[6] L. Wang, H. Cheng, Z. Liu, and C. Zhu, “A robust elastic net approach for feature learning,” J. Vis. Commun. Image Represent., vol. 25, no. 2, pp. 313–321, 2014.
[7] M. Ijaz, Z. Asghar, and A. Gul, “Computation Ensemble of penalized logistic models for classification of high-dimensional data,” Commun. Stat. - Simul. Comput., vol. 0, no. 0, pp. 1–17, 2019.
[8] T. Basu and J. Einbeck, “Binary Credal Classification under Sparsity Constraints,” no. January, 2020.
[9] Y. Liang et al., “Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification,” BMC Bioinformatics, vol. 14, no. 1, pp. 1–12, 2013.
[10] K. Chen, K. Wang, K. Wang, and M. Angelia, “Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data,” Appl. Soft Comput. J., vol. 24, pp. 773–780, 2014.
[11] A. Hossain, S. Muhammad, S. Islam, J. M. W. Quinn, and F. Huq, “Machine learning and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality,” J. Biomed. Inform., vol. 100, no. February, p. 103313, 2019.
[12] P. T. Vu, A. A. Szpiro, and N. Simon, “Spatial Matrix Completion for Spatially-Misaligned and High-Dimensional Air Pollution Data,” pp. 1–20, 2020.
[13] C. Bielza, V. Robles, and P. Larrañaga, “Expert Systems with Applications Regularized logistic regression without a penalty term : An application to cancer classification with microarray data,” Expert Syst. Appl., vol. 38, no. 5, pp. 5110–5118, 2011.
[14] Z. Y. Algamal and M. H. Lee, “Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification,” Expert Syst. Appl., vol. 42, no. 23, pp. 9326–9332, 2015.
[15] J. Fan et al., “Feature Augmentation via Nonparametrics and Selection ( FANS ) in High Dimensional Classification Feature Augmentation via Nonparametrics and Selection ( FANS ) in High Dimensional Classification ∗,” vol. 1459, no. September, 2015.
[16] H. Huang, “Large dimensional analysis of general margin based classification methods,” pp. 1–28.
[17] A. R. Patil, “Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data,” 2020.
[18] R. Tibshirani, “Regression Shrinkage and Selection via the Lasso Robert Tibshirani,” J. R. Stat. Soc. Ser. B, vol. 58, no. 1, pp. 267–288, 1996.
[19] J. Mandozzi, P. Bühlmann, J. Mandozzi, and B. Peter, “Hierarchical Testing in the High-Dimensional Setting with Correlated Variables Hierarchical Testing in the High-Dimensional Setting with Correlated Variables,” vol. 1459, no. March, 2016.
[20] H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” pp. 301–320, 2005.
[21] H. Z. A. H. H. Zhang, “ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS,” vol. 37, no. 4, pp. 1733–1751, 2009.
[22] J. Wang. S., Nan, B., Rosset, S., Zhu, “Random lasso,” vol. 5, no. 1, pp. 468–485, 2011.
[23] Z. Liu et al., “Sparse logistic regression with Lp penalty for biomarker identification,” Stat. Appl. Genet. Mol. Biol., vol. 6, no. 1, 2007.
[24] N. Computing, J. Li, Y. Jia, and Z. Zhi-hua, “Partly adaptive elastic net and its application to microarray classification Partly adaptive elastic net and its application to microarray classification,” no. May, 2012.
[25] Z. Y. Algamal, R. Alhamzawi, and H. T. Mohammad Ali, “Gene selection for microarray gene expression classification using Bayesian Lasso quantile regression,” Comput. Biol. Med., vol. 97, no. April, pp. 145–152, 2018.
[26] S. Kwon, S. Lee, and Y. Kim, “Moderately clipped LASSO,” Comput. Stat. Data Anal., vol. 92, pp. 53–67, 2015.
[27] W. Rejchel, “Neurocomputing Oracle inequalities for ranking and U -processes with Lasso penalty,” vol. 239, pp. 214–222, 2017.
[28] M. Y. and T. H. Park, “L 1 -regularization path algorithm for generalized linear models,” pp. 659–677, 2007.
[29] J. Friedman, T. Hastie, and R. Tibshirani, “Regularization Paths for Generalized Linear Models via Coordinate Descent,” vol. 33, no. 1, pp. 1–20, 2010.
[30] X. Liu, S. Wang, H. Zhang, H. Zhang, Z. Yang, and Y. Liang, “Novel regularization method for biomarker selection and cancer classification,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. PP, no. c, p. 1, 2019.
[31] H. Zou, “The adaptive lasso and its oracle properties,” J. Am. Stat. Assoc., vol. 101, no. 476, pp. 1418–1429, 2006.
[32] S. Ghosh, “On the grouped selection and model complexity of the adaptive,” pp. 451–462, 2011.
[33] U. A. ALon, N. B. Arkai, D. A. N. Otterman, K. G. Ish, S. Y. Barra, and D. M. Ack, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” vol. 96, no. June, pp. 6745–6750, 1999.
[34] H. Yin, “An Empirical Study on Preprocessing High-dimensional Class-imbalanced Data for Classification,” pp. 1314–1319, 2015.
[35] A. Luque, A. Carrasco, A. Martín, and A. De, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit., vol. 91, pp. 216–231, 2019.
[36] G. Wang, J. Yuen, C. Teoh, J. Lu, and K. Sze, “Least squares support vector machines with fast leave ‑ one ‑ out AUC optimization on imbalanced prostate cancer data,” Int. J. Mach. Learn. Cybern., no. 0123456789, 2020.
[37] D. M.W, “Archived at the Flinders Academic Commons : This is the author ’ s post-print version of this article . The published article is available at : © 2011 Bioinfo Publications Please note that any alterations made during the publishing process may not appear,” vol. 2, no. 1, pp. 37–63, 2011.
|