Gene Selection for Cancer Diagnosis via Iterative Graph Clustering-based Approach

The development of microarray devices has led to the accumulation of DNA microarray datasets. Through this technological advance, physicians are able to examine various aspects of gene expression for cancer diagnosis. As data accumulation rapidly increases, the task of machine learning faces considerable challenges for high-dimensional DNA microarray data classification. Gene selection is a popular and powerful approach to deal with these high-dimensional cancer data. In this paper, a novel graph clustering-based gene selection approach is developed. The developed approach has two main objectives, consisting of relevance maximization and redundancy minimization of the selected genes. In this method, in each iteration, one subgraph is extracted, and then among the existing genes in this cluster, appropriate genes are selected using filter-based measure. The reported results on five cancer datasets indicate that the developed gene selection approach can improve the accuracy of cancer diagnosis.