MapReduce based Classification for Microarray data using Parallel Genetic Algorithm

Authors

  • E. Gothai Associate Professor, Perundurai, Tamilnadu, India
  • P. Aarthi Research Scholar, Kongu Engineering College, Perundurai, Tamilnadu, India.

DOI:

https://doi.org/10.24297/jac.v12i15.2413

Keywords:

MapReduce, Hadoop, Microarray, genes, mutual information, parallel attribute clustering, classification

Abstract

Inorder to uncover thousands of genes Microarray   produces high throughput is used. Only few gene expression data out of thousands of data is used for disease predication and also for disease classification in medical environment.  To find such initial coexpressed gene groups of clusters whose joint expression is strongly related with the class label A Supervised attribute clustering is used. By sharing the information between each attributes the Mutual Information uses the information of sample varieties to measure the similarity among the attributes. From this the redundant and irrelevant attributes are removed. After forming the clusters the PGA is used to find the optimal feature and is given as mapper function so as to improve the class separability. Using this method the diagnosis can be made easier and effective since its done parallelly. The predictive accuracy is estimated using all the three classifiers such as K-nearest neighbours including naive bayes and Support Vector machine. Thus the overall approach used reducer function which provides excellent predictive capability for accurate medical diagnosis.

Downloads

Download data is not yet available.

Author Biography

E. Gothai, Associate Professor, Perundurai, Tamilnadu, India

Department of CSE, Kongu Engineering College,

References

W. H. Au, K. C. C. Chan, A. K. C. Wong and Y. Wang , “Attribute Clustering for Grouping , Selection, and Classification of Gene Expression Data”, IEEE/ACM Trans. Computational Biology and Bioinformatics, Vol. 2, No. 2, pp. 83-101, Apr-Jun 2005.
M. Dettling and P. Buhlmann, “Supervised Clustering of Genes”, Genome Biology, Vol.3, No. 12, pp.0069.1-0069.15,2002.
P. A. Devijver and J. Kittler, “Pattern Recognition: A Statistical Approach”, Prentice Hall,1982.
E. Domany, “Cluster Analysis of Gene Expression Data”, J.Statistical Physics, Vol.110, Nos. 3-6, pp. 1117-1139, 2011.
T. R. Golub, D. K. Slonim, P. Tamayo and C. Huard, “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring”, Science, Vol. 286, No. 5439, pp. 531-537, 1999.
D. Huang and T. W. S. Chow, “Effective Feature Selection Scheme Using Mutual Information”, Neurocomputing, Vol.63, pp.325-343, 2004.
Lei Wang, “Feature Selection with Kernel Class Separability”, IEEE Trans.Pattern Analysis and Machine Intelligence, Vol. 30, No., 9, 2008.
J. Li, H. Su, H. Chen and B. W. Futscher, “Optimal Search-based Gene Subset Selection for Gene Array Cancer Classification”, IEEE Trans. Biomedical Eng., Vol. 56, No .4, pp. 1063-1069, 2009.
Pradipta Maji, “Fuzzy-Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data”, IEEE Trans. Cybernetics., Vol. 41, No.1, 2011.
Sheng-Bo Guu., Michael Lyu R. and Tat-Ming Lok, ‘Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer’, Science, Vol 134, 2004.
Pradipta Maji., Mutual information based supervised Attribute clustering for microarray sample classification., IEEE transaction on Knowledge and data Engineering., Vol 24,No.1, Jan2012.
P.Aarthi, E.Gothai “Enhancing Sample Classification for Microarray datasets using Genetic Algorithm”, International Conference on Information Communication & Embedded Systems (ICICES 2014)
Apache. Org. Hadoop distributed file system. http://hadoop.apache.org.

Apache Hadoop, http://www.cloudera.com/hadoop/

Borthakur D. (2007), ‘The hadoop distributed file system: architecture and design’, Hadoop Project Website
P.Aarthi, E.Gothai, “Improving Class Separability for Microarray datasets using Genetic Algorithm
with KLD Measure”, International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 3, Issue 2, March 2014.

Downloads

Published

2016-08-01

How to Cite

Gothai, E., & Aarthi, P. (2016). MapReduce based Classification for Microarray data using Parallel Genetic Algorithm. JOURNAL OF ADVANCES IN CHEMISTRY, 12(15), 4860–4866. https://doi.org/10.24297/jac.v12i15.2413

Issue

Section

Articles