Skip to main navigation Skip to search Skip to main content

Gene Co-AdaBoost: A semi-supervised approach for classifying gene expression data

  • SUNY Buffalo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Co-training has been proved successful in classifying many different kinds of data, such as text data and web data, which have naturally split views. Using these views as feature sets respectively, classifiers could make less generalization errors by maximizing their agreement over the unlabeled data. However, this method has limited performance in gene expression data. The first reason is that most gene expression data lacks of naturally split views. The second reason is that there are usually some noisy samples in the gene expression dataset. Furthermore, some semisupervised algorithms prefer to add these misclassified samples to the training set, which will mislead the classification. In this paper, a Co-training based algorithm named Gene Co-Adaboost is proposed to utilize limitedly labeled gene expression samples to predict the class variables. This method splits the gene features into relatively independent views and keeps the performance stable by refusing to add unlabeled examples that may be wrongly labeled to the training set with a Cascade Judgment technique. Experiments on four public microarray datasets indicate that Gene Co-Adaboost effectively uses the unlabeled samples to improve the classification accuracy.

Original languageEnglish
Title of host publication2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011
Pages531-535
Number of pages5
DOIs
StatePublished - 2011
Event2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011 - Chicago, IL, United States
Duration: Aug 1 2011Aug 3 2011

Publication series

Name2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011

Conference

Conference2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011
Country/TerritoryUnited States
CityChicago, IL
Period08/1/1108/3/11

Keywords

  • Cascade judgment
  • Co-training
  • Gene Co-Adaboost
  • Gene features split
  • Multi-views

Fingerprint

Dive into the research topics of 'Gene Co-AdaBoost: A semi-supervised approach for classifying gene expression data'. Together they form a unique fingerprint.

Cite this