University of Surrey

Test tubes in the lab Research in the ATI Dance Research

A novel feature selection approach for biomedical data classification.

Peng, Y, Wu, Z and Jiang, J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform, 43 (1). pp. 15-23.

Full text not available from this repository.

Abstract

This paper presents a novel feature selection approach to deal with issues of high dimensionality in biomedical data classification. Extensive research has been performed in the field of pattern recognition and machine learning. Dozens of feature selection methods have been developed in the literature, which can be classified into three main categories: filter, wrapper and hybrid approaches. Filter methods apply an independent test without involving any learning algorithm, while wrapper methods require a predetermined learning algorithm for feature subset evaluation. Filter and wrapper methods have their, respectively, drawbacks and are complementary to each other in that filter approaches have low computational cost with insufficient reliability in classification while wrapper methods tend to have superior classification accuracy but require great computational power. The approach proposed in this paper integrates filter and wrapper methods into a sequential search procedure with the aim to improve the classification performance of the features selected. The proposed approach is featured by (1) adding a pre-selection step to improve the effectiveness in searching the feature subsets with improved classification performances and (2) using Receiver Operating Characteristics (ROC) curves to characterize the performance of individual features and feature subsets in the classification. Compared with the conventional Sequential Forward Floating Search (SFFS), which has been considered as one of the best feature selection methods in the literature, experimental results demonstrate that (i) the proposed approach is able to select feature subsets with better classification performance than the SFFS method and (ii) the integrated feature pre-selection mechanism, by means of a new selection criterion and filter method, helps to solve the over-fitting problems and reduces the chances of getting a local optimal solution.

Item Type: Article
Authors :
NameEmailORCID
Peng, YUNSPECIFIEDUNSPECIFIED
Wu, ZUNSPECIFIEDUNSPECIFIED
Jiang, Jjianmin.jiang@surrey.ac.ukUNSPECIFIED
Date : February 2010
Identification Number : 10.1016/j.jbi.2009.07.008
Uncontrolled Keywords : Algorithms, Area Under Curve, Artificial Intelligence, Automatic Data Processing, Breast Neoplasms, Computational Biology, Female, Gene Expression Profiling, Heart, Humans, Neural Networks (Computer), Oligonucleotide Array Sequence Analysis, Pattern Recognition, Automated, ROC Curve, Software
Related URLs :
Depositing User : Symplectic Elements
Date Deposited : 17 May 2017 12:24
Last Modified : 17 May 2017 15:03
URI: http://epubs.surrey.ac.uk/id/eprint/835190

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800