Title
|
|
|
|
A cross-validation study to select a classification procedure for clinical diagnosis based on proteomic mass spectrometry
| |
Author
|
|
|
|
| |
Abstract
|
|
|
|
We present an approach to construct a classification rule based on the mass spectrometry data provided by the organizers of the "Classification Competition on Clinical Mass Spectrometry Proteomic Diagnosis Data." Before constructing a classification rule, we attempted to pre-process the data and to select features of the spectra that were likely due to true biological signals (i.e., peptides/proteins). As a result, we selected a set of 92 features. To construct the classification rule, we considered eight methods for selecting a subset of the features, combined with seven classification methods. The performance of the resulting 56 combinations was evaluated by using a cross-validation procedure with 1000 re-sampled data sets. The best result, as indicated by the lowest overall misclassification rate, was obtained by using the whole set of 92 features as the input for a support-vector machine (SVM) with a linear kernel. This method was therefore used to construct the classification rule. For the training data set, the total error rate for the classification rule, as estimated by using leave-one-out cross-validation, was equal to 0.16, with the sensitivity and specificity equal to 0.87 and 0.82, respectively. |
| |
Language
|
|
|
|
English
| |
Source (journal)
|
|
|
|
Statistical applications in genetics and molecular biology. - [Berkeley, CA], 2002, currens
| |
Publication
|
|
|
|
[Berkeley, CA]
:
Berkeley Electronic Press
,
2008
| |
ISSN
|
|
|
|
1544-6115
| |
Volume/pages
|
|
|
|
7
:2
(2008)
, 22 p.
| |
Article Reference
|
|
|
|
12
| |
ISI
|
|
|
|
000254568100009
| |
Medium
|
|
|
|
E-only publicatie
| |
|