Publication | Closed Access
Generalizable mass spectrometry mining used to identify disease state biomarkers from blood serum
21
Citations
2
References
2003
Year
We bring a "spectrum" of classical data mining and statistical analysis methods to bear on discrimination of two groups of spectra from 24 diseased and 17 normal patients. Our primary goal is to accurately estimate the generalizability of this small dataset. After an aggressive preprocessing step that reduces consideration to only 55 peaks, we conduct over 35 out-of-sample cross-validation simulations of logistic regression, binary decision trees, and linear discriminant analysis. Misclassification rates grow worse as the size of the holdout sample increases, with many exceeding 30 percent. The ability to generalize is clearly tempered by the statistical, instrumentation, and biophysical characteristics of the study.
| Year | Citations | |
|---|---|---|
Page 1
Page 1