TY - GEN
T1 - Boosted Spectral Embedding (BoSE)
T2 - 2011 8th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI'11
AU - Sridhar, Akshay
AU - Doyle, Scott
AU - Madabhushi, Anant
PY - 2011
Y1 - 2011
N2 - In machine learning, non-linear dimensionality reduction (NLDR) is commonly used to embed high-dimensional data into a low-dimensional space while preserving local object adjacencies. However, the majority of NLDR methods define object adjacencies using distance metrics that do not account for the quality of the features in the high-dimensional space. In this paper we present Boosted Spectral Embedding (BoSE), a variant of the traditional Spectral Embedding (SE) that utilizes a Boosted Distance Metric (BDM) to improve the low-dimensional representation of the data. Under the naive assumption that all features are equally important, SE uses the Euclidean distance metric to define object-distance relationships. However, the BDM selectively weights features via the AdaBoost algorithm such that the low-dimensional representation contains only the most discriminating features. In this work BoSE is evaluated against SE in the context of digitized histopathology images using (a) content-based image retrieval and (b) classification via Random Forest of the low-dimensional representation. Using images from a cohort of 58 prostate cancer patient studies, BoSE and SE separated benign and malignant samples with areas under the precision-recall curve (AUPRCs) of 0.95 and 0.67 and classification accuracies using a Random Forest (RF) classifer were 0.93 and 0.79, respectively. For a cohort of 55 breast cancer studies, BoSE and SE performed comparably in terms of both RF accuracy and AUPRC. In addition, a qualitative visualization of the low-dimensional data representations suggests that BoSE exhibits improved class separability over SE.
AB - In machine learning, non-linear dimensionality reduction (NLDR) is commonly used to embed high-dimensional data into a low-dimensional space while preserving local object adjacencies. However, the majority of NLDR methods define object adjacencies using distance metrics that do not account for the quality of the features in the high-dimensional space. In this paper we present Boosted Spectral Embedding (BoSE), a variant of the traditional Spectral Embedding (SE) that utilizes a Boosted Distance Metric (BDM) to improve the low-dimensional representation of the data. Under the naive assumption that all features are equally important, SE uses the Euclidean distance metric to define object-distance relationships. However, the BDM selectively weights features via the AdaBoost algorithm such that the low-dimensional representation contains only the most discriminating features. In this work BoSE is evaluated against SE in the context of digitized histopathology images using (a) content-based image retrieval and (b) classification via Random Forest of the low-dimensional representation. Using images from a cohort of 58 prostate cancer patient studies, BoSE and SE separated benign and malignant samples with areas under the precision-recall curve (AUPRCs) of 0.95 and 0.67 and classification accuracies using a Random Forest (RF) classifer were 0.93 and 0.79, respectively. For a cohort of 55 breast cancer studies, BoSE and SE performed comparably in terms of both RF accuracy and AUPRC. In addition, a qualitative visualization of the low-dimensional data representations suggests that BoSE exhibits improved class separability over SE.
KW - boosting
KW - BoSE
KW - breast cancer
KW - content-based image retrieval
KW - histopathology
KW - prostate cancer
KW - spectral embedding
UR - https://www.scopus.com/pages/publications/80055049728
U2 - 10.1109/ISBI.2011.5872779
DO - 10.1109/ISBI.2011.5872779
M3 - Conference contribution
AN - SCOPUS:80055049728
SN - 9781424441280
T3 - Proceedings - International Symposium on Biomedical Imaging
SP - 1897
EP - 1900
BT - 2011 8th IEEE International Symposium on Biomedical Imaging
Y2 - 30 March 2011 through 2 April 2011
ER -