Skip to main navigation Skip to search Skip to main content

Supervised Multi-view Canonical Correlation Analysis (sMVCCA): Integrating histologic and proteomic features for predicting recurrent prostate cancer

  • George Lee
  • , Asha Singanamalli
  • , Haibo Wang
  • , Michael D. Feldman
  • , Stephen R. Master
  • , Natalie N.C. Shih
  • , Elaine Spangler
  • , Timothy Rebbeck
  • , John E. Tomaszewski
  • , Anant Madabhushi
  • Case Western Reserve University
  • University of Pennsylvania

Research output: Contribution to journalArticlepeer-review

88 Scopus citations

Abstract

In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the 'curse of dimensionality' problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman's rank correlation which, unlike Pearson's correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p < 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.

Original languageEnglish
Article number2355175
Pages (from-to)284-297
Number of pages14
JournalIEEE Transactions on Medical Imaging
Volume34
Issue number1
DOIs
StatePublished - Jan 1 2015

Keywords

  • Data fusion
  • digital pathology
  • dimensionality reduction
  • mass spectrometry
  • prostate cancer
  • proteomics

Fingerprint

Dive into the research topics of 'Supervised Multi-view Canonical Correlation Analysis (sMVCCA): Integrating histologic and proteomic features for predicting recurrent prostate cancer'. Together they form a unique fingerprint.

Cite this