TY - GEN
T1 - Explainable Feature Embeddings from Histopathology Foundation Models
T2 - Medical Imaging 2025: Digital and Computational Pathology
AU - Kasireddy, Harishwar Reddy
AU - Lucarelli, Nicholas
AU - Yun, Donghwan
AU - Moon, Kyung Chul
AU - La Rosa, Patricio S.
AU - Tomaszewski, John E.
AU - Han, Seung Seok
AU - Shickel, Benjamin
AU - Naglah, Ahmed
AU - Sarder, Pinaki
N1 - Publisher Copyright:
© 2025 SPIE.
PY - 2025
Y1 - 2025
N2 - Foundational models (FMs) based on advanced neural network architectures have demonstrated improved performance in pathology image analysis across various organs due to their increased generalizability. However, their clinical adoption requires explainability, as their black box nature limits transparency. Understanding the specific features these models learn for a given downstream task is crucial for explainability and integrating FMs into clinical workflows more effectively. We propose a computational pipeline that enhances explainability by correlating domain-specific handcrafted features (HFs), with hidden features i.e., feature embeddings (FEs) from FMs. We correlate and combine HFs from Detectron 2 DeepLabv3+ segmentation with FEs from Prov-Gigapath (PG) and UNI FMs for improved explainability and performance. In this work, HFs are extracted from segmented functional tissue units, including arteries, tubules, globally sclerotic glomeruli, and non-globally sclerotic glomeruli. FEs are extracted at the tile and slide levels for PG and at the tile level for UNI. We use the Pearson correlation coefficient to identify significant correspondences between these feature sets. To evaluate our proposed methodology, we use 56 diabetic nephropathy kidney biopsy whole slide images (WSIs) from Seoul National University Hospital. The task is to predict end-stage kidney disease (ESKD) two years post-biopsy using leave-one-out cross-validation on 56 WSIs, with 16 from ESKD patients and 40 from non-ESKD patients. We combine top correlated features from FEs of FMs with HFs and train logistic regression (LR) and k nearest neighbor (kNN) classifiers. LR model trained on combined feature set improved accuracy, balanced accuracy, Matthew's correlation coefficient, F1-score, precision, and recall to 0.8393, 0.7938, 0.5993, 0.8377, 0.8367, 0.8393 respectively, when compared to LR and kNN models trained on individual feature sets. PG excelled in specificity (1.000) and AUROC (0.8281), while UNI showed superior AUPRC (0.7813) performance. We also present feature explainability maps corresponding to each feature in FE.
AB - Foundational models (FMs) based on advanced neural network architectures have demonstrated improved performance in pathology image analysis across various organs due to their increased generalizability. However, their clinical adoption requires explainability, as their black box nature limits transparency. Understanding the specific features these models learn for a given downstream task is crucial for explainability and integrating FMs into clinical workflows more effectively. We propose a computational pipeline that enhances explainability by correlating domain-specific handcrafted features (HFs), with hidden features i.e., feature embeddings (FEs) from FMs. We correlate and combine HFs from Detectron 2 DeepLabv3+ segmentation with FEs from Prov-Gigapath (PG) and UNI FMs for improved explainability and performance. In this work, HFs are extracted from segmented functional tissue units, including arteries, tubules, globally sclerotic glomeruli, and non-globally sclerotic glomeruli. FEs are extracted at the tile and slide levels for PG and at the tile level for UNI. We use the Pearson correlation coefficient to identify significant correspondences between these feature sets. To evaluate our proposed methodology, we use 56 diabetic nephropathy kidney biopsy whole slide images (WSIs) from Seoul National University Hospital. The task is to predict end-stage kidney disease (ESKD) two years post-biopsy using leave-one-out cross-validation on 56 WSIs, with 16 from ESKD patients and 40 from non-ESKD patients. We combine top correlated features from FEs of FMs with HFs and train logistic regression (LR) and k nearest neighbor (kNN) classifiers. LR model trained on combined feature set improved accuracy, balanced accuracy, Matthew's correlation coefficient, F1-score, precision, and recall to 0.8393, 0.7938, 0.5993, 0.8377, 0.8367, 0.8393 respectively, when compared to LR and kNN models trained on individual feature sets. PG excelled in specificity (1.000) and AUROC (0.8281), while UNI showed superior AUPRC (0.7813) performance. We also present feature explainability maps corresponding to each feature in FE.
KW - Diabetic nephropathy
KW - classification
KW - digital pathology
KW - end stage kidney disease
KW - explainability
KW - feature embeddings
KW - foundation models
UR - https://www.scopus.com/pages/publications/105004794356
U2 - 10.1117/12.3047908
DO - 10.1117/12.3047908
M3 - Conference contribution
AN - SCOPUS:105004794356
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2025
A2 - Tomaszewski, John E.
A2 - Ward, Aaron D.
PB - SPIE
Y2 - 18 February 2025 through 20 February 2025
ER -