TY - GEN
T1 - Renal Cell Type and State Estimation in Brightfield Histology Images
T2 - Medical Imaging 2025: Digital and Computational Pathology
AU - Fermin, Jamie L.
AU - Border, Samuel
AU - Naglah, Ahmed
AU - Shickel, Benjamin
AU - La Rosa, Patricio S.
AU - Tomaszewski, John E.
AU - Jain, Sanjay
AU - El-Achkar, Tarek M.
AU - Eadon, Michael T.
AU - Sarder, Pinaki
N1 - Publisher Copyright:
© 2025 SPIE.
PY - 2025
Y1 - 2025
N2 - Multi-omics data, such as 10X Genomics Visium (spatial transcriptomics), measure gene expressions, molecular pathway activities, and can predict cell types/states but are often expensive and inaccessible in clinical settings. Thus, despite the emergence of multi-omics technologies, histopathological assessments under brightfield microscopy remain the diagnostic gold standard. In this work, we examine machine learning-based pipelines for predicting cell types/states from brightfield h i stology i m ages u s ing s t ate-of-the-art (SOTA) d e ep l e arning (DL) models, aiming to enhance diagnostics and prognostics in clinical medicine. Our proposed pipeline consists of two stages: (1) an Image-To-Text retrieval Network (ITTN) that leverages the CONtrastive learning from Captions for Histopathology (CONCH) model to assign histopathological text prompt from brightfield h i stology i m age, a n d (2) a V i sion L a nguage M odel (V LM), w h ich i s b u ilt o n t h e same CONCH model used in ITTN but incorporates a regression head to predict cell type/state proportions based on the paired image and text inputs. During training, we classify the image into one of four structural types (glomerulus, tubules, vessels, and interstitium) using the ITTN. These classification l a bels a r e t h en u s ed to construct a new text prompt with a suitable histopathological description for each image in the test set. The new text prompt and raw image are used as paired inputs to the VLM to predict cell types/states. We also utilize SOTA models, such as CONCH (using only the vision encoder), ViT, and ResNet, which employ image-only inputs in separate regression pipelines. We experimented and tested our proposed pipelines on a set of 10X Visium formalin-fixed paraffin-embedded whole slides images of diabetic nephropathy samples collected at Indiana University. Our experiments yielded a mean squared error of 0.0027 for the proposed pipeline, showing improvements of 20.59%, 27.03%, and 32.50% over CONCH (image only), ViT, and ResNet, respectively. The proposed pipeline aims to bridge the gap between traditional histopathology and molecular diagnostics, enhancing disease diagnosis and prognosis.
AB - Multi-omics data, such as 10X Genomics Visium (spatial transcriptomics), measure gene expressions, molecular pathway activities, and can predict cell types/states but are often expensive and inaccessible in clinical settings. Thus, despite the emergence of multi-omics technologies, histopathological assessments under brightfield microscopy remain the diagnostic gold standard. In this work, we examine machine learning-based pipelines for predicting cell types/states from brightfield h i stology i m ages u s ing s t ate-of-the-art (SOTA) d e ep l e arning (DL) models, aiming to enhance diagnostics and prognostics in clinical medicine. Our proposed pipeline consists of two stages: (1) an Image-To-Text retrieval Network (ITTN) that leverages the CONtrastive learning from Captions for Histopathology (CONCH) model to assign histopathological text prompt from brightfield h i stology i m age, a n d (2) a V i sion L a nguage M odel (V LM), w h ich i s b u ilt o n t h e same CONCH model used in ITTN but incorporates a regression head to predict cell type/state proportions based on the paired image and text inputs. During training, we classify the image into one of four structural types (glomerulus, tubules, vessels, and interstitium) using the ITTN. These classification l a bels a r e t h en u s ed to construct a new text prompt with a suitable histopathological description for each image in the test set. The new text prompt and raw image are used as paired inputs to the VLM to predict cell types/states. We also utilize SOTA models, such as CONCH (using only the vision encoder), ViT, and ResNet, which employ image-only inputs in separate regression pipelines. We experimented and tested our proposed pipelines on a set of 10X Visium formalin-fixed paraffin-embedded whole slides images of diabetic nephropathy samples collected at Indiana University. Our experiments yielded a mean squared error of 0.0027 for the proposed pipeline, showing improvements of 20.59%, 27.03%, and 32.50% over CONCH (image only), ViT, and ResNet, respectively. The proposed pipeline aims to bridge the gap between traditional histopathology and molecular diagnostics, enhancing disease diagnosis and prognosis.
KW - Diabetic nephropathy
KW - digital pathology
KW - foundation model
KW - gene expression
KW - regression
KW - spatial transcriptomics
UR - https://www.scopus.com/pages/publications/105004789155
U2 - 10.1117/12.3047996
DO - 10.1117/12.3047996
M3 - Conference contribution
AN - SCOPUS:105004789155
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2025
A2 - Tomaszewski, John E.
A2 - Ward, Aaron D.
PB - SPIE
Y2 - 18 February 2025 through 20 February 2025
ER -