DDW ePoster Library

Abstract
Discussion Forum (0)

Number: Su1831
IDENTIFICATION AND RANKING OF GENE MODULE ASSOCIATION WITH HISTOLOGIC FEATURES IN ULCERATIVE COLITIS

Society: AGA
Track: Inflammatory Bowel Diseases

Histologic features in digitized whole slide images (WSIs) of Inflammatory Bowel Disease (IBD) biopsies capture aspects of disease activity that are increasingly seen as valuable in characterizing mucosal healing and remission in clinical trials. Given the success achieved in estimating histologic disease severity from WSIs using weakly supervised deep learning (DL) models, we investigated whether such models can identify visual histologic phenotypes associated with gene co-expression modules (sets of genes with highly coordinated expression across samples) related to IBD biology. Using transcriptomic profiling (microarray) data from biopsies adjacent to 1967 H&E-stained colonic biopsies collected from 1599 patients in three Ulcerative Colitis (UC) clinical trials (NCT01959282, NCT01988961, and NCT01988961), we applied correlation and clustering analysis to identify 16 gene co-expression modules. After removing trial-specific transcriptomics batch effects by preprocessing with ComBat, we used 1579 WSIs of the above biopsies to train a multi-instance learning (MIL) multitask DL model with self-attention to estimate Gene Signature Variation Analysis (GSVA) signatures computed for each of those modules. To compensate for training data limitations, our signature estimation pipeline incorporated a foundational model - a very large neural network that uses self-supervised learning to learn histological features from additional unlabeled WSI datasets. We used the Self-Distillation with No Labels (DINO) v2 foundational model with 108 parameters pretrained on over 56000 WSIs of biopsies obtained from patients with IBD or several different types of cancer in multiple clinical trial and retrospective study datasets to create fixed-length representations of input WSIs. We measured model predictive performance by computing the mean root mean squared error (RMSE) for each module's estimated signature values normalized with respect to signature range and averaged across 4 cross-validation folds; mean normalized RMSE values for the modules ranged from 0.22 to 0.30 over a scale of [0, 1]. By ranking model performance evaluated on a test subset of 388 WSIs for each of the target tasks, we were able to identify modules whose signatures are most strongly associated to histological image data. We observed that the modules with best performance (lowest RMSE) - which included those corresponding to immune signaling, granulocytes, stromal tissue, and plasma cells - all are associated with active inflammation and other aspects of immune response. These findings suggest that estimation of gene expression associated with immune response/inflammation from adjacent histology images may provide a richer assessment of histologic disease activity than existing severity measures alone without necessitating additional and costly transcriptomics analysis of imaged biopsy tissue.

Module-specific performance of multitask MIL model for estimation of GSVA signatures evaluated on a single holdout dataset and averaged across 4 cross-validation folds.

Gene correlation network (GCN) generated from colonic biopsy transcriptomics data from patients with UC.

Number: Su1831
IDENTIFICATION AND RANKING OF GENE MODULE ASSOCIATION WITH HISTOLOGIC FEATURES IN ULCERATIVE COLITIS

Society: AGA
Track: Inflammatory Bowel Diseases

Histologic features in digitized whole slide images (WSIs) of Inflammatory Bowel Disease (IBD) biopsies capture aspects of disease activity that are increasingly seen as valuable in characterizing mucosal healing and remission in clinical trials. Given the success achieved in estimating histologic disease severity from WSIs using weakly supervised deep learning (DL) models, we investigated whether such models can identify visual histologic phenotypes associated with gene co-expression modules (sets of genes with highly coordinated expression across samples) related to IBD biology. Using transcriptomic profiling (microarray) data from biopsies adjacent to 1967 H&E-stained colonic biopsies collected from 1599 patients in three Ulcerative Colitis (UC) clinical trials (NCT01959282, NCT01988961, and NCT01988961), we applied correlation and clustering analysis to identify 16 gene co-expression modules. After removing trial-specific transcriptomics batch effects by preprocessing with ComBat, we used 1579 WSIs of the above biopsies to train a multi-instance learning (MIL) multitask DL model with self-attention to estimate Gene Signature Variation Analysis (GSVA) signatures computed for each of those modules. To compensate for training data limitations, our signature estimation pipeline incorporated a foundational model - a very large neural network that uses self-supervised learning to learn histological features from additional unlabeled WSI datasets. We used the Self-Distillation with No Labels (DINO) v2 foundational model with 108 parameters pretrained on over 56000 WSIs of biopsies obtained from patients with IBD or several different types of cancer in multiple clinical trial and retrospective study datasets to create fixed-length representations of input WSIs. We measured model predictive performance by computing the mean root mean squared error (RMSE) for each module's estimated signature values normalized with respect to signature range and averaged across 4 cross-validation folds; mean normalized RMSE values for the modules ranged from 0.22 to 0.30 over a scale of [0, 1]. By ranking model performance evaluated on a test subset of 388 WSIs for each of the target tasks, we were able to identify modules whose signatures are most strongly associated to histological image data. We observed that the modules with best performance (lowest RMSE) - which included those corresponding to immune signaling, granulocytes, stromal tissue, and plasma cells - all are associated with active inflammation and other aspects of immune response. These findings suggest that estimation of gene expression associated with immune response/inflammation from adjacent histology images may provide a richer assessment of histologic disease activity than existing severity measures alone without necessitating additional and costly transcriptomics analysis of imaged biopsy tissue.

Module-specific performance of multitask MIL model for estimation of GSVA signatures evaluated on a single holdout dataset and averaged across 4 cross-validation folds.

Gene correlation network (GCN) generated from colonic biopsy transcriptomics data from patients with UC.

IDENTIFICATION AND RANKING OF GENE MODULE ASSOCIATION WITH HISTOLOGIC FEATURES IN ULCERATIVE COLITIS
Dr. Lev Givon
Dr. Lev Givon
Author(s): Lev Givon,  
Lev Givon
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Chaitanya Parmar,  
Chaitanya Parmar
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Gabriela O. Cula,  
Gabriela O. Cula
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Weiwei Schultz,  
Weiwei Schultz
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Patrick Branigan,  
Patrick Branigan
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Aleksandar Stojmirovic,  
Aleksandar Stojmirovic
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Louis R. Ghanem,  
Louis R. Ghanem
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Dylan Richards,  
Dylan Richards
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Tom C. Freeman,  
Tom C. Freeman
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
Kristopher Standish
Kristopher Standish
Affiliations:
Data Science & Digital Health, Janssen Research and Development LLC, Raritan, New Jersey, United States
DDW ePoster Library. Givon L. 05/19/2024; 415811; Su1831
Abstract
Discussion Forum (0)

Number: Su1831
IDENTIFICATION AND RANKING OF GENE MODULE ASSOCIATION WITH HISTOLOGIC FEATURES IN ULCERATIVE COLITIS

Society: AGA
Track: Inflammatory Bowel Diseases

Histologic features in digitized whole slide images (WSIs) of Inflammatory Bowel Disease (IBD) biopsies capture aspects of disease activity that are increasingly seen as valuable in characterizing mucosal healing and remission in clinical trials. Given the success achieved in estimating histologic disease severity from WSIs using weakly supervised deep learning (DL) models, we investigated whether such models can identify visual histologic phenotypes associated with gene co-expression modules (sets of genes with highly coordinated expression across samples) related to IBD biology. Using transcriptomic profiling (microarray) data from biopsies adjacent to 1967 H&E-stained colonic biopsies collected from 1599 patients in three Ulcerative Colitis (UC) clinical trials (NCT01959282, NCT01988961, and NCT01988961), we applied correlation and clustering analysis to identify 16 gene co-expression modules. After removing trial-specific transcriptomics batch effects by preprocessing with ComBat, we used 1579 WSIs of the above biopsies to train a multi-instance learning (MIL) multitask DL model with self-attention to estimate Gene Signature Variation Analysis (GSVA) signatures computed for each of those modules. To compensate for training data limitations, our signature estimation pipeline incorporated a foundational model - a very large neural network that uses self-supervised learning to learn histological features from additional unlabeled WSI datasets. We used the Self-Distillation with No Labels (DINO) v2 foundational model with 108 parameters pretrained on over 56000 WSIs of biopsies obtained from patients with IBD or several different types of cancer in multiple clinical trial and retrospective study datasets to create fixed-length representations of input WSIs. We measured model predictive performance by computing the mean root mean squared error (RMSE) for each module's estimated signature values normalized with respect to signature range and averaged across 4 cross-validation folds; mean normalized RMSE values for the modules ranged from 0.22 to 0.30 over a scale of [0, 1]. By ranking model performance evaluated on a test subset of 388 WSIs for each of the target tasks, we were able to identify modules whose signatures are most strongly associated to histological image data. We observed that the modules with best performance (lowest RMSE) - which included those corresponding to immune signaling, granulocytes, stromal tissue, and plasma cells - all are associated with active inflammation and other aspects of immune response. These findings suggest that estimation of gene expression associated with immune response/inflammation from adjacent histology images may provide a richer assessment of histologic disease activity than existing severity measures alone without necessitating additional and costly transcriptomics analysis of imaged biopsy tissue.

Module-specific performance of multitask MIL model for estimation of GSVA signatures evaluated on a single holdout dataset and averaged across 4 cross-validation folds.

Gene correlation network (GCN) generated from colonic biopsy transcriptomics data from patients with UC.

Number: Su1831
IDENTIFICATION AND RANKING OF GENE MODULE ASSOCIATION WITH HISTOLOGIC FEATURES IN ULCERATIVE COLITIS

Society: AGA
Track: Inflammatory Bowel Diseases

Histologic features in digitized whole slide images (WSIs) of Inflammatory Bowel Disease (IBD) biopsies capture aspects of disease activity that are increasingly seen as valuable in characterizing mucosal healing and remission in clinical trials. Given the success achieved in estimating histologic disease severity from WSIs using weakly supervised deep learning (DL) models, we investigated whether such models can identify visual histologic phenotypes associated with gene co-expression modules (sets of genes with highly coordinated expression across samples) related to IBD biology. Using transcriptomic profiling (microarray) data from biopsies adjacent to 1967 H&E-stained colonic biopsies collected from 1599 patients in three Ulcerative Colitis (UC) clinical trials (NCT01959282, NCT01988961, and NCT01988961), we applied correlation and clustering analysis to identify 16 gene co-expression modules. After removing trial-specific transcriptomics batch effects by preprocessing with ComBat, we used 1579 WSIs of the above biopsies to train a multi-instance learning (MIL) multitask DL model with self-attention to estimate Gene Signature Variation Analysis (GSVA) signatures computed for each of those modules. To compensate for training data limitations, our signature estimation pipeline incorporated a foundational model - a very large neural network that uses self-supervised learning to learn histological features from additional unlabeled WSI datasets. We used the Self-Distillation with No Labels (DINO) v2 foundational model with 108 parameters pretrained on over 56000 WSIs of biopsies obtained from patients with IBD or several different types of cancer in multiple clinical trial and retrospective study datasets to create fixed-length representations of input WSIs. We measured model predictive performance by computing the mean root mean squared error (RMSE) for each module's estimated signature values normalized with respect to signature range and averaged across 4 cross-validation folds; mean normalized RMSE values for the modules ranged from 0.22 to 0.30 over a scale of [0, 1]. By ranking model performance evaluated on a test subset of 388 WSIs for each of the target tasks, we were able to identify modules whose signatures are most strongly associated to histological image data. We observed that the modules with best performance (lowest RMSE) - which included those corresponding to immune signaling, granulocytes, stromal tissue, and plasma cells - all are associated with active inflammation and other aspects of immune response. These findings suggest that estimation of gene expression associated with immune response/inflammation from adjacent histology images may provide a richer assessment of histologic disease activity than existing severity measures alone without necessitating additional and costly transcriptomics analysis of imaged biopsy tissue.

Module-specific performance of multitask MIL model for estimation of GSVA signatures evaluated on a single holdout dataset and averaged across 4 cross-validation folds.

Gene correlation network (GCN) generated from colonic biopsy transcriptomics data from patients with UC.

By clicking “Accept Terms & all Cookies” or by continuing to browse, you agree to the storing of third-party cookies on your device to enhance your user experience and agree to the user terms and conditions of this learning management system (LMS).

Cookie Settings
Accept Terms & all Cookies