Digital profiling of gene expression from histology images with linearized attention

Menée à partir de 7 584 échantillons tumoraux couvrant 16 types de cancer puis validée sur 1 368 tumeurs, cette étude évalue la performance du modèle SEQUOIA, un algorithme d'apprentissage automatique utilisant des données d'images de lames histologiques, pour prédire les variations génétiques des cancers mammaires et identifier les tumeurs présentant un risque de récidive

Nature Communications, Volume 15, Numéro 1, Page 9886, 2024, article en libre accès

Résumé en anglais

Cancer is a heterogeneous disease requiring costly genetic profiling for better understanding and management. Recent advances in deep learning have enabled cost-effective predictions of genetic alterations from whole slide images (WSIs). While transformers have driven significant progress in non-medical domains, their application to WSIs lags behind due to high model complexity and limited dataset sizes. Here, we introduce SEQUOIA, a linearized transformer model that predicts cancer transcriptomic profiles from WSIs. SEQUOIA is developed using 7584 tumor samples across 16 cancer types, with its generalization capacity validated on two independent cohorts comprising 1368 tumors. Accurately predicted genes are associated with key cancer processes, including inflammatory response, cell cycles and metabolism. Further, we demonstrate the value of SEQUOIA in stratifying the risk of breast cancer recurrence and in resolving spatial gene expression at loco-regional levels. SEQUOIA hence deciphers clinically relevant information from WSIs, opening avenues for personalized cancer management.