Improving Clinical Predictions with Multi-Modal Pre-training in Retinal Imaging
Sprache des Titels:
Englisch
Original Buchtitel:
2024 IEEE International Symposium on Biomedical Imaging (ISBI)
Original Kurzfassung:
Self-supervised learning has emerged as a foundational approach for creating robust and adaptable artificial intelligence (AI) systems within medical imaging. Specifically, contrastive representation learning methods, trained on extensive multi-modal datasets, have showcased remarkable proficiency in generating highly adaptable representations suitable for a multitude of downstream tasks. In the field of ophthalmology, modern retinal imaging devices capture both 2D fundus images and 3D optical coherence tomography (OCT) scans. As a result, large multi-modal imaging datasets are readily available and allow us to explore uni-modal versus multi-modal contrastive pre-training. After pre-training on 153,306 scan pairs, we showcase the transferability and efficacy of these acquired representations via fine-tuning on multiple external datasets, explicitly focusing on several clinically pertinent prediction tasks derived from OCT data. Additionally, we illustrate how multi-modal pre-training enhances the exchange of information between OCT, a richer modality, and the more cost-effective fundus imaging, ultimately amplifying the predictive capacity of fundus-based models.