A Novel Multi-Modal Deep Learning Framework for Early Detection of Ocular Diseases

Ranjith, D. and Sakthivanitha, M. (2025) A Novel Multi-Modal Deep Learning Framework for Early Detection of Ocular Diseases. In: 2025 International Conference on Intelligent Computing and Control Systems (ICICCS), Erode, India.

Full text not available from this repository. (Request a copy)

Abstract

The leading causes of blindness across the globe are cataracts and glaucoma which require early and accurate detection. These conditions demonstrate the need for methods that can be employed on a large scale. In earlier methods of diagnosis mostly consisted of unmeasurable clinical evaluation and reporting which adds to the unreliability of these methods. This paper present a multi modal deep learning architecture with a new approach to combining information from various data sources such as OCT, fundus pictures, visual field exams, and clinical metadata. This multi-modal framework utilizes hybrid fusion approaches that integrate attention, late fusion and early fusion with graph neural networks (GNNs) to exploit feature specific and cross feature modalities. The use of explainable artificial intelligence methods such as Grad-CAM and SHAP increases clinician trust as they are able to offer clear model predictions. Using a dataset with 10000 multi-modal data, the architecture is able to achieve a 95% accuracy on cataract and 92% glaucoma detections, with AUC-ROC scores of 0.97 and 0.94 respectively. The model’s clarity ranked impressively with a cerca 4.5/5 satisfaction score. Other models were not able to compare with the versatility and accuracy of detections brought by the new frameworks. Although this work is an impressive first step towards further diagnosis, improving clinical methods in measuring blindness, and diagnosis cataracts and glaucoma conditions, further testing is required to cover real world challenges. This work mainly developing on a reliable, explainable and generalizable deep learning framework that can be advance the early detection of ocular diseases. By improving the gap in multi-modal data integration and model interpretability, our approach aims to lighten the differences in diagnosis, improve clinical decisions and improve the results of the patient’s treatment.

Item Type: Conference or Workshop Item (Paper)
Subjects: Computer Science Engineering > Deep Learning
Domains: Computer Science Engineering
Depositing User: Mr IR Admin
Date Deposited: 11 Aug 2025 10:15
Last Modified: 11 Aug 2025 10:15
URI: https://ir.vistas.ac.in/id/eprint/9920

Actions (login required)

View Item
View Item