ePoster listing and sessions

Topic: ESOPRS 2021 ePoster sessions
Time: Sep 17, 2021 16:00 Amsterdam, Berlin, Rome, Stockholm, Vienna, 15:00 London

 

 

(plain text version here)

Back to previous page


External Validation of AI-Based Facial Image Analysis for Thyroid Eye Disease: Results from a Spanish Patient Cohort

Author: Antonio Manuel Garrido Hermosilla
ePoster Number: 128


Purpose

Three AI-assisted software solutions—Glandy CAS, EXO, and LID—have been developed to evaluate thyroid eye disease (TED) activity and severity using facial photographs. This study aimed to externally validate the performance of all three systems in a cohort of Spanish patients.


Methods

A total of 1,118 facial images from 140 TED patients were analyzed to assess disease activity using Glandy CAS, which classifies TED as active (CAS ≥ 3) or inactive. The system’s performance was evaluated against reference CAS scores determined through in-person assessment by an oculoplastic specialist, and compared with photo-based CAS evaluations by a general ophthalmologist. For severity assessment, 1,102 facial images from 137 patients were used for proptosis estimation with Glandy EXO, and 1,119 images from 140 patients were used to evaluate eyelid retraction with Glandy LID. Reference standards included clinical exophthalmometry for proptosis and manually measured margin-reflex distances (MRD1, MRD2) for eyelid position. Model performance was evaluated using F1 score, sensitivity, specificity, mean absolute error (MAE), mean absolute percentage error (MAPE), and Pearson correlation coefficients.


Results

Glandy CAS achieved an F1 score of 0.77, sensitivity of 80.5%, and specificity of 87.8%, outperforming the general ophthalmologist (F1: 0.69, sensitivity: 73.5%, specificity: 82.4%). For proptosis, Glandy EXO showed strong agreement with clinical reference (MAE: 1.74 mm, MAPE: 8.67%, r = 0.7043 using a single image; MAE: 1.56 mm, MAPE: 7.86%, r = 0.7611 using three-image average). Glandy LID also demonstrated excellent performance: for MRD1, MAE was 0.42 mm (MAPE: 9.53%, r = 0.9387); for MRD2, MAE was 0.75 mm (MAPE: 14.63%, r = 0.9123).


Conclusion

This integrated validation study confirms the robust performance of Glandy CAS, EXO, and LID in evaluating TED activity and severity from facial images in a Spanish cohort. Despite the absence of Spanish data in model training, all systems demonstrated strong generalizability and agreement with clinical reference standards, supporting their use in real-world practice.


Additional Authors

First name Last name Base Hospital / Institution
Marina Soto Sierra Virgen Macarena University Hospital (Seville, Spain)
Raquel Monge Carmona Virgen Macarena University Hospital (Seville, Spain)
Mariola Méndez Muros Virgen Macarena University Hospital (Seville, Spain)
Kyubo Shin Thyroscope Inc. (Seoul, Republic of Korea)
Jae Hoon Moon Thyroscope Inc. (Seoul, Republic of Korea)
Jongchan Kim Thyroscope Inc. (Seoul, Republic of Korea)
Joonhyeon Park Thyroscope Inc. (Seoul, Republic of Korea)

Abstract ID: 25-257