Cancers | Free Full-Text | Detection and Classification of Hysteroscopic Images Using Deep Learning
[ad_1]
1. Introduction
In the present study, we aimed to develop a DL model to provide an automated tool for detecting endometrial pathologies and classifying them as benign or malignant intrauterine lesions using hysteroscopic images from a consecutive series of women with pathologically confirmed endometrial lesions.
2. Materials and Methods
2.1. Study Protocol and Selection Criteria
We reviewed clinical records, electronic databases, and stored videos of hysteroscopies from all consecutive patients with pathological confirmation of intracavitary uterine lesions at IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy, from January 2021 to May 2021. Retrieved hysteroscopic images were used to build a DL model for the classification and identification of intracavitary uterine lesions with and without the aid of clinical factors.
Intracavitary uterine lesions included endometrial polyps, fibroids, endometrial hyperplasia with and without atypia, and endometrial cancer diagnosed at histological examination of hysteroscopic specimens.
The exclusion criteria were the absence of adequate histological examination, absence of iconographic documentation, presence of uterine dysmorphism, and absence of intrauterine pathology.
2.2. Study Outcomes
The primary outcome was the accuracy of the DL model in the classification of intracavitary uterine lesions (overall and by category of lesion) without the aid of specific clinical factors to DL model performance.
The secondary outcomes were the following:
-
accuracy of the DL model in the classification of intracavitary uterine lesions (overall and by category of lesion) with the aid of specific clinical factors to DL model performance;
-
precision, sensitivity, specificity, and F1 score (i.e., the harmonic mean of precision and sensitivity) of the DL model in the classification of intracavitary uterine lesions (overall and by category of lesion), with and without the aid of specific clinical factors to DL model performance;
-
precision, sensitivity, and F1 score of the DL model in the identification of intracavitary uterine lesions, with and without the aid of specific clinical factors to DL model performance;
-
the best performance of the DL model during testing in the identification and classification of intracavitary uterine lesions (overall and by category of lesion).
Classification refers to the discrimination between three categories of intracavitary uterine lesions: benign focal lesions (i.e., polyps and myomas), benign diffuse lesions (i.e., non-atypical endometrial hyperplasia), and pre-neoplastic/neoplastic lesions (i.e., atypical endometrial hyperplasia and endometrial cancer). Instead, identification referred to the detection of intracavitary uterine lesions. Given the inclusion of only patients with intracavitary uterine lesions diagnosed at histological examination, true negatives were absent for identification metrics. On the other hand, intracavitary uterine lesions of other categories were considered as false negatives for classification metrics.
Clinical factors assessed for aiding DL model performance were age, menopausal status, AUB, hormonal therapy, and tamoxifen use.
2.3. Hysteroscopy and Image Processing
Hysteroscopy with targeted biopsies of intracavitary uterine lesions through 5 French instruments was performed in outpatient settings using 0.9% saline solution distension and a Bettocchi hysteroscope (Karl Storz, Tuttlingen, Germany). Stills and images from hysteroscopic videos of eligible patients were processed for DL model building. Images and videos were captured with two different hysteroscopic systems, one high-definition system and one standard-definition system. Features were extracted from the original image. The system extracts the area of interest for the lesion detected at 224 × 224 pixels required for the classification task. Manual segmentation was performed by an experienced hysteroscopist.
2.4. Deep Learning
We developed an end-to-end DL model for intracavitary uterine lesion identification and classification. The deep learning process comprises three parts: training, validation, and testing. The dataset was divided into three groups at random with a ratio of 60:20:20. Two groups were used for training and validation, and the remaining group was used for testing.
ResNet50 was used as a deep learning model since it can exhibit relatively high accuracy with smaller size datasets and less expensive learning costs. ResNet50 was pre-trained by a million natural images from the Microsoft Common Objects in Context dataset and was fine-tuned using images from the training and validation dataset.
We used established techniques to reduce over-fitting during the validation process with an iterative method: (a) data augmentation, which is a process synthetically generating additional training examples by using random image transformations; (b) “early stopping”, by which the weights of the network at the point of best performance are saved, as opposed to the weights obtained at the end of training. The performance of the DL model was evaluated using a balanced sampler on image units.
In our methodology, data augmentation was implemented online, meaning it was applied in real-time during the training of the model. This approach differs significantly from the traditional offline augmentation, where an augmented dataset is prepared in advance before the training process begins. Each training batch underwent a unique set of random transformations, ensuring that the model encountered a diverse range of variations in the training images. This dynamic approach to augmentation is crucial in preventing the model from overfitting, as it learns to generalize better from a constantly varying dataset. The specific augmentation steps included in our process were as follows:
-
Random Vertical and Horizontal Flipping: each image in the training batch had a chance of being flipped either vertically or horizontally. This step introduces a variety of orientations, helping the model to learn features that are orientation-invariant.
-
Random Brightness Adjustment: the brightness of each image was altered using a random factor ranging from 0.8 to 1.2. This variance in brightness ensures the model’s robustness against different lighting conditions.
-
Random Contrast Adjustment: similarly, the contrast of each image was modified with a random factor within the same range (0.8 to 1.2). This step helps in training the model to identify features under various contrast levels.
By incorporating these random transformations, our DL model benefits from a more comprehensive and challenging training environment. This online method of data augmentation plays a significant role in enhancing the model’s ability to accurately classify and identify lesions under diverse imaging conditions, ultimately improving its diagnostic efficacy.
3. Results
3.1. Study Population and Dataset
During the study period, 703 patients underwent hysteroscopy in our center. Four hundred and thirty-seven were excluded from analysis due to lack of imaging or histological examination or both.
We reviewed a total of 1500 images from 266 patients (image-to-patient ratio = 5.6): 186 (69.92%) patients had benign focal lesions (image-to-patient ratio = 5.97), 25 (9.39%) benign diffuse lesions (image-to-patient ratio = 5.6), and 55 (20.67%) preneoplastic/neoplastic lesions (image-to-patient ratio = 4.55).
Out of benign focal lesions, 21 were myomas, and 165 were polyps; out of benign diffuse lesions, 19 were polypoid endometrium, and 6 were endometrial hyperplasia without atypia; out of preneoplastic and neoplastic lesions, 7 were atypical endometrial hyperplasia, 12 were endometrial intraepithelial neoplasia, and 36 were endometrial cancers.
3.2. Model Performance
Overall, the accuracy of the model in classifying uterine intracavitary lesions without the aid of specific clinical factors was 85.09 ± 1.18%. Specifically, such accuracy was 79.55 ± 1.29% for benign focal lesions, 90.1 ± 0.91% for benign diffuse lesions, and 85.63 ± 1.16% for malignant lesions.
For the identification task, the best performance was achieved with the aid of clinical factors with detection of 85.82%, precision of 93.12%, recall of 91.63%, and an F1 score of 92.37%.
4. Discussion
This study showed that the DL model had low overall accuracy in the detection and classification of uterine intracavitary diseases. The best performance of the DL model was obtained with the aid of clinical factors for both tasks. However, such an improvement was slight.
As previously stated, in the present study, our DL model showed a low accuracy in the detection and classification of intracavitary diseases. This observation may reflect the heterogeneity of uterine intracavitary pathology, the small size, and the heterogeneity of the dataset. Moreover, the lack of images of normal cavities and the small number of patients led to a dataset imbalance problem.
Anyway, the best performance of our DL model is close to that of the above-mentioned larger studies. Our DL model might be an updated starting point for future improved DL models in the field.
5. Conclusions
In this study, our DL model achieved a low diagnostic performance in the detection and classification of intracavitary uterine lesions from hysteroscopic images. Although the best diagnostic performance was obtained with the aid of clinical data, such an improvement was slight. However, our DL model might be an updated starting point for future improved DL models in the field based on larger datasets.
Our study underscores the importance of continued research in refining DL models for uterine lesion detection and classification. Future efforts should prioritize the expansion of datasets with high-definition images, the inclusion of diverse uterine pathologies, and external validation across multiple centers. Moreover, the addition of normal uterine cavity images and rarer intrauterine lesions to the training set might allow to enhance the DL model’s diagnostic accuracy.
In conclusion, while our DL model represents a promising step towards automated uterine lesion diagnosis, further refinement and validation is needed before its integration into clinical practice. By addressing current limitations and leveraging advances in AI technology, future DL models hold the potential to significantly improve the accuracy and efficiency of uterine pathology diagnosis, ultimately benefiting patient care and outcomes.
[ad_2]