Abstract
To investigate the ability of an auxiliary diagnostic model based on the YOLO-v7-based model in the classification of cervical lymphadenopathy images and compare its performance against qualitative visual evaluation by experienced radiologists. Three types of lymph nodes were sampled randomly but not uniformly. The dataset was randomly divided into for training, validation, and testing. The model was constructed with PyTorch. It was trained and weighting parameters were tuned on the validation set. Diagnostic performance was compared with that of the radiologists on the testing set. The mAP of the model was 96.4% at the 50% intersection-over-union threshold. The accuracy values of it were 0.962 for benign lymph nodes, 0.982 for lymphomas, and 0.960 for metastatic lymph nodes. The precision values of it were 0.928 for benign lymph nodes, 0.975 for lymphomas, and 0.927 for metastatic lymph nodes. The accuracy values of radiologists were 0.659 for benign lymph nodes, 0.836 for lymphomas, and 0.580 for metastatic lymph nodes. The precision values of radiologists were 0.478 for benign lymph nodes, 0.329 for lymphomas, and 0.596 for metastatic lymph nodes. The model effectively classifies lymphadenopathies from ultrasound images and outperforms qualitative visual evaluation by experienced radiologists in differential diagnosis.
Similar content being viewed by others
Introduction
Cervical lymphoma often presents as an abnormal enlargement of lymph nodes in the neck and must be differentiated from several other cervical lymphadenopathies (e.g., granulomatous inflammation, reactive inflammation, immune deficiency syndrome, tuberculosis, systemic lupus erythematosus, and metastasis) to develop an appropriate therapeutic plan1,2. The current diagnostic standard for cervical lymphadenopathy is a pathological examination, though views differ on the proper biopsy sample method for obtaining sufficient tissue for histologic examination of a given lymphadenopathy3. Excision biopsy is generally recommended for classification of lymphoma4,5,6, while only a needle biopsy is required for other cervical lymphadenopathies7. Excision biopsy has a greater risk of trauma-related symptoms and complications than needle biopsy8, so patients with enlarged cervical lymph nodes will usually be examined first with non-invasive imaging. If non-invasive imaging reveals benign lesions, patients can avoid the more invasive biopsy procedure.
Both computed tomography and ultrasound are commonly used for non-invasive diagnostic imaging of cervical lymphadenopathies9,10, though ultrasound is regarded as the first-line examination because it is radiation-free11. Lymph node diagnostic accuracy in differentiating benign from malignant lymph nodes has been improved by a new predictive scoring system based on ultrasound features12, while the diagnostic accuracy of lymphoma has improved with ultrasound-guided machine-learning models13. However, health professionals often encounter challenging cases with overlapping imaging features in differentiating metastasis from lymphoma only with conventional B-mode US and even with Doppler US14. It has been reported that a combination of ultrasound and contrast-enhanced ultrasound has good diagnostic value in distinguishing between cervical lymphadenitis and primary lymphoma15. However, these methods still depend on the subjectivity of the radiologist’s judgment in ultrasound interpretation. Besides, contrast-enhanced ultrasound is expensive and cumbersome to operate, which is not conducive to widespread implementation in grassroots hospitals.
Computer vision approaches to diagnostic image interpretation may overcome such limitations. Multiple computer vision techniques have enabled that convolutional neural network (CNN) can show good potential for the detection and classification of cervical lymphadenopathy16,17. Notably, CNNs have been demonstrated to be particularly suitable for computer vision, especially in image interpretation18. Representative CNN algorithms include Region-based Convolution Neural Networks (R-CNN), Fast R-CNN, Single Shot MultiBox Detector (SSD), and You Only Look Once (YOLO)19,20,21. YOLO is capable of identifying objects by localizing them with a bounding box and, at the same time, classifying them according to the probability to belong to a given class22. The YOLO series represents one-stage algorithms, which are more suited to practical applications than two-stage algorithms (such as Faster R-CNN) owing to their better balance between accuracy and speed23. Zhong et al.24 pointed that the YOLO model was superior to the Faster R-CNN model for the Helicobacter pylori detection task. YOLO-v7 leverages a trainable bag-of-freebies approach, enabling significant improvements in precision for real-time detection tasks without incurring additional inference costs. By integrating extend and compound scaling, it effectively reduce the number of parameters and calculations, resulting in a substantial acceleration of the detection rate25. To the best of our knowledge, no studies have applied this YOLO model to the diagnostic distinction of cervical lymphadenopathy.
In this study, we aimed to build an artificial intelligence diagnostic model based YOLO-v7 of cervical lymphadenopathy that would reduce subjective influence from radiologists and improve the accuracy of cervical lymphoma detection.
Materials and methods
This retrospective study was approved by the research ethics committee of Zhangzhou Affiliated Hospital of Fujian Medical University (Protocol No. 2022KYB138). All experiments were performed in accordance with relevant guidelines and regulations. The need for informed consent was waived by the ethical committee with Zhangzhou Affiliated Hospital of Fujian Medical University.
Dataset
Ultrasound images of cervical lymph nodes were collected from our hospital between January 2017 and June 2022 retrospectively, including B-mode and Doppler ultrasound. All lymph nodes had determinate pathological results. Patients with incomplete information and unclear pathological results were excluded. The entire dataset comprises three categories: benign lymph nodes (n = 2807), lymphomas (n = 1108), and metastatic lymph nodes (n = 4580).
Ultrasound images were captured with the Mindray Resona 7S Ultrasound Scanner (Mindray BioMedical, Shenzhen, China), Acuson S3000 Scanner (Siemens Medical Solutions USA, Malvern, PA), and Hitachi Vision Preirus Scanner (Hitachi Medical Corp., Chiba, Japan).
Augmenting datasets, labeling images, and dividing image datasets
All images were resized to 640 × 640 pixels and then augmented via rotation and contrast changes to multiply and increase the sample size of the training dataset. Because the lymphoma sample set is relatively small, the amplification ratio of this subset is higher than that of the other two diseases. Augmentation variations for benign lymph node and metastatic lymph node datasets were: rotation (10° clockwise, 10° counter-clockwise, 80° clockwise, 80° counter-clockwise, 90° clockwise, 100° clockwise, 150° clockwise) and contrast change (by factor 0.5 and 1.5). Augmentation variations for the lymphoma dataset were: rotation (10° clockwise, 10° counter-clockwise, 45° clockwise, 45° counter-clockwise, 50° clockwise, 50° counter-clockwise, 80° clockwise, 80° counter-clockwise, 90° clockwise, 90° counter-clockwise, 100° clockwise, 100° counter-clockwise, 150° clockwise) and contrast change (by a factor of 0.5, 0.6, 1.0 and 1.5). After augmentation, the total dataset reached 93,814 images distributed as: benign lymph node (n = 28,070), lymphoma (19,944), and metastatic lymph node (n = 45,800) (Table 1). We utilized the “random.sample” function in Python to randomly divide the total dataset into three subsets: training, validation, and testing. The sampling process did not consider the lymph node types and randomly distributed the entire dataset. The training set received 90% of the images while the validation and testing sets each received 5%. The number of lymph nodes in each dataset is detailed in Table 2. The training set was used to train the model. The validation set was used to adjust the weight parameters of the model. The testing set was used to compare with the conducted qualitative visual evaluation by experienced radiologists.
An experienced radiologist (21 years) manually tagged images using the graphic marker software LabelImg. Using LabelImg software, the lymph node was selected with a rectangular bounding box and assign a category label to it, such as “benign lymph node”, “lymphoma” or “metastatic lymph nodes”. Then the result was saved into a .txt file (Fig. 1).
Training YOLO-v7 model
We acquired both the source code document and the pre-trained YOLO-v7 weight model from GitHub. Originally trained on the MS COCO dataset from scratch by the primary authors26, we further trained the model using our own images. Throughout the training process, the model automatically fine-tuned its network structure and optimized its loss function.
Model training was performed on a machine with an Intel Core i9-12900H processor, 32 GB RAM, and a GPU with 8 GB memory. The hardware and software parameters of the training system are shown in Table 3.
The YOLO-v7 model iterated 300 training epochs on the training set. The validation set was used to adjust weighting in the training model. The testing set was used to analyze the capability of the model. A block diagram of the complete methodology is shown in Fig. 2. The classification experiment was performed first on the validation subset and then on the testing subset, and the accuracy value, precision value, recall value, and F1 score of the model were automatically calculated by the software according to Eqs. (1)–(4). True Positive (TP) is a positive sample that is correctly classified. False Positive (FP) is a negative sample incorrectly classified as positive. True Negative (TN) is a negative sample that is correctly classified. False Negative (FN) is a positive sample incorrectly classified as negative. Background FN: The model misclassified the lesion into a background. Background FP: The model misclassified the background into a lesion. The mAP is used to measure the performance of the target detection algorithm. It is obtained from a comprehensive weighted average of the average accuracy of all categories detected. AP (average precision): For each category, calculate the area under its precision-recall curve to obtain AP. This represents the performance of the model at different levels of precision and recall. mAP (mean average precision): Take the average of all categories of AP to obtain mAP, which is a comprehensive evaluation of overall performance.
Visual evaluation by radiologists
The testing subset images were analyzed and diagnostically classified by two radiologists with more than 22 and 29 years of experience in ultrasonography. Radiologists were blinded to patient clinical information and pathological results. The ultrasound features of cervical lymphadenopathy are shown in Fig. 3 and include: absence of echogenic hilum, non-circumscribed margin, necrosis, calcification, grid-like echo, and peripheral vascular pattern27.
Results
Overall performance of YOLO-v7 model
After 200 iterations, the mAP, which is an indicator for measuring the quality of the detection model, gradually stabilizes. The model had the highest mAP at an intersection-over-union threshold of 50% (Fig. 4). At an intersection-over-union threshold of 50%, the model mAP on the testing dataset of 4691 images is 96.4% (Fig. 5).
The loss function of the model (Fig. 6) shows that the YOLO-v7 algorithm curve gradually converges as the number of iterations increases and the loss value of classification decreases. After 300 iterations, the loss value of classification stabilizes near zero and the network essentially converges. The confusion matrix (Fig. 7) shows that the YOLO-v7 model has a recall value of 0.842, 0.925, and 0.882 for benign lymph nodes, lymphomas, and metastatic lymph nodes, respectively.
Comparison of performance between the YOLO-v7 model and visual evaluation by radiologists
Table 4 shows the multi-class and individual class parameters for accuracy, recall, precision, and F1 score for the testing set. Visual evaluation by radiologists results were all lower. The recall value for lymphomas was only 0.237.
Discussion
Ultrasound is the preferred diagnostic method for cervical lymphadenopathy. However, the diagnostic accuracy of ultrasound depends critically upon image quality, the professional experience of the radiologists, and the ultrasound instrument itself28,29. Object detector-based deep learning mode was used in detecting, segmenting, and classifying on lesions30. Before the emergence of YOLO, object detection algorithms such as DCNN generally required generating a large number of candidate regions and then classifying targets among them. Compared to region based methods, YOLO-v7 does not require early detection of potential target regions. It can output the category and show the location information of all targets by browsing the image only once. We trained a YOLO-v7 model that can identify the location of potential lesions on an entire ultrasound image and simultaneously classify lymph nodes as benign, lymphoma, or metastatic. The present study shows that the YOLO-v7 model is clearly superior to qualitative visual evaluation by experienced radiologists in a diagnostic test. The multi-class accuracy and F1 scores for the YOLO-v7 model were 0.952 and 0.912, respectively. And it indicated that the YOLO-v7 model can accurately identify lymphoma and also effectively distinguish benign and metastatic lymph nodes. To increase the amount of training data, the ultrasound images were augmented by rotation to train the model with a neural network having greater learning ability to discriminate features in a given image. A higher mAP indicates higher average detection accuracy and greater performance. Our model produced a maximum mAP of 96.4%, showing that the model can sensitively detect various classes of cervical lymphadenopathies, especially lymphomas; therefore, this model has achieved our goal. Compared with the YOLO-v7 model, the visual evaluation by experienced radiologists would miss most patients with lymphoma and cause unnecessary biopsies prior to lymph node resection.
In a study of ultrasound-based prenatal abnormality detection, the sensitivity range across different medical institutions was 27.5–96%31. This indicates that US is highly dependent on the experience and skills of the radiologists. So, it has become necessary to find a new technology that can overcome the subjectivity of US diagnosis. Our study showed that the precision value was only 0.329, indicating that many patients would be misdiagnosed with lymphoma and undergo unnecessary lymphadenectomies. This further indicates that visual evaluation by radiologists depends greatly on the personal experience of the radiologist and is therefore highly subjective. It remains difficult to accurately differentiate lymph node diseases, even for senior radiologists. To address these limitations, there were some studies combined conventional ultrasound, shear-wave elastography, and contrast-enhanced ultrasonography to detect the stiffness, perfusion pattern, and characteristics of lymph nodes. Experiments have shown that multimodal ultrasonography is a valuable tool for differentiating between benign and malignant lymphadenopathies32. Some advanced automated ultrasound image analysis methods have been also developed to improve the objectivity, accuracy, and intelligence of ultrasound diagnostics and image-guided intervention33,34. The most widely used method is an automated feature extraction and informatics analysis using radiomics, but the image segmentation step usually relies on manual delineation that introduces errors in image feature calculation33. This paper presents an automated and accurate deep-learning-based cervical lymphadenopathy diagnostic technique that does not involve extra feature extraction operations. It is convenient and highly accurate. The accuracy, precision, recall and F1 score of the three pathological types of lymph nodes of the model are all better than the qualitative visual evaluation of experienced radiologists. Given this accuracy and stability, we believe that the YOLO-v7 model is superior to visual evaluation by radiologists for the differential diagnosis of lymphadenopathy.
In clinical practice, it's common for US images to feature calipers and body markers. Consequently, these elements were retained in the US images utilized to train the YOLOv7 model. While this decision might introduce some interference during the training phase, it enhances the model's ability to accurately identify target objects and minimize errors caused by non-target elements in subsequent applications.
The main limitation of this project is the scope; the study is retrospective and performed at a single institution. Prospective multicenter studies will be conducted in the future to further validate our findings. In addition, our study only focuses on ultrasound diagnostics. In future research, we will collect more data from computed tomography and magnetic resonance imaging to further train a more efficient YOLOv7 model.
In conclusion, The YOLO-v7 model classifies ultrasound images effectively and outperforms doctor’s recognition in the differential diagnosis of cervical lymphadenopathy. This suggests that the YOLO-v7 model has high clinical applicability and could be used for rapid screening at low cost.
Data availability
The datasets used and analysed during the current study available from the corresponding author on reasonable request.
Code availability
The source code of YOLO-v7 model was publicly obtained from Github. The project code has been uploaded to Github. (https://github.com/YCY2023/yolo-v7-code.git).
References
Zhu, Y. et al. Deep learning radiomics of dual-modality ultrasound images for hierarchical diagnosis of unexplained cervical lymphadenopathy. BMC Med. 20(1), 269 (2022).
Tomita, H. et al. Radiomics analysis for differentiating of cervical lymphadenopathy between cancer of unknown primary and malignant lymphoma on unenhanced computed tomography. Nagoya J. Med. Sci. 84(2), 269–285 (2022).
Eichenauer, D. A. et al. Hodgkin lymphoma: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 29, 19–29 (2018).
Syrykh, C. et al. Lymph node excisions provide more precise lymphoma diagnoses than core biopsies: A French lymphopath network survey. Blood 140(24), 2573–2583 (2022).
Sigaard, R. K., Wennervaldt, K., Munksgaard, L., Rahbek Gjerdrum, L. M. & Homøe, P. Core needle biopsy is an inferior tool for diagnosing cervical lymphoma compared to lymph node excision. Acta Oncol. 60(7), 904–910 (2021).
Chatani, S. et al. Image-guided core needle biopsy in the diagnosis of malignant lymphoma: Comparison with surgical excision biopsy. Eur. J. Radiol. 127, 108990 (2020).
Oh, K. H. et al. Efficacy of ultrasound-guided core needle gun biopsy in diagnosing cervical lymphadenopathy. Eur. Ann. Otorhinolaryngol. Head Neck Dis. 133(6), 401–404 (2016).
Bassiouni, M., Kang, G., Olze, H., Dommerich, S. & Arens, P. The diagnostic yield of excisional biopsy in cervical lymphadenopathy: A retrospective analysis of 158 biopsies in adults. Ear Nose Throat J. 102(10), 645–649 (2023).
Forghani, R. An update on advanced dual-energy CT for head and neck cancer imaging. Expert Rev. Anticancer Ther. 19(7), 633–644 (2019).
Kang, H. J. et al. Comparison of diagnostic performance of B-mode ultrasonography and Shear Wave Elastography in cervical lymph nodes. Ultrasound Q. 35(3), 290–296 (2019).
Liu, Y. et al. Ultrasound-based radiomics can classify the etiology of cervical lymphadenopathy: A multi-center retrospective study. Front. Oncol. 12, 856605 (2022).
Shen, H. et al. The clinical value of new scoring system of cervical lymph node. Ultrasound Q. 35(3), 269–274 (2019).
Lu, W. et al. A model to predict the prognosis of diffuse large B-cell lymphoma based on ultrasound images. Sci. Rep. 13(1), 3346 (2023).
Białek, E. J. & Jakubowski, W. Mistakes in ultrasound diagnosis of superficial lymph nodes. J. Ultrason. 17(68), 59–65 (2017).
Liu, N. et al. A combination of ultrasound and contrast-enhanced ultrasound improves diagnostic accuracy for the differentiation of cervical tuberculous lymphadenitis from primary lymphoma. Clin. Hemorheol. Microcirc. 85(3), 261–275 (2023).
Seidler, M. et al. Dual-energy CT texture analysis with machine learning for the evaluation and characterization of cervical lymphadenopathy. Comput. Struct. Biotechnol. J. 17, 1009–1015 (2019).
Zhang, W. et al. Deep learning combined with radiomics for the classification of enlarged cervical lymph nodes. J. Cancer Res. Clin. Oncol. 148(10), 2773–2780 (2022).
Tama, B. A., Kim, D. H., Kim, G., Kim, S. W. & Lee, S. Recent advances in the application of artificial intelligence in otorhinolaryngology-head and neck surgery. Clin. Exp. Otorhinolaryngol. 13(4), 326–339 (2020).
Schaap, M. J. et al. Image-based automated psoriasis area severity index scoring by Convolutional Neural Networks. J. Eur. Acad. Dermatol. Venereol. 36(1), 68–75 (2022).
Lenatti, M., Narteni, S., Paglialonga, A., Rampa, V. & Mongelli, M. Dual-view single-shot multibox detector at urban intersections: Settings and performance evaluation. Sensors (Basel) 23(6), 3195 (2023).
Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 779–788 (2016).
Azam, M. A. et al. Deep Learning applied to white light and narrow band imaging videolaryngoscopy: Toward real-time laryngeal cancer detection. Laryngoscope 132(9), 1798–1806 (2022).
Qiu, R. Z. et al. An automatic identification system for citrus greening disease (Huanglongbing) using a YOLO convolutional neural network. Front. Plant Sci. 13, 1002606 (2022).
Zhong, Z. et al. A study on the diagnosis of the Helicobacter pylori coccoid form with artificial intelligence technology. Front. Microbiol. 13, 1008346 (2022).
Soeb, M. J. A. et al. Tea leaf disease detection and identification based on YOLOv7 (YOLO-T). Sci. Rep. 13(1), 6078 (2023).
Wang, C. Y., Bochkovskiy, A. & Liao, H. Y. M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 7464–7475 (2023).
Ryu, K. H. et al. Cervical lymph node imaging reporting and data system for ultrasound of cervical lymphadenopathy: A pilot study. AJR Am. J. Roentgenol. 206(6), 1286–1291 (2016).
Cai, D. & Wu, S. Efficacy of logistic regression model based on multiparametric ultrasound in assessment of cervical lymphadenopathy—A retrospective study. Dentomaxillofac. Radiol. 51(2), 20210308 (2022).
Junn, J. C., Soderlund, K. A. & Glastonbury, C. M. Imaging of head and neck cancer with CT, MRI, and US. Semin. Nucl. Med. 51(1), 3–12 (2021).
Li, J. et al. Primary bone tumor detection and classification in full-field bone radiographs via YOLO deep learning model. Eur. Radiol. 33(6), 4237–4248 (2023).
Diniz, P. H. B., Yin, Y. & Collins, S. Deep learning strategies for ultrasound in pregnancy. Eur. Med. J. Reprod. Health 6(1), 73–80 (2020).
Yang, J. R., Song, Y., Jia, Y. L. & Ruan, L. T. Application of multimodal ultrasonography for differentiating benign and malignant cervical lymphadenopathy. Jpn. J. Radiol. 39(10), 938–945 (2021).
Li, Z., Wang, Y., Yu, J., Guo, Y. & Cao, W. Deep learning based radiomics (DLR) and its usage in noninvasive IDH1 prediction for low grade glioma. Sci. Rep. 7(1), 5467 (2017).
Liu, S. et al. Deep learning in medical ultrasound analysis: A review. Engineering. 5(2), 261–275 (2019).
Acknowledgements
We gratefully acknowledge the contributions and efforts of all patients who participated in this study. We thank the Natural Science Foundation of Fujian Province (Grant No. 2022J011478) for funding this study.
Author information
Authors and Affiliations
Contributions
Wang, Y.G., Yang, C.Y., Yang, Q.T., Zhong, R. and Wang, K.J. performed the research; Shen, H.L. designed the research study; Shen, H.L. and Wang, Y.G. analyzed the data; and Wang, Y.G. and Yang, C.Y. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, Y., Yang, C., Yang, Q. et al. Diagnosis of cervical lymphoma using a YOLO-v7-based model with transfer learning. Sci Rep 14, 11073 (2024). https://doi.org/10.1038/s41598-024-61955-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-61955-x
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.