Advertisement

Automatic captioning of early gastric cancer using magnification endoscopy with narrow-band imaging

  • Lixin Gong
    Affiliations
    College of Medicine and Biological Information Engineering School, Northeastern University, Shenyang, China

    CAS Key Laboratory of Molecular Imaging, Beijing Key Laboratory of Molecular Imaging, The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
    Search for articles by this author
  • Min Wang
    Affiliations
    Department of Gastroenterology, Hepatology and Nutrition, Shanghai Children’s Hospital, Shanghai Jiaotong University, Shanghai, China
    Search for articles by this author
  • Lei Shu
    Affiliations
    Department of Gastroenterology, No. 1 Hospital of Wuhan, Wuhan, China
    Search for articles by this author
  • Jie He
    Affiliations
    Endoscopy Center, Zhongshan Hospital (Xiamen Branch), Fudan University, Xiamen, China

    Department of Gastroenterology, The Affiliated Dongnan Hospital of Xiamen University, Zhangzhou, China
    Search for articles by this author
  • Bin Qin
    Affiliations
    Department of Gastroenterology, the Second Affiliated Hospital of Xi’an Jiaotong University, Zhangzhou, China
    Search for articles by this author
  • Jiacheng Xu
    Affiliations
    Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai, China

    Shanghai Collaborative Innovation Center of Endoscopy, Shanghai, China
    Search for articles by this author
  • Wei Su
    Affiliations
    Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai, China

    Shanghai Collaborative Innovation Center of Endoscopy, Shanghai, China
    Search for articles by this author
  • Di Dong
    Affiliations
    CAS Key Laboratory of Molecular Imaging, Beijing Key Laboratory of Molecular Imaging, The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China
    Search for articles by this author
  • Hao Hu
    Correspondence
    Reprint requests: Hao Hu, MD, Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital of Fudan University, 180 Fenglin Rd, Xuhui District, Shanghai, 200032, China; Jie Tian, PhD, Director of the CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, No. 95, Zhongguancun East Rd, Beijing, 100190, China.
    Affiliations
    Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai, China

    Shanghai Collaborative Innovation Center of Endoscopy, Shanghai, China

    Department of Gastroenterology, Shigatse People’s Hospital, Shigatse, China
    Search for articles by this author
  • Jie Tian
    Affiliations
    College of Medicine and Biological Information Engineering School, Northeastern University, Shenyang, China

    CAS Key Laboratory of Molecular Imaging, Beijing Key Laboratory of Molecular Imaging, The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China

    Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, Beihang University, Beijing, China
    Search for articles by this author
  • Pinghong Zhou
    Affiliations
    Endoscopy Center and Endoscopy Research Institute, Zhongshan Hospital, Fudan University, Shanghai, China

    Shanghai Collaborative Innovation Center of Endoscopy, Shanghai, China
    Search for articles by this author

      Background and Aims

      The detection rate for early gastric cancer (EGC) is unsatisfactory, and mastering the diagnostic skills of magnifying endoscopy with narrow-band imaging (ME-NBI) requires rich expertise and experience. We aimed to develop an EGC captioning model (EGCCap) to automatically describe the visual characteristics of ME-NBI images for endoscopists.

      Methods

      ME-NBI images (n = 1886) from 294 cases were enrolled from multiple centers, and corresponding 5658 text data were designed following the simple EGC diagnostic algorithm. An EGCCap was developed using the multiscale meshed-memory transformer. We conducted comprehensive evaluations for EGCCap including the quantitative and quality of performance, generalization, robustness, interpretability, and assistant value analyses. The commonly used metrics were BLEUs, CIDEr, METEOR, ROUGE, SPICE, accuracy, sensitivity, and specificity. Two-sided statistical tests were conducted, and statistical significance was determined when P < .05.

      Results

      EGCCap acquired satisfying captioning performance by outputting correctly and coherently clinically meaningful sentences in the internal test cohort (BLEU1 = 52.434, CIDEr = 36.734, METEOR = 27.823, ROUGE = 49.949, SPICE = 35.548) and maintained over 80% performance when applied to other centers or corrupted data. The diagnostic ability of endoscopists improved with the assistance of EGCCap, which was especially significant (P < .05) for junior endoscopists. Endoscopists gave EGCCap an average remarkable score of 7.182, showing acceptance of EGCCap.

      Conclusions

      EGCCap exhibited promising captioning performance and was proven with satisfying generalization, robustness, and interpretability. Our study showed potential value in aiding and improving the diagnosis of EGC and facilitating the development of automated reporting in the future.

      Abbreviations:

      AI (artificial intelligence), DHXU (Affiliated Dongnan Hospital of Xiamen University), EGC (early gastric cancer), EGCCap (early gastric cancer captioning model), FDZS (Endoscopy Center of Zhongshan Hospital), ME-NBI (magnifying endoscopy with narrow-band imaging), MESDA-G (magnifying endoscopy simple diagnostic algorithm for early gastric cancer), NPV (negative predictive value), PPV (positive predictive value), WHFH (Wuhan First Hospital)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      Subscribe:

      Subscribe to Gastrointestinal Endoscopy
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Cao W.
        • Chen H.-D.
        • Yu Y.-W.
        • et al.
        Changing profiles of cancer burden worldwide and in China: a secondary analysis of the global cancer statistics 2020.
        Chin Med J. 2021; 134: 783-791
        • Sun D.
        • Cao M.
        • Li H.
        • et al.
        Cancer burden and trends in China: a review and comparison with Japan and South Korea.
        Chin J Cancer Res. 2020; 32: 129
        • Kato M.
        • Kaise M.
        • Yonezawa J.
        • et al.
        Magnifying endoscopy with narrow-band imaging achieves superior accuracy in the differential diagnosis of superficial gastric lesions identified with white-light endoscopy: a prospective study.
        Gastrointest Endosc. 2010; 72: 523-529
        • Ezoe Y.
        • Muto M.
        • Uedo N.
        • et al.
        Magnifying narrowband imaging is more accurate than conventional white-light imaging in diagnosis of gastric mucosal cancer.
        Gastroenterology. 2011; 141: 2017-2025
        • Yao K.
        • Doyama H.
        • Gotoda T.
        • et al.
        Diagnostic performance and limitations of magnifying narrow-band imaging in screening endoscopy of early gastric cancer: a prospective multicenter feasibility study.
        Gastric Cancer. 2014; 17: 669-679
        • Yamada S.
        • Doyama H.
        • Yao K.
        • et al.
        An efficient diagnostic strategy for small, depressed early gastric cancer with magnifying narrow-band imaging: a post-hoc analysis of a prospective randomized controlled trial.
        Gastrointest Endosc. 2014; 79: 55-63
        • Muto M.
        • Yao K.
        • Kaise M.
        • et al.
        Magnifying endoscopy simple diagnostic algorithm for early gastric cancer (MESDA-G).
        Dig Endosc. 2016; 28: 379-393
        • Yao K.
        • Iwashita A.
        • Tanabe H.
        • et al.
        Novel zoom endoscopy technique for diagnosis of small flat gastric cancer: a prospective, blind study.
        Clin Gastroenterol Hepatol. 2007; 5: 869-878
        • Liao H.
        • Long Y.
        • Han R.
        • et al.
        Deep learning-based classification and mutation prediction from histopathological images of hepatocellular carcinoma.
        Clin Translat Med. 2020; 10: e102
        • Ding Y.
        • Ruan S.
        • Wang Y.
        • et al.
        Novel deep learning radiomics model for preoperative evaluation of hepatocellular carcinoma differentiation based on computed tomography data.
        Clin Translat Med. 2021; 11: e570
        • Dong D.
        • Tang L.
        • Li Z.-Y.
        • et al.
        Development and validation of an individualized nomogram to identify occult peritoneal metastasis in patients with advanced gastric cancer.
        Ann Oncol. 2019; 30: 431-438
        • Dong D.
        • Fang M.-J.
        • Tang L.
        • et al.
        Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study.
        Ann Oncol. 2020; 31: 912-920
        • Hu H.
        • Gong L.
        • Dong D.
        • et al.
        Identifying early gastric cancer under magnifying narrow-band images with deep learning: a multicenter study.
        Gastrointest Endosc. 2021; 93: 1333-1341
        • Li L.
        • Chen Y.
        • Shen Z.
        • et al.
        Convolutional neural network for the diagnosis of early gastric cancer based on magnifying narrow band imaging.
        Gastric Cancer. 2020; 23: 126-132
        • Horiuchi Y.
        • Aoyama K.
        • Tokai Y.
        • et al.
        Convolutional neural network for differentiating gastric cancer from gastritis using magnified endoscopy with narrow band imaging.
        Dig Dis Sci. 2020; 65: 1355-1363
        • Fonollà R.
        • van der Zander Q.E.
        • Schreuder R.M.
        • et al.
        Automatic image and text-based description for colorectal polyps using BASIC classification.
        Artif Intell Med. 2021; 121: 102178
        • Ayesha H.
        • Iqbal S.
        • Tariq M.
        • et al.
        Automatic medical image interpretation: state of the art and future directions.
        Pattern Recogn. 2021; 114: 107856
        • Jing B.
        • Xie P.
        • Xing E.
        On the automatic generation of medical imaging reports.
        in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Vol 1 (Long Papers). 2018: 2577-2586
        • Babar Z.
        • van Laarhoven T.
        • Zanzotto F.M.
        • et al.
        Evaluating diagnostic content of AI-generated radiology reports of chest X-rays.
        Artif Intell Med. 2021; 116: 102075
        • Singh A.
        • Krishna Raguru J.
        • Prasad G.
        • et al.
        Medical image captioning using optimized deep learning model.
        Comput Intell Neurosci. 2022; 2022: 9638438
        • Mishra S.
        • Banerjee M.
        Automatic caption generation of retinal diseases with self-trained RNN merge model.
        in: Advanced computing and systems for security. Springer, 2020: 1-10
        • He K.
        • Zhang X.
        • Ren T.
        • et al.
        Deep residual learning for image recognition.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778
        • Papineni K.
        • Roukos S.
        • Ward T.
        • et al.
        Bleu: a method for automatic evaluation of machine translation.
        in: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002: 311-318
        • Vedantam R.
        • Lawrence Zitnick C.
        • et al.
        Cider: consensus-based image description evaluation.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 4566-4575
        • Banerjee S.
        • Lavie A.
        METEOR: an automatic metric for MT evaluation with improved correlation with human judgments.
        in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 2005: 65-72
        • Lin C.-Y.
        Rouge: a package for automatic evaluation of summaries.
        in: Text summarization branches out. 2004: 74-81
        • Anderson P.
        • Fernando B.
        • Johnson M.
        • et al.
        Spice: Semantic propositional image caption evaluation.
        in: European Conference on Computer Vision. Springer, 2016: 382-398
        • Sundararajan M.
        • Taly A.
        • Yan Q.
        Axiomatic attribution for deep networks.
        in: International Conference on Machine Learning: PMLR. 2017: 3319-3328
        • Chen S.
        • Zhao Q.
        Boosted attention: leveraging human attention for image captioning.
        in: Proceedings of the European Conference on Computer Vision. 2018: 68-84
      1. Pavlopoulos J, Kougia V, Androutsopoulos I. A survey on biomedical image captioning. Proceedings of the Second Workshop on Shortcomings in Vision and Language. 2019. p. 26-36.

        • Deng C.
        • Ding N.
        • Tan M.
        • et al.
        Length-controllable image captioning.
        Springer International Publishing, Cham, Switzerland2020: 712-729
        • Shin H.-C.
        • Roberts K.
        • Lu L.
        • et al.
        Learning to read chest x-rays: recurrent neural cascade model for automated image annotation.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2497-2506
        • Demner-Fushman D.
        • Antani S.
        • Simpson M.
        • et al.
        Design and development of a multimodal biomedical information retrieval system.
        J Comput Sci Eng. 2012; 6: 168-177
        • Pelka O.
        • Koitka S.
        • Rückert J.
        • et al.
        Radiology Objects in COntext (ROCO): a multimodal image dataset.
        in: Intravascular imaging and computer assisted stenting and large-scale annotation of biomedical data and expert label synthesis. Springer, 2018: 180-189
        • Demner-Fushman D.
        • Kohli M.D.
        • Rosenman M.B.
        • et al.
        Preparing a collection of radiology examinations for distribution and retrieval.
        J Am Med Inform Assoc. 2016; 23: 304-310
        • Wang X.
        • Peng Y.
        • Lu L.
        • et al.
        Chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2097-2106
        • Irvin J.
        • Rajpurkar P.
        • Ko M.
        • et al.
        Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison.
        in: Proceedings of the AAAI Conference on Artificial Intelligence. 2019: 590-597

      References

        • Lambert R.
        The Paris endoscopic classification of superficial neoplastic lesions: esophagus, stomach, and colon: November 30 to December 1, 2002.
        Gastrointest Endosc. 2003; 58: S3-S43
        • Axon A.
        • Diebold M.
        • Fujino M.
        • et al.
        Update on the Paris classification of superficial neoplastic lesions in the digestive tract.
        Endoscopy. 2005; 37: 570-578
        • Dixon M.
        Gastrointestinal epithelial neoplasia: Vienna revisited.
        Gut. 2002; 51: 130-131
        • Chen X.
        • Fang H.
        • Lin T.-Y.
        • et al.
        Microsoft COCO captions: data collection and evaluation server.
        (arXiv preprint arXiv:150400325)2015 (Available at: https://arxiv.org/abs/1504.00325. Accessed February 5, 2022)
        • Papineni K.
        • Roukos S.
        • Ward T.
        • et al.
        Bleu: a method for automatic evaluation of machine translation.
        in: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002: 311-318
        • Vedantam R.
        • Lawrence Zitnick C.
        • et al.
        Cider: consensus-based image description evaluation.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 4566-4575
        • Banerjee S.
        • Lavie A.
        METEOR: an automatic metric for MT evaluation with improved correlation with human judgments.
        in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 2005: 65-72
        • Lin C.-Y.
        Rouge: a package for automatic evaluation of summaries.
        in: Text summarization branches out. 2004: 74-81
        • Anderson P.
        • Fernando B.
        • Johnson M.
        • et al.
        Spice: semantic propositional image caption evaluation.
        in: European Conference on Computer Vision. Springer, 2016: 382-398