American Society for Gastrointestinal Endoscopy Artificial Intelligence Task Force|Articles in Press

Framework and metrics for the clinical use and implementation of artificial intelligence algorithms into endoscopy practice: recommendations from the American Society for Gastrointestinal Endoscopy Artificial Intelligence Task Force

Published:February 08, 2023DOI:
      In the past few years, we have seen a surge in the development of relevant artificial intelligence (AI) algorithms addressing a variety of needs in GI endoscopy. To accept AI algorithms into clinical practice, their effectiveness, clinical value, and reliability need to be rigorously assessed. In this article, we provide a guiding framework for all stakeholders in the endoscopy AI ecosystem regarding the standards, metrics, and evaluation methods for emerging and existing AI applications to aid in their clinical adoption and implementation. We also provide guidance and best practices for evaluation of AI technologies as they mature in the endoscopy space. Note, this is a living document; periodic updates will be published as progress is made and applications evolve in the field of AI in endoscopy.


      AI (artificial intelligence), CONSORT (Consolidated Standards of Reporting Trials), FP (false positive), RCT (randomized clinical trial), TP (true positive)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Gastrointestinal Endoscopy
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • He J.
        • Baxter S.L.
        • Xu J.
        • et al.
        The practical implementation of artificial intelligence technologies in medicine.
        Nat Med. 2019; 25: 30-36
        • Seyed Tabib N.S.
        • Madgwick M.
        • Sudhakar P.
        • et al.
        Big data in IBD: big progress for clinical practice.
        Gut. 2020; 69: 1520-1532
      1. Reinke A, Eisenmann M, Tizabi MD et al. Common limitations of image processing metrics: a picture story. arXiv preprint arXiv 2021;2104.05642.

        • Liu X.
        • Cruz Rivera S.
        • Moher D.
        • et al.
        Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension.
        Nat Med. 2020; 26: 1364-1374
        • Cruz Rivera S.
        • Liu X.
        • Chan A.-W.
        • et al.
        Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension.
        Lancet Digital Health. 2020; 2: e549-e560
        • Hicks S.A.
        • Strümke I.
        • Thambawita V.
        • et al.
        On evaluation metrics for medical applications of artificial intelligence.
        Sci Rep. 2022; 12: 5979
        • Horsky J.
        • Zhang J.
        • Patel V.L.
        To err is not entirely human: complex technology and user cognition.
        J Biomed Inform. 2005; 38: 264-266
        • Thorpe S.
        • Fize D.
        • Marlot C.
        Speed of processing in the human visual system.
        Nature. 1996; 381: 520-522
      2. Shalev-Shwartz S, Shammah S, Shashua A. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint arXiv 2016;1610.03295.

        • Hassan C.
        • Badalamenti M.
        • Maselli R.
        • et al.
        Computer-aided detection-assisted colonoscopy: classification and relevance of false positives.
        Gastrointest Endosc. 2020; 92: 900-904
        • Holzwanger E.A.
        • Bilal M.
        • Glissen Brown J.R.
        • et al.
        Benchmarking definitions of false-positive alerts during computer-aided polyp detection in colonoscopy.
        Endoscopy. 2021; 53: 937-940
        • Thambawita V.
        • Hicks S.A.
        • Isaksen J.
        • et al.
        DeepSynthBody: the beginning of the end for data deficiency in medicine.
        in: 2021 International Conference on Applied Artificial Intelligence. ICAPAI, 2021 (Available at: Accessed December 6, 2022)
        • Thambawita V.
        • Strümke I.
        • Hicks S.A.
        • et al.
        Impact of image resolution on deep learning performance in endoscopy image classification: an experimental study using a large dataset of endoscopic images.
        Diagnostics. 2021; 11
        • Abadi M.
        • Chu A.
        • Goodfellow I.
        • et al.
        Deep learning with differential privacy.
        in: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 2016 (Available at: Accessed December 6, 2022)
        • Boughorbel S.
        • Jarray F.
        • El-Anbari M.
        Optimal classifier for imbalanced data using Matthews correlation coefficient metric.
        PloS One. 2017; 12e0177678