Development and validation of a machine learning-based model for varices screening in compensated cirrhosis (CHESS2001): An international multicenter study

Published:October 14, 2022DOI:
      This paper is only available as a PDF. To read, Please Download here.


      Background and Aims

      The prevalence of high-risk varices (HRV) is low among compensated cirrhotic patients undergoing esophagogastroduodenoscopy (EGD). Our study aimed to identify a novel machine learning-based model, named ML EGD, for ruling out HRV and avoiding unnecessary EGDs in patients with compensated cirrhosis.


      An international cohort from 17 institutions from China, Singapore, and India were enrolled (CHESS2001, NCT04307264). The variables with the top three importance scores (liver stiffness, platelet count, and total bilirubin) were selected by shapley additive explanation and inputted into light gradient boosting machine algorithm to develop ML EGD for identification of HRV. Furthermore, we built a web-based calculator of ML EGD and it was free with open access at Spared EGDs and the rates of missed HRV were used to assess the efficacy and safety for varices screening.


      A total of 2,794 patients were enrolled. Of them, 1,283 patients in a real-world cohort from one university hospital in China were to develop and internally validate the performance of ML EGD for varices screening. They were randomly assigned into the training (n = 1154) and validation (n = 129) cohorts with a ratio of 9:1. In the training cohort, ML EGD spared 607 (52.6%) unnecessary EGDs with a missed HRV rate of 3.6%. In the validation cohort, ML EGD spared 75 (58.1%) EGDs with a missed HRV rate of 1.4%. To externally test the performance of ML EGD, 966 patients from 14 university hospitals in China (test cohort 1) and 545 from two hospitals in Singapore and India (test cohort 2) comprised two test cohorts. In the test cohort 1, ML EGD spared 506 (52.4%) EGDs with a missed HRV rate of 2.8%. In the test cohort 2, ML EGD spared 224 (41.1%) EGDs with a missed HRV rate of 3.1%. Comparing with Baveno VI criteria, ML EGD spared more screening EGDs in all cohorts (training cohort, 52.6% vs 29.4%; validation cohort, 58.1% vs 44.2%; test cohort 1, 52.4% vs 26.5%; test cohort 2, 41.1% vs 21.1%) (p < 0.001).


      We identified a novel model based on liver stiffness, platelet count, and total bilirubin, named ML EGD, as a free web-based calculator. ML EGD could efficiently help rule out HRV and avoid unnecessary EGDs in patients with compensated cirrhosis.

      Graphical abstract


      Acronyms and abbreviations:

      gastroesophageal varices (GEV), high-risk varices (HRV), Esophagogastroduodenoscopy (EGD), liver stiffness measurement (LSM), transient elastography (TE), platelet count (PLT), machine learning (ML), total bilirubin (TBIL), alkaline phosphatase (ALP), aspartate aminotransferase (AST), alanine aminotransferase (ALT), white blood cells (WBC), international normalized ratio (INR), hemoglobin (Hb), gamma-glutamyl transpeptidase (GGT), prothrombin time (PT), creatinine, albumin (Cr), Light Gradient Boosting Machine algorithm (LightGBM), Shapley additive explanation (SHAP), gradient-based one-side sampling (GOSS), exclusive feature bundling (EFB), interquartile range (IQR), standard deviation (SD), receiver operating characteristic (ROC), the area under the ROC curve (AUC), positive predictive value (PPV), negative predictive value (NPV)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Gastrointestinal Endoscopy
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect