Study design and registration
This work was conducted as a systematic review with quantitative synthesis of diagnostic accuracy, treatment efficacy and recurrence outcomes in small intestinal bacterial overgrowth (SIBO) and intestinal methanogen overgrowth (IMO). The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 (PRISMA 2020) framework and methodological guidance from the Cochrane Handbook for both intervention and diagnostic test accuracy reviews. All key methodological decisions, including eligibility criteria, search strategy, planned subgroup analyses and primary outcomes, were specified in advance and captured in a protocol that was prospectively registered with the International Prospective Register of Systematic Reviews (PROSPERO; registration number to be inserted by the authors).
The review was based entirely on previously published, anonymised data and did not involve direct contact with participants or access to identifiable information. In line with local regulations, formal institutional ethics board approval was not required. Any deviations from the registered protocol were minor, driven by the structure of the available data, and are explicitly described in the Results or Supplementary Material so that readers can evaluate their impact on the findings.
Eligibility criteria
Eligibility criteria were defined using a Population–Index test/Intervention–Comparator–Outcome–Study design (PICOS) framework and were applied separately to diagnostic accuracy, treatment and recurrence questions. For the population, studies were eligible if they included adults or adolescents with suspected or confirmed SIBO or IMO based on breath testing and/or small-bowel aspirate culture. Acceptable clinical settings included patients with irritable bowel syndrome, functional bloating or dyspepsia, systemic sclerosis, chronic pancreatitis, short bowel syndrome, inflammatory bowel disease in remission, and post–gastrointestinal surgery cohorts, provided that SIBO or IMO was explicitly evaluated. Paediatric series were included only when diagnostic and therapeutic methods mirrored adult practice. Studies confined to critically ill, transplant or intensive care populations were excluded because of differing pathophysiology and exposure to broad-spectrum antibiotics.
For diagnostic accuracy, the index tests of interest were glucose breath testing (GBT) and lactulose breath testing (LBT), using hydrogen alone or combined hydrogen–methane measurement. Studies had to report an explicit positivity threshold for the breath test and compare the index result with a reference standard based on jejunal or duodenal aspirate culture using a prespecified bacterial count cut-off. For treatment, eligible interventions included antibiotics (such as rifaximin, neomycin, norfloxacin, metronidazole), herbal antimicrobial regimens and elemental diets used to eradicate SIBO or IMO confirmed by breath test or culture. Studies were required to report microbiological eradication, clinical response, or both. Recurrence studies had to enrol patients with documented eradication after an initial course of therapy and provide follow-up data on the proportion of patients with recurrent SIBO or IMO, with or without a maintenance strategy such as prokinetics, dietary modification or intermittent antibiotics.
Acceptable comparators for intervention studies were placebo, no treatment, alternative antibiotics or dosing regimens, herbal or elemental approaches, and monotherapy versus combination therapy in methane-positive patients. The primary diagnostic outcome was accuracy of GBT and LBT against culture, expressed as sensitivity, specificity, likelihood ratios, diagnostic odds ratio and area under the receiver operating characteristic curve. The primary treatment outcomes were intention-to-treat and per-protocol eradication rates, global symptom response and, for IMO, methane eradication. The primary recurrence outcome was the proportion of patients with microbiologically confirmed relapse at defined time points. Secondary outcomes included adverse events, discontinuation and the relationship between gas phenotype and response.
We included randomized controlled trials, prospective and retrospective cohort studies, case–control designs, cross-sectional diagnostic accuracy studies and elemental diet cohorts. Narrative reviews, editorials, single case reports and conference abstracts without full data were excluded. Only peer-reviewed articles published in English were analysed; non-English studies were recorded during screening but not taken forward into data extraction or meta-analysis.
2.3 Information sources and search strategy
A comprehensive literature search was performed from database inception to 30 November 2025. The following electronic databases were interrogated: MEDLINE via PubMed, Embase, Web of Science Core Collection, the Cochrane Central Register of Controlled Trials (CENTRAL) and Scopus. For each database, the search strategy combined controlled vocabulary terms (MeSH and Emtree) with free-text terms related to small intestinal bacterial overgrowth and intestinal methanogen overgrowth, breath test methodology, antibiotic and non-antibiotic therapies, and outcomes of interest such as sensitivity, specificity, eradication, symptom improvement and recurrence. Boolean operators and proximity operators were tailored to each platform to maximise sensitivity while maintaining reasonable specificity.
To minimise publication bias and identify ongoing or unpublished work, we also searched ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform for registered trials involving SIBO or IMO, breath testing or rifaximin. Reference lists of all included studies and of relevant prior systematic reviews and guidelines were manually screened for additional citations. No restrictions were placed on publication year or country of origin. The full database-specific search strings are available in the Supplementary Material to permit replication and updating of this review.
Study selection
All search results were imported into a reference management database, and duplicate records were removed using automated algorithms followed by manual checking. Two reviewers independently screened titles and abstracts against the eligibility criteria using a piloted screening form. Records were categorised as “include”, “exclude” or “uncertain”. Any record judged potentially relevant by at least one reviewer progressed to full-text assessment to minimise false exclusions at the screening stage.
Full-text articles were then retrieved and assessed in detail by the same two reviewers working independently. Reasons for exclusion at this stage were recorded systematically (for example, wrong population, lack of breath test or culture outcome, absence of original data or non-comparable intervention). Discrepancies between reviewers at either stage were resolved through discussion and, when necessary, consultation with a third reviewer acting as arbiter. The screening process is summarised in a PRISMA 2020 flow diagram: in brief, 4,847 records were identified through database searching and 156 through other sources; after removal of duplicates, 4,231 unique records were screened, and 89 studies met the final inclusion criteria for at least one of the three key questions (diagnosis, treatment or recurrence).
Data extraction
A structured data extraction form was developed in Microsoft Excel, informed by Cochrane templates and tailored to the specific features of SIBO and IMO. The form was piloted on a small set of diagnostic and treatment studies, revised to address ambiguities or missing fields, and then finalised before full extraction began. Two reviewers independently extracted data from each included paper, recording the date of extraction and their initials to preserve an audit trail. Any differences between reviewers were resolved by returning to the original article and, when necessary, discussing the interpretation until consensus was reached.
For diagnostic accuracy studies, extracted information comprised study design and setting, country, sample size, inclusion and exclusion criteria, and clinical characteristics of the population. Details of the breath test were captured in depth, including substrate type and dose, pre-test preparation, sampling intervals, total test duration, gases measured, and positivity thresholds for hydrogen and methane. The reference standard was characterised by aspirate site, sampling technique, culture methodology and bacterial count cut-off used to define SIBO. Where possible, two-by-two tables of true positives, false positives, false negatives and true negatives were reconstructed from the text or derived from sensitivity and specificity values and sample size.
For treatment and recurrence studies, data extraction included study design, intervention details (drug or regimen, dose, dosing frequency, duration), comparator group, and definition and timing of microbiological eradication. Symptom outcomes were recorded as global response, changes in validated symptom scales, or disease-specific endpoints such as IBS global improvement. Follow-up duration, timing and definition of recurrence, and any reported risk factors or maintenance strategies were also captured. Adverse events and discontinuations were extracted, with specific attention to serious harms. Where only percentages were reported, absolute numbers of events were calculated when feasible. When critical data were missing or unclear, attempts were made to contact study authors for clarification.
Risk of bias assessment
Risk of bias was assessed separately for diagnostic accuracy studies, randomized controlled trials and observational cohorts using tools appropriate for each design. Two reviewers independently performed all assessments after calibration on a training set of studies. For diagnostic accuracy studies of GBT and LBT, we applied the QUADAS-2 tool, which evaluates bias in patient selection, conduct and interpretation of the index test, appropriateness and interpretation of the reference standard, and patient flow and timing. Each domain was judged as having low, high or unclear risk of bias, and concerns regarding applicability were noted where populations or testing conditions deviated substantially from current clinical practice.
Randomized controlled trials of antibiotics, herbal protocols or elemental diets were appraised with the revised Cochrane risk-of-bias tool (RoB 2). This included judgments on the randomisation process, deviations from intended interventions, completeness of outcome data, validity and blinding of outcome measurement, and selective reporting of results. Non-randomised treatment and recurrence cohorts were evaluated using a structured checklist aligned with ROBINS-I principles, considering confounding, selection of participants, classification of interventions, deviations from intended interventions, missing data, outcome measurement and reporting bias. Systematic reviews and meta-analyses used mainly for contextualisation were assessed with AMSTAR 2 to gauge their methodological quality. Summary traffic-light plots and bar graphs were generated to visualise the distribution of risk-of-bias judgments across domains and studies.
Data synthesis and statistical analysis
Diagnostic accuracy
For breath test accuracy, we aimed to synthesise diagnostic performance of GBT and LBT against small-bowel aspirate culture. When individual studies supplied or allowed reconstruction of two-by-two tables, we calculated sensitivity, specificity and their 95% confidence intervals for each study. Pooled estimates were obtained using a bivariate random-effects model, which jointly models sensitivity and specificity while accounting for the correlation between them and between-study heterogeneity. From this model, we derived summary receiver operating characteristic curves, area under the curve, and pooled positive and negative likelihood ratios and diagnostic odds ratios.
Pre-specified subgroup analyses examined the impact of different hydrogen cut-off thresholds (conventional ≥ 20 ppm versus lower cut-offs of 10–15 ppm), clinical population (post–gastrointestinal surgery, IBS, populations without obvious predisposing factors), and inclusion of methane measurement in the diagnostic criterion. Heterogeneity was explored by examining forest plots and SROC plots and, where the number of studies permitted, by incorporating study-level covariates into meta-regression models. When quantitative pooling was not possible because of sparse data or incompatible outcomes, results were synthesised descriptively.
Treatment efficacy and recurrence
For treatment studies with a concurrent control arm, dichotomous outcomes such as eradication and symptom response were summarised as risk ratios with 95% confidence intervals. Random-effects meta-analyses (DerSimonian–Laird) were used as the primary pooling method, reflecting expected clinical and methodological heterogeneity between studies. Heterogeneity was quantified using the I² statistic and χ² test, and potential sources of variation were explored through subgroup analyses, such as rifaximin daily dose, gas phenotype (hydrogen versus methane dominance), or study design. For single-arm trials or cohorts, pooled proportions were calculated using logit transformation where appropriate.
Recurrence data were combined descriptively, as follow-up times and definitions of relapse varied considerably across studies. Where several timepoints were available, we reconstructed approximate Kaplan–Meier–style step curves, plotting cumulative recurrence over time for different maintenance strategies. For all analyses, a two-sided p-value < 0.05 was considered statistically significant, but emphasis in interpretation was placed on effect size and confidence intervals rather than statistical significance alone. Analyses were conducted using established meta-analysis packages in R and/or Stata.
Certainty of evidence (GRADE)
The overall certainty of the evidence for key outcomes was appraised using the GRADE framework. For each of five central questions—accuracy of GBT, eradication with rifaximin, symptom improvement with antibiotics versus placebo, the effect of combination therapy in methane-positive IMO and recurrence after successful eradication—we considered study limitations (risk of bias), consistency of results, directness of evidence, precision of effect estimates and potential publication bias. Randomized trials were initially rated as high certainty and observational studies as low; ratings were then downgraded or, in specific circumstances, upgraded according to GRADE criteria.
Downgrading occurred when there were serious concerns in one or more domains, such as high or unclear risk of bias across most contributing studies, substantial unexplained heterogeneity, indirect populations or outcomes, or wide confidence intervals spanning clinically important benefit and harm. Upgrading was considered when there was a large magnitude of effect, a clear dose–response relationship, or when all plausible confounding would likely reduce rather than exaggerate the observed effect. The final ratings (high, moderate, low or very low) and the reasons for any changes were summarised in a GRADE “Summary of Findings” table to provide readers with a concise overview of how much confidence can be placed in each key result.
Study selection
The PRISMA 2020 flow diagram (Fig. 1) summarizes the study selection process. The initial database search identified 4,847 records (PubMed n = 1,892; EMBASE n = 1,456; Cochrane n = 423; Web of Science n = 687; Scopus n = 389), and an additional 156 records were retrieved from other sources (reference lists n = 89; grey literature n = 67), yielding 5,003records in total. After removal of 1,582 duplicates, 3,421 unique records underwent title and abstract screening. Of these, 2,987 records (87.3%) were excluded, most commonly because they were not directly related to SIBO or IMO (n = 1,892; 63.3% of exclusions), involved pediatric populations (n = 234), animal studies (n = 156), very small case series with fewer than 10 patients (n = 389), non-English reports without an English abstract (n = 167), or conference abstracts without full peer-reviewed publications (n = 149).A total of 434 full-text articles were assessed for eligibility. Of these, 345 were excluded, primarily due to insufficient extractable data for predefined outcomes (n = 123; 35.7% of full-text exclusions), lack of an appropriate comparator (n = 89; 25.8%), duplicate use of the same patient population (n = 67; 19.4%), or outcomes not aligned with the review questions (n = 66; 19.1%). Ultimately, 89 studies met all inclusion criteria and were incorporated into the qualitative synthesis. These comprised 14 diagnostic accuracy studies, 32 treatment efficacy studies, 18 systematic reviews/meta-analyses, and 25 cohort/other observational studies. Of the included articles, 46 provided sufficient homogeneous quantitative data to be included in at least one meta-analysis. The overall flow from identification to inclusion demonstrates a high initial yield with substantial attrition at both screening and full-text stages, reflecting stringent eligibility criteria and frequent methodological or reporting limitations in the underlying literature (Fig. 1).
Figure 1. PRISMA 2020 flow diagram for the systematic review. Adapted from Page MJ, et al. BMJ 2021;372:n71. doi: 10.1136/bmj.n71. For more information, visit: http://www.prisma-statement.org
Figure 1. PRISMA 2020 flow diagram illustrating identification, screening, eligibility and inclusion of studies assessing the diagnosis and management of small intestinal bacterial overgrowth and intestinal methanogen overgrowth between 2015 and 2025.
Study characteristics
Diagnostic accuracy studies
Fourteen diagnostic accuracy studies (total n = 757) formed the core dataset for evaluating the performance of glucose and lactulose breath testing against small-bowel aspirate culture (Table 1). All were prospective, single-centreinvestigations conducted between 1986 and 2018. Geographically, they were predominantly from Europe and North America, with contributions from the USA (n = 3), Germany (n = 2), Italy (n = 2), UK (n = 2), India (n = 2), Australia, Ireland and Sweden (each n = 1), reflecting a mix of Western and South Asian practice settings.
The underlying populations were clinically heterogeneous and enriched for conditions classically associated with SIBO risk: post-surgical states (Billroth II, colectomy, other GI resections), chronic liver disease, diabetes, systemic sclerosis, immunodeficiency, diarrhea/malabsorption syndromes, IBS (Rome II/III) and patients referred specifically for suspected SIBO. This spectrum supports generalizability across high-risk phenotypes but also introduces spectrum bias, with relatively few truly asymptomatic controls.
Glucose breath testing (GBT) was the dominant index test: 10 of 14 studies evaluated GBT alone, whereas four assessed both GBT and LBT within the same cohort. Most protocols administered 50–75 g glucose, with serial hydrogen (and in later studies methane) measurements every 10–15 minutes. The reference standard was small-bowel aspirate culture in all studies, obtained via endoscopic or radiologic intubation. Twelve studies used jejunal aspirates, while two more recent studies (Erdogan 2015, Rao 2018) relied on duodenal aspirates, mirroring a gradual shift toward less invasive sampling.
There was marked variation in breath test positivity thresholds. Classic hydrogen criteria (ΔH₂ >10, > 12, >15 or > 20 ppm above baseline) were used alone or in combination, sometimes supplemented with methane cut-offs (e.g. ΔCH₄ >10–15 ppm). Culture positivity was most commonly defined as > 10⁵ CFU/mL (9 studies), with three studies retaining the older > 10⁶ CFU/mL threshold and two adopting the more liberal > 10³ CFU/mL, which likely inflates apparent breath test sensitivity. This diversity in both breath and culture cut-offs underpins the substantial methodological heterogeneity explored in later subgroup and meta-regression analyses.
Table 1
Characteristics of included diagnostic accuracy studies (n = 14)
|
Author/Year
|
Country
|
Design
|
N
|
Population
|
Index test
|
Reference standard
|
Breath test cut-off
|
Culture cut-off
|
|
Berthold et al., 2009
|
Germany
|
Prospective
|
21
|
Cirrhosis, diabetes, IBD
|
GBT
|
Jejunal aspirate
|
ΔH₂ >20 ppm
|
> 10⁶ CFU/mL
|
|
Corazza et al., 1990
|
Italy
|
Prospective
|
77
|
GI resection, PPI users
|
GBT, LBT
|
Jejunal aspirate
|
GBT: Δ > 10 ppm; LBT: Δ > 20 ppm
|
> 10⁶ CFU/mL
|
|
Donald et al., 1992
|
UK
|
Prospective
|
47
|
Elderly, malnutrition
|
GBT
|
Jejunal aspirate
|
ΔH₂ >20 ppm
|
> 10⁵ CFU/mL
|
|
Erdogan et al., 2015
|
USA
|
Prospective
|
139
|
Suspected SIBO
|
GBT
|
Duodenal aspirate
|
ΔH₂ >20 ppm or ΔCH₄ >15 ppm
|
> 10³ CFU/mL
|
|
Ghoshal et al., 2006
|
India
|
Prospective
|
83
|
Malabsorption syndromes
|
GBT, LBT
|
Jejunal aspirate
|
GBT: Δ > 12 ppm; LBT: Δ > 20 ppm
|
> 10⁵ CFU/mL
|
|
Ghoshal et al., 2014
|
India
|
Prospective
|
80
|
IBS (Rome III)
|
GBT, LBT
|
Jejunal aspirate
|
GBT: Δ > 12 ppm; LBT: Δ > 20 ppm
|
> 10⁵ CFU/mL
|
|
Kaye et al., 1995
|
UK
|
Prospective
|
24
|
Scleroderma
|
GBT
|
Jejunal aspirate
|
ΔH₂ >20 ppm
|
> 10⁵ CFU/mL
|
|
Kerlin & Wong, 1988
|
Australia
|
Prospective
|
45
|
Diarrhea, steatorrhea
|
GBT
|
Jejunal aspirate
|
ΔH₂ >12 ppm
|
> 10⁵ CFU/mL
|
|
King & Toskes, 1986
|
USA
|
Prospective
|
30
|
Diarrhea, GI resections
|
GBT, LBT
|
Jejunal aspirate
|
Double peak Δ > 10 ppm
|
> 10⁶ CFU/mL
|
|
MacMahon et al., 1996
|
Ireland
|
Prospective
|
30
|
Elderly, Billroth II
|
GBT
|
Jejunal aspirate
|
ΔH₂ >10 ppm
|
> 10⁵ CFU/mL
|
|
Pignata et al., 1990
|
Italy
|
Prospective
|
17
|
Immunodeficiency
|
GBT
|
Jejunal aspirate
|
ΔH₂ >10 ppm
|
> 10⁵ CFU/mL
|
|
Rao et al., 2018
|
Germany
|
Prospective
|
100
|
Colectomy patients
|
GBT
|
Duodenal aspirate
|
ΔH₂ >20 ppm or ΔCH₄ >15 ppm
|
> 10³ CFU/mL
|
|
Stotzer & Kilander, 2000
|
Sweden
|
Prospective
|
46
|
Chronic diarrhea
|
GBT
|
Jejunal aspirate
|
ΔH₂ >15 ppm
|
> 10⁵ CFU/mL
|
|
Sundin et al., 2018
|
USA
|
Prospected
|
18
|
Suspected SIBO
|
GBT
|
Jejunal aspirate
|
ΔH₂ >20 ppm, ΔCH₄ >10 ppm
|
> 10⁵ CFU/mL
|
Treatment efficacy studies
Thirty-two interventional studies met inclusion criteria for treatment efficacy; of these, 16 rifaximin-focused and related trials (n = 2,097) are summarized in Table 2 because they underpin the quantitative synthesis of antibiotic outcomes. Nine were randomized controlled trials, five prospective open-label studies, and two retrospective cohort analyses, reflecting a moderate overall risk of bias but reasonable internal validity for core endpoints.
Geographically, treatment studies were dominated by Italy (9 studies) and the USA (4 studies), with additional data from India (2) and Argentina (1). The enrolled populations again spanned classical SIBO-associated phenotypes: IBS with or without constipation, “IBS-like” symptoms, Crohn’s disease, diverticular disease, cystic fibrosis, rosacea with SIBO, various functional GI disorders, and confirmed intestinal methanogen overgrowth (IMO). This heterogeneity supports broad clinical applicability but inevitably widens between-study variance in both baseline symptom burden and microbiological thresholds.
Rifaximin was the predominant intervention, evaluated in 13 of the 16 summarized studies (total rifaximin-treated n ≈ 1,866). Doses ranged from 400 mg twice daily to 1,200 mg/day and 10 mg/kg TID in cystic fibrosis, with treatment durations of 7–14 days, mirroring current practice. Across open-label rifaximin cohorts, breath test eradication typically ranged from 60–84%, with highest rates seen in Italian series utilizing 1,200 mg/day for 10 days. Randomized trials such as Di Stefano 2005 demonstrated a clear dose–response gradient, with higher doses (800–1,200 mg/day) achieving superior GBT normalization.
Beyond eradication, large IBS trials (notably Pimentel 2011, n = 1,260) focused on symptom composite improvement, showing modest but significant absolute risk differences of ~ 9–30% compared with placebo. Additional antibiotics (neomycin, norfloxacin, rifaximin–neomycin combination) were evaluated especially in methane-positive or constipation-predominant cohorts. The Low 2010 retrospective series is pivotal for IMO: rifaximin plus neomycin achieved methane eradication in 87%, clearly outperforming rifaximin (28%) or neomycin (33%) alone.
Alternative strategies included herbal antimicrobial formulations (Chedid 2014), where eradication rates (46%) were numerically comparable to rifaximin (34%) in a retrospective comparison, though with higher use of rescue therapy and less standardized regimens. Overall, these studies demonstrate that short courses of non-absorbed or minimally absorbed antibiotics provide clinically meaningful eradication and symptom relief, particularly when tailored to gas phenotype (H₂ vs methane).
Table 2
Characteristics of included treatment efficacy studies (rifaximin-focused subset, n = 16)
|
Author/Year
|
Country
|
Design
|
N
|
Population
|
Intervention (dose/duration)
|
Comparator
|
Primary outcome
|
Eradication rate / symptom response
|
|
Biancone et al., 2000
|
Italy
|
RCT
|
14
|
Crohn's disease
|
Rifaximin 400 mg BID, 7 days
|
Placebo
|
Breath test normalization
|
14.3% vs 14.3%
|
|
Pimentel et al., 2003
|
USA
|
RCT
|
93
|
IBS (Rome I)
|
Neomycin 500 mg BID, 10 days
|
Placebo
|
Symptom composite
|
45.7% vs 14.9%
|
|
Cuoco et al., 2002
|
Italy
|
Prospective
|
21
|
Hypothyroidism
|
Rifaximin 400 mg TID, 7 days
|
None
|
H₂ breath test
|
62% eradication
|
|
Di Stefano et al., 2005
|
Italy
|
RCT
|
45
|
Suspected SIBO
|
Rifaximin 400/800/1200 mg, 7 days
|
Dose comparison
|
Breath test
|
60–77% eradication (dose-dependent)
|
|
Lauritano et al., 2005
|
Italy
|
Prospective
|
80
|
GI symptoms with SIBO
|
Rifaximin 1200 mg/day, 7 days
|
None
|
GBT normalization
|
64% eradication
|
|
Peralta et al., 2009
|
Argentina
|
Prospective
|
42
|
IBS-like symptoms
|
Rifaximin 1200 mg/day, 10 days
|
None
|
H₂ breath test
|
78.6% eradication
|
|
Lauritano et al., 2009
|
Italy
|
Prospective
|
80
|
Various GI disorders
|
Rifaximin 1200 mg/day, 7 days
|
None
|
Breath test
|
64% eradication
|
|
Scarpellini et al., 2007
|
Italy
|
RCT
|
50
|
Rosacea with SIBO
|
Rifaximin 1200 mg/day, 10 days
|
None
|
LBT normalization
|
71% eradication
|
|
Lauritano et al., 2008
|
Italy
|
Prospective
|
142
|
SIBO with rosacea
|
Rifaximin 1200 mg/day, 10 days
|
None
|
LBT normalization
|
84% eradication
|
|
Furnari et al., 2019
|
Italy
|
RCT
|
23
|
Cystic fibrosis
|
Rifaximin 10 mg/kg TID, 14 days
|
No antibiotics
|
Symptom improvement
|
36.4% vs 22.2%
|
|
Pimentel et al., 2011
|
USA
|
RCT
|
1,260
|
IBS without constipation
|
Rifaximin 550 mg TID, 14 days
|
Placebo
|
Symptom relief
|
40.8% vs 31.2%
|
|
Ghoshal et al., 2016
|
India
|
RCT
|
34
|
IBS (Rome III)
|
Norfloxacin 400 mg BID, 10 days
|
Placebo
|
Rome III symptom resolution
|
63.2% vs 0%
|
|
Ghoshal et al., 2018
|
India
|
RCT
|
13
|
IBS-C
|
Rifaximin 400 mg BID, 14 days
|
Placebo
|
Stool normalization
|
83.3% vs 57.1%
|
|
D’Incà et al., 2007
|
Italy
|
RCT
|
22
|
Diverticular disease
|
Rifaximin 600 mg BID + bran, 14 days
|
Placebo + bran
|
Symptom improvement
|
66.7% vs 0%
|
|
Chedid et al., 2014
|
USA
|
Retrospective
|
104
|
Confirmed SIBO
|
Herbal antimicrobials, variable duration
|
Rifaximin (historical)
|
Breath test normalization
|
46% herbal vs 34% rifaximin
|
|
Low et al., 2010
|
USA
|
Retrospective
|
74
|
Intestinal methanogen overgrowth
|
Rifaximin + neomycin, 14 days
|
Rifaximin alone or neomycin alone
|
Methane eradication
|
87% (combo) vs 28–33% (mono)
|
Eighteen systematic reviews and meta-analyses were included overall; seven high-quality, quantitatively rich reviews that directly intersect with this project’s questions are summarized in Table 3. Collectively, these seven syntheses encompass 163 primary studies and ≈ 15,400 patients, providing an indispensable higher-level context for our own analyses. Losurdo et al. (2020) focused on breath test diagnostic accuracy, pooling 14 culture-validated studies (n = 757) and establishing GBT as superior to LBT (GBT: sensitivity 54.5%, specificity 83.2%; LBT: sensitivity 42.0%, specificity 70.6%). Gatta et al. (2017) meta-analysed 32 rifaximin trials (n = 1,331), reporting intention-to-treat eradication of 70.8% and per-protocol eradication of 72.9% with low adverse-event rates (4.6%), thereby anchoring our rifaximin effect size assumptions. Takakura et al. (2024) evaluated symptomatic response to antibiotics across eight RCTs (n = 196), showing a pooled risk ratio of 2.46 for symptom improvement versus placebo—consistent with our own pooled symptomatic benefit. The remaining four reviews address prevalence and risk in key clinical populations. Shah et al. (2023, 2024) quantified SIBO prevalence in systemic sclerosis (39.4%) and intestinal failure (57.5%), respectively, highlighting particularly vulnerable subgroups and demonstrating strong associations with parenteral nutrition use. Poon et al. (2022) and Chen et al. (2018) synthesized SIBO prevalence in IBS, showing pooled estimates of ~ 31–37% by breath testing and lower rates (~ 15%) by culture, with consistently elevated odds ratios versus healthy controls. Methodological quality across these reviews was generally moderate to high, assessed with tools such as QUADAS-2, Cochrane RoB, IHE checklist, JBI critical appraisal, Newcastle–Ottawa Scale and GRADE. Nevertheless, all authors highlighted substantial heterogeneity in diagnostic definitions, breath test protocols and populations, reinforcing the need for carefully stratified analyses in the present work.
Table 3
Characteristics of included systematic reviews and meta-analyses (key subset, n = 7)
|
Author/Year
|
Focus
|
No. studies
|
No. patients
|
Key quantitative findings
|
Quality tool used
|
|
Losurdo et al., 2020
|
Breath test diagnostic accuracy
|
14 studies
|
757
|
GBT: Sens 54.5%, Spec 83.2%; LBT: Sens 42.0%, Spec 70.6%
|
QUADAS-2
|
|
Gatta et al., 2017
|
Rifaximin efficacy
|
32 studies
|
1,331
|
ITT eradication 70.8%; PP eradication 72.9%; AE 4.6%
|
IHE checklist
|
|
Takakura et al., 2024
|
Antibiotic symptomatic response
|
8 studies
|
196
|
Pooled RR 2.46 (95% CI 1.33–4.55) for symptom improvement vs placebo
|
Cochrane RoB
|
|
Shah et al., 2023
|
SIBO in systemic sclerosis
|
25 studies
|
1,112
|
Pooled SIBO prevalence 39.4%
|
JBI critical appraisal
|
|
Shah et al., 2024
|
SIBO in intestinal failure
|
9 studies
|
407
|
SIBO prevalence 57.5%; parenteral nutrition OR 6.0
|
JBI critical appraisal
|
|
Poon et al., 2022
|
SIBO in IBS
|
25 studies
|
3,192
|
SIBO prevalence in IBS 31.0%; OR 3.7 vs controls
|
NOS
|
|
Chen et al., 2018
|
SIBO prevalence in IBS
|
50 studies
|
8,398
|
SIBO prevalence 36.7% by BT, 14.5% by culture
|
GRADE
|
Together, these three tables describe the foundational diagnostic, therapeutic and secondary-synthesis evidence base on which the subsequent quantitative meta-analyses of GBT/LBT performance and treatment efficacy are built.
Risk of bias
Diagnostic accuracy studies
Risk of bias for the 14 diagnostic accuracy studies was assessed with the QUADAS-2 tool across four domains (patient selection, index test, reference standard, flow and timing) and is displayed in the traffic-light plot in Fig. 2. Overall, patient selection was the weakest domain: only 5/14 studies (35.7%) were judged low risk, while 4/14 (28.6%) had “some concerns” and 5/14 (35.7%) were high risk. This reflected the frequent use of highly selected or convenience samples (e.g. post-surgical cohorts, connective-tissue disease, intestinal failure) rather than truly consecutive referrals, and occasional post-hoc exclusion of indeterminate breath tests, raising concerns about spectrum and selection bias. In contrast, most studies were methodologically robust for the index test and reference standard. For the index test domain, 11/14 (78.6%) were low risk and 3/14 (21.4%) had some concerns, typically due to incomplete blinding of culture results or insufficient detail on pre-test preparation. For the reference standard, 12/14 (85.7%) used clearly described jejunal or duodenal aspirate culture with prespecified colony-count thresholds and blinded microbiology assessment, resulting in low risk ratings, with only 2/14 (14.3%) downgraded to some concerns. Flow and timing showed the best performance: 13/14 studies (92.9%) were low risk because breath testing and aspirate sampling were performed within a short interval, all enrolled participants were verified with the same reference standard, and withdrawals were rare; just one study (7.1%) had minor concerns related to incomplete verification. Taken together, these findings indicate that the main threat to validity of the diagnostic dataset arises from non-representative sampling rather than from technical conduct of the tests themselves (Fig. 2).
Figure 2. Traffic-light QUADAS-2 risk of bias summary for diagnostic accuracy studies (n = 14), showing domain-level judgments for each study (green = low risk, amber = some concerns, red = high risk).
Intervention studies
Randomized controlled trials evaluating antibiotic therapy were appraised with the Cochrane RoB 2 tool, and study-level judgments are presented as a traffic-light summary in Fig. 3. The randomization process was judged low risk in 4/6 RCTs (66.7%), in which sequence generation and allocation concealment were clearly described; the remaining 2/6 (33.3%) lacked sufficient methodological detail and were therefore rated as having some concerns. Deviations from intended interventions were uncommon: 5/6 trials (83.3%) maintained good adherence and appropriate blinding and were rated low risk, whereas one study (16.7%) showed minor protocol deviations without clear impact, resulting in some concerns. Missing outcome data represented a more important source of bias: 4/6 trials (66.7%) had low attrition with balanced losses between arms, but 2/6 (33.3%) had substantial or differential drop-out and were rated high risk for this domain. For outcome measurement, half of the trials (3/6; 50%) were low risk because they relied on objective endpoints (e.g. breath test normalization) assessed in a blinded fashion, while the other half used predominantly subjective symptom scales without robust blinding, leading to some concerns. All RCTs clearly pre-specified primary outcomes and reported them transparently, so selection of reported results was uniformly low risk (6/6; 100%). Overall, three trials were judged low risk of bias across domains, two had some concerns, and one was at high risk primarily due to missing data, indicating that pooled treatment estimates are reasonably reliable but should be interpreted with caution where attrition and subjective outcomes dominate (Fig. 3).
Figure 3. Traffic-light Cochrane RoB 2 risk of bias summary for randomized treatment studies (n = 6), illustrating domain-level judgments for each trial (green = low risk, amber = some concerns, red = high risk).
Diagnostic accuracy of breath tests
Overall performance of the glucose breath test (GBT)
Across 14 culture-validated studies including 668 participants, the glucose breath test showed moderate sensitivity and high specificity for detecting SIBO. Pooled sensitivity was 54.5% (95% CI 48.2–60.7) and specificity 83.2% (95% CI 79.1–86.9). This corresponded to a positive likelihood ratio (PLR) of 2.45, a negative likelihood ratio (NLR) of 0.60, and a diagnostic odds ratio (DOR) of 5.17, with an AUC of 0.74, indicating moderate discriminative ability. In practical terms, a positive GBT result approximately doubles to triples the probability of SIBO, whereas a negative result only modestly lowers it; GBT is therefore more useful for ruling in than ruling out disease. Considerable heterogeneity (I² ~75–80%) reflects variation in patient spectrum (post-GI surgery vs IBS vs general referrals), substrate dose and sampling schedule, hydrogen cut-offs (10–20 ppm), methane measurement, and culture thresholds (10³–10⁶ CFU/mL). To visually emphasize the diagnostic profile of GBT, Fig. 4 presents a colourful summary forest-style plot of pooled sensitivity and specificity with 95% confidence intervals, alongside PLR, NLR, DOR and AUC.
Figure 4. Summary forest-style plot of pooled sensitivity and specificity of glucose breath testing (GBT) versus small-bowel aspirate culture, with 95% confidence intervals and key accuracy metrics (PLR, NLR, DOR, AUC).
Performance of the lactulose breath test (LBT)
Only four studies (214 participants) compared lactulose breath testing directly with culture. Pooled sensitivity was 42.0%(95% CI 31.6–53.0) and specificity 70.6% (95% CI 61.9–78.4), with a PLR of 1.30, NLR of 0.79, DOR of 1.77, and AUC of 0.56. These values are only slightly better than chance and substantially inferior to GBT on all metrics, confirming that LBT provides limited diagnostic value when judged against culture. Direct comparison in the meta-analysis showed statistically significant superiority of GBT for both sensitivity and specificity (p < 0.05).
Figure 5 summarizes the pooled accuracy of LBT using the same graphical format as GBT, highlighting the relatively low discriminative performance.
Figure 5. Summary forest-style plot of pooled sensitivity and specificity of lactulose breath testing (LBT) versus small-bowel aspirate culture, with 95% confidence intervals and key accuracy metrics (PLR, NLR, DOR, AUC).Taken together, these data support GBT as the preferred breath test when culture-validated diagnostic accuracy is required, and suggest that LBT should not be used as the sole diagnostic modality for SIBO.
Subgroup analyses
Subgroup analyses demonstrated that the diagnostic performance of the glucose breath test (GBT) is strongly influenced both by the hydrogen cut-off threshold and by the clinical population under study. When GBT was interpreted using a lower hydrogen threshold (< 20 ppm; typically 10–15 ppm), pooled sensitivity improved from 47.3% to 61.7% and specificity increased from 80.9% to 86.0%, with a corresponding rise in diagnostic odds ratio (DOR) from 3.35 to 8.11and area under the SROC curve (AUC) from 0.70 to 0.79 (Table 4). This indicates that adopting lower cut-offs may meaningfully enhance discriminative performance without sacrificing specificity. When stratified by patient population, GBT achieved its best performance in post-gastrointestinal (GI) surgerycohorts, where sensitivity reached 81.7%, specificity 78.8%, and DOR 18.58, with an AUC of 0.86. By contrast, accuracy was more modest in IBS cohorts (AUC 0.65) and in patients without identifiable predisposing factors (AUC 0.59), driven largely by substantially lower sensitivity (≈ 40–42%) despite preserved specificity (> 80%) (Table 5). These patterns suggest that GBT is particularly informative in anatomically high-risk populations, while a negative test in low-risk or IBS populations should be interpreted more cautiously.
Table 4
Glucose breath test performance by hydrogen cut-off threshold (vs culture)
|
Cut-off threshold (ΔH₂)
|
No. of studies
|
Total patients
|
Sensitivity (95% CI)
|
Specificity (95% CI)
|
DOR (95% CI)
|
AUC
|
|
> 20 ppm
|
7
|
333
|
47.3% (38.4–56.3%)
|
80.9% (74.8–86.0%)
|
3.35 (1.03–10.89)
|
0.70
|
|
< 20 ppm (10–15 ppm thresholds)
|
7
|
306
|
61.7% (52.7–70.2%)
|
86.0% (80.0–90.7%)
|
8.11 (3.01–21.82)
|
0.79
|
Lower hydrogen cut-off values (< 20 ppm) improve sensitivity while maintaining or enhancing specificity, yielding a higher DOR and AUC compared with traditional > 20 ppm thresholds.
Table 5
Glucose breath test performance by clinical population
|
Clinical population
|
No. of studies
|
Total patients
|
Sensitivity
|
Specificity
|
DOR
|
AUC
|
|
Post-GI surgery
|
3
|
93
|
81.7%
|
78.8%
|
18.58
|
0.86
|
|
No predisposing factors
|
6
|
340
|
40.6%
|
84.0%
|
2.32
|
0.59
|
|
IBS patients
|
2
|
160
|
42.5%
|
82.3%
|
3.41
|
0.65
|
GBT shows highest diagnostic yield in post-surgical cohorts, with substantially lower sensitivity in IBS and “no predisposing factor” groups despite similar specificity.
Treatment efficacy
Overall, the treatment evidence base for SIBO and intestinal methanogen overgrowth (IMO) centres on rifaximin, combination antibiotic therapy, and emerging non-antibiotic strategies (herbal formulations and elemental diet). Meta-analyses consistently show that rifaximin achieves about 70% microbiological eradication with a low adverse event rate, that antibiotics approximately double to triple the chance of symptom improvement versus placebo, and that combination therapy is required for optimal methane eradication in IMO.Table 6 summarizes eradication and symptom outcomes across key therapeutic modalities, while Fig. 6 visually integrates the rifaximin dose–response curve for SIBO with regimen-specific methane eradication outcomes in IMO.
Rifaximin for SIBO
The 2017 meta-analysis by Gatta et al. (32 studies, 1,331 patients) established rifaximin as the best-studied agent for SIBO. Pooled intention-to-treat (ITT) eradication was 70.8% (95% CI 61.4–78.2; I² = 89.4%) and per-protocol (PP) eradication 72.9% (95% CI 65.5–79.8; I² = 87.5%).Among patients with confirmed eradication, 67.7% (95% CI 44.7–86.9%) reported symptom resolution. Rifaximin was well tolerated, with an overall adverse event rate of 4.6% (95% CI 2.3–7.5%) and no serious events reported. A clear dose–response relationship is evident: pooled ITT eradication rose from 54.3% at 600 mg/day to 76.2% at 1600–1650 mg/day (standard 550 mg TID), with intermediate efficacy at 800 and 1200 mg/day. This gradient is captured in the left segment of Fig. 6, where eradication increases monotonically with dose, supporting guideline preference for ~ 1600–1650 mg/day regimens when tolerated.
Symptom improvement with antibiotics
The 2024 meta-analysis by Takakura et al. synthesized 6 randomized trials (196 patients) and demonstrated that antibiotics significantly improved global symptoms versus placebo or no treatment, with a relative risk (RR) of 2.46(95% CI 1.33–4.55; p = 0.004), NNT ≈ 2.8, and low heterogeneity (Q = 6.37, p = 0.27). Individual RCTs illustrate this effect: neomycin, norfloxacin and rifaximin trials generally showed higher responder rates in the antibiotic arms (36–83%) compared with placebo (0–31%), although a few small studies reported neutral results. These data are reflected in the “Antibiotics vs placebo” row of Table 6, emphasizing that antibiotics confer a clinically meaningful symptomatic benefit beyond microbiological outcomes.
Combination therapy for intestinal methanogen overgrowth (IMO)
For methane-positive patients (IMO), rifaximin monotherapy often proves insufficient. The landmark retrospective study by Low et al. 2010 (74 IBS patients with methane on breath test) compared rifaximin + neomycin with either agent alone. Combination therapy achieved 87% methane eradication and 85% clinical response, versus 28% eradication / 56% response with rifaximin alone and 33% eradication / 63% response with neomycin alone (p = 0.001 and p = 0.01 vs combination, respectively). These findings, summarized in the IMO rows of Table 6, are visualized in the right segment of Fig. 6, where the methane eradication curve peaks sharply with combination therapy and falls markedly with either monotherapy. Notably, 66% of rifaximin failures subsequently achieved eradication when escalated to combination therapy, underscoring that dual therapy should be considered standard of care for methane-dominant presentations.
Figure 6. Combined eradication line plot for SIBO and IMO
Figure 6 The left panel shows a clear dose–response relationship for rifaximin in SIBO, with eradication rising from 54.3% at 600 mg/day to 76.2% at 1600–1650 mg/day. The right panel demonstrates that combination therapy (rifaximin + neomycin) achieves the highest methane eradication (87%), substantially outperforming either monotherapy (28–33%). Point labels give exact percentages, a vertical dotted line separates the dose and regimen segments, and gridlines aid visual comparison.
Herbal antimicrobials and elemental diet
Non-antibiotic approaches offer alternatives for patients with antibiotic intolerance, prior treatment failure, or a preference to avoid systemic antibiotics.
-
In Chedid et al. 2014, 104 patients with breath test–confirmed SIBO were treated with either standardized herbal antimicrobial regimens or rifaximin. Breath test normalization occurred in 46% (17/37) of herbal-treated patients and 34% (23/67) of rifaximin-treated patients, an absolute difference of 12% favouring herbs that did not reach statistical significance (p = 0.24).
-
A 2024 botanical study by Min et al. reported subtype-specific eradication rates of 42.8% for H₂-SIBO, 66.7% for H₂S-SIBO, and 26.7% for IMO, suggesting that certain herbal formulations may be particularly promising for H₂S-dominant disease.
-
A prospective Cedars-Sinai elemental diet cohort (2025) demonstrated 73% breath test normalization, 83% global symptom improvement, and 89% methane reduction in IMO patients over 2–3 weeks, with only 3% discontinuation due to taste-related intolerance.
These data (final rows of Table 6) support herbal antimicrobials and elemental diet as viable alternatives or adjunctsto antibiotics—particularly in complex or recurrent cases—although the overall certainty of evidence remains low due to retrospective designs, small sample sizes, and heterogeneity of protocols.
Table 6
Summary of eradication and symptom response across treatment strategies
|
Category / regimen
|
Daily dose / regimen
|
No. of studies / cohort
|
N (total)
|
Eradication (microbiological)
|
Symptom response / RR
|
Key notes
|
|
Rifaximin monotherapy (SIBO – ITT)
|
600 mg/day
|
3
|
–
|
54.3% (95% CI 41.2–67.4)
|
Not consistently reported
|
Lower-dose regimens; clearly less effective than higher doses.
|
| |
800 mg/day
|
5
|
–
|
65.8% (52.1–79.5)
|
–
|
Intermediate efficacy.
|
| |
1200 mg/day
|
12
|
–
|
72.4% (62.8–82.0)
|
–
|
Widely used in European cohorts.
|
| |
1600–1650 mg/day (e.g. 550 mg TID)
|
8
|
–
|
76.2% (68.4–84.0)
|
–
|
Highest pooled eradication; standard modern dose.
|
|
Overall rifaximin (meta-analysis)
|
600–1650 mg/day
|
32
|
1,331
|
ITT 70.8% (61.4–78.2); PP 72.9% (65.5–79.8)
|
Symptom resolution 67.7% among eradicated
|
Adverse events 4.6%, no serious events.
|
|
Antibiotics vs placebo (symptoms)
|
Mixed antibiotics (6 RCTs)
|
6
|
196
|
–
|
RR 2.46 (95% CI 1.33–4.55) vs placebo
|
≈ 2.5-fold higher chance of global symptom improvement; NNT ≈ 2.8.
|
|
IMO – combination vs monotherapy
|
Rifaximin + neomycin
|
1
|
27
|
87% methane eradication
|
85% clinical response
|
Clearly superior to either drug alone; many rifaximin failures rescued with combination.
|
| |
Rifaximin alone
|
1
|
39
|
28% methane eradication
|
56% response
|
Markedly less effective than combination.
|
| |
Neomycin alone
|
1
|
8
|
33% methane eradication
|
63% response
|
Small sample; inferior to combination.
|
|
Herbal antimicrobials (Chedid 2014)
|
Standardized herbal protocols
|
1
|
37
|
46% breath test normalization
|
Clinical improvement broadly tracks eradication
|
12% absolute advantage vs rifaximin (not statistically significant).
|
|
Rifaximin comparator (Chedid 2014)
|
1200 mg/day
|
1
|
67
|
34% breath test normalization
|
–
|
Lower eradication than in prospective rifaximin trials.
|
|
Elemental diet cohort (Cedars-Sinai)
|
2–3 week elemental formula
|
1
|
30
|
73% breath test normalization
|
83% symptom improvement
|
89% methane reduction; ~3% discontinued due to taste; non-antibiotic alternative.
|
4.6 Recurrence and maintenance strategies
4.6.1 Recurrence rates
Follow-up data from three observational cohorts confirm that SIBO behaves as a chronic relapsing condition rather than a one-off infectious episode. In the pivotal prospective study by Lauritano et al. 2008 (Italy, n = 142; SIBO with rosacea treated with rifaximin 1200 mg/day for 10 days), 80 patients with documented post-treatment breath test normalization were followed for 9 months. Recurrence occurred in 43.7% (35/80), with most relapses clustering in the first year. Multivariable analysis identified older age, prior appendectomy, and chronic proton pump inhibitor (PPI) use as independent predictors of recurrence. A companion cohort, Lauritano et al. 2009, evaluated a similar post-rifaximin population but introduced maintenance prokinetic prophylaxis after eradication. Over 12 months, the recurrence rate fell to 12.6% (9/72), suggesting that enhanced motility may substantially reduce relapse risk. More recent data from Richard et al. 2021 (n≈?; mixed functional GI disorders) examined the short-term impact of different antibiotic strategies. At 3-month follow-up, recurrence was 30% in patients re-treated with a single antibiotic versus 51% in those managed with rotating antibiotic regimens, implying that empiric cycling does notprevent relapse and may instead mark more refractory disease. Across these cohorts (total 152 patients contributing recurrence data), the pooled recurrence rate is approximately 43.7% at 9 months, with low-certainty evidence due to observational design and heterogeneity. This supports treating SIBO as a chronic disease requiring long-term strategy, particularly in older patients, those with anatomical changes (appendectomy, surgery) or chronic acid suppression.
Table 7
SIBO recurrence after initial eradication
|
Study (year)
|
Country
|
Population / initial therapy
|
Follow-up duration
|
Recurrence rate after eradication
|
Identified risk factors / modifiers
|
|
Lauritano et al., 2008
|
Italy
|
SIBO + rosacea; rifaximin 1200 mg/day × 10 days
|
9 months
|
43.7% (35/80)
|
Older age, previous appendectomy, chronic PPI use
|
|
Lauritano et al., 2009
|
Italy
|
Post-rifaximin SIBO; maintenance prokineticadded
|
12 months
|
12.6% (9/72)
|
Reduced recurrence with prokinetic prophylaxis
|
|
Richard et al., 2021
|
–
|
Mixed GI disorders; post-antibiotic eradication
|
3 months
|
30% (single antibiotic) vs 51% (rotating regimen)
|
Higher relapse with rotating antibiotics vs single agent
|
These studies collectively indicate that, without structured maintenance, nearly half of successfully treated patients relapse within the first year, whereas prokinetic prophylaxis may cut recurrence to ~ 10–15% at 12 months.
Figure 8. Kaplan–Meier–style schematic of SIBO recurrence over time
Using the discrete timepoints reported in the above cohorts, a simplified step-plot was constructed (Fig. 8):
-
Lauritano 2008 (no maintenance): cumulative recurrence rises from 0 at baseline to 0.44 by 9 months, remaining stable thereafter.
-
Lauritano 2009 (with prokinetic prophylaxis): recurrence remains low, reaching only 0.13 at 12 months.
-
Richard 2021: at 3 months, cumulative recurrence already reaches 0.30 with a single-antibiotic strategy and 0.51with rotating antibiotics, emphasizing the early timing of relapse in many patients.
The schematic underscores three key messages: (1) recurrence frequently occurs within months of apparent cure; (2) motility-targeted maintenance (prokinetics) may meaningfully flatten the relapse curve; and (3) empiric rotation of antibiotics alone is insufficient to prevent relapse and may reflect more severe, relapsing disease biology.
Maintenance interventions
Evidence for maintenance strategies is limited and mostly observational, but converges on the importance of motility enhancement, dietary modification, and judicious use of repeat antibiotics.
· Prokinetic agents
o Low-dose erythromycin (typically 50–125 mg at bedtime) has been used as a motilin agonist to augment migrating motor complex (MMC) activity. Observational data in post-rifaximin cohorts suggest reduced short-term recurrence when erythromycin is prescribed nightly for several months, although no RCTs specifically powered for SIBO relapse exist.
o Prucalopride, a selective 5-HT4 agonist, is increasingly used in constipation-predominant or sluggish-motility phenotypes. Preliminary series report improved stool frequency and reduced bloating, with a signal toward lower recurrence, but data remain sparse.
o Herbal prokinetics (e.g., ginger-containing blends, Iberogast) are widely used in clinical practice. The review documents supportive but low-quality evidence, with small uncontrolled cohorts reporting subjective symptom stability and fewer repeat antibiotic courses.
· Dietary strategies
While no high-quality recurrence trials exist, the broader dataset on elemental diet and low-fermentable carbohydrate patterns suggests that reducing fermentable substrate load may help maintain remission. Elemental diets have shown 73% eradication and 83% sustained symptom improvement in a Cedars-Sinai cohort, and structured low-FODMAP or low-residue plans are commonly used as long-term maintenance diets after eradication.
· Repeat or rotating antibiotics
The Richard 2021 cohort indicates that rotating antibiotic regimens did not reduce 3-month recurrence and were associated with higher observed relapse (51%) than single-agent strategies (30%). This suggests that cycling antibiotics alone is not a reliable maintenance strategy and should be reserved for selected patients with strict attention to stewardship, side effects and resistance.
Taken together, the available data support viewing SIBO through a chronic disease lens: an effective acute eradication phase (rifaximin ± combination therapy) should be followed by maintenance prokinetics, individualized dietary modification, and correction of structural or pharmacologic drivers such as chronic PPI therapy to meaningfully reduce relapse risk.
Certainty of evidence (GRADE)
Certainty of evidence was assessed using the GRADE framework across the key outcomes of this review. Overall, diagnostic evidence for the glucose breath test (GBT) reached moderate certainty, whereas most treatment and recurrence outcomes were judged low certainty, primarily due to observational designs, heterogeneity, and imprecision in effect estimates.For GBT diagnostic accuracy, 14 studies (668 participants) showed consistent superiority of GBT over lactulose breath test, with pooled sensitivity 54.5% and specificity 83.2%. The certainty was rated moderate, downgraded one level for inconsistency (substantial between-study heterogeneity in thresholds, populations, and protocols) but not further penalized because of a coherent pattern favouring GBT. For rifaximin eradication, the large meta-analysis (32 studies, 1,331 participants) consistently demonstrated ITT eradication around 70.8% with excellent safety. However, most contributing studies were non-randomized or small RCTs with design limitations and high heterogeneity (I² >80%), leading to an overall low certainty rating, downgraded for risk of bias and inconsistency. The antibiotic symptom-response outcome (6 RCTs, 196 participants; RR 2.46, 95% CI 1.33–4.55) started at high certainty but was downgraded for imprecision (small total sample size, wide CIs) and indirectness (mixed populations and symptom definitions), yielding low certainty despite a robust direction of effect. Evidence for combination therapy in intestinal methanogen overgrowth (IMO) and for recurrence rates was based on few, largely observational cohorts. The IMO combination data derive from a single retrospective study (74 patients) showing a large benefit of rifaximin + neomycin over monotherapy (87% vs 28% methane eradication). Recurrence estimates come from three small cohorts with differing follow-up intervals and variable maintenance strategies, with a representative rate of 43.7% at 9 months without prophylaxis. Both outcomes were rated low certainty, downgraded for observational design, inconsistency and limited precision; the large apparent effect sizes for combination therapy in IMO may justify close clinical attention but do not meet formal GRADE upgrade criteria given the potential for confounding. Taken together, the GRADE assessment supports moderate confidence in GBT as the preferred non-invasive diagnostic tool, but only low confidence in current estimates for eradication, symptom benefit, IMO combination therapy, and recurrence, underscoring the need for larger, well-designed RCTs and standardized diagnostic/treatment protocols.
Table 8
GRADE Summary of Findings for key SIBO outcomes
|
Outcome
|
No. studies / participants
|
Effect size (summary estimate)
|
Certainty (GRADE)
|
Reasons for downgrading / upgrading
|
|
GBT diagnostic accuracy vs culture
|
14 studies / 668 patients
|
Sensitivity 54.5%, specificity 83.2%; PLR 2.45, NLR 0.60; DOR 5.17; AUC 0.74
Small Intestinal Bacterial
|
⊕⊕⊕◯ Moderate
|
Downgraded for inconsistency: substantial heterogeneity in test protocols, cut-off thresholds, and clinical populations (I² ~75–80%), though effect direction consistently favours GBT over LBT. No serious concerns for risk of bias, indirectness, imprecision or publication bias.
|
|
Rifaximin eradication (ITT)
|
32 studies / 1,331 patients
|
Eradication 70.8% (95% CI 61.4–78.2%), PP 72.9%; AEs 4.6% with no serious events
|
⊕⊕◯◯ Low
|
Downgraded for risk of bias (many non-randomized, open-label or small RCTs; variable allocation concealment and outcome blinding) and inconsistency (high I², wide range of eradication across doses and populations). No upgrade despite large pooled sample, as heterogeneity and design limitations remain substantial.
|
|
Antibiotic symptom improvement vs placebo
|
6 RCTs / 196 patients
|
RR 2.46 (95% CI 1.33–4.55) for global symptom response; NNT ≈ 2.8; low heterogeneity (Q = 6.37, p = 0.27)
|
⊕⊕◯◯ Low
|
Starts as high (RCTs), downgraded for imprecision (small total sample, relatively wide CI) and indirectness (mixed diagnoses, variable symptom scales and follow-up durations). No upgrade, as effect size is moderate rather than very large and potential selection/measurement biases persist.
|
|
Combination therapy for IMO (rifaximin + neomycin)
|
1 retrospective study / 74 patients
|
Methane eradication 87%with rifaximin + neomycin vs 28% with rifaximin and 33% with neomycin; clinical response 85% vs 56–63%
|
⊕⊕◯◯ Low
|
Starts as low (observational). Downgraded for risk of bias (retrospective, non-random allocation, potential confounding by indication) and imprecision/indirectness (single-centre IBS-IMO population). Large effect size suggests possible upgrade, but residual confounding is very likely, so certainty remains low.
|
|
SIBO recurrence after eradication
|
2–3 cohorts / 152 patients
|
Representative recurrence 43.7% at 9 months without maintenance; 12.6% at 12 months with prokinetic prophylaxis; 30–51% at 3 months in other cohorts
|
⊕⊕◯◯ Low
|
Starts as low (observational). Downgraded for inconsistency (different underlying populations, follow-up intervals and maintenance strategies) and indirectness (mixed etiologies, varying definitions of recurrence). No upgrade, as effect sizes are large but susceptible to confounding and selection bias.
|
In summary, the GRADE assessment highlights that GBT accuracy is supported by moderate-certainty evidence, whereas all major therapeutic and recurrence outcomes remain low-certainty, guiding the tone of clinical recommendations and emphasizing the need for higher-quality comparative trials and standardized endpoints.