Skip to main content

16S rRNA sequencing analysis of the oral and fecal microbiota in colorectal cancer positives versus colorectal cancer negatives in Iranian population



Colorectal cancer (CRC) poses a significant healthcare challenge, accounting for nearly 6.1% of global cancer cases. Early detection, facilitated by population screening utilizing innovative biomarkers, is pivotal for mitigating CRC incidence. This study aims to scrutinize the fecal and salivary microbiomes of CRC-positive individuals (CPs) in comparison to CRC-negative counterparts (CNs) to enhance early CRC diagnosis through microbial biomarkers.

Material and methods

A total of 80 oral and stool samples were collected from Taleghani Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran, encompassing both CPs and CNs undergoing screening. Microbial profiling was conducted using 16S rRNA sequencing assays, employing the Nextera XT Index Kit on an Illumina NovaSeq platform.


Distinct microbial profiles were observed in saliva and stool samples of CPs, diverging significantly from those of CNs at various taxonomic levels, including phylum, family, and species. Saliva samples from CPs exhibited abundance of Calothrix parietina, Granulicatella adiacens, Rothia dentocariosa, and Rothia mucilaginosa, absent in CNs. Additionally, Lachnospiraceae and Prevotellaceae were markedly higher in CPs' feces, while the Fusobacteria phylum was significantly elevated in CPs' saliva. Conversely, the non-pathogenic bacterium Akkermansia muciniphila exhibited a significant decrease in CPs' fecal samples compared to CNs.


Through meticulous selection of saliva and stool microbes based on Mean Decrease GINI values and employing logistic regression for saliva and support vector machine models for stool, we successfully developed a microbiota test with heightened sensitivity and specificity for early CRC detection.


Colorectal cancer (CRC) stands as a leading cause of cancer-related mortality in both developed and developing countries [1]. Implementation of population-based CRC screening has demonstrated a potential to reduce CRC incidence, garnering strong recommendations [3, 4]. Notably, over 85% of CRC cases originate from pre-malignant adenoma polyps, emphasizing the preventive nature of early detection [5]. The primary objective of CRC screening is to identify pre-symptomatic neoplastic lesions, thereby reducing the overall incidence through timely intervention and examination [6].

The prevailing CRC screening approaches involve fecal immunochemical tests (FIT) coupled with subsequent colonoscopies for positive cases, or periodic endoscopic procedures such as flexible sigmoidoscopy every 5 years or colonoscopy every 10 years [8, 9]. Ongoing considerations include alternative screening methods like fecal DNA analysis and CT colonography [5]. However, the efficacy of any screening program hinges on two pivotal factors: compliance and accuracy [10]. Despite the success observed in various strategies, overall individual compliance remains suboptimal, with rates falling below 52% in CRC screening initiatives [5]. Therefore, there is a growing consensus that novel strategies, encompassing the amalgamation of established tests or the introduction of convenient screening alternatives, could significantly enhance population-based CRC screening adherence [11, 12].

Remarkably, altered microbiota composition has emerged as a potential foundation for a highly sensitive and specific CRC screening test [13,14,15,16,17,18]. Beyond microbiota, their proteins and metabolites contribute to CRC pathogenesis, with reciprocal interactions influencing host proteins and metabolites in CRC development [19]. Significantly, signatures derived from the abundance of bacterial proteins, particularly those associated with signal transduction systems like sensory proteins, hold promise in distinguishing between healthy and diseased states [19].

In this context, our study represents a continuation of previous efforts focused on early CRC detection based on microbial biomarkers [15, 20, 21]. We aim to assess fecal and oral microbiota through 16S rRNA sequencing analysis, exploring the abundance and variation of pathogenic oral and fecal microbiota composition between CRC-positive individuals (CPs) and CRC-negative counterparts (CNs) in the Iranian population. Additionally, we investigate the status of nonpathogenic microorganisms, including probiotics and short-chain fatty acid (SCFA)-producing bacteria, in the feces of CPs compared to CNs. Ultimately, we endeavor to develop classifier models utilizing oral and fecal microbiota profiles, with the intent of enhancing the diagnostic capabilities for early CRC detection with high sensitivity and specificity.


Demographic results

Demographic characterization of participants with related p-value between CPs and CNs are presented in Table 1. The population study was characterized by similar distributions of gender, viral infection, alcohol consumption and dietary habit. The profession, family history, disease and surgical history, smoking habit and physical activity had significant differences between the CPs and CNs based on p-value.

Table 1 Demographic characteristics of CRC positives (CPs) and CRC negatives (CNs)

16S rRNA sequencing analysis of clinical samples:

Top 10 microbes with more abundance in CPs versus CNs

We conducted a comparison of the frequency of the top 10 microbes that were most abundant CPs, analyzing both fecal and oral samples, in terms of phylum, family, and species in comparison to CN samples (see Fig. 1). Notably, some of these microbes were completely absent in CNs, while others exhibited a significant difference in their presence.

Fig. 1
figure 1

The frequency of top 10 bacteria that were most abundant in oral and fecal samples of colorectal cancer positives (CPs) for phylum, family, and species versus colorectal cancer negatives (CNs) [# = CRC-exclusive bacteria, * = significant CRC vs. normal differences]

In the saliva of CPs, Chloroflexi, Lactobacillaceae, Rivulariaceae, Calothrix parietina, Rothia dentocariosa, and Rothia mucilaginosa ranked among the top 10 microbes, none of which were present in the saliva of CN individuals. Conversely, in the feces of CRC patients, Coprobacillaceae, Enterococcaceae, Neisseriaceae, Streptococcaceae, Bacteroides cellulosilyticus, Coprobacillus cateniformis, Porphyromonas asaccharolytica, Sphingobacterium bambusae, and Streptococcus vestibularis were identified among the 10 most abundant microbes at the family and species levels, with none of them present in CN participants.

Furthermore, our analysis revealed a higher abundance of microbes such as Fusobactria in the saliva of CRC patients compared to CN individuals. Additionally, a significant p-value indicated a higher amount of Lachnospiraceae and Prevotellaceae in the stool of CPs compared to controls, suggesting that these microbes are present in both CNs and CPs, but their quantity is elevated in CPs.

In the Table 2, the median and the p-value of these 10 more abundant microbes in the saliva and feces of CRC patients compared to CNs regarding the phylum, family and species have been investigated in detail.

Table 2 Median (first quartile, third quartile) and a p-value of each individual candidate bacteria based on abundancy

Non-pathogenic microbiota

An investigation into a range of commensal microbiota, including Lactobacillaceae, Bifidobacteriaceae, Ruminococcaceae, Lachnospiraceae, Lactobacillus, Bifidobacterium, Akkermansia, Roseburia, Faecalibacterium, and Ruminococcus, was conducted in the feces of CPs in comparison to CNs (see Fig. 2). Notably, among all the non-pathogenic microbes analyzed in the stool samples, the genus Akkermansia and the species Akkermansia muciniphila were significantly more abundant in the CN group than in CRC patients.

Fig. 2
figure 2

The higher abundancy of the genus Akkermansia and the species Akkermansia muciniphila among all the non-pathogenic microbes in the stool samples of colorectal cancer negatives versus colorectal cancer positive patients

Based on microbial variables that have the least missing data, 24 microbes in saliva and 27 microbes in stool were selected. AUROC, sensitivity, specificity, PPV, NPV and ACC were calculated for each bacterium. For ROC analysis, four different models were used, including logistic regression, support vector machine, naïve bayes and neural network. In Table 3 we showed which microbes are most important in predicting CRC. Four of them in saliva have the highest AUC which include Porphyromonadaceae, Unclassified at Family level, Fusobacteria, and Streptococcus infantis. Also, four of the microbes in stool have the highest AUC, which include Lachnospiraceae, Proteobacteria, Nitrospirae and Escherichia albertii. Confidence interval (CI) was reported for SE, SP, PPV, NPV and ACC.

Table 3 The Prediction performance using logistic regression for each microbiota

In Fig. 3, important microbes in predicting CRC in saliva include Streptococcus infantis, Fusobacteria, Actinobacteria, Porphyromonadaceae, Streptococcus tigurinus, Streptococcaceae, Spirochaetes, Unclassified at Family level, and Unclassified at phylum level. Also, important microbes in predicting CRC in stool include Lachnospiraceae, Proteobacteria, Nitrospirae, Prevotellaceae, Escherichia albertii, Ruminococcaceae, Veillonellaceae, Clostridiaceae, and Alcaligenaceae.

Fig. 3
figure 3

Mean Decrease GINI model for colorectal cancer prediction. Higher mean decreases in GINI for bacteria show that bacteria are more important in predicting CRC. *The Mean Decrease GINI presents those microbes that have the highest amount in GINI, their removal makes the model worse in the direction of predicting CRC and their presence helps the model to be powerful

Combination of selected variable microbiota based on mean decrease GINI model for improvement of the diagnostic ability for early detection of CRC

The desired microbial variables were selected based on Mean Decrease GINI, and then we examined multiple regressions. Multiple regressions mean to use certain microbiota simultaneously in certain statistical models to predict CRC patients. Four different models including logistic regression, support vector machine, Naïve Bayes, neural network were selected along with a selection of microbiota based on GINI. For saliva, the logistic model is the best model among others due to its simplicity and AUC of 91%, SE of 87%, SP of 80%, PPV 87%, NPV of 80% and ACC of 84% (Table 4). For stool, the support vector machine was the best model because it has performed with the highest AUC of 97%, SE of 92%, SP of 93%, PPV of 96%, NPV of 87% and ACC of 90% compared to other models, even the simple logistic regression (Table 4).

Table 4 The Prediction performance using logistic regression with selected variables for each microbiota

ROC curves with performance of logistic regression, support vector machine, naïve Bayes and neural network models along with a selection of microbiota based on mean decrease GINI were demonstrated in Fig. 4. At the best cutoff value, this panel of bacteria could be used to discriminate CP patients from CN individuals.

Fig. 4
figure 4

ROC curves with performance of logistic model, support vector machine, naïve bayes and neural network models using selected variables


In this study, we conducted the first-ever examination of the integrated microbiome from stool and saliva samples of colorectal cancer (CRC) patients in comparison to healthy controls (CNs) within the Iranian population, utilizing the 16S rRNA sequencing method. The utilization of microbiota as biomarkers for disease and health has gained significant traction, particularly with the advancements in 16S rRNA sequencing technology.

Our results, as depicted in the demographic table, reveal a noteworthy difference between CPs and CNs concerning occupation, physical activity, and smoking habits. Interestingly, housewives and retired individuals exhibited a higher prevalence of CRC compared to working and non-retired individuals. Furthermore, smoking and a lack of exercise were more prevalent among CP patients compared to CNs.

In general, the incidence of CRC tends to be higher in individuals over 50 years old, whereas those under 50 years old, who typically undergo screening, are generally healthier. This age-related discrepancy is a noteworthy factor contributing to the differences observed between the CP and CN groups. Additionally, the occurrence of CRC in individuals with a family history of the disease and a personal history of other illnesses and surgeries was more prevalent than in CNs. This implies that individuals with a susceptibility marked by a history of other diseases and surgeries are more predisposed to CRC than those without such histories.

The notable observation of distinct microbial profiles between CPs and CNs highlights a significant aspect, suggesting that the microbiome may play a crucial role in the initiation and development of CRC. For instance, certain microbial patterns were found to be significantly more abundant in CRC patients compared to CNs, with specific examples including Chloroflexi, Lactobacillaceae, Rivulariaceae, Calothrix parietina, Rothia dentocariosa, and Rothia mucilaginosa, which exhibited higher abundancy in the saliva of CRC patients but were entirely absent in CN individuals. Similarly, Coprobacillaceae, Enterococcaceae, Neisseriaceae, Streptococcaceae, Bacteroides cellulosilyticus, Coprobacillus cateniformis, Porphyromonas asaccharolytica, Sphingobacterium bambusae, and Streptococcus vestibularis were identified as the most abundant microbes in the feces of CRC patients, whereas they were absent in CN individuals.

While our findings suggest a compelling association between the presence or absence of certain microbes and CRC, it is essential to conduct studies on a larger population to provide more definitive insights. Our results align with the research by Flemer et al. [18], who identified 63 operational taxonomic units (OTU) distinguishing CRC cases from CNs, including 29 oral OTU and 34 stool OTU. Additionally, our findings are consistent with previous studies that have highlighted the ability of specific microbiota to differentiate individuals with CRC or adenoma polyps from healthy individuals.

Notably, research conducted across various geographical regions such as the USA, Canada, Ireland, Spain, China, Colorado, France, and India has explored the increased presence of bacteria in CRC. Despite differences in ethnicity and geography influencing microbial patterns, it is intriguing that many of the microbes identified in these studies closely correlate with those increased in our CRC patients, including Fusobacterium, Porphyromonas, Prevotella, Bacteroides, and Streptococcus [18, 22,23,24,25,26,27,28].

Identifying a group of microbes with higher abundance in CPs than in healthy CNs and demonstrating statistical significance is crucial, as it facilitates the selection of potential biomarker candidates. In our study, we observed an increased number of Fusobacteria in the saliva of CRC patients compared to CNs, as well as a higher abundance of Lachnospiraceae and Prevotellaceae in the stool of CRC patients compared to CNs. Consistent with our findings, Flemer et al. reported differential abundance of certain oral microbiotas between CPs and CNs, including Parvimonas, Haemophilus, Prevotella, Alloprevotella, Neisseria, Lachnoanaerobaculum, and Streptococcus [18].

Furthermore, non-pathogenic microbiota in the human gut or microbiota that produces short-chain fatty acids (SCFA) play a crucial role in human health and disease prevention [29]. In our research, Akkermansia muciniphila showed significantly higher abundance in CNs compared to CPs. Akkermansia muciniphila is an important bacterium that degrades mucin in the gut, and its role is debated regarding whether it is beneficial or harmful [30]. Patients with conditions such as overweight, obesity, type 2 diabetes [31], and inflammatory bowel disease (ulcerative colitis and Crohn's disease) [33, 34] have exhibited reduced levels of Akkermansia muciniphila in their intestines. In contrast to our findings, Wang et al. reported that Akkermansia muciniphila exacerbated the development of colitis-associated CRC in mice [35]. However, similar to our study, Gu et al. concluded that an increased number of Akkermansia muciniphila is associated with protection against inflammatory bowel disease (IBD) and CRC following interventions with nutrients, prebiotics, probiotics, and medications [36]. They noted that despite these therapeutic benefits, some animal studies, such as Wang et al.'s experiment, have reported a negative association with Akkermansia muciniphila [35, 36]. Therefore, it is advisable to consider Akkermansia muciniphila as both a "friend and foe" until additional research and clinical examinations provide further clarity.

A limitation of this study is the small sample size of the cohort, which lacks geographical coverage and broader applicability of the microbiome-based biomarker approach. Validation and confirmation of these findings would benefit from a larger population. Additionally, there is an age difference between the CPs and CNs, which we have attempted to minimize for future studies.

Furthermore, utilizing a combination of selected variable microbiota based on the Mean Decrease GINI model platform, we aimed to enhance the diagnostic ability for the early detection of CRC. For saliva, logistic regression emerged as the optimal model due to its simplicity, boasting an AUC of 91%, sensitivity of 87%, specificity of 80%, PPV of 87%, NPV of 80%, and an ACC of 84%. In contrast, for stool, the support vector machine outperformed other models, achieving the highest AUC of 97%, sensitivity of 92%, specificity of 93%, PPV of 96%, NPV of 87%, and ACC of 90%.

In previous studies, we examined fecal samples of CRC and polyps’ cases versus normal individuals in the Iranian population, employing three models of logistic regression, simple linear combination, and factor with the q-PCR method, ultimately determining specific biomarkers [15]. We identified elevated counts of F. nucleatum, Enterococcus faecalis, Streptococcus bovis, Enterotoxigenic Bacteroides fragilis, and Porphyromonas spp. in CRC stages 0 and I, as well as in adenoma polyps’ cases, specifically in tubular adenomas and notably in villous and tubovillous adenomas. This contrasts with samples from normal, hyperplastic, and sessile serrated adenoma groups.

However, in the current study, we investigated the entire fecal and saliva microbiota of CRC patients and CNs in the Iranian population using the 16S rRNA sequencing technique. Statistical modeling was not limited to stool but extended to saliva as well. Sensitivity and specificity were determined, and biomarker candidates were selected. In parallel with our study, Flemer et al. [18] identified 16 oral microbiota OTUs that distinguished CRC patients from CN individuals with a sensitivity of 53% and specificity of 96%. Their model's sensitivity to using fecal microbiota to distinguish CRC patients was 22% with a specificity of 95%. However, with the combination of oral and stool microbiota, the model's sensitivity increased to 76% for CRC detection.

Furthermore, an identical set of biomarkers between our study and the studies of Yuan et al., Deng et al., and Choi et al. included Bacteroides, Prevotella, Fusobacterium nucleatum, and Veillonella dispar [37,38,39]. By comparing the differences and similarities between our study and these findings, we emphasize the necessity of investigating a large cohort consisting of different geographical populations of CP and CN individuals from Europe, Asia, and America to comprehensively compare the microbiome.


Our findings indicate that both oral and fecal microbiota have the potential to differentiate individuals with CPs from CNs. Additionally, our study revealed a reduction in the abundance of Akkermansia muciniphila in the stool of patients with CRC. This raises the question of whether these microbes play a crucial role in maintaining health, and their diminished presence may be associated with the pathogenesis of CRC.

Given these observations, further research into the cellular and molecular mechanisms of Akkermansia muciniphila is warranted and should be conducted extensively. Moreover, we recommend larger prospective studies that encompass diverse geographical populations with varying diets. These studies should incorporate the analysis of FIT, fecal microbiota, and oral microbiota composition to validate the promising results obtained in our study.


Study population

The current study follows a case–control design, and clinical samples, including saliva and stool (n = 80), were gathered from participants who underwent colonoscopy at Taleghani Hospital in Tehran, Iran, between 2020 and 2021. All participants volunteered to take part in the study, and samples were obtained prior to the colonoscopy procedure. Those enrolled in the study presented symptoms such as rectal bleeding, changes in bowel movements, abdominal pains, and anemia, prompting their initial screening. CN individuals also underwent the screening test, and their colonoscopy results indicated normal findings. The inclusion and exclusion criteria are thoroughly detailed in our recently published article [16]. Additionally, demographic information for the studied groups was collected through questionnaire forms.

Stool and saliva samples collection, storage, and extraction

Fecal samples were collected before colonoscopy, at a point when the gut microbiota had returned to baseline levels [15, 20]. These stool samples were preserved at − 80 °C at Taleghani Hospitals until subsequent analysis. Similarly, saliva samples were stored at − 80 °C until utilized in the experiments. The comprehensive protocol for sample collection has been detailed in our prior study [16].

Patients underwent diagnosis through colonoscopy and histopathological review of any biopsy. For oral specimens, thawing was done on ice, and Genomic DNA was extracted using the QIAamp DNA Microbiome Kit from Qiagen (Hilden, Germany). In parallel, stool specimens were thawed, and DNA extraction was carried out using the QIAamp DNA Fecal Mini Kit (Qiagen), following the procedures explained earlier [21, 22].

PCR amplification and sequencing

The gene specific sequences applied here target the 16S rRNA V3 and V4 regions using primers: a forward (5′TCGTCGGCAGCGTCAGATGTGTATA AGAGACAGCCTACGGGNGGCWGCAG3′) and a reverse (5′GTCTCGTGGGCTCGGAG ATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC3′). The 25 µL PCR was set up as follow: 12.5 µL per sample 2xKAPA HiFi HotStart Ready Mix, 5 µL forward primer (1 µM), 5 µL reverse primer (1 µM), and 2.5 µL genomic DNA of bacteria (5 ng/µL in 10 mM Tris pH 8.5). The thermal cycling situation for amplification of PCR was as follows: initial incubation step at 98 °C for 3 min, 30 denaturation cycles at 94 °C for 30 s, annealing step at 55 °C for 30 s, extension at 72 °C for 30 s, and a final extension at 72 °C for 5 min [16]. Then, 1 µL of PCR product was run on a BioanalyzerDNA 1000 chip to verify the size. Using the V3 and V4 primer pairs in current study, the expected size on a Bioanalyzer trace after the Amplicon PCR step is ~ 550 bp. Amplicon product purification was done with AMPure XP beads based on the manufacturer’s protocol to remove contaminants and PCR artifacts. Purified amplicons were utilized to construct the library based on standard protocols, and sequencing was done using the Nextera XT Index Kiton on an Illumina NovaSeq platform (Illumina, San Diego, CA, USA) [16].

Demultiplexed raw sequences were imported into QIIME2 v.2022-2 [40] and were denoised and clustered using DADA2 [41]. Taxonomy classification was done using the pre-trained, via scikit-learn [42], SILVA [43] with 138 99% full-length sequences. The resulting amplicon sequence variant (ASV) table, taxonomy assignment, and appropriate metadata were applied as input for the Marker Data Profiling module of the online platform Microbiome Analyst [44]. Features with low counts (< 4 and < 20% prevalence in samples, n = 1815) along with those with low variance (based on interquartile range, n = 25) were excluded from the downstream analyses counts were normalized using Total Sum Scaling (TSS).

Statistical analysis

Descriptive statistics were presented using mean ± standard deviation (SD) and median (interquartile range [IQR]) for quantitative data by group (CNs and CPs). The independent t-test was applied to compare the mean of age between CRC and normal groups. The Fisher exact test or exact Pearson Chi-Square was used to evaluate the relation between categorical variables and group. Barplots were utilized to show the frequency of microbiota and compare them between the CPs and CNs groups. The "*" symbol in barplots represents statistically significant differences between CRC samples and normal samples, while the "#" symbol highlights CRC-exclusive bacteria. Analyses were conducted applying SPSS (version 26) and R (version 4.2.1). p-values less than 0.05 were assumed as statistically significant.

Machine learning algorithm

In current study, subjects were randomly divided into two groups: training specimens (70% of samples) and validation specimens (30% of samples). Models were created based on training data and tested based on validation data. It is possible for a patient to appear in only one sample, depending on which sample was used. Data in training was used to expand models including logistic regression (LR), naive baye (NB), support vector machine (SVM), and neural network (NN) [45,46,47].

Tune parameters

Each of the methods described here has a number of parameters associated with it, and it is crucial that the most appropriate parameter be selected in order to produce both the optimal and minimal model. In order to accurately predict diseases, each algorithm was fine-tuned. The fivefold cross validation was used with ten iterations to tune each machine learning algorithm, utilizing available statistical codes and R packages.

Performance evaluation

An area under Receiver Operating Characteristics (ROC) curve (AUC) was used to estimate and compare models, followed by sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV), and accuracy (ACC). AUC was used as the criteria for selecting the most effective model for clinical decision-making. The ROC curve depicts the sensitivity and specificity of different diagnostic tests. There is no discrimination for example the ability to diagnose cases with or without a disease at AUC 0.5, 0.7–0.8 is acceptable, 0.8–0.9 is excellent, and more than 0.9 is exceptional [48]. Sensitivity is defined as the percentage of patients with the disease predicted in the model to be patients with the disease. The model must be able to nicely recognize all CRC cases in regard to attain 100% sensitivity. The specificity of the model refers to the percentage of cases without the CRC who will be predicted to be CNs as a result of the model. The model should nicely recognize all CNs in order to be 100% specific. PPP refers to the percentage of CRC cases who were speculated to have CRC who really have it. NPV refers to the proportion of individuals speculated as CNs that really do not have CRC. A prediction's ACC is assessed by dividing the number of correct predictions by the number of observations.

Selection variable

A Random Forest technique was used in regard to characterize the importance of the variable based on the mean decrease in GINI. Higher mean decreases in GINI for gut bacteria show that bacteria are more important in predicting CRC [49]. A fivefold cross-validation method with 10 iterations was applied to tune the parameters of the random forest.

Availability of data and materials

Most data generated and analyzed during this study is included in this published article. Additional dataset used and/or analyzed during the current study are available from the corresponding author on reasonable request.



Colorectal cancer


Colorectal cancer positives


Colorectal cancer negatives


Fecal immunochemical test


Quantitative PCR


Confidence interval


Operational taxonomic units


Standard deviation


Interquartile range


Logistic regression models


Naïve Bayes models


Support vector machines


Neural network models


Receiver operating characteristics


Area under curve


Positive predictive value


Negative predictive value




  1. Rawla P, Sunkara T, Barsouk A. Epidemiology of colorectal cancer: incidence, mortality, survival, and risk factors. Przeglad Gastroenterol. 2019;14(2):89–103.

    CAS  Google Scholar 

  2. Roselló S, Simón S, Cervantes A. Programmed colorectal cancer screening decreases incidence and mortality. Transl Gastroenterol Hepatol. 2019;4:84.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Zhang J, Chen G, Li Z, Zhang P, Li X, Gan D, et al. Colonoscopic screening is associated with reduced Colorectal Cancer incidence and mortality: a systematic review and meta-analysis. J Cancer. 2020;11(20):5953–70.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Rex DK, Boland CR, Dominitz JA, Giardiello FM, Johnson DA, Kaltenbach T, et al. Colorectal cancer screening: recommendations for physicians and patients from the US multi-society task force on colorectal cancer. Am J Gastroenterol. 2017;112(7):1016–30.

    Article  PubMed  Google Scholar 

  5. Quintero E, Hassan C, Senore C, Saito Y. Progress and challenges in colorectal cancer screening. Gastroenterol Res Pract. 2012;2012:846985.

    PubMed  PubMed Central  Google Scholar 

  6. Gupta N, Kupfer SS, Davis AM. Colorectal cancer screening. JAMA. 2019;321(20):2022–3.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Bucchi L, Mancini S, Baldacchini F, Ravaioli A, Giuliani O, Vattiato R, et al. How a faecal immunochemical test screening programme changes annual colorectal cancer incidence rates: an Italian intention-to-screen study. Br J Cancer. 2022;127(3):541–8.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Levin B, Lieberman DA, McFarland B, Andrews KS, Brooks D, Bond J, et al. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: a joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. Gastroenterology. 2008;134(5):1570–95.

    Article  CAS  PubMed  Google Scholar 

  9. Atkin WS, Edwards R, Kralj-Hans I, Wooldrage K, Hart AR, Northover JM, et al. Once-only flexible sigmoidoscopy screening in prevention of colorectal cancer: a multicentre randomised controlled trial. Lancet (London). 2010;375(9726):1624–33.

    Article  Google Scholar 

  10. Shaukat A, Levin TR. Current and future colorectal cancer screening strategies. Nat Rev Gastroenterol Hepatol. 2022;19(8):521–31.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Iragorri N, Spackman E. Assessing the value of screening tools: reviewing the challenges and opportunities of cost-effectiveness analysis. Public Health Rev. 2018;39(1):17.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Hol L, van Leerdam ME, van Ballegooijen M, van Vuuren AJ, van Dekken H, Reijerink JC, et al. Screening for colorectal cancer: randomised trial comparing guaiac-based and immunochemical faecal occult blood testing and flexible sigmoidoscopy. Gut. 2010;59(1):62–8.

    Article  CAS  PubMed  Google Scholar 

  13. Sánchez-Alcoholado L, Ramos-Molina B, Otero A, Laborda-Illanes A, Ordóñez R, Medina JA, et al. The role of the gut microbiome in colorectal cancer development and therapy response. Cancers. 2020;12(6):1.

    Article  Google Scholar 

  14. Cheng Y, Ling Z, Li L. The intestinal microbiota and colorectal cancer. Front Immunol. 2020;11:615056.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Rezasoltani S, Sharafkhah M, Asadzadeh Aghdaei H, Nazemalhosseini Mojarad E, Dabiri H, Akhavan Sepahi A, et al. Applying simple linear combination, multiple logistic and factor analysis methods for candidate fecal bacteria as novel biomarkers for early detection of adenomatous polyps and colon cancer. J Microbiol Methods. 2018;155:82–8.

    Article  CAS  PubMed  Google Scholar 

  16. Rezasoltani S, Aghdaei HA, Jasemi S, Gazouli M, Dovrolis N, Sadeghi A, et al. Oral microbiota as novel biomarkers for colorectal cancer screening. Cancers (Internet). 2023;15(1):1.

    Google Scholar 

  17. Liang Q, Chiu J, Chen Y, Huang Y, Higashimori A, Fang J, et al. Fecal bacteria act as novel biomarkers for noninvasive diagnosis of colorectal cancer. Clin Cancer Res Off J Am Assoc Cancer Res. 2017;23(8):2061–70.

    Article  CAS  Google Scholar 

  18. Flemer B, Warren RD, Barrett MP, Cisek K, Das A, Jeffery IB, et al. The oral microbiota in colorectal cancer is distinctive and predictive. Gut. 2018;67(8):1454–63.

    Article  CAS  PubMed  Google Scholar 

  19. Bhar S, Singh R, Pinna Nishal K, Bose T, Dutta A, Mande SS. Sensing host health: insights from sensory protein signature of the metagenome. Appl Environ Microbiol. 2022;88(15):e00596-e622.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Rezasoltani S, Asadzadeh Aghdaei H, Dabiri H, Akhavan Sepahi A, Modarressi MH, Nazemalhosseini ME. The association between fecal microbiota and different types of colorectal polyp as precursors of colorectal cancer. Microb Pathog. 2018;124:244–9.

    Article  PubMed  Google Scholar 

  21. Rezasoltani S, Ghanbari R, Looha MA, Mojarad EN, Yadegar A, Stewart D, et al. Expression of main toll-like receptors in patients with different types of colorectal polyps and their relationship with gut microbiota. Int J Mol Sci (Internet). 2020;21(23):1.

    Google Scholar 

  22. Baxter NT, Ruffin MTT, Rogers MA, Schloss PD. Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med. 2016;8(1):37.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Zeller G, Tap J, Voigt AY, Sunagawa S, Kultima JR, Costea PI, et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol Syst Biol. 2014;10(11):766.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Allali I, Delgado S, Marron PI, Astudillo A, Yeh JJ, Ghazal H, et al. Gut microbiome compositional and functional differences between tumor and non-tumor adjacent tissues from cohorts from the US and Spain. Gut Microbes. 2015;6(3):161–72.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Gao Z, Guo B, Gao R, Zhu Q, Qin H. Microbiota disbiosis is associated with colorectal cancer. Front Microbiol. 2015;6:1.

    Article  Google Scholar 

  26. Dejea CM, Wick EC, Hechenbleikner EM, White JR, Mark Welch JL, Rossetti BJ, et al. Microbiota organization is a distinct feature of proximal colorectal cancers. Proc Natl Acad Sci USA. 2014;111(51):18321–6.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  27. Marchesi JR, Dutilh BE, Hall N, Peters WH, Roelofs R, Boleij A, et al. Towards the human colorectal cancer microbiome. PLoS ONE. 2011;6(5):e20447.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  28. Sobhani I, Tap J, Roudot-Thoraval F, Roperch JP, Letulle S, Langella P, et al. Microbial dysbiosis in colorectal cancer (CRC) patients. PLoS ONE. 2011;6(1):e16393.

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  29. Silva YP, Bernardi A, Frozza RL. The role of short-chain fatty acids from gut microbiota in gut-brain communication. Front Endocrinol. 2020;11:1.

    Article  Google Scholar 

  30. Kim S, Shin YC, Kim TY, Kim Y, Lee YS, Lee SH, et al. Mucin degrader Akkermansia muciniphila accelerates intestinal stem cell-mediated epithelial development. Gut Microbes. 2021;13(1):1–20.

    Article  PubMed  Google Scholar 

  31. Depommier C, Everard A, Druart C, Plovier H, Van Hul M, Vieira-Silva S, et al. Supplementation with Akkermansia muciniphila in overweight and obese human volunteers: a proof-of-concept exploratory study. Nat Med. 2019;25(7):1096–103.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Roshanravan N, Mahdavi R, Alizadeh E, Ghavami A, Rahbar Saadat Y, Mesri Alamdari N, et al. The effects of sodium butyrate and inulin supplementation on angiotensin signaling pathway via promotion of Akkermansia muciniphila abundance in type 2 diabetes: a randomized, double-blind, placebo-controlled trial. J Cardiovasc Thor Res. 2017;9(4):183–90.

    Article  Google Scholar 

  33. Zhang T, Ji X, Lu G, Zhang F. The potential of Akkermansia muciniphila in inflammatory bowel disease. Appl Microbiol Biotechnol. 2021;105:1.

    Article  CAS  Google Scholar 

  34. Rodrigues VF, Elias-Oliveira J, Pereira ÍS, Pereira JA, Barbosa SC, Machado MSG, et al. Akkermansia muciniphila and gut immune system: a good friendship that attenuates inflammatory bowel disease, obesity, and diabetes. Front Immunol. 2022;13:1.

    Article  Google Scholar 

  35. Wang F, Cai K, Xiao Q, He L, Xie L, Liu Z. Akkermansia muciniphila administration exacerbated the development of colitis-associated colorectal cancer in mice. J Cancer. 2022;13(1):124–33.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gu ZY, Pei WL, Zhang Y, Zhu J, Li L, Zhang Z. Akkermansia muciniphila in inflammatory bowel disease and colorectal cancer. Chin Med J. 2021;134(23):2841–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Yuan B, Ma B, Yu J, Meng Q, Du T, Li H, et al. Fecal bacteria as non-invasive biomarkers for colorectal adenocarcinoma. Front Oncol. 2021;11:1.

    Article  Google Scholar 

  38. Choi S, Chung J, Cho M-L, Park D, Choi SS. Analysis of changes in microbiome compositions related to the prognosis of colorectal cancer patients based on tissue-derived 16S rRNA sequences. J Transl Med. 2021;19(1):485.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Deng X, Li Z, Li G, Li B, Jin X, Lyu G. Comparison of microbiota in patients treated by surgery or chemotherapy by 16S rRNA sequencing reveals potential biomarkers for colorectal cancer therapy. Front Microbiol. 2018;9:1.

    Article  Google Scholar 

  40. Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat Biotechnol. 2019;37(8):852–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods. 2016;13(7):581–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.

    MathSciNet  Google Scholar 

  43. Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucl Acids Res. 2007;35(21):7188–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Dhariwal A, Chong J, Habib S, King IL, Agellon LB, Xia J. MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucl Acids Res. 2017;45(W1):W180–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, et al. Top 10 algorithms in data mining. Knowl Inf Syst. 2008;14(1):1–37.

    Article  Google Scholar 

  46. Moore AW, Komarek P, editors. Logistic regression for data mining and high-dimensional classification; 2004.

  47. Craven MW, Shavlik JW. Using neural networks for data mining. Futur Gener Comput Syst. 1997;13(2):211–29.

    Article  Google Scholar 

  48. Hosmer DW, Lemeshow S, Sturdivant RX. Applied logistic regression. London: Wiley; 2013.

    Book  Google Scholar 

  49. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

Download references


We would like to thank to Professor Sadegh Massarrat, for his endless support in progress of this research.


This research was supported by Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran under Grant [RIGLD1065]. Also, the16S rRNA sequencing part was supported by the UNISS FAR fondi ricercar 2021, 2022 and Fondazione di Sardegna 2017 to L.A.S. Grant from the Regione Autonoma della Sardegna, legge regionale 12 dicembre 2022, n. 22 to L.A.S.

Author information

Authors and Affiliations



Conceptualization, SR, MMF, and HAA; methodology, SR, MMF, and MAL; software, LAS, MAL, SJ, MG, and DN; validation, SR, and MMF; formal analysis, SR, SJ, LAS, MAL, and MG; investigation, SR, MMF, and HAA; sample collecting, SR, HAA, AS, ST, RB; writing-original draft preparation, SR; writing—review and editing, SR, MMF, HS, LAS; visualization, SR, MMF and LAS; supervision, MMF and LAS; project administration, MMF, and SR; funding acquisition, HAA, MRZ, and LAS. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Leonardo Antonio Sechi or Mohammad Mehdi Feizabadi.

Ethics declarations

Ethics approval and consent to participate.

The case–control study was approved by the Clinical Research Ethics Committee of the Shahid Beheshti University of Medical Sciences and the Ethics Committee of Taleghani Hospital, Tehran, Iran (IR.SBMU.RIGLD.REC.1398.039).

Consent for publication

Informed consent was obtained in all cases.

Competing interests

The authors declare that there are no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original version of the article was revised: The affiliation information in authorship has been corrected.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rezasoltani, S., Azizmohammad Looha, M., Asadzadeh Aghdaei, H. et al. 16S rRNA sequencing analysis of the oral and fecal microbiota in colorectal cancer positives versus colorectal cancer negatives in Iranian population. Gut Pathog 16, 9 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: