In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy

Shokri Garjan, Hassan; Omidi, Yadollah; Poursheikhali Asghari, Mehdi; Ferdousi, Reza

doi:10.1186/s13099-023-00535-2

Gut Pathogens

Table 3 List of all the data that was utilized in the microbe–disease prediction

From: In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy

Data	Source	Original state	Similarity process	URL
HMDAD	The HMDAD database provides documentation of population disorders of disease-related microorganisms in PubMed	HMDAD integrated 483 disease-microbe entries which include 39 diseases and 292 microbes	They're reduced to 450 known MDAs that are then utilized to calculate GIP kernel, Cosine, and Spearman correlation similarity	https://www.cuilab.cn/hmdad
PERYTON	The content of Peryton is entirely supported by the manual curation of biomedical journals. Using reference tools to construct database dictionaries, diseases and Microbiota are supplied in a well-structured, well-organized format	There are currently over 7,900 entries in the database, which link 43 diseases and 1,396 microorganisms	Peryton also provides interactive visualizations, and the data may be downloaded straight to your computer for local storage and analysis	https://dianalab.e-ce.uth.gr/peryton/
GEN-BASED	On DisGeNET, you may find GDAs from UNIPROT, CGI, ClinGen, Genomics England, CTD (human subset), PsyGeNET, Orphanet, and those produced from text mining MEDLINE abstracts	Between 17 549 genes and 24 166 diseases, there are 628 685 GDAs covered. There are 37 diseases mapped, 1850 chromosomes, and 2715 GDAs Size/coverage in HMDAD	The neighbor-based similarity approach calculates GDA scores which were used to find further commonalities among a selection of disorders	https://www.disgenet.org
SYMPTOM-BASED disease data	HSDN pulls data from PubMed's large-scale medical bibliographic records of disease–symptom correlations	Simultaneous counting and TF-IDF weight values for 322 symptoms and 4442 disorders, with 147 97 connections and 22 mapped diseases, 269 symptoms, and 1858 associations of disease symptoms	The symptom-based illness similarity is calculated using Co-occurrence TF-IDFs between one illness and other symptoms	https://www.nature.com/articles/ncomms5212
Semantics-based disease data	MeSH trees are in the National Library of Medicine for a hierarchical definition of disease	Hierarchical trees systematically describe a variety of diseases 33 diseases of size/coverage mapped in HMDAD	The DAG-based semantic similarity of two disease trees made up of hierarchical descriptors is calculated	https://meshb.nlm.nih.gov/search
PROTEIN	STRING is a database that collects protein–protein interactions and data on proteins from several sources	At the species level, 1391 microbes were mapped, with gene neighbor scores of 932 370 pairs of COGs	The neighborhood score is used to determine if there is an edge between two COGs. Also provides interactive visualizations	https://string-db.org
Comprehensive Antibiotic Resistance Database (CARD)	A carefully curated resource offering high-quality reference material on the molecular basis of antimicrobial resistance (AMR), with a focus on the genes, proteins, and mutations implicated in AMR	CARD found 2441 model reference sequences, 853 single nucleotide alterations, as well as an increasing number of indels, frame shift, and nonsense mutations linked to antimicrobial resistance	Additional search criteria include mutations conferring AMR (if relevant) and curated BLAST(P/N) bit score cut-offs are included in the ontology	https://card.mcmaster.ca/
Disbiome	Created in 2018, is a more comprehensive database that is constantly updated every three months	As of December 2019, the Disbiome database includes 322 diseases, 1,470 microbiome organisms, and 9,102 experiments published in 1,018 scholarly articles	The human annotation guarantees a clear and organized presentation of the material that is accessible	https://disbiome.ugent.be/home/
MicroPhenoDB	There are 5677 non-redundant correlations between 1781 microorganisms and 542 human illness phenotypes across more than 22 human body locations in this study	In addition, MicroPhenoDB has 696,934 connections between 27,277 clade-specific core genes and 685 microorganisms	The software allows scientists to search DNA and RNA sequences for potential pathogens without running the usual meta-genomic data processing and assembly steps	http://www.liwzlab.cn/microphenodb

Back to article page

ISSN: 1757-4749

Contact us

Submission enquiries: journalsubmissions@springernature.com