Gfkb: Difference between revisions

From HIVE Lab
Jump to navigation Jump to search
Jkeeney (talk | contribs)
Added links to current SlimNT files.
Jkeeney (talk | contribs)
No edit summary
 
Line 7: Line 7:


We use a two-step pipeline for metagenomic analysis; CensuScope and Hexagon. CensuScope is a census-based tool that randomly samples a user-defined number of reads and BLASTs them against a reference DB. Our reference database (a filtered version of NTdb) is the NCBI Nucleotide db with all of the sequences lacking a clear taxonomic lineage filtered out. All artificial sequences have been removed either by our automated filter or manually, once an artificial sequence is identified during post analysis processing Sequences identified by CensuScope are used as references in Hexagon alignments. HIVE-hexagon, a K-mer based aligner, is more sensitive and faster than current standard alignment algorithms. HIVE-hexagon offers a decrease in computational cost, memory requirement and time for processing.
We use a two-step pipeline for metagenomic analysis; CensuScope and Hexagon. CensuScope is a census-based tool that randomly samples a user-defined number of reads and BLASTs them against a reference DB. Our reference database (a filtered version of NTdb) is the NCBI Nucleotide db with all of the sequences lacking a clear taxonomic lineage filtered out. All artificial sequences have been removed either by our automated filter or manually, once an artificial sequence is identified during post analysis processing Sequences identified by CensuScope are used as references in Hexagon alignments. HIVE-hexagon, a K-mer based aligner, is more sensitive and faster than current standard alignment algorithms. HIVE-hexagon offers a decrease in computational cost, memory requirement and time for processing.
<br>
<br>


'''Publications'''
'''Current Slim NT database:'''


<a target="_blank" href="https://doi.org/10.1186/1471-2164-15-918">Shamsaddini ''et al''.</a>
https://hive.biochemistry.gwu.edu/static/slimNT.fa.gz


<a target="_blank" href="https://doi.org/10.1371/journal.pone.0099033">Santana-Quintero ''et al''.</a>
'''Current Slim NT taxonomy database:'''


<a target="_blank" href="https://doi.org/10.1093/database/baw022">Simonyan ''et al''.</a>
https://hive.biochemistry.gwu.edu/static/slimNT.db.gz
 
<br>
<a target="_blank" href="https://doi.org/10.3390/genes5040957">Simonyan V, Mazumder R</a>
<br>
 
'''Publications'''
Current Slim NT database:


https://hive.biochemistry.gwu.edu/static/slimNT.fa.gz
[https://doi.org/10.1186/1471-2164-15-918 Shamsaddini ''et al''.]


Current Slim NT taxonomy database:
[https://doi.org/10.1371/journal.pone.0099033 Santana-Quintero ''et al''.]


https://hive.biochemistry.gwu.edu/static/slimNT.db.gz
[https://doi.org/10.1093/database/baw022 Simonyan ''et al''.]


[https://doi.org/10.3390/genes5040957 Simonyan V, Mazumder R]
<br>
<br>
'''Funding'''
'''Funding'''


LOI_ID#L02496974, NSF_Lineage_Award #1546491
LOI_ID#L02496974, NSF_Lineage_Award #1546491
</td></tr></table>
</td></tr></table>

Latest revision as of 14:18, 29 August 2025

Gutfeeling KnowledgeBase
We have developed a proof-of-concept gut microbiome monitoring system prototype using a sequencing and analysis pipeline implemented during our previous I-Corps award (see below).

We have collected from the individuals enrolled in our study the following: three separate fecal samples for metagenomic sequencing, anthropometric measurements, a diet history questionnaire, gastrointestinal symptoms questionnaires, perceived stress questionnaires, physical activity questionnaires, and sleep questionnaires. We have also begun the analysis of fecal samples from the Human Microbiome Project and the associated metadata. The integration of this data into a single knowledgebase of comparable samples using our optimized pipeline will provide the real value of our prototype.

HIVE Metagenomics Pipeline

We use a two-step pipeline for metagenomic analysis; CensuScope and Hexagon. CensuScope is a census-based tool that randomly samples a user-defined number of reads and BLASTs them against a reference DB. Our reference database (a filtered version of NTdb) is the NCBI Nucleotide db with all of the sequences lacking a clear taxonomic lineage filtered out. All artificial sequences have been removed either by our automated filter or manually, once an artificial sequence is identified during post analysis processing Sequences identified by CensuScope are used as references in Hexagon alignments. HIVE-hexagon, a K-mer based aligner, is more sensitive and faster than current standard alignment algorithms. HIVE-hexagon offers a decrease in computational cost, memory requirement and time for processing.

Current Slim NT database:

https://hive.biochemistry.gwu.edu/static/slimNT.fa.gz

Current Slim NT taxonomy database:

https://hive.biochemistry.gwu.edu/static/slimNT.db.gz

Publications

Shamsaddini et al.

Santana-Quintero et al.

Simonyan et al.

Simonyan V, Mazumder R

Funding

LOI_ID#L02496974, NSF_Lineage_Award #1546491