Dataset Resources: Difference between revisions
No edit summary |
No edit summary |
||
Line 38: | Line 38: | ||
<ul style="font-weight: normal;"> | <ul style="font-weight: normal;"> | ||
<li>< | <li>Version: 1.0 More info: [https://bioconductor.org/packages/release/bioc/html/polyester.html Bioconductor polyester page]</li> | ||
<li> | <li>[sample_01_01.fasta]: Forward reads for sample 1. Size: 1.7 GB, Format: FASTA</li> | ||
<li> | <li>[sample_01_2.fasta]: Reverse reads for sample 1. Size: 1.7 GB, Format: FASTA</li> | ||
<li> | <li>[sample_02_01.fasta]: Forward reads for sample 2. Size: 1.8 GB, Format: FASTA</li> | ||
<li> | <li>[sample_02_02.fasta]: Reverse reads for sample 2. Size: 1.8 GB, Format: FASTA</li> | ||
<li>[sim_tx_info.txt]: Summary of fold changes per transcript. Size: 142 KB, Format: TXT</li> | |||
</ul style="font-weight: normal;"> | </ul style="font-weight: normal;"> | ||
Revision as of 20:31, 16 December 2024
HIVE Team Datasets
BCO HCV
We demonstrated that the use of the IEEE 2791-2020 Standard, (BioCompute objects [BCO]) enables complete and concise communication of NGS data analysis results. One arm of a clinical trial4 was replicated using synthetically generated data made to resemble real biological data. Two separate, independent analyses were then carried out using BCOs as the tool for communication of analysis: one to simulate a pharmaceutical regulatory submission to the FDA, and another to simulate the FDA review. The two results were compared and tabulated for concordance analysis: of the 118 simulated patient samples generated, the final results of 117 (99.15%) were in agreement. This high concordance rate demonstrates the ability of a BCO, when a verification kit is included, to effectively capture and clearly communicate NGS analyses within regulatory submissions. BCO promotes transparency and induces reproducibility, thereby reinforcing trust in the regulatory submission process.
- Base line 1: Version: 1.0, Size: 48 MB, Format: FASTQ
- Base line 2: Version: 1.0, Size: 48 MB, Format: FASTQ
- Treatment Failure 1: Version: 1.0, Size: 48 MB, Format: FASTQ
- Treatment Failure 2: Version: 1.0, Size: 48 MB, Format: FASTQ
Full Dataset Download:
- manifest.json: Version: 1.0, Size: 163K, Format: JSON
- hcvALL.zip: Version: 1.0, Size: 4.4G, Format: ZIP
Citation: https://doi.org/10.1101/2020.12.07.415059
GFKB
Gut feeling knowledgebase is a reference database of healthy human gut microbiome. It is generated by a metagenomic analysis pipeline described in our paper https://doi.org/10.1371/journal.pone.0206484, and includes three tools which are integrated in the HIVE platform. 49 healthy samples sequenced at GWU and 49 healthy samples taken from The Human Microbiome Project were analyzed to create GutFeelingKB.
- Version: 4.0
- Citation: https://pubmed.ncbi.nlm.nih.gov/31509535/ PMID: 31509535
- Downloads Link: https://hive.biochemistry.gwu.edu/gfkb
Polyester Simulated RNA-seq Reads for Chromosome 22
Simulated RNA-seq reads were generated using the R package polyester for Chromosome 22 of the human reference genome GRCh38. Two samples were generated, with each sample containing a unique 2 transcripts that are expressed at 20 fold higher than normal to serve as positive controls. These reads can be used for testing RNA-seq analysis pipelines and to gauge any variability an analysis has on validating the 20 fold difference of the positive control transcripts between samples.
- Version: 1.0 More info: Bioconductor polyester page
- [sample_01_01.fasta]: Forward reads for sample 1. Size: 1.7 GB, Format: FASTA
- [sample_01_2.fasta]: Reverse reads for sample 1. Size: 1.7 GB, Format: FASTA
- [sample_02_01.fasta]: Forward reads for sample 2. Size: 1.8 GB, Format: FASTA
- [sample_02_02.fasta]: Reverse reads for sample 2. Size: 1.8 GB, Format: FASTA
- [sim_tx_info.txt]: Summary of fold changes per transcript. Size: 142 KB, Format: TXT