<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://hivelab.biochemistry.gwu.edu/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jeetvora</id>
	<title>HIVE Lab - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://hivelab.biochemistry.gwu.edu/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jeetvora"/>
	<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/Special:Contributions/Jeetvora"/>
	<updated>2026-05-06T12:25:50Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.42.1</generator>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_Spring_2026&amp;diff=1253</id>
		<title>Volunteership Spring 2026</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_Spring_2026&amp;diff=1253"/>
		<updated>2026-04-15T15:29:29Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
== 2026 Spring Volunteer Program Details ==&lt;br /&gt;
&lt;br /&gt;
=== Dates ===&lt;br /&gt;
&#039;&#039;&#039;Application Deadline&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
January 9, 2026, Noon (email your updated resume and projects in order of preference). Acceptance letter/email will be sent to candidates latest the day after the kick-off meeting.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Volunteer Zoom Kick-Off Meeting&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
January 12, 2026 | 11:00 AM to 12:00 PM&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Program Dates: January, 2026 –  April, 2026&#039;&#039;&#039; (13 weeks)&lt;br /&gt;
&lt;br /&gt;
Remote | Hybrid for GW employees and students (Ross Hall 5th floor)&lt;br /&gt;
&lt;br /&gt;
[[Volunteership Fall 2025|Fall 2025 Volunteership]] (Closed)&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Volunteer Expectations ===&lt;br /&gt;
&lt;br /&gt;
# Minimum commitment of 10 hours per week.&lt;br /&gt;
# Progress updates via Slack at least 3 days per week (scrum).&lt;br /&gt;
# 30-minute Zoom meetings (during regular work hours) once a week or every other week with the assigned project point of contact (POC).&lt;br /&gt;
# Attend some lectures or seminars remotely (max 4-5).&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;&#039;&#039;Important:&#039;&#039;&#039; If the scrum is not updated for 2 consecutive working days, the candidate will be automatically dropped from the program.&#039;&#039;&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Potential Projects ===&lt;br /&gt;
We are excited to continue our bioinformatics volunteership program in Fall 2025. This program offers students the opportunity to work on bioinformatics projects supported by agencies such as the NIH, ARPA-H, and FDA. Participants will gain exposure to a variety of activities within a bioinformatics lab, including data analysis, computational biology, and genomics. If you are interested, please email mazumder_lab@gwu.edu your resume and a ranked list of the projects that interest you most. You can also indicate if you want to focus on specific areas that are of interest to you.&lt;br /&gt;
&lt;br /&gt;
# BiomarkerKB (biomarkerkb.org) project: Biomarker curation project. Involves reading papers and collecting biomarkers.&lt;br /&gt;
# GlyGen (glygen.org) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information.&lt;br /&gt;
# ARGOS (argosdb.org) project: Analyze genomics data using HIVE to identify reference genome assemblies.&lt;br /&gt;
# PredictMod (hivelab.biochemistry.gwu.edu/predictmod) project. Curating PMIDs for intervention outcome prediction dataset LLM recommendation training.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen. &amp;lt;u&amp;gt;We are also looking for individuals who have previously worked with us to take on a coordinator role&amp;lt;/u&amp;gt;.&#039;&#039;&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== 1. BiomarkerKB Biocuration Project Ideas ====&lt;br /&gt;
POC: Maria Kim, Cyrus Yeung, Jeet Vora&lt;br /&gt;
&lt;br /&gt;
# Curate biomarkers for a specific disease or for a treatment&lt;br /&gt;
## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.&lt;br /&gt;
## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.&lt;br /&gt;
# Top 50 biomarkers&lt;br /&gt;
## Curate the top 50 biomarkers for biomarkerkb.org.&lt;br /&gt;
## Define what constitutes a top 50 biomarker.&lt;br /&gt;
## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.&lt;br /&gt;
# Biocuration of biomarkers from NLP/LLM work&lt;br /&gt;
## Use the biomarkers collected from NLP work.&lt;br /&gt;
## Curate biomarkers. Data provided was not provided in the biomarker data model.&lt;br /&gt;
## While curating the biomarkers, check if data collected from NLP is correct.&lt;br /&gt;
## After completion, the student can start using curated data to work on NLP/LLM methods.&lt;br /&gt;
# Continue working on LLM methods started by volunteers in Fall 2025.&lt;br /&gt;
::: The data is available as well as some preliminary research and work done by previous volunteers in this area.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project.&lt;br /&gt;
&lt;br /&gt;
==== 2. GlyGen Biocuration Project Ideas ====&lt;br /&gt;
POC: Rene Ranzinger, Kate Warner, Urnisha Bhuiyan &lt;br /&gt;
&lt;br /&gt;
Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding; however, the data contained within them remains highly valuable to the research community. Integrating these legacy datasets into modern databases or knowledgebases, such as GlyGen, presents a significant challenge because much of the associated metadata (e.g., species, tissue, disease, cell line) is recorded as free-text that does not conform to the standardized dictionaries and ontologies used by current resources.&lt;br /&gt;
&lt;br /&gt;
To address this challenge, this project will leverage large language models (LLMs) to automate the mapping of free-text metadata from legacy databases, specifically CarbBank and CFG, to standardized accessions in authoritative resources such as NCBI Taxonomy, Disease Ontology, and Cellosaurus. The LLM-based workflow will identify and normalize synonyms, abbreviations, and spelling variants (e.g., “human,” “man,” or “h. sapiens” mapped to Homo sapiens), enabling scalable and reproducible metadata harmonization that would otherwise require extensive manual curation. The LLM tasks will be performed using OpenAI resources integrated into the GlyGen curation pipeline. The project involves the development of Python scripts to read and write data, invoke the OpenAI API and compare results with manual curated data. Another aspect of the work is the development and finetunning of a prompt for ChatGPT to ensure reliable and accurate mapping is produced. &lt;br /&gt;
&lt;br /&gt;
While the mapping process will be largely automated, manual validation will be incorporated as a quality-control step to assess model performance, verify correctness, and identify edge cases requiring refinement. This hybrid approach significantly reduces curator burden while ensuring high-quality, ontology-aligned annotations.&lt;br /&gt;
&lt;br /&gt;
The goal of this effort is to migrate and modernize datasets from CarbBank and CFG, making them interoperable with GlyGen and other contemporary glycoinformatics resources through a scalable, AI-assisted curation strategy.&lt;br /&gt;
&lt;br /&gt;
For any questions, please contact Rene Ranzinger (rene@ccrc.uga.edu) or Kate Warner (k.warner1@email.gwu.edu). &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. GlyGen Publication Analysis Project Ideas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using the PubMed web API to filter publications based on keywords.&lt;br /&gt;
# Analyzing paper abstracts to identify research institutions and groups that form the community.&lt;br /&gt;
# Filtering the community list to exclude unrelated co-authors.&lt;br /&gt;
# Prioritize papers identified by GlycoSiteMiner for curation via TableMaker&lt;br /&gt;
&lt;br /&gt;
A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
==== 4. PredictMod Machine Learning (ML) Modeling Project ====&lt;br /&gt;
POC: Lori Krammer, Pat McNeely (optional)&lt;br /&gt;
&lt;br /&gt;
Volunteers will conduct ML modeling using publicly-available -omics datasets that were previously identified (see [[Recommended Publications for Intervention Outcome Prediction Models|https://hivelab.biochemistry.gwu.edu/wiki/Recommended_Publications_for_Intervention_Outcome_Prediction_Models]]). This volunteership will involve data harmonization, model training, and pipeline documentation.&lt;br /&gt;
&lt;br /&gt;
Tasks associated with this project include:&lt;br /&gt;
&lt;br /&gt;
# Exploring and understanding the data found in relevant PMIDs that can be used to train intervention outcome prediction models.&lt;br /&gt;
# Preparing the data for model training and model performance evaluation&lt;br /&gt;
# Testing the modeling tutorial, PredictMod platform, and associated project tools&lt;br /&gt;
# Documentation of the ML pipeline and testing results&lt;br /&gt;
Deliverables for this project include:&lt;br /&gt;
&lt;br /&gt;
# ML-ready datasets&lt;br /&gt;
# Trained model scripts&lt;br /&gt;
# Pipeline documentation captured in BioCompute Objects (BCOs) and testing reports&lt;br /&gt;
# Volunteership documentation (final report or weekly progress reports)&lt;br /&gt;
&lt;br /&gt;
Interested individuals should reach out to lorikrammer@gwu.edu. Please note that this project requires attendance at biweekly meetings and a final presentation of your work.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. FDA-ARGOS Computation and Pathogen Curation Project&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Note:&#039;&#039; For anyone interested in ARGOS, you may be assigned to another project of your choice. This project is contingent on a contract extension. Please complete your project selection in order of preference.&lt;br /&gt;
&lt;br /&gt;
POC: Christie Woodside&lt;br /&gt;
&lt;br /&gt;
Qualifications: basic/medium programming skills, knowledgeable of basic bioinformatics platforms and skills.&lt;br /&gt;
&lt;br /&gt;
# Curate and report on currently circulating pathogens to upload to ARGOS&lt;br /&gt;
## The student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found.&lt;br /&gt;
## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.&lt;br /&gt;
## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.&lt;br /&gt;
# QC Analysis using HIVE&lt;br /&gt;
## Analyze the curated pathogens using our QC ARGOS one-click pipeline.&lt;br /&gt;
## The results will be added to our ARGOS database.&lt;br /&gt;
# Report Results&lt;br /&gt;
## Defend your pathogens you have selected to be added to the database. Explain their importance and what value they would hold to the scientific community if they were added.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the Spring.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Completion ===&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; The following are mandatory. Failure to complete any will result in an incomplete volunteer record.&lt;br /&gt;
&lt;br /&gt;
==== Documentation ====&lt;br /&gt;
All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.&lt;br /&gt;
&lt;br /&gt;
==== Written Report ====&lt;br /&gt;
Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.&lt;br /&gt;
&lt;br /&gt;
==== Presentation &amp;amp; Slide Submission ====&lt;br /&gt;
Present your work last week of the 13-week period.&lt;br /&gt;
&lt;br /&gt;
Slides must be submitted to the POCs.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Completion Certificate ===&lt;br /&gt;
A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program. Additional recognition will be given to the top three volunteers with exceptional presentations at the end of the program.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Contact ===&lt;br /&gt;
mazumder_lab@gwu.edu.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Volunteers (TBD) ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
!Name&lt;br /&gt;
!Project Assigned&lt;br /&gt;
!POC Assigned&lt;br /&gt;
!Projects Interested&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy*]&lt;br /&gt;
|PredictMod; GlyGen&lt;br /&gt;
|Lori Krammer; Urnisha Bhuiyan; Rene Ranzinger&lt;br /&gt;
|PredictMod; Glyco web development&lt;br /&gt;
|-&lt;br /&gt;
|Sampurna Chakravorty&lt;br /&gt;
|PredictMod&lt;br /&gt;
|Lori Krammer&lt;br /&gt;
|PredictMod; ARGOS; BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|Vishal Muthusekaran*&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Maria Kim; Cyrus Yeung; Jeet Vora&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/ashley-tien/ Ashley Tien*]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|Lori Krammer&lt;br /&gt;
|PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/conner-cognata/ Conner Cognata]&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Maria Kim; Cyrus Yeung; Jeet Vora&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|Venya Gulati&lt;br /&gt;
|ARGOS&lt;br /&gt;
|Christie Woodside&lt;br /&gt;
|ARGOS; PredictMod; BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|Isaac Kim&lt;br /&gt;
|GlyGen&lt;br /&gt;
|Urnisha Bhuiyan; Rene Ranzinger&lt;br /&gt;
|PredictMod; GlyGen biocuration; ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|Miao Wang**&lt;br /&gt;
|ARGOS&lt;br /&gt;
|Christie Woodside; Lori Krammer&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|Vishal Bakshi**&lt;br /&gt;
|PredictMod&lt;br /&gt;
|Lori Krammer&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;nowiki&amp;gt;*&amp;lt;/nowiki&amp;gt;Returning volunteer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;nowiki&amp;gt;**&amp;lt;/nowiki&amp;gt;Not directly involved in the semester curriculum; long-term volunteer.&lt;br /&gt;
&lt;br /&gt;
== Spring 2026 Symposium ==&lt;br /&gt;
The Spring symposium will be held virtually.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Date:&#039;&#039;&#039; April 15th, 2026 (Wednesday)&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; 4 - 6 PM&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link:&#039;&#039;&#039; [https://gwu-edu.zoom.us/j/93790551366?pwd=C0aN4b95CUbxahO9By6pTj35D9lFIx.1&amp;amp;jst=2#success https://gwu-edu.zoom.us/j/93790551366?pwd=C0aN4b95CUbxahO9By6pTj35D9lFIx.1&amp;amp;jst=2]&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Time&lt;br /&gt;
!Project&lt;br /&gt;
!Presentation Title&lt;br /&gt;
!Presenter(s)&lt;br /&gt;
|-&lt;br /&gt;
|4:00-4:05 PM&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |Welcome &amp;amp; Introduction&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|4:05-4:30 PM&lt;br /&gt;
|GlyGen&lt;br /&gt;
|&lt;br /&gt;
* 8 + 5 mins QA - presentation #1&lt;br /&gt;
* 8 + 5 mins QA - presentation #2&lt;br /&gt;
|Diya Kamalabharathy; Isaac Kim&lt;br /&gt;
|-&lt;br /&gt;
|4:30-4:45 PM&lt;br /&gt;
|ARGOS&lt;br /&gt;
|&lt;br /&gt;
* 8 + 5 mins QA - presentation #1&lt;br /&gt;
|Venya Gulati&lt;br /&gt;
|-&lt;br /&gt;
|4:45-5:10 PM&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
* 8 + 5 mins QA - presentation #1&lt;br /&gt;
* 8 + 5 mins QA - presentation #2&lt;br /&gt;
|Vishal Muthusekaran; Conner Cognata&lt;br /&gt;
|-&lt;br /&gt;
|5:10-5:30 PM&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
* 15 + 5 mins QA - group presentation &lt;br /&gt;
|Diya Kamalabharathy; Sampurna Chakravorty; Ashley Tien&lt;br /&gt;
|-&lt;br /&gt;
|5:30-5:45PM&lt;br /&gt;
| PredictMod&lt;br /&gt;
|&lt;br /&gt;
* 8 + 5 mins QA - presentation&lt;br /&gt;
|Vishal Bakshi&lt;br /&gt;
|-&lt;br /&gt;
|5:45-6:00 PM&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |Remarks&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_Summer_2026&amp;diff=1217</id>
		<title>Volunteership Summer 2026</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_Summer_2026&amp;diff=1217"/>
		<updated>2026-04-02T21:01:52Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* 1. BiomarkerKB Biocuration Project Ideas */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== 2026 Summer Volunteer Program Details ==&lt;br /&gt;
&lt;br /&gt;
=== Dates ===&lt;br /&gt;
&#039;&#039;&#039;Application Deadline&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Date TBD | 12:00 PM ET&lt;br /&gt;
&lt;br /&gt;
Please email your updated resume and projects in order of preference. Acceptance letter/email will be sent to candidates latest the day after the kick-off meeting.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Volunteer Zoom Kick-Off Meeting&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Date TBD | 11:00 AM to 12:00 PM&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Program Dates: June 1, 2026 –  July 31, 2026&#039;&#039;&#039; (9 weeks)&lt;br /&gt;
&lt;br /&gt;
Remote | Hybrid for GW employees and students (Ross Hall 5th floor)&lt;br /&gt;
&lt;br /&gt;
[[Volunteership Spring 2026|Fall 2025 Volunteership]] (Closed)&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Volunteer Expectations ===&lt;br /&gt;
&lt;br /&gt;
# Minimum commitment of 10 hours per week.&lt;br /&gt;
# Progress updates via Slack at least 3 days per week (scrum).&lt;br /&gt;
# 30-minute Zoom meetings (during regular work hours) once a week or every other week with the assigned project point of contact (POC).&lt;br /&gt;
# Attend some lectures or seminars remotely (max 4-5).&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;&#039;&#039;Important:&#039;&#039;&#039; If the scrum is not updated for 2 consecutive working days, the candidate will be automatically dropped from the program.&#039;&#039;&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Potential Projects ===&lt;br /&gt;
We are excited to continue our bioinformatics volunteership program in Summer 2026. This program offers students the opportunity to work on bioinformatics projects supported by agencies such as the NIH, ARPA-H, and FDA. Participants will gain exposure to a variety of activities within a bioinformatics lab, including data analysis, computational biology, and genomics. If you are interested, please email &#039;&#039;mazumder_lab@gwu.edu&#039;&#039; your resume and a ranked list of the projects that interest you most. You can also indicate if you want to focus on specific areas that are of interest to you.&lt;br /&gt;
&lt;br /&gt;
# BiomarkerKB (biomarkerkb.org) project: Biomarker curation project. Involves reading papers and collecting biomarkers.&lt;br /&gt;
# GlyGen (glygen.org) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information.&lt;br /&gt;
# ARGOS (argosdb.org) project: Analyze genomics data using HIVE to identify reference genome assemblies.&lt;br /&gt;
# PredictMod (hivelab.biochemistry.gwu.edu/predictmod) project. Curating PMIDs for intervention outcome prediction dataset LLM recommendation training.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen. &amp;lt;u&amp;gt;We are also looking for individuals who have previously worked with us to take on a coordinator role&amp;lt;/u&amp;gt;.&#039;&#039;&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
==== 1. BiomarkerKB Biocuration Project Ideas ====&lt;br /&gt;
&lt;br /&gt;
# Review exisiting published biomarkers for the correctness and validity&lt;br /&gt;
#* Validate biomarker–disease associations using primary literature&lt;br /&gt;
#* Assess evidence strength&lt;br /&gt;
#* Identify outdated, conflicting, or unsupported biomarker claims&lt;br /&gt;
# Biocurate biomarkers from publications based on disease and entity type&lt;br /&gt;
#* Identify and curate novel biomarkers from recent publications&lt;br /&gt;
#* Standardize biomarker representation using controlled vocabularies and ontologies&lt;br /&gt;
#* Classify biomarkers by type and disease context&lt;br /&gt;
# Review and Map Electronic Health Records Normal Entity Data&lt;br /&gt;
#* Identify relevant EHR data elements (lab tests, diagnoses, procedures)&lt;br /&gt;
#* Map entities to standard terminologies (e.g., SNOMED CT, LOINC, ICD codes)&lt;br /&gt;
#* Resolve ambiguities and inconsistencies in mapping, clinical terminology&lt;br /&gt;
# Continue working on LLM methods started by previous volunteers.&lt;br /&gt;
#* The data is available as well as some preliminary research and work done by previous volunteers in this area.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas or would like to know more, please reach out to jeetvora@gwu.edu&lt;br /&gt;
&lt;br /&gt;
==== 2. GlyGen Biocuration Project Ideas ====&lt;br /&gt;
POC: Rene Ranzinger, Kate Warner, and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding; however, the data contained within them remains highly valuable to the research community. Integrating these legacy datasets into modern databases or knowledgebases, such as GlyGen, presents a significant challenge because much of the associated metadata (e.g., species, tissue, disease, cell line) is recorded as free-text that does not conform to the standardized dictionaries and ontologies used by current resources.&lt;br /&gt;
&lt;br /&gt;
To address this challenge, this project will leverage large language models (LLMs) to automate the mapping of free-text metadata from legacy databases, specifically CarbBank and CFG, to standardized accessions in authoritative resources such as NCBI Taxonomy, Disease Ontology, and Cellosaurus. The LLM-based workflow will identify and normalize synonyms, abbreviations, and spelling variants (e.g., “human,” “man,” or “h. sapiens” mapped to Homo sapiens), enabling scalable and reproducible metadata harmonization that would otherwise require extensive manual curation. The LLM tasks will be performed using OpenAI resources integrated into the GlyGen curation pipeline. The project involves the development of Python scripts to read and write data, invoke the OpenAI API and compare results with manual curated data. Another aspect of the work is the development and finetunning of a prompt for ChatGPT to ensure reliable and accurate mapping is produced.&lt;br /&gt;
&lt;br /&gt;
While the mapping process will be largely automated, manual validation will be incorporated as a quality-control step to assess model performance, verify correctness, and identify edge cases requiring refinement. This hybrid approach significantly reduces curator burden while ensuring high-quality, ontology-aligned annotations.&lt;br /&gt;
&lt;br /&gt;
The goal of this effort is to migrate and modernize datasets from CarbBank and CFG, making them interoperable with GlyGen and other contemporary glycoinformatics resources through a scalable, AI-assisted curation strategy.&lt;br /&gt;
&lt;br /&gt;
For any questions, please contact Rene Ranzinger (rene@ccrc.uga.edu) or Kate Warner (k.warner1@email.gwu.edu).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. GlyGen Publication Analysis Project Ideas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Rene Ranzinger, Kate Warner, and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using the PubMed web API to filter publications based on keywords.&lt;br /&gt;
# Analyzing paper abstracts to identify research institutions and groups that form the community.&lt;br /&gt;
# Filtering the community list to exclude unrelated co-authors&lt;br /&gt;
&lt;br /&gt;
A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
==== 4. PredictMod Machine Learning (ML) Modeling Project ====&lt;br /&gt;
POC: Lori Krammer, Pat McNeely (optional)&lt;br /&gt;
&lt;br /&gt;
Volunteers will conduct ML modeling using publicly-available -omics datasets that were previously identified (see our [[Recommended Publications for Intervention Outcome Prediction Models|Recommended Publications for IOPMs]] page). This volunteership will involve data harmonization, model training, and pipeline documentation.&lt;br /&gt;
&lt;br /&gt;
Tasks associated with this project include:&lt;br /&gt;
&lt;br /&gt;
# Exploring and understanding the data found in relevant PMIDs that can be used to train intervention outcome prediction models.&lt;br /&gt;
# Preparing the data for model training and model performance evaluation&lt;br /&gt;
# Testing the modeling tutorial, PredictMod platform, and associated project tools&lt;br /&gt;
# Documentation of the ML pipeline and testing results&lt;br /&gt;
&lt;br /&gt;
Deliverables for this project include:&lt;br /&gt;
&lt;br /&gt;
# ML-ready datasets&lt;br /&gt;
# Trained model scripts&lt;br /&gt;
# Pipeline documentation captured in BioCompute Objects (BCOs) and testing reports&lt;br /&gt;
# Volunteership documentation (final report, progress updates, symposium presentation)&lt;br /&gt;
&lt;br /&gt;
Interested individuals should reach out to lorikrammer@gwu.edu. Please note that this project requires attendance at biweekly meetings and a final presentation of your work.&lt;br /&gt;
&lt;br /&gt;
==== 5. BioCompute Objects User Research Project ====&lt;br /&gt;
POC: Lori Krammer, Pat McNeely&lt;br /&gt;
&lt;br /&gt;
Volunteers will conduct individual audits and user researcher to improve the human readability of BioCompute Objects (BCOs) and the project documentation. This volunteership will involve user research, prototyping, and documentation.&lt;br /&gt;
&lt;br /&gt;
Tasks associated with the project include:&lt;br /&gt;
&lt;br /&gt;
# Reviewing existing documentation to gain a comprehensive understanding of BioCompute Objects, their relevance to bioinformatics, and key user personas. The volunteer will identify and report gaps in the current documentation.&lt;br /&gt;
# Conducting user research to understand pain points and desired outcomes. The volunteer will develop user stories based on interviews with BCO users.&lt;br /&gt;
# Prototyping improvements to the BCO documentation and/or portal based on user stories. This could involve visual diagrams, wiki restructuring, or decision logs.&lt;br /&gt;
&lt;br /&gt;
Deliverables will include:&lt;br /&gt;
&lt;br /&gt;
# User research report with user story maps&lt;br /&gt;
# BCO documentation improvement plan&lt;br /&gt;
# Volunteership documentation (final report, progress updates, symposium presentation)&lt;br /&gt;
&lt;br /&gt;
Interested individuals should reach out to lorikrammer@gwu.edu. Please note that this project requires attendance at biweekly meetings and a final presentation of your work.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;6. FDA-ARGOS Computation and Pathogen Curation Project&#039;&#039;&#039;&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Completion ===&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; The following are mandatory. Failure to complete any will result in an incomplete volunteer record.&lt;br /&gt;
&lt;br /&gt;
==== Documentation ====&lt;br /&gt;
All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.&lt;br /&gt;
&lt;br /&gt;
==== Written Report ====&lt;br /&gt;
Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.&lt;br /&gt;
&lt;br /&gt;
==== Presentation &amp;amp; Slide Submission ====&lt;br /&gt;
Present your work last week of the 9-week period.&lt;br /&gt;
&lt;br /&gt;
Slides must be submitted to the POCs.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Completion Certificate ===&lt;br /&gt;
A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program. Additional recognition will be given to the top three volunteers with exceptional presentations at the end of the program.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Contact ===&lt;br /&gt;
mazumder_lab@gwu.edu.&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
=== Volunteers (TBD) ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
!Name&lt;br /&gt;
!Project Assigned&lt;br /&gt;
!POC Assigned&lt;br /&gt;
!Projects Interested&lt;br /&gt;
|-&lt;br /&gt;
|Sahana Adusumilli&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Jeet Vora&lt;br /&gt;
|Review EHR Normal Ranges&lt;br /&gt;
|-&lt;br /&gt;
|Abhirama Chillara&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Jeet Vora/Maria&lt;br /&gt;
|TBD&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;nowiki&amp;gt;*&amp;lt;/nowiki&amp;gt;Returning volunteer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;nowiki&amp;gt;**&amp;lt;/nowiki&amp;gt;Not directly involved in the semester curriculum; long-term volunteer.&lt;br /&gt;
&lt;br /&gt;
== Summer 2026 Symposium ==&lt;br /&gt;
The Summer symposium will be held virtually.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Date:&#039;&#039;&#039; TBD&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Time:&#039;&#039;&#039; 4 - 6 PM&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link&#039;&#039;&#039; - TBA&lt;br /&gt;
&lt;br /&gt;
=== Agenda (All times are in Eastern Standard Time) ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
!Time&lt;br /&gt;
!Project&lt;br /&gt;
!Presentation Title&lt;br /&gt;
!Presenter(s)&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&lt;br /&gt;
|&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=1076</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=1076"/>
		<updated>2025-10-23T19:27:21Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
&lt;br /&gt;
The HIVE platform and associated algorithms such as CensuScope and HIVE-Hexagon is used to support Metgenomics analysis infrastructure.&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[[GW-HIVE WIKI]]&lt;br /&gt;
&lt;br /&gt;
[[METAGENOMICS WIKI]]&lt;br /&gt;
        &lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://data.argosdb.org/ FDA-ARGOS Project (Food and Drug Administration-dAtabase for Regulatory-Grade micrObial Sequences)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The FDA-ARGOS Project (Food and Drug Administration-dAtabase for Regulatory-Grade micrObial Sequences) is a collaborative effort to create a high-quality genomic database for identifying and characterizing microbial pathogens. Developed in partnership with the FDA, University of Maryland, and NCBI, the project provides regulatory-grade genomic data, crucial for public health and diagnostic use. Expanded in 2021 with support from GWU, Temple University, and Embleema, FDA-ARGOS aims to enhance infectious disease research through rigorous quality control protocols. The ArgosDB hosts this data, offering downloadable sequences and reproducible workflows for research and regulatory applications.[https://www.fda.gov/medical-devices/science-and-research-medical-devices/database-reference-grade-microbial-sequences-fda-argos More here].&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[[FDA-ARGOS WIKI]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.biocomputeobject.org/Main_Page BIOCOMPUTE OBJECTS WIKI]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/ &amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.glygen.org/Main_Page GlyGen WIKI]&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.  &amp;lt;br&amp;gt; &amp;lt;br&amp;gt;&lt;br /&gt;
[https://hivelab.biochemistry.gwu.edu/wiki/PredictMod PredictMod WIKI]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”. &amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[https://hivelab.biochemistry.gwu.edu/wiki/GW-FEAST GW-FEAST WIKI]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://biomarkerkb.org/ Biomarker Knowledgebase]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
[https://wiki.biomarkerkb.org/Main_Page BioMarkerKB WIKI]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases which are actively maintained. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;RESOURCES&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[Tool Resources]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&#039;&#039;&#039;&#039;&#039;Main article:&#039;&#039;&#039; [[Tool Resources]]&#039;&#039;&amp;lt;br&amp;gt;There are a variety of bioinformatic tool resources developed by our team.&lt;br /&gt;
&amp;lt;br /&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[Dataset Resources]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&amp;amp;nbsp;&#039;&#039;&#039;&#039;&#039;Main article:&#039;&#039;&#039; [[Dataset Resources]]&#039;&#039;&amp;lt;br&amp;gt;There are a variety of bioinformatic dataset resources integrated by our team.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Tool_Resources&amp;diff=929</id>
		<title>Tool Resources</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Tool_Resources&amp;diff=929"/>
		<updated>2025-08-14T20:21:16Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;Tools/Applications&amp;lt;/h2&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main GW HIVE] - GW instance of the High-performance Integrated Virtual Environment (PMID: [https://pubmed.ncbi.nlm.nih.gov/25271953/ 25271953]; [https://pubmed.ncbi.nlm.nih.gov/26989153/ 26989153])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;AWS HIVE - AWS instance of the High-performance Integrated Virtual Environment (PMID: [https://pubmed.ncbi.nlm.nih.gov/25271953/ 25271953]; [https://pubmed.ncbi.nlm.nih.gov/26989153/ 26989153])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects Portal (BCOP)] - Portal for making BioCompute objects&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.biocomputeobject.org/ PhyloSNP] - generate phylogenetic trees from single-nucleotide variation data (PMID: [https://pubmed.ncbi.nlm.nih.gov/24930720 24930720])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/CensuScope CensuScope] - detect taxonomic composition of a metagenomic data set (PMID: [https://pubmed.ncbi.nlm.nih.gov/25336203 25336203])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod] - A machine-learning-based platform for tools that predict patient-based clinical outcomes (PMID: [https://pubmed.ncbi.nlm.nih.gov/33814114 33814114])&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h2&amp;gt;Resources&amp;lt;/h2&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomuta_overview BioMuta] - browse human disease associated single-nucleotide variations (PMID: [https://pubmed.ncbi.nlm.nih.gov/24667251/ 24667251])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/bioxpress_overview BioXpress] - a curated gene expression and disease association database. (PMID: [https://pubmed.ncbi.nlm.nih.gov/25819073/ 25819073])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Feeling Knowledge Base] - Gut feeling knowledgebase is a reference database of healthy human gut microbiome. (DOI:[https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0206484 10.1371/journal.pone.0206484])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/tools FilteredNT]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.oncomx.org/ OncoMX] - KB of unified cancer genomics data from integrated mutation, expression, literature, and biomarker databases, accessible through web portal&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.glygen.org/ GlyGen] - data integration and dissemination project for carbohydrate and glycoconjugate related data. (PMID: [https://www.ncbi.nlm.nih.gov/pubmed/31616925 31616925]; [https://www.ncbi.nlm.nih.gov/pubmed/32324859 32324859])&amp;lt;li&amp;gt;[https://biomarkerkb.org/ BiomarkerKB] - KB for organized and integrated biomarker data from different public sources with contextual information about biomarkers.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Tool_Resources&amp;diff=928</id>
		<title>Tool Resources</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Tool_Resources&amp;diff=928"/>
		<updated>2025-08-14T20:19:12Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;Tools/Applications&amp;lt;/h2&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main GW HIVE] - GW instance of the High-performance Integrated Virtual Environment (PMID: [https://pubmed.ncbi.nlm.nih.gov/25271953/ 25271953]; [https://pubmed.ncbi.nlm.nih.gov/26989153/ 26989153])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;AWS HIVE - AWS instance of the High-performance Integrated Virtual Environment (PMID: [https://pubmed.ncbi.nlm.nih.gov/25271953/ 25271953]; [https://pubmed.ncbi.nlm.nih.gov/26989153/ 26989153])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects Portal (BCOP)] - Portal for making BioCompute objects&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.biocomputeobject.org/ PhyloSNP] - generate phylogenetic trees from single-nucleotide variation data (PMID: [https://pubmed.ncbi.nlm.nih.gov/24930720 24930720])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/CensuScope CensuScope] - detect taxonomic composition of a metagenomic data set (PMID: [https://pubmed.ncbi.nlm.nih.gov/25336203 25336203])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod] - A machine-learning-based platform for tools that predict patient-based clinical outcomes (PMID: [https://pubmed.ncbi.nlm.nih.gov/33814114 33814114])&amp;lt;/li&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h2&amp;gt;Resources&amp;lt;/h2&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomuta_overview BioMuta] - browse human disease associated single-nucleotide variations (PMID: [https://pubmed.ncbi.nlm.nih.gov/24667251/ 24667251])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/bioxpress_overview BioXpress] - a curated gene expression and disease association database. (PMID: [https://pubmed.ncbi.nlm.nih.gov/25819073/ 25819073])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Feeling Knowledge Base] - Gut feeling knowledgebase is a reference database of healthy human gut microbiome. (DOI:[https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0206484 10.1371/journal.pone.0206484])&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/tools FilteredNT]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.oncomx.org/ OncoMX] - KB of unified cancer genomics data from integrated mutation, expression, literature, and biomarker databases, accessible through web portal&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;[https://www.glygen.org/ GlyGen] - data integration and dissemination project for carbohydrate and glycoconjugate related data.&amp;lt;li&amp;gt;[https://biomarkerkb.org/ BiomarkerKB] - KB for organized and integrated biomarker data from different public sources with contextual information about biomakers.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=907</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=907"/>
		<updated>2025-07-31T13:39:57Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Agenda */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 643, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Lupus Discovery Project + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;12:30pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH Ross 505 (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|Categorization of glycan names&lt;br /&gt;
|Filmawit Zeru (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;3:45pm&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |&#039;&#039;&amp;lt;nowiki&amp;gt;Certificate Distribution | Photo | Break (15mins)&amp;lt;/nowiki&amp;gt;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|4:00pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Guest Lecture: Crafting a Strong LinkedIn Profile and Resume&#039;&#039;&#039; &lt;br /&gt;
|&#039;&#039;&#039;Sara Orrick (Senior Career Consultant (45 Mins)&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;GlycositeMiner Project&#039;&#039;&#039; ===&lt;br /&gt;
Establishing a set of rules to broadly classify glycan names into structure-based and function-based categories, and applying these rules to organize entries in the glycan dictionary. These classification rules will evolve over time, enabling the creation of more refined hierarchical categories. The ultimate goal of this rule-based glycan name categorization is to support automated literature mining, specifically for identifying glycan names in PubMed articles.  &lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=906</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=906"/>
		<updated>2025-07-31T13:24:10Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Agenda */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 643, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;12:30pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH Ross 505 (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|Categorization of glycan names&lt;br /&gt;
|Filmawit Zeru (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;3:45pm&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |&#039;&#039;&amp;lt;nowiki&amp;gt;Certificate Distribution | Photo | Break (15mins)&amp;lt;/nowiki&amp;gt;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|4:00pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Guest Lecture: Crafting a Strong LinkedIn Profile and Resume&#039;&#039;&#039; &lt;br /&gt;
|&#039;&#039;&#039;Sara Orrick (Senior Career Consultant (45 Mins)&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;GlycositeMiner Project&#039;&#039;&#039; ===&lt;br /&gt;
Establishing a set of rules to broadly classify glycan names into structure-based and function-based categories, and applying these rules to organize entries in the glycan dictionary. These classification rules will evolve over time, enabling the creation of more refined hierarchical categories. The ultimate goal of this rule-based glycan name categorization is to support automated literature mining, specifically for identifying glycan names in PubMed articles.  &lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=904</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=904"/>
		<updated>2025-07-31T03:35:15Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Agenda */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 637, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;12:30pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|Categorization of glycan names&lt;br /&gt;
|Filmawit Zeru (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;3:45pm&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |&#039;&#039;&amp;lt;nowiki&amp;gt;Certificate Distribution | Photo | Break (15mins)&amp;lt;/nowiki&amp;gt;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|4:00pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Guest Lecture: Crafting a Strong LinkedIn Profile and Resume&#039;&#039;&#039; &lt;br /&gt;
|&#039;&#039;&#039;Sara Orrick (Senior Career Consultant (45 Mins)&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;GlycositeMiner Project&#039;&#039;&#039; ===&lt;br /&gt;
Establishing a set of rules to broadly classify glycan names into structure-based and function-based categories, and applying these rules to organize entries in the glycan dictionary. These classification rules will evolve over time, enabling the creation of more refined hierarchical categories. The ultimate goal of this rule-based glycan name categorization is to support automated literature mining, specifically for identifying glycan names in PubMed articles.  &lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=903</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=903"/>
		<updated>2025-07-31T03:13:28Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 637, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;12:30pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|Categorization of glycan names&lt;br /&gt;
|Filmawit Zeru (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;3:45pm&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |&#039;&#039;Break (15mins)&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|4:00pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Guest Lecture: Crafting a Strong LinkedIn Profile and Resume&#039;&#039;&#039; &lt;br /&gt;
|&#039;&#039;&#039;Sara Orrick (Senior Career Consultant (45 Mins)&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;GlycositeMiner Project&#039;&#039;&#039; ===&lt;br /&gt;
Establishing a set of rules to broadly classify glycan names into structure-based and function-based categories, and applying these rules to organize entries in the glycan dictionary. These classification rules will evolve over time, enabling the creation of more refined hierarchical categories. The ultimate goal of this rule-based glycan name categorization is to support automated literature mining, specifically for identifying glycan names in PubMed articles.  &lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=902</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=902"/>
		<updated>2025-07-29T14:48:15Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 637, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|Categorization of glycan names&lt;br /&gt;
|Filmawit Zeru (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;GlycositeMiner Project&#039;&#039;&#039; ===&lt;br /&gt;
Establishing a set of rules to broadly classify glycan names into structure-based and function-based categories, and applying these rules to organize entries in the glycan dictionary. These classification rules will evolve over time, enabling the creation of more refined hierarchical categories. The ultimate goal of this rule-based glycan name categorization is to support automated literature mining, specifically for identifying glycan names in PubMed articles.  &lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=894</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=894"/>
		<updated>2025-07-23T12:52:15Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 637, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets + 5 min Q/A&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images + 5 min Q/A&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications + 5 min Q/A&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Comprehensive Identification and LLM Based Curation of the Top 50 Clinically Relevant Disease Biomarkers + 5 min Q/A&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA + 5 min Q/A&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|Systematic Curation and Large Language Model-Based Extraction of Alzheimer’s Disease Biomarkers&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|1:55pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Introduction&#039;&#039;&#039;&lt;br /&gt;
|Raja Mazumder&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project + 5 min Q/A&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation + 5 min Q/A&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance + 5 min Q/A&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose + 5 min Q/A&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion + 5 min Q/A&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI and machine learning ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;br /&gt;
&lt;br /&gt;
=== PredictMod Curation Project ===&lt;br /&gt;
Identifying relevant and useful publicly-available datasets for machine learning is currently a resource-intensive task. This curation project aims to develop a corpus for training an AI model to recommend PMIDs with publicly-available datasets useful for intervention outcome prediction models. The corpus will include an annotation spreadsheet + annotated PDFs for PubMed articles relevant to prostate, lung, breast, and liver cancer, and focus on indicators such as condition, intervention, and response. &lt;br /&gt;
&lt;br /&gt;
=== PredictMod AI-READI Project ===&lt;br /&gt;
This project creates a data-driven pipeline that uses continuous glucose monitoring (CGM) data to distinguish truly healthy individuals from those with underlying glycemic dysregulation, even if they&#039;re mislabeled. Using the AI-READI dataset funded by the NIH Bridge2AI program, the pipeline combines unsupervised clustering, handcrafted feature engineering, and LSTM-based deep learning to identify metabolic health states and extract insights into glycemic variability, with potential real-time applications in personalized health monitoring.&lt;br /&gt;
&lt;br /&gt;
=== Glycobiology Web Development ===&lt;br /&gt;
This project involves creating an EDAM ontology of glycobiology and glycoinformatics resources for the Glyspace alliance web development. It includes a resource compilation and organization of everything related to Glyspace to create a more user friendly tool to access these resources. This process involves compiling a list of resources that are associated with the Glyspace alliance and sorting them by type, topics, tool operation, and data.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=889</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=889"/>
		<updated>2025-07-18T18:27:56Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Symposium Venue */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 637, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:25pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=888</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=888"/>
		<updated>2025-07-18T18:18:34Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Agenda */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:20pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:25pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:40pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose&lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:55pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=887</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=887"/>
		<updated>2025-07-18T16:37:44Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Program and Information and Registeration */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:50pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=886</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=886"/>
		<updated>2025-07-18T16:37:34Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information and Registeration&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/j/98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250731T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:50pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=885</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=885"/>
		<updated>2025-07-18T16:18:48Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Zoom Registration */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information and Registeration&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Zoom Link  -&#039;&#039;&#039; https://gwu-edu.zoom.us/my/jeetvora?omn=98841344003    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F5554320007%3Fomn%3D98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F5554320007%3Fomn%3D98841344003&amp;amp;ST=20250729T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F5554320007%3Fomn%3D98841344003%0D%0A%0D%0AMeeting%20ID%3A%20555%20432%200007%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C5554320007%23%20US%0D%0A%2B13097403221%2C%2C5554320007%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%0D%0AMeeting%20ID%3A%20555%20432%200007%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FacHiPyrvkB%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%205554320007%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20555%20432%200007%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:50pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=884</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=884"/>
		<updated>2025-07-18T16:12:51Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information and Registeration&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom (Click on the Link to register below)&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Zoom Registration&#039;&#039;&#039; ===&lt;br /&gt;
&#039;&#039;&#039;Registration Link  - &amp;lt;nowiki&amp;gt;https://gwu-edu.zoom.us/meeting/register/nziV1ZJQRt6V4ZOEfjpU_g&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;    &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Add to - [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/calendar/google/add Google Calendar] | [https://gwu-edu.zoom.us/meeting/tJwlc-irqj8qGtdu0zteQ__E2Fqo0fbPF6P7/ics Outlook Calendar] | [https://calendar.yahoo.com/?v=60&amp;amp;VIEW=d&amp;amp;TITLE=2025%20CFDE-GlyGen-HIVE%20Lab%20Summer%20Symposium&amp;amp;in_loc=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;URL=https%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003&amp;amp;ST=20250729T140000Z&amp;amp;DUR=0700&amp;amp;DESC=Jeet%20Vora%20%28GlyGen%20-%20GW%29%20is%20inviting%20you%20to%20a%20scheduled%20Zoom%20meeting.%0D%0AJoin%20Zoom%20Meeting%0D%0Ahttps%3A%2F%2Fgwu-edu.zoom.us%2Fj%2F98841344003%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A---%0D%0A%0D%0AOne%20tap%20mobile%0D%0A%2B14042012656%2C%2C98841344003%23%20US%0D%0A%2B13097403221%2C%2C98841344003%23%20US%0D%0A%0D%0A---%0D%0A%0D%0ADial%20by%20your%20location%0D%0A%E2%80%A2%20%2B14042012656%20US%0D%0A%E2%80%A2%20%2B1%20309%20740%203221%20US%0D%0A%E2%80%A2%20%2B12122258997%20US%0D%0A%E2%80%A2%20%2B16462551997%20US%0D%0A%E2%80%A2%20%2B14805624901%20US%0D%0A%E2%80%A2%20%2B81345107597%20Japan%0D%0A%E2%80%A2%20%2B41434569439%20Switzerland%0D%0A%E2%80%A2%20%2B44%20201%20151%208517%20United%20Kingdom%0D%0A%E2%80%A2%20%2B442079791833%20United%20Kingdom%0D%0A%E2%80%A2%20%2B61280318153%20Australia%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0AFind%20your%20local%20number%3A%20https%3A%2F%2Fgwu-edu.zoom.us%2Fu%2FadB0kGNOyd%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20SIP%0D%0A%E2%80%A2%2098841344003%40zoomcrc.com%0D%0A%0D%0A---%0D%0A%0D%0AJoin%20by%20H.323%0D%0A%E2%80%A2%20144.195.19.161%20%28US%20West%29%0D%0A%E2%80%A2%20206.247.11.121%20%28US%20East%29%0D%0A%E2%80%A2%20115.114.131.7%20%28India%20Mumbai%29%0D%0A%E2%80%A2%20115.114.115.7%20%28India%20Hyderabad%29%0D%0A%E2%80%A2%20159.124.15.191%20%28Amsterdam%20Netherlands%29%0D%0A%E2%80%A2%20159.124.47.249%20%28Germany%29%0D%0A%E2%80%A2%20159.124.104.213%20%28Australia%20Sydney%29%0D%0A%E2%80%A2%20159.124.74.212%20%28Australia%20Melbourne%29%0D%0A%E2%80%A2%20170.114.180.219%20%28Singapore%29%0D%0A%E2%80%A2%2064.211.144.160%20%28Brazil%29%0D%0A%E2%80%A2%20159.124.132.243%20%28Mexico%29%0D%0A%E2%80%A2%20159.124.168.213%20%28Canada%20Toronto%29%0D%0A%E2%80%A2%20159.124.196.25%20%28Canada%20Vancouver%29%0D%0A%E2%80%A2%20170.114.194.163%20%28Japan%20Tokyo%29%0D%0A%E2%80%A2%20147.124.100.25%20%28Japan%20Osaka%29%0D%0A%0D%0AMeeting%20ID%3A%20988%204134%204003%0D%0A%0D%0A Yahoo Calendar]&#039;&#039;&#039;      &lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|TBA&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:50pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|TBA&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=883</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=883"/>
		<updated>2025-07-17T20:46:29Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|Machine Learning Models for Linkage Prediction in Glycan Images&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:35pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:50pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:10pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== CFDE Project ===&lt;br /&gt;
The CFDE project focuses on integrating biocuration and data standardization to generate machine learning-ready glycan datasets. It brings together curated information and structured metadata to ensure that glycan-related data is both interoperable and computationally accessible. As part of this effort, the project supports the development of machine learning models for linkage prediction in glycan images, enabling automated interpretation of glycan structures from visual representations. In addition, a graph-based AI workflow is being implemented to mine glycan biomarkers and related annotations from scientific publications, helping to uncover novel insights and associations. These approaches collectively advance the integration of glycobiology into broader biomedical research by making glycan data more usable for downstream AI applications.&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=882</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=882"/>
		<updated>2025-07-17T19:20:44Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Agenda */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 2 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:45pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:05pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:20pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=881</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=881"/>
		<updated>2025-07-17T19:18:08Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab summer symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their findings on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Sohana Bahl, Isaac Kim, Sparsh Gupta (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|Nathan Ressom, Ana Vohralikova, Mathias Belay (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|John McCaffery, Alma Ogunsina, Akale Kinfe (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Rene Ranzinger&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Argos&lt;br /&gt;
|Curation of Emerging Pathogen Genomes for FDA-ARGOS Database Expansion&lt;br /&gt;
|Miao Wang (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:45pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|GlyGen Biocuration Project&lt;br /&gt;
|Aise Arpinar, Haravinay P. Gujjulla, Nahom Abel (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:05pm&lt;br /&gt;
|GlycoSiteMiner&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:20pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;Open Q and A  | Closing Remarks&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;&amp;lt;nowiki&amp;gt;All (20 min) | Raja Mazumder&amp;lt;/nowiki&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
=== BiomarkerKB Biocuration Project ===&lt;br /&gt;
The Biomarker Biocuration project focuses on biomarker curation from abstracts and publications in the BiomarkerKB data model. A key challenge in curating biomarkers is the vast amount of data that is present over various publications. Manual curation requires reading, inferring, and understanding key elements of biomarker data and being able to map it to the defined biomarker data model. LLM methodologies will help immensely in being able to recognize biomarker and condition data and being able to map information found into the data model while also automatically mapping other contextual and standardized data to the model to allow data to be AI andmachine leanring ready.&lt;br /&gt;
&lt;br /&gt;
=== ArgosDB Curation Project ===&lt;br /&gt;
This project focuses on evaluating and curating high-quality genomes of emerging and clinically relevant pathogens, with an emphasis on fungal species. Using public genomic repositories and FDA-ARGOS inclusion criteria, I identify candidate organisms for database expansion to support diagnostic assay development and public health surveillance.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=872</id>
		<title>Symposium 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Symposium_2025&amp;diff=872"/>
		<updated>2025-07-17T14:50:02Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: Created page with &amp;quot;The HIVE Lab symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their finding on the projects they worked on for 8 weeks.  frame  == &amp;#039;&amp;#039;&amp;#039;Program and Information&amp;#039;&amp;#039;&amp;#039; ==  === &amp;#039;&amp;#039;&amp;#039;Symposium Venue&amp;#039;&amp;#039;&amp;#039; === The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.  In Person - Ross 647, Ross Hall, School of Health and Med...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The HIVE Lab symposium is scheduled for Thursday July 31, 2025. It is an exciting time for the lab volunteers and interns to present their finding on the projects they worked on for 8 weeks.&lt;br /&gt;
&lt;br /&gt;
[[File:DC.png|center|frame]]&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Program and Information&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== &#039;&#039;&#039;Symposium Venue&#039;&#039;&#039; ===&lt;br /&gt;
The HIVE lab symposium will held in person at The George Washington University, Washington DC with an option to join virtually.&lt;br /&gt;
&lt;br /&gt;
In Person - Ross 647, Ross Hall, School of Health and Medical Sciences, The George Washington University, Washington DC ([https://maps.app.goo.gl/PHQmZacA4hWDvTCh6 MAP])&lt;br /&gt;
&lt;br /&gt;
Virtual - Zoom&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Agenda&#039;&#039;&#039; ==&lt;br /&gt;
All times in Eastern Standard Time&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&#039;&#039;&#039;Time (ET)&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Project&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Title&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Presenter&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;10:00am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                            &#039;&#039;&#039;Welcome and Introduction&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Michael Tiemeyer (10 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|10:10am &lt;br /&gt;
|CFDE&lt;br /&gt;
|Integrating Biocuration and Data Standardization to Generate Machine Learning-Ready Glycan Datasets&lt;br /&gt;
|Ana Jaramillo and Yuxin Zou (20 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:30am&lt;br /&gt;
|CFDE&lt;br /&gt;
|&lt;br /&gt;
|Campbell Ross (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|10:45am&lt;br /&gt;
|CFDE&lt;br /&gt;
|A Graph-Based AI Workflow for Mining Glycan Biomarkers and Related Annotations from Publications&lt;br /&gt;
|Cyrus Chun Hong Au Yeung (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:00am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:15am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|11:30am&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;11:45am&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (30 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|12:30pm&lt;br /&gt;
| colspan=&amp;quot;3&amp;quot; |                                                                                                          &#039;&#039;&#039;LUNCH (90 mins)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
| colspan=&amp;quot;4&amp;quot; |                                                                                                                         &#039;&#039;Group 1 Moderator : Nathan Edwards&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|2:00pm&lt;br /&gt;
|Predictmod AI-READI&lt;br /&gt;
|Robust Classification of Glycemic Health States from Continuous Glucose &lt;br /&gt;
|Nikhil Arethiya (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:15pm&lt;br /&gt;
|Predictmod Curation&lt;br /&gt;
|PredictMod: PubMed Curation for Training an LLM for Recommendation&lt;br /&gt;
|Grace Chong, Aaron Ressom, Diya Kamalabharathy (15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:30pm&lt;br /&gt;
|Argos&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|2:45pm&lt;br /&gt;
|GlyGen&lt;br /&gt;
|&lt;br /&gt;
|(20 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:05pm&lt;br /&gt;
|GlycoSiteMineros&lt;br /&gt;
|&lt;br /&gt;
|(15 min)&lt;br /&gt;
|-&lt;br /&gt;
|3:20pm&lt;br /&gt;
|Glycobiology Web Development&lt;br /&gt;
|A Resource Drill Down and Visualization for the Glyspace Alliance&lt;br /&gt;
|Diya Kamalabharathy (5 min)&lt;br /&gt;
|-&lt;br /&gt;
|&#039;&#039;&#039;3:25pm&#039;&#039;&#039;&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |&#039;&#039;&#039;Open Q and A&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;All (20 min)&#039;&#039;&#039;&lt;br /&gt;
|-&lt;br /&gt;
|3:45pm&lt;br /&gt;
| colspan=&amp;quot;2&amp;quot; |                                                                                       &#039;&#039;&#039;Closing Remarks&#039;&#039;&#039;&lt;br /&gt;
|&#039;&#039;&#039;Raja Mazumder&#039;&#039;&#039;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== &#039;&#039;&#039;Project Description&#039;&#039;&#039; ==&lt;br /&gt;
&lt;br /&gt;
=== GlyGen Project ===&lt;br /&gt;
The GlyGen Biocuration project focuses on integrating legacy, yet valuable, data from the CarbBank and CFG databases into the GlyGen infrastructure. A key challenge is mapping metadata, such as species names and publication references, to standardized dictionaries and ontologies. While most entries have been automatically matched using custom scripts, remaining inconsistencies, including outdated, misspelled, or abbreviated terms, require manual curation using resources such as Google, PubMed, and domain-specific dictionaries and ontologies.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:DC.png&amp;diff=871</id>
		<title>File:DC.png</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:DC.png&amp;diff=871"/>
		<updated>2025-07-17T14:00:45Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DC&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=870</id>
		<title>Volunteership 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=870"/>
		<updated>2025-07-16T20:15:15Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;2025 Volunteer Program Details&amp;lt;/h2&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Dates&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;strong&amp;gt;Volunteer Zoom Kick-Off Meeting&amp;lt;/strong&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
May 27, 2025 | 3:30 to 4:30 PM&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;Program Dates: June 2nd, 2025 – July 25th, 2025&amp;lt;/strong&amp;gt; (8 weeks)&amp;lt;br&amp;gt;&lt;br /&gt;
Monday to Friday | Remote | No breaks&lt;br /&gt;
&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Volunteer Expectations&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Daily progress updates via Slack (scrum).&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Regular Zoom meetings with the assigned project point of contact.&amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&amp;lt;p style=&amp;quot;color: red;&amp;quot;&amp;gt;&amp;lt;strong&amp;gt;Important:&amp;lt;/strong&amp;gt; If the scrum is not updated for 2 consecutive days, the candidate will be &amp;lt;u&amp;gt;automatically dropped&amp;lt;/u&amp;gt; from the program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Potential Projects&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models.  &amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&#039;&#039;Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.&#039;&#039;&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;1. BiomarkerKB Biocuration Project Ideas&amp;lt;/h4&amp;gt;POC: Daniall Masood, Maria Kim&lt;br /&gt;
# Curate biomarkers for a specific disease (Alzheimers)&lt;br /&gt;
## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.&lt;br /&gt;
## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.&lt;br /&gt;
# Top 50 biomarkers&lt;br /&gt;
## Curate the top 50 biomarkers for biomarkerkb.org.&lt;br /&gt;
## Define what constitutes a top 50 biomarker.&lt;br /&gt;
## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.&lt;br /&gt;
# Biocuration of biomarkers from NLP/LLM work&lt;br /&gt;
## Use the biomarkers collected from NLP work.&lt;br /&gt;
## Curate biomarkers. Data provided was not provided in the biomarker data model.&lt;br /&gt;
## While curating the biomarkers, check if data collected from NLP is correct.&lt;br /&gt;
## After completion, the student can start using curated data to work on the NLP/LLM method.&lt;br /&gt;
# Curate biomarkers for a treatment&lt;br /&gt;
## See #1 above.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&lt;br /&gt;
&lt;br /&gt;
==== 2. GlyGen Biocuration Project Ideas ====&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human,&amp;quot; &amp;quot;man,&amp;quot; and &amp;quot;h. sapiens&amp;quot; all map to the scientific species name &amp;quot;Homo sapiens.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.&lt;br /&gt;
# Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.&lt;br /&gt;
# Finding papers based on titles and author lists that may contain spelling errors.&lt;br /&gt;
# Interacting and discussing with other curators in case terms are mapped differently.&lt;br /&gt;
&lt;br /&gt;
If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. GlyGen Publication Analysis Project Ideas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using the PubMed web API to filter publications based on keywords.&lt;br /&gt;
# Analyzing paper abstracts to identify research institutions and groups that form the community.&lt;br /&gt;
# Filtering the community list to exclude unrelated co-authors.&lt;br /&gt;
# Prioritize papers identified by GlycoSiteMiner for curation via TableMaker&lt;br /&gt;
&lt;br /&gt;
A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
==== 4. PredictMod Machine Learning Project Ideas ====&lt;br /&gt;
POC: Lori Krammer&lt;br /&gt;
&lt;br /&gt;
Data Identification &amp;amp; Curation: &lt;br /&gt;
&lt;br /&gt;
# Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.&lt;br /&gt;
# Curate indicators of useful ML publications that could be used to train an LLM to recommend relevant publications for cancer modeling.&lt;br /&gt;
&lt;br /&gt;
Modeling &amp;amp; Integration (for those with experience in programming/ML)&lt;br /&gt;
&lt;br /&gt;
# Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.&lt;br /&gt;
# Perform model training and document ML pipeline in a BioCompute Object (BCO).&lt;br /&gt;
# Integrate model into PredictMod platform.&lt;br /&gt;
&lt;br /&gt;
Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. FDA-ARGOS Computation and Pathogen Curation Project&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Christie Woodside, Jonathon Keeney&lt;br /&gt;
&lt;br /&gt;
# Update data tables for more efficient computations&lt;br /&gt;
## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week&#039;s worth of work&lt;br /&gt;
## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.&lt;br /&gt;
# Curate and report on current pathogens to upload to ARGOS&lt;br /&gt;
## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work&lt;br /&gt;
## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.&lt;br /&gt;
## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&amp;lt;hr&amp;gt;&lt;br /&gt;
&amp;lt;h3&amp;gt;Requirements for Completion&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;Note:&amp;lt;/strong&amp;gt; The following are &amp;lt;u&amp;gt;mandatory&amp;lt;/u&amp;gt;. Failure to complete any will result in an incomplete volunteer record.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Documentation&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Written Report&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Presentation &amp;amp; Slide Submission&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Present your work last week of the 8-week period.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Slides must be submitted to the Admin Team and should include:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;See Symposium Slides Guidelines below&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
Contact the Admin Team to access previously submitted slides.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Completion Certificate ===&lt;br /&gt;
A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Contact ===&lt;br /&gt;
mazumder_lab@gwu.edu.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Volunteers ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
|-&lt;br /&gt;
! Name&lt;br /&gt;
!Project&lt;br /&gt;
!Projects Interested&lt;br /&gt;
|-&lt;br /&gt;
| [https://www.linkedin.com/in/gracesjchong/ Grace Chong]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# ARGOS&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# PredictMod Machine Learning&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BioMarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/miao-wang-88b602290/Miao&amp;amp;#x20;Wang Miao Wang]&lt;br /&gt;
|ARGOS&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project Ideas&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen Curation Project&lt;br /&gt;
# PredictMod Machine Learning Project Ideas&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]&lt;br /&gt;
|GlyGen and PubMed project&lt;br /&gt;
|&lt;br /&gt;
#PredictMod&lt;br /&gt;
#BiomarkerKB&lt;br /&gt;
#GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] &lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aise-arpinar-a8bb9b373/?original_referer= Aise Arpinar]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Publication Analysis&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]&lt;br /&gt;
|GlycoSiteMiner project&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/mathias-belay-03b51a2a3/ Mathias Belay]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/isaac-kim-b644bb231/ Isaac Kim]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# PredictMod&lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|Sohana Bahl&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/ana-vohralikova-794a4433a?utm_source=share&amp;amp;utm_campaign=share_via&amp;amp;utm_content=profile&amp;amp;utm_medium=ios_app Ana Vohralikova]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project&lt;br /&gt;
# GlyGen Biocuration Project&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Symposium Slide Guidelines ===&lt;br /&gt;
&#039;&#039;&#039;Content Clarity&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
   •       &#039;&#039;&#039;Keep It Simple:&#039;&#039;&#039; Use concise bullet points instead of long paragraphs. Aim for no more than 6 bullet points per slide. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Focus on Key Points:&#039;&#039;&#039; Highlight the main ideas or data you want your audience to remember. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Consistent Layout:&#039;&#039;&#039; Use a consistent layout for each slide, including fonts, colors, and background. This helps maintain a professional look. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;High-Quality Images:&#039;&#039;&#039; Use high-resolution images and graphics to illustrate your points. Avoid using clip art. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Readable Fonts:&#039;&#039;&#039; Use easy-to-read fonts (e.g., Arial, Calibri) and ensure font sizes are large enough to be seen from a distance (24 pt or larger for main text). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Contrast:&#039;&#039;&#039; Ensure there is high contrast between text and background (e.g., dark text on a light background). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Citation:&#039;&#039;&#039; Cite a publication to support the information presented in proper citation format. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline for Symposium presentation&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
1.       Introduction: &lt;br /&gt;
&lt;br /&gt;
2.       Project Descriptions &lt;br /&gt;
&lt;br /&gt;
3.       Objectives and Goals: &lt;br /&gt;
&lt;br /&gt;
4.       Methods, Results, Achievements and Contributions: &lt;br /&gt;
&lt;br /&gt;
5.       Future Plans: &lt;br /&gt;
&lt;br /&gt;
6.       Skills and Knowledge Gained: &lt;br /&gt;
&lt;br /&gt;
7.       Acknowledgments: &lt;br /&gt;
&lt;br /&gt;
8.       Q&amp;amp;A Session: &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1. Introduction:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Briefly introduce yourself.  &lt;br /&gt;
&lt;br /&gt;
  - Add your picture and name on the introduction slide.  If it is group add the group picture.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2. Project Descriptions:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Provide context and background information about the project.  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. Project Objectives and Goals:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Describe the main objectives of the project or initiative.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any additional goals or desired outcomes.  &lt;br /&gt;
&lt;br /&gt;
  - Explain why these objectives and goals are important.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;4. Methods, Results, Achievements and Contributions:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  -  Highlight the methods/tools used in the project.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight the key results and outcomes of the project.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss the most significant achievements and milestones reached.  &lt;br /&gt;
&lt;br /&gt;
  - Explain how each member of the team project contributed to the project (for group project) &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; &#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. Future Plans&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  - Next steps or future plans for the project&lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;6. Skills and Knowledge Gained:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  -   Detail any technical skills acquired or improved.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight any soft skills, such as communication or teamwork, that were developed.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss new knowledge gained in specific areas or subjects.  &lt;br /&gt;
&lt;br /&gt;
  -  Share any personal reflections on the experience and what was learned.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any challenges or obstacles encountered and how they were overcome.  &lt;br /&gt;
&lt;br /&gt;
  - Provide key insights or lessons learned from the project.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;7. Acknowledgments:&#039;&#039;&#039;  &#039;&#039;&#039;:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the contributions of team members and collaborators.  &lt;br /&gt;
&lt;br /&gt;
- Recognize the guidance and support of mentors and advisors.  &lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the Project Funding.  Eg. CFDE&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;8. Q&amp;amp;A Session:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  - Invite the audience to ask questions and engage in discussion.  &lt;br /&gt;
&lt;br /&gt;
  - Provide clear and thoughtful responses to audience questions.  &lt;br /&gt;
&lt;br /&gt;
  - Offer closing remarks and thank the audience for their participation. &lt;br /&gt;
&lt;br /&gt;
Note – If you have limited presentation time you can also merge few topics into one.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=869</id>
		<title>Volunteership 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=869"/>
		<updated>2025-07-16T20:13:34Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;2025 Volunteer Program Details&amp;lt;/h2&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Dates&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;strong&amp;gt;Volunteer Zoom Kick-Off Meeting&amp;lt;/strong&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
May 27, 2025 | 3:30 to 4:30 PM&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;Program Dates: June 2nd, 2025 – July 25th, 2025&amp;lt;/strong&amp;gt; (8 weeks)&amp;lt;br&amp;gt;&lt;br /&gt;
Monday to Friday | Remote | No breaks&lt;br /&gt;
&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Volunteer Expectations&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Daily progress updates via Slack (scrum).&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Regular Zoom meetings with the assigned project point of contact.&amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&amp;lt;p style=&amp;quot;color: red;&amp;quot;&amp;gt;&amp;lt;strong&amp;gt;Important:&amp;lt;/strong&amp;gt; If the scrum is not updated for 2 consecutive days, the candidate will be &amp;lt;u&amp;gt;automatically dropped&amp;lt;/u&amp;gt; from the program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Potential Projects&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models.  &amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&#039;&#039;Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.&#039;&#039;&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;1. BiomarkerKB Biocuration Project Ideas&amp;lt;/h4&amp;gt;POC: Daniall Masood, Maria Kim&lt;br /&gt;
# Curate biomarkers for a specific disease (Alzheimers)&lt;br /&gt;
## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.&lt;br /&gt;
## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.&lt;br /&gt;
# Top 50 biomarkers&lt;br /&gt;
## Curate the top 50 biomarkers for biomarkerkb.org.&lt;br /&gt;
## Define what constitutes a top 50 biomarker.&lt;br /&gt;
## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.&lt;br /&gt;
# Biocuration of biomarkers from NLP/LLM work&lt;br /&gt;
## Use the biomarkers collected from NLP work.&lt;br /&gt;
## Curate biomarkers. Data provided was not provided in the biomarker data model.&lt;br /&gt;
## While curating the biomarkers, check if data collected from NLP is correct.&lt;br /&gt;
## After completion, the student can start using curated data to work on the NLP/LLM method.&lt;br /&gt;
# Curate biomarkers for a treatment&lt;br /&gt;
## See #1 above.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&lt;br /&gt;
&lt;br /&gt;
==== 2. GlyGen Biocuration Project Ideas ====&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human,&amp;quot; &amp;quot;man,&amp;quot; and &amp;quot;h. sapiens&amp;quot; all map to the scientific species name &amp;quot;Homo sapiens.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.&lt;br /&gt;
# Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.&lt;br /&gt;
# Finding papers based on titles and author lists that may contain spelling errors.&lt;br /&gt;
# Interacting and discussing with other curators in case terms are mapped differently.&lt;br /&gt;
&lt;br /&gt;
If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. GlyGen Publication Analysis Project Ideas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using the PubMed web API to filter publications based on keywords.&lt;br /&gt;
# Analyzing paper abstracts to identify research institutions and groups that form the community.&lt;br /&gt;
# Filtering the community list to exclude unrelated co-authors.&lt;br /&gt;
# Prioritize papers identified by GlycoSiteMiner for curation via TableMaker&lt;br /&gt;
&lt;br /&gt;
A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
==== 4. PredictMod Machine Learning Project Ideas ====&lt;br /&gt;
POC: Lori Krammer&lt;br /&gt;
&lt;br /&gt;
Data Identification &amp;amp; Curation: &lt;br /&gt;
&lt;br /&gt;
# Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.&lt;br /&gt;
# Curate indicators of useful ML publications that could be used to train an LLM to recommend relevant publications for cancer modeling.&lt;br /&gt;
&lt;br /&gt;
Modeling &amp;amp; Integration (for those with experience in programming/ML)&lt;br /&gt;
&lt;br /&gt;
# Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.&lt;br /&gt;
# Perform model training and document ML pipeline in a BioCompute Object (BCO).&lt;br /&gt;
# Integrate model into PredictMod platform.&lt;br /&gt;
&lt;br /&gt;
Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. FDA-ARGOS Computation and Pathogen Curation Project&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Christie Woodside, Jonathon Keeney&lt;br /&gt;
&lt;br /&gt;
# Update data tables for more efficient computations&lt;br /&gt;
## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week&#039;s worth of work&lt;br /&gt;
## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.&lt;br /&gt;
# Curate and report on current pathogens to upload to ARGOS&lt;br /&gt;
## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work&lt;br /&gt;
## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.&lt;br /&gt;
## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&amp;lt;hr&amp;gt;&lt;br /&gt;
&amp;lt;h3&amp;gt;Requirements for Completion&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;Note:&amp;lt;/strong&amp;gt; The following are &amp;lt;u&amp;gt;mandatory&amp;lt;/u&amp;gt;. Failure to complete any will result in an incomplete volunteer record.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Documentation&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Written Report&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Presentation &amp;amp; Slide Submission&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Present your work last week of the 8-week period.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Slides must be submitted to the Admin Team and should include:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;A title slide with your name, date, and mentor&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;At least 3 content slides&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;A final slide with acknowledgements or references&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
Contact the Admin Team to access previously submitted slides.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Completion Certificate ===&lt;br /&gt;
A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Contact ===&lt;br /&gt;
mazumder_lab@gwu.edu.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Volunteers ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
|-&lt;br /&gt;
! Name&lt;br /&gt;
!Project&lt;br /&gt;
!Projects Interested&lt;br /&gt;
|-&lt;br /&gt;
| [https://www.linkedin.com/in/gracesjchong/ Grace Chong]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# ARGOS&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# PredictMod Machine Learning&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BioMarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/miao-wang-88b602290/Miao&amp;amp;#x20;Wang Miao Wang]&lt;br /&gt;
|ARGOS&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project Ideas&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen Curation Project&lt;br /&gt;
# PredictMod Machine Learning Project Ideas&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]&lt;br /&gt;
|GlyGen and PubMed project&lt;br /&gt;
|&lt;br /&gt;
#PredictMod&lt;br /&gt;
#BiomarkerKB&lt;br /&gt;
#GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] &lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aise-arpinar-a8bb9b373/?original_referer= Aise Arpinar]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Publication Analysis&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]&lt;br /&gt;
|GlycoSiteMiner project&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/mathias-belay-03b51a2a3/ Mathias Belay]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/isaac-kim-b644bb231/ Isaac Kim]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# PredictMod&lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|Sohana Bahl&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/ana-vohralikova-794a4433a?utm_source=share&amp;amp;utm_campaign=share_via&amp;amp;utm_content=profile&amp;amp;utm_medium=ios_app Ana Vohralikova]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project&lt;br /&gt;
# GlyGen Biocuration Project&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Symposium Slide Guidelines ===&lt;br /&gt;
&#039;&#039;&#039;Content Clarity&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
   •       &#039;&#039;&#039;Keep It Simple:&#039;&#039;&#039; Use concise bullet points instead of long paragraphs. Aim for no more than 6 bullet points per slide. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Focus on Key Points:&#039;&#039;&#039; Highlight the main ideas or data you want your audience to remember. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Consistent Layout:&#039;&#039;&#039; Use a consistent layout for each slide, including fonts, colors, and background. This helps maintain a professional look. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;High-Quality Images:&#039;&#039;&#039; Use high-resolution images and graphics to illustrate your points. Avoid using clip art. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Readable Fonts:&#039;&#039;&#039; Use easy-to-read fonts (e.g., Arial, Calibri) and ensure font sizes are large enough to be seen from a distance (24 pt or larger for main text). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Contrast:&#039;&#039;&#039; Ensure there is high contrast between text and background (e.g., dark text on a light background). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Citation:&#039;&#039;&#039; Cite a publication to support the information presented in proper citation format. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline for Symposium presentation&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
1.       Introduction: &lt;br /&gt;
&lt;br /&gt;
2.       Project Descriptions &lt;br /&gt;
&lt;br /&gt;
3.       Objectives and Goals: &lt;br /&gt;
&lt;br /&gt;
4.       Methods, Results, Achievements and Contributions: &lt;br /&gt;
&lt;br /&gt;
5.       Future Plans: &lt;br /&gt;
&lt;br /&gt;
6.       Skills and Knowledge Gained: &lt;br /&gt;
&lt;br /&gt;
7.       Acknowledgments: &lt;br /&gt;
&lt;br /&gt;
8.       Q&amp;amp;A Session: &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1. Introduction:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Briefly introduce yourself.  &lt;br /&gt;
&lt;br /&gt;
  - Add your picture and name on the introduction slide.  If it is group add the group picture.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2. Project Descriptions:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Provide context and background information about the project.  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. Project Objectives and Goals:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Describe the main objectives of the project or initiative.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any additional goals or desired outcomes.  &lt;br /&gt;
&lt;br /&gt;
  - Explain why these objectives and goals are important.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;4. Methods, Results, Achievements and Contributions:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  -  Highlight the methods/tools used in the project.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight the key results and outcomes of the project.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss the most significant achievements and milestones reached.  &lt;br /&gt;
&lt;br /&gt;
  - Explain how each member of the team project contributed to the project (for group project) &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; &#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. Future Plans&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  - Next steps or future plans for the project&lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;6. Skills and Knowledge Gained:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  -   Detail any technical skills acquired or improved.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight any soft skills, such as communication or teamwork, that were developed.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss new knowledge gained in specific areas or subjects.  &lt;br /&gt;
&lt;br /&gt;
  -  Share any personal reflections on the experience and what was learned.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any challenges or obstacles encountered and how they were overcome.  &lt;br /&gt;
&lt;br /&gt;
  - Provide key insights or lessons learned from the project.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;7. Acknowledgments:&#039;&#039;&#039;  &#039;&#039;&#039;:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the contributions of team members and collaborators.  &lt;br /&gt;
&lt;br /&gt;
- Recognize the guidance and support of mentors and advisors.  &lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the Project Funding.  Eg. CFDE&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;8. Q&amp;amp;A Session:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  - Invite the audience to ask questions and engage in discussion.  &lt;br /&gt;
&lt;br /&gt;
  - Provide clear and thoughtful responses to audience questions.  &lt;br /&gt;
&lt;br /&gt;
  - Offer closing remarks and thank the audience for their participation. &lt;br /&gt;
&lt;br /&gt;
Note – If you have limited presentation time you can also merge few topics into one.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=868</id>
		<title>Volunteership 2025</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&amp;diff=868"/>
		<updated>2025-07-16T20:12:47Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: /* Volunteers */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;2025 Volunteer Program Details&amp;lt;/h2&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Dates&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;strong&amp;gt;Volunteer Zoom Kick-Off Meeting&amp;lt;/strong&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
May 27, 2025 | 3:30 to 4:30 PM&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;Program Dates: June 2nd, 2025 – July 25th, 2025&amp;lt;/strong&amp;gt; (8 weeks)&amp;lt;br&amp;gt;&lt;br /&gt;
Monday to Friday | Remote | No breaks&lt;br /&gt;
&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Volunteer Expectations&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Daily progress updates via Slack (scrum).&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Regular Zoom meetings with the assigned project point of contact.&amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&amp;lt;p style=&amp;quot;color: red;&amp;quot;&amp;gt;&amp;lt;strong&amp;gt;Important:&amp;lt;/strong&amp;gt; If the scrum is not updated for 2 consecutive days, the candidate will be &amp;lt;u&amp;gt;automatically dropped&amp;lt;/u&amp;gt; from the program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h3&amp;gt;Potential Projects&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. &amp;lt;/li&amp;gt;&amp;lt;li&amp;gt;PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models.  &amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&#039;&#039;Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.&#039;&#039;&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;1. BiomarkerKB Biocuration Project Ideas&amp;lt;/h4&amp;gt;POC: Daniall Masood, Maria Kim&lt;br /&gt;
# Curate biomarkers for a specific disease (Alzheimers)&lt;br /&gt;
## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.&lt;br /&gt;
## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.&lt;br /&gt;
# Top 50 biomarkers&lt;br /&gt;
## Curate the top 50 biomarkers for biomarkerkb.org.&lt;br /&gt;
## Define what constitutes a top 50 biomarker.&lt;br /&gt;
## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.&lt;br /&gt;
# Biocuration of biomarkers from NLP/LLM work&lt;br /&gt;
## Use the biomarkers collected from NLP work.&lt;br /&gt;
## Curate biomarkers. Data provided was not provided in the biomarker data model.&lt;br /&gt;
## While curating the biomarkers, check if data collected from NLP is correct.&lt;br /&gt;
## After completion, the student can start using curated data to work on the NLP/LLM method.&lt;br /&gt;
# Curate biomarkers for a treatment&lt;br /&gt;
## See #1 above.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&lt;br /&gt;
&lt;br /&gt;
==== 2. GlyGen Biocuration Project Ideas ====&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human,&amp;quot; &amp;quot;man,&amp;quot; and &amp;quot;h. sapiens&amp;quot; all map to the scientific species name &amp;quot;Homo sapiens.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.&lt;br /&gt;
# Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.&lt;br /&gt;
# Finding papers based on titles and author lists that may contain spelling errors.&lt;br /&gt;
# Interacting and discussing with other curators in case terms are mapped differently.&lt;br /&gt;
&lt;br /&gt;
If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. GlyGen Publication Analysis Project Ideas&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Rene Ranzinger and Urnisha Bhuiyan&lt;br /&gt;
&lt;br /&gt;
One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.&lt;br /&gt;
&lt;br /&gt;
The project involves:&lt;br /&gt;
&lt;br /&gt;
# Using the PubMed web API to filter publications based on keywords.&lt;br /&gt;
# Analyzing paper abstracts to identify research institutions and groups that form the community.&lt;br /&gt;
# Filtering the community list to exclude unrelated co-authors.&lt;br /&gt;
# Prioritize papers identified by GlycoSiteMiner for curation via TableMaker&lt;br /&gt;
&lt;br /&gt;
A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.&lt;br /&gt;
&lt;br /&gt;
==== 4. PredictMod Machine Learning Project Ideas ====&lt;br /&gt;
POC: Lori Krammer&lt;br /&gt;
&lt;br /&gt;
Data Identification &amp;amp; Curation: &lt;br /&gt;
&lt;br /&gt;
# Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.&lt;br /&gt;
# Curate indicators of useful ML publications that could be used to train an LLM to recommend relevant publications for cancer modeling.&lt;br /&gt;
&lt;br /&gt;
Modeling &amp;amp; Integration (for those with experience in programming/ML)&lt;br /&gt;
&lt;br /&gt;
# Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.&lt;br /&gt;
# Perform model training and document ML pipeline in a BioCompute Object (BCO).&lt;br /&gt;
# Integrate model into PredictMod platform.&lt;br /&gt;
&lt;br /&gt;
Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. FDA-ARGOS Computation and Pathogen Curation Project&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
POC: Christie Woodside, Jonathon Keeney&lt;br /&gt;
&lt;br /&gt;
# Update data tables for more efficient computations&lt;br /&gt;
## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week&#039;s worth of work&lt;br /&gt;
## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.&lt;br /&gt;
# Curate and report on current pathogens to upload to ARGOS&lt;br /&gt;
## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work&lt;br /&gt;
## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.&lt;br /&gt;
## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.&lt;br /&gt;
&lt;br /&gt;
If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.&amp;lt;hr&amp;gt;&lt;br /&gt;
&amp;lt;h3&amp;gt;Requirements for Completion&amp;lt;/h3&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;&amp;lt;strong&amp;gt;Note:&amp;lt;/strong&amp;gt; The following are &amp;lt;u&amp;gt;mandatory&amp;lt;/u&amp;gt;. Failure to complete any will result in an incomplete volunteer record.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Documentation&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Written Report&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h4&amp;gt;Presentation &amp;amp; Slide Submission&amp;lt;/h4&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Present your work last week of the 8-week period.&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Slides must be submitted to the Admin Team and should include:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;A title slide with your name, date, and mentor&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;At least 3 content slides&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;A final slide with acknowledgements or references&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
Contact the Admin Team to access previously submitted slides.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Completion Certificate ===&lt;br /&gt;
A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Contact ===&lt;br /&gt;
mazumder_lab@gwu.edu.&lt;br /&gt;
&amp;lt;hr&amp;gt;&lt;br /&gt;
=== Volunteers ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
|-&lt;br /&gt;
! Name&lt;br /&gt;
!Project&lt;br /&gt;
!Projects Interested&lt;br /&gt;
|-&lt;br /&gt;
| [https://www.linkedin.com/in/gracesjchong/ Grace Chong]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# ARGOS&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]&lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# PredictMod Machine Learning&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BioMarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/miao-wang-88b602290/Miao&amp;amp;#x20;Wang Miao Wang]&lt;br /&gt;
|ARGOS&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project Ideas&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen Curation Project&lt;br /&gt;
# PredictMod Machine Learning Project Ideas&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# PredictMod&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]&lt;br /&gt;
|GlyGen and PubMed project&lt;br /&gt;
|&lt;br /&gt;
#PredictMod&lt;br /&gt;
#BiomarkerKB&lt;br /&gt;
#GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]&lt;br /&gt;
|BiomarkerKB&lt;br /&gt;
|&lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] &lt;br /&gt;
|PredictMod&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/aise-arpinar-a8bb9b373/?original_referer= Aise Arpinar]&lt;br /&gt;
|GlyGen curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen Biocuration&lt;br /&gt;
# BiomarkerKB Biocuration&lt;br /&gt;
# GlyGen Publication Analysis&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration &lt;br /&gt;
# PredictMod &lt;br /&gt;
# GlyGen Biocuration &lt;br /&gt;
|-&lt;br /&gt;
|[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]&lt;br /&gt;
|GlycoSiteMiner project&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# GlyGen&lt;br /&gt;
# ARGOS&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/mathias-belay-03b51a2a3/ Mathias Belay]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# GlyGen&lt;br /&gt;
# PredictMod&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/isaac-kim-b644bb231/ Isaac Kim]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
# PredictMod&lt;br /&gt;
# GlyGen&lt;br /&gt;
|-&lt;br /&gt;
|Sohana Bahl&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB&lt;br /&gt;
|-&lt;br /&gt;
|[https://www.linkedin.com/in/ana-vohralikova-794a4433a?utm_source=share&amp;amp;utm_campaign=share_via&amp;amp;utm_content=profile&amp;amp;utm_medium=ios_app Ana Vohralikova]&lt;br /&gt;
|Biomarker curation&lt;br /&gt;
|&lt;br /&gt;
# BiomarkerKB Biocuration Project&lt;br /&gt;
# GlyGen Biocuration Project&lt;br /&gt;
# FDA-ARGOS Computation and Pathogen&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Symposium Slide Guidelines ===&lt;br /&gt;
&#039;&#039;&#039;Content Clarity&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
   •       &#039;&#039;&#039;Keep It Simple:&#039;&#039;&#039; Use concise bullet points instead of long paragraphs. Aim for no more than 6 bullet points per slide. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Focus on Key Points:&#039;&#039;&#039; Highlight the main ideas or data you want your audience to remember. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Consistent Layout:&#039;&#039;&#039; Use a consistent layout for each slide, including fonts, colors, and background. This helps maintain a professional look. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;High-Quality Images:&#039;&#039;&#039; Use high-resolution images and graphics to illustrate your points. Avoid using clip art. &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Readable Fonts:&#039;&#039;&#039; Use easy-to-read fonts (e.g., Arial, Calibri) and ensure font sizes are large enough to be seen from a distance (24 pt or larger for main text). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Contrast:&#039;&#039;&#039; Ensure there is high contrast between text and background (e.g., dark text on a light background). &lt;br /&gt;
&lt;br /&gt;
  •        &#039;&#039;&#039;Citation:&#039;&#039;&#039; Cite a publication to support the information presented in proper citation format. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline for Symposium presentation&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
1.       Introduction: &lt;br /&gt;
&lt;br /&gt;
2.       Project Descriptions &lt;br /&gt;
&lt;br /&gt;
3.       Objectives and Goals: &lt;br /&gt;
&lt;br /&gt;
4.       Methods, Results, Achievements and Contributions: &lt;br /&gt;
&lt;br /&gt;
5.       Future Plans: &lt;br /&gt;
&lt;br /&gt;
6.       Skills and Knowledge Gained: &lt;br /&gt;
&lt;br /&gt;
7.       Acknowledgments: &lt;br /&gt;
&lt;br /&gt;
8.       Q&amp;amp;A Session: &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Outline&#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1. Introduction:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Briefly introduce yourself.  &lt;br /&gt;
&lt;br /&gt;
  - Add your picture and name on the introduction slide.  If it is group add the group picture.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2. Project Descriptions:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Provide context and background information about the project.  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;3. Project Objectives and Goals:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Describe the main objectives of the project or initiative.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any additional goals or desired outcomes.  &lt;br /&gt;
&lt;br /&gt;
  - Explain why these objectives and goals are important.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;4. Methods, Results, Achievements and Contributions:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  -  Highlight the methods/tools used in the project.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight the key results and outcomes of the project.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss the most significant achievements and milestones reached.  &lt;br /&gt;
&lt;br /&gt;
  - Explain how each member of the team project contributed to the project (for group project) &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; &#039;&#039;&#039; &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;5. Future Plans&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
  - Next steps or future plans for the project&lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;6. Skills and Knowledge Gained:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  -   Detail any technical skills acquired or improved.  &lt;br /&gt;
&lt;br /&gt;
  - Highlight any soft skills, such as communication or teamwork, that were developed.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss new knowledge gained in specific areas or subjects.  &lt;br /&gt;
&lt;br /&gt;
  -  Share any personal reflections on the experience and what was learned.  &lt;br /&gt;
&lt;br /&gt;
  - Discuss any challenges or obstacles encountered and how they were overcome.  &lt;br /&gt;
&lt;br /&gt;
  - Provide key insights or lessons learned from the project.  &lt;br /&gt;
&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;7. Acknowledgments:&#039;&#039;&#039;  &#039;&#039;&#039;:&#039;&#039;&#039;  (1 slide)&lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the contributions of team members and collaborators.  &lt;br /&gt;
&lt;br /&gt;
- Recognize the guidance and support of mentors and advisors.  &lt;br /&gt;
&lt;br /&gt;
  - Acknowledge the Project Funding.  Eg. CFDE&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;8. Q&amp;amp;A Session:&#039;&#039;&#039;  &lt;br /&gt;
&lt;br /&gt;
  - Invite the audience to ask questions and engage in discussion.  &lt;br /&gt;
&lt;br /&gt;
  - Provide clear and thoughtful responses to audience questions.  &lt;br /&gt;
&lt;br /&gt;
  - Offer closing remarks and thank the audience for their participation. &lt;br /&gt;
&lt;br /&gt;
Note – If you have limited presentation time you can also merge few topics into one.&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Events&amp;diff=286</id>
		<title>Events</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Events&amp;diff=286"/>
		<updated>2025-02-03T15:53:37Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;h2&amp;gt;Current/Past Events&amp;lt;/h2&amp;gt;&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        [https://wiki.glygen.org/GlyGen_Webinar_Series GlySpace Alliance / GlyGen Webinar Series]&amp;lt;br&amp;gt;&lt;br /&gt;
        Frequency: monthly&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        [https://docs.google.com/document/d/15DOqNE_BLI0QPlzYgXcR9fyG-plqJKeRQc_AtYjTeaE/edit?usp=sharing 2025 GW-Bioinformatics Symposium] &amp;lt;br&amp;gt;&lt;br /&gt;
        April 29, at GW in Washington, DC, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        [https://wiki.glygen.org/GlyGen_and_Foundation_for_Advanced_Education_in_the_Sciences_(FAES)_2024_Workshop 2024 GlyGen-FAES Workshop] &amp;lt;br&amp;gt;&lt;br /&gt;
        Nov 4 at FAES-NIH in Bethesda, MD, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        [https://wiki.biocomputeobject.org/BioCompute_Conference_and_Workshop#Conference_Webinar 2024 BioCompute Conference and Workshop] &amp;lt;br&amp;gt;&lt;br /&gt;
        May 10 at FDA in White Oak, MD, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        [https://www.biocomputeobject.org/ 2019 BioCompute Workshop] &amp;lt;br&amp;gt;&lt;br /&gt;
        May 14 at FDA in White Oak, MD, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        2018 OncoMX Workshop &amp;lt;br&amp;gt;&lt;br /&gt;
        May 22 at GW in Washington, DC, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        2018 Glycoinformatics Tools Workshop &amp;lt;br&amp;gt;&lt;br /&gt;
        Mar 16 at GW in Washington, DC, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        2018 BioCompute PoC Workshop &amp;lt;br&amp;gt;&lt;br /&gt;
        Mar 23 at GW in Washington, DC, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        2017 HTS Computational Standards for Regulatory Sciences Workshop &amp;lt;br&amp;gt;&lt;br /&gt;
        Mar 16-17 at NIH in Bethesda, MD, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
    &amp;lt;li style=&amp;quot;margin-bottom: 15px;&amp;quot;&amp;gt;&lt;br /&gt;
        2014 Next Generation Sequencing Standards &amp;lt;br&amp;gt;&lt;br /&gt;
        Sep 24-25 at NIH in Bethesda, MD, USA&lt;br /&gt;
    &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=221</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=221"/>
		<updated>2024-12-10T21:19:32Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=220</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=220"/>
		<updated>2024-12-10T21:19:09Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=219</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=219"/>
		<updated>2024-12-10T21:18:41Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=218</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=218"/>
		<updated>2024-12-10T21:17:26Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), [https://www.glygen.org/&amp;lt;nowiki&amp;gt;] is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community. GlyGen is funded by the National Institute of General Medical Sciences under the grant # 1R24GM146616 - 01 and the  National Institutes of Health Office of Strategic Coordination - The Common Fund under the grant # 1OT2OD032092. More information about GlyGen - &amp;lt;/nowiki&amp;gt;https://www.glygen.org/about/&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=217</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=217"/>
		<updated>2024-12-10T21:15:41Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology communit&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=216</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=216"/>
		<updated>2024-12-10T21:14:19Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=215</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=215"/>
		<updated>2024-12-10T21:13:24Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
	<entry>
		<id>https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=214</id>
		<title>Projects</title>
		<link rel="alternate" type="text/html" href="https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Projects&amp;diff=214"/>
		<updated>2024-12-10T21:12:52Z</updated>

		<summary type="html">&lt;p&gt;Jeetvora: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Current Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main The High-performance Integrated Virtual Environment (HIVE) platform]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
HIVE is a cloud-based environment optimized for the storage and analysis of extra-large data, such as biomedical data, clinical data, next-generation sequencing (NGS) data, mass spectrometry files, confocal microscopy images, post-market surveillance data, medical recall data, and many others. HIVE provides secure web access for authorized users to deposit, retrieve, annotate and compute on Big Data, and analyze the outcomes using web user interfaces. [https://docs.google.com/document/d/1F5iq00uKkJfdSsbwanvKOy-nPnwijH56mwbwa_HhzfY/edit?tab=t.0#heading=h.7dlfmngwfzih More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.biocomputeobject.org/ BioCompute Objects (BCO)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. The BioCompute Object (BCO) was developed in the High-throughput Sequencing Computational Standards for Regulatory Sciences (HTS-CSRS) initiative in the BioCompute Objects Portal (BOP), a web portal to serve as a collaborative ground to encourage a dialogue to facilitate interoperability between different bioinformatic pipelines, industries, and developers. HIVE capabilities have been leveraged to support the development of the BCO. The BCO is versatile and adaptable to other common HTS analysis platforms. [https://docs.google.com/document/d/1WQFZm_PFiQXob4NyOKq6y-2ywnbmNoFHSS27fYf3l4Y/edit?tab=t.0#heading=h.bs8eki17tykx More here].&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.glygen.org/ GlyGen]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
GlyGen (gly-glycobiology; gen-information), is an advanced glycoinformatics resource developed to facilitate discovery in basic and translational glycobiology research along with enhancing the integration of multidisciplinary information from diverse resources. GlyGen includes knowledge about molecular, biophysical and functional properties of glycans, genes, and proteins organized in pathways and ontologies, plus a rapidly growing body of biological big data related to cancer mutation and expression. GlyGen adopts an innovative user-driven approach for implementing, prioritizing and knowledge disseminating tools to address the questions and needs of glycobiology community.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row3&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.biochemistry.gwu.edu/predictmod PredictMod]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
PredictMod is an application designed to predict the outcome of an intervention prior to a patient initiating treatment. Our goal is to provide clinicians with a powerful decision making tool that enhances clinical understanding of patient-level data. The PredictMod platform utilizes machine learning tools and complex datasets based on electronic health records, gut microbiome, and -omics data to forecast patient outcomes, often in response to treatment for a particular condition. While our primary condition of interest is Prediabetes, the tool is designed to be used for a variety of conditions, interventions, and data types.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[[GW-FEAST]]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The GW Federated Ecosystems for Analytics and Standardized Technologies (GW-FEAST) project is part of the ARPA-H FEAST performer team initiative that includes academic and industry partners. The goal of the ARPA-H performer teams is “to create bridges across data silos to make health data more accessible and usable”.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/biomarker-partnership Biomarker Partnership]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The Biomarker Partnership is a CFDE sponsored project to develop a knowledgebase that will organize and integrate biomarker data from different public sources. The data will be connected to contextual information to show a novel systems-level view of biomarkers. The motivation for this project is to improve the harmonization and organization of biomarker data. This will be done by mapping biomarkers from public sources to, and across, CF data elements. This mapping will bridge knowledge across multiple DCCs and biomedical disciplines.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
{{DISPLAYTITLE:&amp;lt;span style=&amp;quot;position: absolute; clip: rect(1px 1px 1px 1px); clip: rect(1px, 1px, 1px, 1px);&amp;quot;&amp;gt;{{FULLPAGENAME}}&amp;lt;/span&amp;gt;}}&lt;br /&gt;
__NOTOC__&lt;br /&gt;
&amp;lt;!-- BANNER ACROSS TOP OF PAGE --&amp;gt;&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw-topbanner&amp;quot; style=&amp;quot;clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;margin:0.4em; text-align:center;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;font-size:160%; padding:.1em;&amp;quot;&amp;gt;Past Projects&amp;lt;/div&amp;gt;&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;clear: both;&amp;quot;&amp;gt;&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div id=&amp;quot;ggw_row2&amp;quot; style=&amp;quot;display: flex; flex-flow: row wrap; justify-content: space-between; padding: 0; margin: 0 -5px 0 -5px;&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hivelab.tst.biochemistry.gwu.edu/gfkb Gut Microbiome Analytic System (Microbiome)]&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The HIVE team received NSF funding to develop a Gut Microbiome Monitoring System (GutFeeling) as a tool which when used over time will allow users to rectify their dietary (such as consumption of probiotics and prebiotics) and other lifestyle habits and to help restore their normal microbiome. Rapid analysis of the large amount of metagenomic data, a major bottleneck, has been resolved by our group through the development of a novel algorithm and accompanying software called CensuScope. Through analysis of healthy gut microbiome data, we are actively developing a Knowledge Base (GutFeelingKB) to provide a clearer picture of not only an ideal personalized microbiome but also establish baseline characteristics for each customer. The Mazumder Lab is collaborating with the Milken School of Public Health and Kamtek Sequencing Facility to investigate the relationship between bacterial species commonly present in the digestive tract, diet, physical activity, lifestyle habits, and metabolic risk factors. [https://docs.google.com/document/d/18WyVTJrrf-FR0sHt634vO8Lwel-4OQxP9sNar7gYYro/edit?tab=t.0#heading=h.7qbm3f7lky31 More here].&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
	&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;HIVE-EQAPOL Project on HIVE NGS Data Processing and Analysis&amp;lt;/h3&amp;gt;&lt;br /&gt;
        &amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
For this project, our group works closely with the External Quality Assurance Program Oversight Laboratory (EQAPOL) team to conduct HIV NGS data analysis and collaborate in terms of analyzing, storing, and tracking HIV NGS Data. Reliable identification of strains is critical for developing new assays, validating assay platforms, assisting regulators to evaluate test kits, monitoring HIV drug resistance, and informing vaccine development. The HIVE tools and platform are used for virus identification, recombination analysis, and clone discovery.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://www.oncomx.org/ OncoMX]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
The OncoMX mission is to create an integrated cancer mutation and expression resource for exploring cancer biomarkers. OncoMX is a collaboration between the George Washington University (GW), NASA&#039;s Jet Propulsion Laboratory (JPL), the Swiss Institute of Bioinformatics (SIB), and the University of Delaware (UD). The core knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Normal expression data from Bgee and custom text mining software augment the cancer data to improve functional interpretation of the reported variants and expression profiles. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from NCI Early Detection Research Network (EDRN) and Reactome. It is expected that the large-scale integration of cancer data and supporting information, provided by OncoMX with direct community feedback, will benefit cancer research by improving synthesis of information and may make earlier detection a reality.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;div style=&amp;quot;flex: 1; margin: 5px; min-width: 210px; border: 1px solid #CCC;	padding: 0 10px 10px 10px; box-shadow: 0 2px 2px rgba(0,0,0,0.1); background: #f5faff;&amp;quot;&amp;gt;&lt;br /&gt;
        &amp;lt;h3&amp;gt;[https://hive.biochemistry.gwu.edu/dna.cgi?cmd=main Glycoproteomics Characterization Workflow and Data-Analysis Pipeline for Vaccines and Biosimilars]&amp;lt;/h3&amp;gt;&lt;br /&gt;
	&amp;lt;div style=&amp;quot;border-top: 1px solid #CCC; padding-top: 0.5em;&amp;quot;&amp;gt;&lt;br /&gt;
In this FDA funded project we are extending High-performance Integrated Virtual Environment (HIVE) capabilities through the development and integration of software tools and datasets for comparative analysis of glycoproteins. Glycomic analysis has many angles and has been extensively reviewed in recent literature. We propose to rely on the independent development of the glycomics field and incorporate these approaches in the HIVE pipeline as they mature while we develop a standardized glycoinformatics pipeline that will benefit investigators and regulators at the FDA.&lt;br /&gt;
        &amp;lt;/div&amp;gt;&lt;br /&gt;
    &amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Jeetvora</name></author>
	</entry>
</feed>