HIVE Lab - User contributions [en]

https://hivelab.biochemistry.gwu.edu/wiki/api.php?action=feedcontributions&feedformat=atom&user=Vishal.bakshi HIVE Lab - User contributions [en] 2026-05-28T00:32:48Z User contributions MediaWiki 1.42.1 https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=836 Volunteership 2025 2025-05-09T14:30:42Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood, Maria Kim<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |PredictMod<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |-<br /> |Aise Arpinar <br /> |<br /> |<br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> # GlyGen Publication Analysis<br /> |-<br /> |[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]<br /> |<br /> |<br /> # BiomarkerKB Biocuration <br /> # PredictMod <br /> # GlyGen Biocuration <br /> |-<br /> |[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]<br /> |Generic Curation<br /> |<br /> # BiomarkerKB<br /> # GlyGen<br /> # ARGOS<br /> |-<br /> |[https://www.linkedin.com/in/mathias-belay-03b51a2a3/ Mathias Belay]<br /> |<br /> |<br /> # GlyGen<br /> # PredictMod<br /> # BiomarkerKB<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=834 Volunteership 2025 2025-05-08T05:50:47Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood, Maria Kim<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |-<br /> |Aise Arpinar <br /> |<br /> |<br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> # GlyGen Publication Analysis<br /> |-<br /> |[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]<br /> |<br /> |<br /> # BiomarkerKB Biocuration <br /> # PredictMod <br /> # GlyGen Biocuration <br /> |-<br /> |[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]<br /> |<br /> |<br /> # BiomarkerKB<br /> # GlyGen<br /> # ARGOS<br /> |-<br /> |[https://www.linkedin.com/in/mathias-belay-03b51a2a3/ Mathias Belay]<br /> |<br /> |<br /> # GlyGen<br /> # PredictMod<br /> # BiomarkerKB<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=830 Volunteership 2025 2025-05-05T18:44:32Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood, Maria Kim<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |-<br /> |Aise Arpinar <br /> |<br /> |<br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> # GlyGen Publication Analysis<br /> |-<br /> |[https://www.linkedin.com/in/piyush-pandey-906b582b5/ Piyush Pandey]<br /> |<br /> |<br /> # BiomarkerKB Biocuration <br /> # PredictMod <br /> # GlyGen Biocuration <br /> |-<br /> |[http://www.linkedin.com/in/filmawit-zeru-203272363 Filmawit Zeru]<br /> |<br /> |<br /> # BiomarkerKB<br /> # GlyGen<br /> # ARGOS<br /> |-<br /> |Mathias Belay<br /> |<br /> |<br /> # GlyGen<br /> # PredictMod<br /> # BiomarkerKB<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=828 Volunteership 2025 2025-05-02T23:41:10Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |-<br /> |Aise Arpinar <br /> |<br /> |<br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> # GlyGen Publication Analysis<br /> |-<br /> |Piyush Pandey<br /> |<br /> |<br /> # BiomarkerKB Biocuration <br /> # PredictMod <br /> # GlyGen Biocuration <br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=827 Volunteership 2025 2025-05-02T23:38:43Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |-<br /> |Aise Arpinar <br /> |<br /> |<br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> # GlyGen Publication Analysis<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=825 Volunteership 2025 2025-04-30T15:39:29Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/akale-kinfe/ Akale Kinfe]<br /> |<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # ARGOS<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=814 Volunteership 2025 2025-04-25T22:03:35Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/aaron-ressom/ Aaron Ressom] <br /> |<br /> |<br /> # BiomarkerKB <br /> # PredictMod <br /> # GlyGen<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=803 Volunteership 2025 2025-04-25T19:54:37Z

<p>Vishal.bakshi: Added hyperlinks</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | [https://www.linkedin.com/in/gracesjchong/ Grace Chong]<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/alma-ogunsina-4959072b1/ Alma Ogunsina]<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/diya-kamalabharathy-62557935a/ Diya Kamalabharathy]<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/harivinay-prasad-reddy-gujjula-a06ba71bb/ Harivinay P. Gujjula]<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |[https://www.linkedin.com/in/miao-wang-88b602290/Miao&#x20;Wang Miao Wang]<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |[https://www.linkedin.com/in/nahom-gebreselassie-1545ab336/ Nahom Abel]<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |[https://www.linkedin.com/in/kajal-patel-cs/ Kajal Sanjaykumar Patel]<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |[https://www.linkedin.com/in/nathan-ressom/ Nathan Ressom]<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=801 Volunteership 2025 2025-04-25T19:49:55Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |[https://www.linkedin.com/in/john-mccaffrey-b8850930a/ John McCaffrey]<br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |Nathan Ressom<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=799 Volunteership 2025 2025-04-25T19:49:10Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Zoom Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !Project<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> |PredictMod<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Biomarker curation<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |PredictMod<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |GlyGen curation<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |ARGOS<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |GlyGen curation<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |GlyGen and PubMed project<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |John McCaffrey <br /> |Biomarker curation<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |Nathan Ressom<br /> |ARGOS<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=788 2025 Bioinformatics Symposium 2025-04-24T18:40:05Z

<p>Vishal.bakshi: </p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 Inaugural GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty! <br /> <br /> '''NOTE: We have received nearly 130 registrations, while SEH B1220 has a seating capacity of 90. Be sure to arrive early to secure a spot. For those who cannot be accommodated in the main room, we’ve arranged an overflow space in GWCC 8040 Conference Room, where the sessions will be live-streamed.'''<br /> <br /> '''REGISTRATION IS CLOSED. Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''ABSTRACT SUBMISSION IS CLOSED. Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> '''OPT-OUT LUNCH PICKUP.''' We’re excited by the overwhelming response. Almost 130 participants from nearly all GW schools have signed up. Please note that seating in the main room is limited to the first 90 attendees. We encourage you to arrive early to secure a seat. For those who arrive later, we’re working to set up a spillover room with TV monitors so everyone can still follow the sessions. Lunch will be provided for all attendees who plan to stay for most of the day. If you’re attending only briefly and do not plan to pick up lunch, please let us know to help minimize food waste. You can either fill out this [https://docs.google.com/forms/d/e/1FAIpQLScWaw6JFsbtQgcdsk9XCZQMJ0mMb3Z0UR5uHZnTZ1DgBYJlDQ/viewform?usp=header form] or email mazumder@gwu.edu. We appreciate your understanding and cooperation.<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 48 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics.'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |<u>Registration & coffee</u><br /> Lead: Raechelle McCants, Jewel Dias<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |<u>Welcome</u><br /> Rong Li (Chair, Dept. BMM, SMHS)<br /> <br /> Alison Hall (Senior Assoc Dean for Res, SMHS)<br /> <br /> Raja Mazumder<br /> <br /> Anelia Horvath<br /> <br /> <u>Talks (10 mins + questions)</u><br /> <br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |1. Anelia Horvath (Biochemistry)<br /> 2. Ljubica Caldovic (Children’s)<br /> <br /> 3. Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> 4. Seth Berger (Children’s)<br /> <br /> 5. Raja Mazumder (Biochemistry)<br /> <br /> 6. Yi-Wen Chen (Children’s)<br /> <br /> 7. Aintzane Santaquiteria Gil (Biology, Orti Lab)<br /> <br /> 8. *Marc Garbey (Neurology)<br /> <br /> <u>Two 4 mins Poster Flash Talks</u><br /> <br /> 9. Jane Ulianova<br /> <br /> 10. Aiste Gulla<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks (10 mins + questions)<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |1. Jo Lynne Rokita (Children’s)<br /> 2. Max Alekseyev (Milken)<br /> <br /> 3. Erika Hubbard (Crandall lab, Milken)<br /> <br /> 5. Ali R. Taheriyoun (Rahnavard Lab, Milken)<br /> <br /> 5. Hiroki Morizono (Children’s)<br /> <br /> <u>One 4 mins Poster Flash Talk</u><br /> <br /> 6. Vania Ballesteros Prieto<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Anelia Horvath, Raechelle McCants, Jewel Dias<br /> |'''Poster Judging Committee:'''<br /> Anelia Horvath <br /> <br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks (10 mins + questions)<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> <br /> <u>3:10 PM remarks</u><br /> <br /> Evangeline J. Downie (Associate Dean for Research, CCAS)<br /> |1. Mohammad Hammas Saeed (Howie Huang Lab, Engineering)<br /> 2. Nan Wu (ECE, Engineering)<br /> <br /> 3. Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> 4. Chen Zeng (Physics)<br /> <br /> 5. Weiqun Peng (Physics)<br /> <br /> 6. Shekhar Nagar (Jimmy Saw Lab, Biology)<br /> <br /> <u>Two 4 mins Poster Flash Talks</u><br /> <br /> 7. Chelcie Puetz<br /> <br /> 8. Kai Leung (Adam) Wong<br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> <br /> * 5 min. Introduction: IT, omics support, and related topics<br /> <br /> * Questions from audience<br /> ** Round table discussion<br /> ** Careers in bioinformatics<br /> ** Funding opportunities<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> <br /> Hiroki Morizono (Children's)<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Anelia Horvath<br /> Jonathon Keeney<br /> <br /> Raja Mazumder<br /> <br /> Keith Crandall<br /> |}<br /> <nowiki>*</nowiki>Talk titles TBD<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Jewel Das. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview (panel discussion)<br /> |-<br /> | Ayah Zirikly || Computer Science || SEAS/Johns Hopkins University || Developments in NLP and AI for Mental Health: Insights from the Last Decade and Future Directions – A Focus on the CLPsych Workshop<br /> |-<br /> | Weiqun Peng || Department of Physics || CCAS || Finding structures and their associated functions in genome wide of profiles of chromatin architecture<br /> |-<br /> | Nan Wu || Electrical and Computer Engineering || SEAS || Directed Graph Representation Learning for Circuits, Boolean Networks, and Beyond<br /> |-<br /> | Seth Berger || Biochemistry and Molecular Medicine / Pediatrics || SMHS || Blindspots in Clinical Genetic Testing: Integration of Multiomics to Improve Diagnostic Yields<br /> |-<br /> | Ali Reza Taheriyoun || Biostatistics and Bioinformatics || SPH || Dynamics of Gut Microbiome and Metabolome of Moderate and Severe Obesity Patients Under Sleeve Gastrectomy<br /> |-<br /> | Yi-Wen Chen || Biochemistry and Molecular Medicine / Pediatrics || SMHS || From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> | Shekhar Nagar || Biological Sciences || CCAS || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |Dae Young Kim<br /> |Center for Translational Research<br /> |CNH<br /> |mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |Anelia Horvath<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |AI driven Functional SNV Discovery from long read Single-Cell RNA-Seq Data<br /> |-<br /> |Ljubica Caldovic<br /> |Center for Genetic Medicine Research<br /> |CNH<br /> |Active Learning of Data Science and Bioinformtics<br /> |-<br /> |Max Alekseyev<br /> |Mathematics / Biostatistics & Bioinformatics<br /> |CCAS/SPH<br /> |Bioinformatics Meets Quantum Informatics: from Genome Rearrangements to Weingarten Calculus<br /> |-<br /> |Aintzane Santaquiteria Gil<br /> |Department of Biological Sciences<br /> |CCAS<br /> |Using comparative genomics to link genes with convergently evolved traits. <br /> |-<br /> |Chen Zeng<br /> |Department of Physics<br /> |CCAS<br /> |Modeling RNA-protein Interactions with network guided machine learning<br /> |-<br /> |Mohammad Hammas Saeed<br /> |Electrical and Computer Engineering<br /> |SEAS<br /> |AI for Good: Leveraging Graph-Based Methods and Large Language Models to Address Real-World Challenges<br /> |-<br /> |Hiroki Morizono<br /> |Center for Genetic Medicine Research<br /> |CNH<br /> |Biomedical data resources at Children's National<br /> |-<br /> |Marc Garbey/Henry Kaminski<br /> |Neurology & Rehabilitation Medicine<br /> |MFA<br /> |Moving towards a digital twin for myasthenia gravis ''(tentative title)''<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Poster Presentations ==<br /> {| class="wikitable"<br /> !Poster Number<br /> ! Name !! Presentation Title<br /> |-<br /> |1<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> |2<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |3<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> |4<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> |5<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> |6<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> |7<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |8<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> |9<br /> | Huai Chin Chiang || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> |10<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> |11<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> |12<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> |13<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> |14<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> |15<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> |16<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> |17<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> |18<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> |19<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> |20<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> |21<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> |22<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> |23<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> |24<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> |25<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> |26<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> |27<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |28<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> |29<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> |30<br /> | Irene Zohn || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> |31<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> |32<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |-<br /> |33<br /> |Mahdi Baghbanzadeh<br /> |seqLens: optimizing language models for genomic predictions<br /> |-<br /> |34<br /> |Dezhao Fu<br /> |varLens - enhancers genetic testing using language models<br /> |-<br /> |35<br /> |Lilly Shaw<br /> |Uncovering Shared and Unique Biomarkers Across 23 Cancer Types Using The Cancer Genome Atlas (TCGA)<br /> |-<br /> |36<br /> |Daniall Masood<br /> |BiomarkerKB: A Comprehensive Biomarker Knowledgebase<br /> |-<br /> |37<br /> |Ljubica Caldovic<br /> |Active Learning of Data Science and Bioinformatics<br /> |-<br /> |38<br /> |Urnisha Bhuiyan<br /> |GlyGen: A Comprehensive Resource for Glycoscience Data Integration and Discovery<br /> |-<br /> |39<br /> |Surajit Bhattacharya<br /> |Redefining Human Airway Biology in Children from The Top Down: Unique Features of the Nasal Airway Epithelium.<br /> |-<br /> |40<br /> |Anelia Horvath<br /> |A Machine Learning Approach to Functional SNV Discovery via Isoform-Aware Single-Cell RNA-Seq<br /> |-<br /> |41<br /> |Christie Rose Woodside<br /> |Enhanced QC Metrics for Reference-Grade Genomic Data<br /> |-<br /> |42<br /> |Yi-Wen Chen<br /> |From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> |43<br /> |Pia Sen<br /> |Investigating the role of bacteriophage diversity in Hawaiian steam vent microbial communities<br /> |-<br /> |44<br /> |Emily Williams*<br /> |TRIM28 regulates endogenous retroviral element expression in prostate cancer<br /> |-<br /> |45<br /> |Cadina Powell<br /> |Be Smart And Use Smartphones for Telemedicine: Narrative Review <br /> |-<br /> |46<br /> |Xinyang Zhang<br /> |Meta-analytic microbiome target discovery for immune checkpoint inhibitor response in advanced melanoma<br /> |-<br /> |47<br /> |Cyrus Chun Hong Au Yeung<br /> |Leveraging Large Language Models for Scalable Glycan-Disease Relation Extraction<br /> |-<br /> |48<br /> |Ashley Garrison<br /> |Gut Microbiome Composition as an Indicator of Preclinical Alzheimer's Disease<br /> |-<br /> |49<br /> |Chelcie Puetz<br /> |Combined Neuroinflammatory and Neurovascular Molecular Screening for Early Detection of Blood-Brain Barrier Dysfunction in Patients with Traumatic Brain Injury<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=787 Volunteership 2025 2025-04-24T17:33:12Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !School<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> |M.S Computer Science, Northeastern University<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |M.S Bioinformatics, Georgetown University<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Poolesville High School - Global Ecology House<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |B.S in Biochemistry and Molecular Biology, Pennsylvania State University<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |M.S. in Bioinformatics, Georgetown University<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |Bachelor of Arts, University of Maryland Baltimore County<br /> |Finding datasets online, Working with Excel/CSV files, Research and Organizational Skills, Experience with Data Entry<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |M.S Computer Science, George Washington University<br /> |Python, Apache, SparkETL, Machine Learning, AWS<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |John McCaffrey <br /> |Chemistry major Class of 2028 – Honors Track, Boston College<br /> |Chemistry<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |-<br /> |Nathan Ressom<br /> |The University of Virginia <br /> |Python, ML, AI, Web Development<br /> |<br /> # PredictMod <br /> # GlyGen Biocuration<br /> # BiomarkerKB Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=780 2025 Bioinformatics Symposium 2025-04-23T19:44:57Z

<p>Vishal.bakshi: Rectified: Spellings and text style</p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 Inaugural GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty! <br /> <br /> '''NOTE: We have received nearly 130 registrations, while SEH B1220 has a seating capacity of 90. Be sure to arrive early to secure a spot. For those who cannot be accommodated in the main room, we’ve arranged an overflow space in GWCC 8040 Conference Room, where the sessions will be live-streamed.'''<br /> <br /> '''REGISTRATION IS CLOSED. Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''ABSTRACT SUBMISSION IS CLOSED. Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> '''OPT-OUT LUNCH PICKUP.''' We’re excited by the overwhelming response. Almost 130 participants from nearly all GW schools have signed up. Please note that seating in the main room is limited to the first 90 attendees. We encourage you to arrive early to secure a seat. For those who arrive later, we’re working to set up a spillover room with TV monitors so everyone can still follow the sessions. Lunch will be provided for all attendees who plan to stay for most of the day. If you’re attending only briefly and do not plan to pick up lunch, please let us know to help minimize food waste. You can either fill out this [https://docs.google.com/forms/d/e/1FAIpQLScWaw6JFsbtQgcdsk9XCZQMJ0mMb3Z0UR5uHZnTZ1DgBYJlDQ/viewform?usp=header form] or email mazumder@gwu.edu. We appreciate your understanding and cooperation.<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 48 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics.'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |<u>Registration & coffee</u><br /> Lead: Raechelle McCants, Jewel Dias<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |<u>Welcome</u><br /> Rong Li (Chair, Dept. BMM, SMHS)<br /> <br /> Alison Hall (Senior Assoc Dean for Res, SMHS)<br /> <br /> Raja Mazumder<br /> <br /> Anelia Horvath<br /> <br /> <u>Talks (10 mins + questions)</u><br /> <br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |Anelia Horvath (Biochemistry)<br /> Ljubica Caldovic (Children’s)<br /> <br /> Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> Seth Berger (Children’s)<br /> <br /> Raja Mazumder (Biochemistry)<br /> <br /> Yi-Wen Chen (Children’s)<br /> <br /> Aintzane Santaquiteria Gil (Biology, Orti Lab)<br /> <br /> <nowiki>*</nowiki>Marc Garbey (Neurology)<br /> <br /> <u>Two 5 mins Poster Flash Talks</u><br /> <br /> Jane Ulianova<br /> <br /> Aiste Gulla<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks (10 mins + questions)<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |Jo Lynne Rokita (Children’s)<br /> Max Alekseyev (Milken)<br /> <br /> Erika Hubbard (Crandall lab; Milken)<br /> <br /> Ali R. Taheriyoun (Rahnavard Lab; Milken)<br /> <br /> Hiroki Morizono (Children’s)<br /> <br /> <u>One 5 mins Poster Flash Talk</u><br /> <br /> Vania Ballesteros Prieto<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Anelia Horvath, Raechelle McCants, Jewel Dias<br /> |'''Poster Judging Committee:'''<br /> Anelia Horvath <br /> <br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks (10 mins + questions)<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> <br /> 3:10 PM remarks: Evangeline J. Downie (Associate Dean for Research (CCAS))<br /> |Mohammad Hammas Saeed (Howie Huang Lab, Engineering)<br /> Nan Wu (ECE, Engineering)<br /> <br /> Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> Chen Zeng (Physics)<br /> <br /> Weiqun Peng (Physics)<br /> <br /> Shekhar Nagar (Jimmy Saw Lab, Biology)<br /> <br /> <u>Two 5 mins Poster Flash Talks</u><br /> <br /> Parimala Nagaraj<br /> <br /> Kai Leung (Adam) Wong<br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> <br /> * 5 min. Introduction: IT, omics support, and related topics<br /> <br /> * Questions from audience<br /> ** Round table discussion<br /> ** Careers in bioinformatics<br /> ** Funding opportunities<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> <br /> Hiroki Morizono (Children's)<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Anelia Horvath<br /> Jonathon Keeney<br /> <br /> Raja Mazumder<br /> <br /> Keith Crandall<br /> |}<br /> <nowiki>*</nowiki>Talk titles TBD<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Jewel Das. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview (panel discussion)<br /> |-<br /> | Ayah Zirikly || Computer Science || SEAS/Johns Hopkins University || Developments in NLP and AI for Mental Health: Insights from the Last Decade and Future Directions – A Focus on the CLPsych Workshop<br /> |-<br /> | Weiqun Peng || Department of Physics || CCAS || Finding structures and their associated functions in genome wide of profiles of chromatin architecture<br /> |-<br /> | Nan Wu || Electrical and Computer Engineering || SEAS || Directed Graph Representation Learning for Circuits, Boolean Networks, and Beyond<br /> |-<br /> | Seth Berger || Biochemistry and Molecular Medicine / Pediatrics || SMHS || Blindspots in Clinical Genetic Testing: Integration of Multiomics to Improve Diagnostic Yields<br /> |-<br /> | Ali Reza Taheriyoun || Biostatistics and Bioinformatics || SPH || Dynamics of Gut Microbiome and Metabolome of Moderate and Severe Obesity Patients Under Sleeve Gastrectomy<br /> |-<br /> | Yi-Wen Chen || Biochemistry and Molecular Medicine / Pediatrics || SMHS || From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> | Shekhar Nagar || Biological Sciences || CCAS || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |Dae Young Kim<br /> |Center for Translational Research<br /> |CNH<br /> |mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |Anelia Horvath<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |AI driven Functional SNV Discovery from long read Single-Cell RNA-Seq Data<br /> |-<br /> |Ljubica Caldovic<br /> |Center for Genetic Medicine Research<br /> |CNH<br /> |Active Learning of Data Science and Bioinformtics<br /> |-<br /> |Max Alekseyev<br /> |Mathematics / Biostatistics & Bioinformatics<br /> |CCAS/SPH<br /> |Bioinformatics Meets Quantum Informatics: from Genome Rearrangements to Weingarten Calculus<br /> |-<br /> |Aintzane Santaquiteria Gil<br /> |Department of Biological Sciences<br /> |CCAS<br /> |Using comparative genomics to link genes with convergently evolved traits. <br /> |-<br /> |Chen Zeng<br /> |Department of Physics<br /> |CCAS<br /> |Modeling RNA-protein Interactions with network guided machine learning<br /> |-<br /> |Mohammad Hammas Saeed<br /> |Electrical and Computer Engineering<br /> |SEAS<br /> |AI for Good: Leveraging Graph-Based Methods and Large Language Models to Address Real-World Challenges<br /> |-<br /> |Hiroki Morizono<br /> |Center for Genetic Medicine Research<br /> |CNH<br /> |Biomedical data resources at Children's National<br /> |-<br /> |Marc Garbey/Henry Kaminski<br /> |Neurology & Rehabilitation Medicine<br /> |MFA<br /> |Moving towards a digital twin for myasthenia gravis ''(tentative title)''<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Poster Presentations ==<br /> {| class="wikitable"<br /> !Poster Number<br /> ! Name !! Presentation Title<br /> |-<br /> |1<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> |2<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |3<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> |4<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> |5<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> |6<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> |7<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |8<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> |9<br /> | Huai Chin Chiang || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> |10<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> |11<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> |12<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> |13<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> |14<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> |15<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> |16<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> |17<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> |18<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> |19<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> |20<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> |21<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> |22<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> |23<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> |24<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> |25<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> |26<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> |27<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |28<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> |29<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> |30<br /> | Irene Zohn || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> |31<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> |32<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |-<br /> |33<br /> |Mahdi Baghbanzadeh<br /> |seqLens: optimizing language models for genomic predictions<br /> |-<br /> |34<br /> |Dezhao Fu<br /> |varLens - enhancers genetic testing using language models<br /> |-<br /> |35<br /> |Lilly Shaw<br /> |Uncovering Shared and Unique Biomarkers Across 23 Cancer Types Using The Cancer Genome Atlas (TCGA)<br /> |-<br /> |36<br /> |Daniall Masood<br /> |BiomarkerKB: A Comprehensive Biomarker Knowledgebase<br /> |-<br /> |37<br /> |Ljubica Caldovic<br /> |Active Learning of Data Science and Bioinformatics<br /> |-<br /> |38<br /> |Urnisha Bhuiyan<br /> |GlyGen: A Comprehensive Resource for Glycoscience Data Integration and Discovery<br /> |-<br /> |39<br /> |Surajit Bhattacharya<br /> |Redefining Human Airway Biology in Children from The Top Down: Unique Features of the Nasal Airway Epithelium.<br /> |-<br /> |40<br /> |Anelia Horvath<br /> |A Machine Learning Approach to Functional SNV Discovery via Isoform-Aware Single-Cell RNA-Seq<br /> |-<br /> |41<br /> |Christie Rose Woodside<br /> |Enhanced QC Metrics for Reference-Grade Genomic Data<br /> |-<br /> |42<br /> |Yi-Wen Chen<br /> |From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> |43<br /> |Pia Sen<br /> |Investigating the role of bacteriophage diversity in Hawaiian steam vent microbial communities<br /> |-<br /> |44<br /> |Emily Williams*<br /> |TRIM28 regulates endogenous retroviral element expression in prostate cancer<br /> |-<br /> |45<br /> |Alexander Thiersch<br /> |Patient-centric approaches for antipsychotic medication research: an application of the Desirability of Outcome Ranking (DOOR) and Global Benefit-Risk (GBR) Score <br /> |-<br /> |46<br /> |Cadina Powell<br /> |Be Smart And Use Smartphones for Telemedicine: Narrative Review <br /> |-<br /> |47<br /> |Xinyang Zhang<br /> |Meta-analytic microbiome target discovery for immune checkpoint inhibitor response in advanced melanoma<br /> |-<br /> |48<br /> |Cyrus Chun Hong Au Yeung<br /> |Leveraging Large Language Models for Scalable Glycan-Disease Relation Extraction<br /> |-<br /> |49<br /> |Ashley Garrison<br /> |Gut Microbiome Composition as an Indicator of Preclinical Alzheimer's Disease<br /> |-<br /> |50<br /> |Chelcie Puetz<br /> |Combined Neuroinflammatory and Neurovascular Molecular Screening for Early Detection of Blood-Brain Barrier Dysfunction in Patients with Traumatic Brain Injury<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=774 Volunteership 2025 2025-04-23T15:47:59Z

<p>Vishal.bakshi: Added candidates study track and school information</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> !School<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> |M.S Computer Science, Northeastern University<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |M.S Bioinformatics, Georgetown University<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Poolesville High School - Global Ecology House<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |B.S in Biochemistry and Molecular Biology, Pennsylvania State University<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |M.S. in Bioinformatics, Georgetown University<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |Bachelor of Arts, University of Maryland Baltimore County<br /> |Finding datasets online, Working with Excel/CSV files, Research and Organizational Skills, Experience with Data Entry<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |M.S Computer Science, George Washington University<br /> |Python, Apache, SparkETL, Machine Learning, AWS<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |-<br /> |John McCaffrey <br /> |Chemistry major Class of 2028 – Honors Track, Boston College<br /> |Chemistry<br /> |<br /> # PredictMod<br /> # BiomarkerKB<br /> # GlyGen Biocuration <br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=741 Volunteership 2025 2025-04-16T17:37:28Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |Finding datasets online, Working with Excel/CSV files, Research and Organizational Skills, Experience with Data Entry<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |Python, Apache, SparkETL, Machine Learning, AWS<br /> |<br /> #PredictMod<br /> #BiomarkerKB<br /> #GlyGen<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=740 Volunteership 2025 2025-04-16T17:35:28Z

<p>Vishal.bakshi: Added new volunteer</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |Finding datasets online, Working with Excel/CSV files, Research and Organizational Skills, Experience with Data Entry<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |-<br /> |Kajal Sanjaykumar Patel<br /> |Python, Apache, SparkETL, Machine Learning, AWS<br /> |<br /> #<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=727 2025 Bioinformatics Symposium 2025-04-15T19:16:58Z

<p>Vishal.bakshi: /* Poster Presentations */</p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 Inaugural GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> '''Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty!'''<br /> <br /> '''REGISTRATION IS CLOSED. Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''ABSTRACT SUBMISSION IS CLOSED. Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> '''OPT-OUT LUNCH PICKUP.''' We’re excited by the overwhelming response. Over 120 participants from nearly all GW schools have signed up. Please note that seating in the main room is limited to the first 90 attendees. We encourage you to arrive early to secure a seat. For those who arrive later, we’re working to set up a spillover room with TV monitors so everyone can still follow the sessions. Lunch will be provided for all attendees who plan to stay for most of the day. If you’re attending only briefly and do not plan to pick up lunch, please let us know to help minimize food waste. You can either fill out this [https://docs.google.com/forms/d/e/1FAIpQLScWaw6JFsbtQgcdsk9XCZQMJ0mMb3Z0UR5uHZnTZ1DgBYJlDQ/viewform?usp=header form] or email mazumder@gwu.edu. We appreciate your understanding and cooperation.<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 42 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |<u>Registration & coffee</u><br /> Lead: Raechelle McCants, Sunisha Harish<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |<u>Welcome</u><br /> Rong Li (Chair, Dept. BMM, SMHS)<br /> <br /> Alison Hall (Senior Assoc Dean for Res, SMHS)<br /> <br /> Raja Mazumder<br /> <br /> Anelia Horvath<br /> <br /> <u>Talks</u><br /> <br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |Anelia Horvath (Biochemistry)<br /> Ljubica Caldovic (Children’s)<br /> <br /> Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> Seth Berger (Children’s)<br /> <br /> Raja Mazumder (Biochemistry)<br /> <br /> Yi-Wen Chen (Children’s)<br /> <br /> Guillermo Orti (Biology)<br /> <br /> ''Marc Garbey (Neurology)''<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |''Jo Lynne Rokita (Children’s)''<br /> Max Alekseyev (Milken)<br /> <br /> Erika Hubbard (Crandall lab; Milken)<br /> <br /> Ali R. Taheriyoun (Rahnavard Lab; Milken)<br /> <br /> Hiroki Morizono (Children’s)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Raechelle McCants, Jewel Dias<br /> |'''Poster Judging Committee:'''<br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> |Howie Huang (Engineering)<br /> Nan Wu (ECE, Engineering)<br /> <br /> Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> Chen Zeng (Physics)<br /> <br /> Weiqun Peng (Physics)<br /> <br /> Xiangyun Qiu (Physics)<br /> <br /> Shekhar Nagar (Jimmy Saw Lab, Biology)<br /> <br /> ''Some speakers might be moved to the morning sessions'' <br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> * Talks on IT, omics support, and related topics<br /> * Round table discussion<br /> * Careers in bioinformatics<br /> * Funding opportunities<br /> * Poster awards<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Keith Crandall<br /> |}<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Jewel Das. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview (panel discussion)<br /> |-<br /> | Ayah Zirikly || Computer Science || SEAS/Johns Hopkins University || Developments in NLP and AI for Mental Health: Insights from the Last Decade and Future Directions – A Focus on the CLPsych Workshop<br /> |-<br /> | Weiqun Peng || Physics || CCAS || Finding structures and their associated functions in genome wide of profiles of chromatin architecture<br /> |-<br /> | Nan Wu || Electrical and Computer Engineering || SEAS || Directed Graph Representation Learning for Circuits, Boolean Networks, and Beyond<br /> |-<br /> | Seth Berger || Biochemistry and Molecular Medicine / Pediatrics || SMHS || Blindspots in Clinical Genetic Testing: Integration of Multiomics to Improve Diagnostic Yields<br /> |-<br /> | Ali Reza Taheriyoun || Biostatistics and Bioinformatics || SPH || Dynamics of Gut Microbiome and Metabolome of Moderate and Severe Obesity Patients Under Sleeve Gastrectomy<br /> |-<br /> | Yi-Wen Chen || Biochemistry and Molecular Medicine / Pediatrics || SMHS || From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> | Mohammad Saeed || Computer Science || SEAS || Biases in AI-Driven Healthcare: Challenges and Implications for Clinical Decision-Making<br /> |-<br /> | Shekhar Nagar || Biological Sciences || CCAS || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |Dae Young Kim<br /> |Center for Translational Research<br /> |CNH<br /> |mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |Anelia Horvath<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |AI driven Functional SNV Discovery from long read Single-Cell RNA-Seq Data<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Poster Presentations ==<br /> {| class="wikitable"<br /> !Poster Number<br /> ! Name !! Presentation Title<br /> |-<br /> |1<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> |2<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |3<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> |4<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> |5<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> |6<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> |7<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |8<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> |9<br /> | Huai Chin Chiang || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> |10<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> |11<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> |12<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> |13<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> |14<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> |15<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> |16<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> |17<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> |18<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> |19<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> |20<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> |21<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> |22<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> |23<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> |24<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> |25<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> |26<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> |27<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |28<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> |29<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> |30<br /> | Irene Zohn || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> |31<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> |32<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |-<br /> |33<br /> |Mahdi Baghbanzadeh<br /> |seqLens: optimizing language models for genomic predictions<br /> |-<br /> |34<br /> |Dezhao Fu<br /> |varLens - enhancers genetic testing using language models<br /> |-<br /> |35<br /> |Lilly Shaw<br /> |Uncovering Shared and Unique Biomarkers Across 23 Cancer Types Using The Cancer Genome Atlas (TCGA)<br /> |-<br /> |36<br /> |Daniall Masood<br /> |BiomarkerKB: A Comprehensive Biomarker Knowledgebase<br /> |-<br /> |37<br /> |Ljubica Caldovic<br /> |Active Learning of Data Science and Bioinformatics<br /> |-<br /> |38<br /> |Urnisha Bhuiyan<br /> |GlyGen: A Comprehensive Resource for Glycoscience Data Integration and Discovery<br /> |-<br /> |39<br /> |Surajit Bhattacharya<br /> |Redefining Human Airway Biology in Children from The Top Down: Unique Features of the Nasal Airway Epithelium.<br /> |-<br /> |40<br /> |Anelia Horvath<br /> |A Machine Learning Approach to Functional SNV Discovery via Isoform-Aware Single-Cell RNA-Seq<br /> |-<br /> |41<br /> |Christie Rose Woodside<br /> |Enhanced QC Metrics for Reference-Grade Genomic Data<br /> |-<br /> |42<br /> |Yi-Wen Chen<br /> |From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> |43<br /> |Pia Sen<br /> |Investigating the role of bacteriophage diversity in Hawaiian steam vent microbial communities<br /> |-<br /> |44<br /> |Emily Williams*<br /> |TRIM28 regulates endogenous retroviral element expression in prostate cancer<br /> |-<br /> |45<br /> |Alexander Thiersch<br /> |Patient-centric approaches for antipsychotic medication research: an application of the Desirability of Outcome Ranking (DOOR) and Global Benefit-Risk (GBR) Score <br /> |-<br /> |46<br /> |Cadina Powell<br /> |Be Smart And Use Smartphones for Telemedicine: Narrative Review <br /> |-<br /> |47<br /> |Xinyang Zhang<br /> |Meta-analytic microbiome target discovery for immune checkpoint inhibitor response in advanced melanoma<br /> |-<br /> |48<br /> |Cyrus Chun Hong Au Yeung<br /> |Leveraging Large Language Models for Scalable Glycan-Disease Relation Extraction<br /> |-<br /> |49<br /> |Ashley Garrison<br /> |Gut Microbiome Composition as an Indicator of Preclinical Alzheimer's Disease<br /> |-<br /> |50<br /> |Chelcie Puetz<br /> |Combined Neuroinflammatory and Neurovascular Molecular Screening for Early Detection of Blood-Brain Barrier Dysfunction in Patients with Traumatic Brain Injury<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=722 Volunteership 2025 2025-04-14T15:21:24Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |-<br /> |Nahom Abel<br /> |Finding datasets online, Working with Excel/CSV files, Research and Organizational Skills, Experience with Data Entry<br /> |<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> # PredictMod<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=721 Volunteership 2025 2025-04-14T14:58:36Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <strong>Volunteer Kick-Off Meeting</strong><br><br /> May 26, 2025 | 3:30 to 4:30 PM<br /> <br /> <strong>Program Dates: June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks<br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=720 Volunteership 2025 2025-04-14T13:52:49Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |-<br /> |Miao Wang<br /> |Python, R, Machine Learning, Bioinformatics Tools (e.g., DESeq2, KEGG, GO, Ensembl VEP), SQL<br /> |<br /> # BiomarkerKB Biocuration Project Ideas<br /> # FDA-ARGOS Computation and Pathogen Curation Project<br /> # PredictMod Machine Learning Project Ideas<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=719 Volunteership 2025 2025-04-14T13:19:21Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, "human," "man," and "h. sapiens" all map to the scientific species name "Homo sapiens."<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seeking curators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> The project involves:<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> The project involves:<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation. Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |-<br /> |Harivinay P. Gujjula<br /> |Molecular Biology, Protein Analysis, Immunoassays, Spectroscopy Techniques, Genetic Engineering<br /> |<br /> # GlyGen Biocuration<br /> # BioMarkerKB Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=710 Volunteership 2025 2025-04-11T15:51:57Z

<p>Vishal.bakshi: /* 2. GlyGen Biocuration Project Ideas */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. <br /> <br /> Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human, &amp;quot; &amp;quot;man, &amp;quot; and &amp;quot; homo sapiens&amp;quot; all map to the scientific species name &amp;quot; Homo sapiens.&amp; quot;<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seekingcurators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> '''The project involves:'''<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> '''The project involves:'''<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation.<br /> <br /> Source code developed as part of this project will be documented and shared in a public GitHub repository. If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=709 Volunteership 2025 2025-04-11T15:51:00Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>1. BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== 2. GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. <br /> <br /> Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human, &amp;quot; &amp;quot;man, &amp;quot; and &amp;quot; homo sapiens&amp;quot; all map to the scientific species name &amp;quot; Homo sapiens.&amp; quot;<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seekingcurators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> '''The project involves:'''<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''3. GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> '''The project involves:'''<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation.<br /> <br /> Source code developed as part of this project will be documented and shared in a public GitHub repository.<br /> <br /> If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== 4. PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''5. FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=708 Volunteership 2025 2025-04-11T15:50:18Z

<p>Vishal.bakshi: /* GlyGen Biocuration Project Ideas */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Over the last three decades, numerous glycomics database projects have been initiated to collect valuable information about glycans, proteins, and their interactions. Some of these databases have been discontinued due to the end of project funding. However, the data within these databases remains highly valuable to the community. Integrating these datasets into modern databases or knowledgebases, such as GlyGen, presents a challenge because much of the valuable metadata (e.g., species, tissue, disease, cell line) annotations are free-text terms that do not align with established standard dictionaries and ontologies used in modern resources. <br /> <br /> Automated matching of this information with dictionaries or ontologies is often not possible due to the use of synonyms, spelling errors, or abbreviations. For example, &amp;quot;human, &amp;quot; &amp;quot;man, &amp;quot; and &amp;quot; homo sapiens&amp;quot; all map to the scientific species name &amp;quot; Homo sapiens.&amp; quot;<br /> <br /> The GlyGen project aims to make datasets from two older databases (CarbBank, CFG) accessible by migrating the data and metadata into our database. For this project, we are seekingcurators with a medical or biology background who are interested in helping map metadata terms from these old databases to standard dictionaries and ontologies.<br /> <br /> '''The project involves:'''<br /> <br /> # Using internet resources (e.g., Google, Wikipedia) to identify terms used in the old database.<br /> # Mapping identified terms to corresponding dictionaries and ontologies using the webpages and search interfaces of these projects.<br /> # Finding papers based on titles and author lists that may contain spelling errors.<br /> # Interacting and discussing with other curators in case terms are mapped differently.<br /> <br /> If you have any other ideas or methods you would like to focus on, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> '''GlyGen Publication Analysis Project Ideas'''<br /> <br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> One of the challenges for any bioinformatics project is understanding the size of its community, how well the project serves this community, and how widely its software/database is used. A potential solution is to analyze PubMed publication data. We are seeking applicants with programming skills (in Python or Java) to perform this analysis.<br /> <br /> '''The project involves:'''<br /> <br /> # Using the PubMed web API to filter publications based on keywords.<br /> # Analyzing paper abstracts to identify research institutions and groups that form the community.<br /> # Filtering the community list to exclude unrelated co-authors.<br /> <br /> A subproject will involve analyzing the full text of papers (when available) for keywords or resource and database names. The results of the analysis will be discussed with GlyGen project member who will suggest changes and improvements to the analysis and data presentation.<br /> <br /> Source code developed as part of this project will be documented and shared in a public GitHub repository.<br /> <br /> If you have any other ideas or methods you would like to explore, please reach out to rene@ccrc.uga.edu to discuss them.<br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=695 2025 Bioinformatics Symposium 2025-04-09T22:25:11Z

<p>Vishal.bakshi: /* Poster Presentations */</p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> '''Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty!'''<br /> <br /> '''Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 42 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |Registration & coffee<br /> Lead: Raechelle McCants, Sunisha Harish<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |Talks<br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |Anelia Horvath (Biochemistry)<br /> Ljubica Caldovic (Children’s)<br /> <br /> Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> Seth Berger (Children’s)<br /> <br /> Raja Mazumder (Biochemistry)<br /> <br /> Yi-Wen Chen (Children’s)<br /> <br /> ''Marc Garbey (Neurology)''<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |''Jo Lynne Rokita (Children’s)''<br /> Max Alekseyev (Milken)<br /> <br /> Erika Hubbard (Crandall lab; Milken)<br /> <br /> Ali R. Taheriyoun (Rahnavard Lab; Milken)<br /> <br /> Hiroki Morizono (Children’s)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Raechelle McCants, Sunisha Harish<br /> |'''Poster Judging Committee:'''<br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> |Howie Huang (Engineering)<br /> Nan Wu (ECE, Engineering)<br /> <br /> Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> Chen Zeng (Physics)<br /> <br /> Weiqun Peng (Physics)<br /> <br /> Xiangyun Qiu (Physics)<br /> <br /> Shekhar Nagar (Jimmy Saw Lab, Biology)<br /> <br /> Guillermo Orti (Biology)<br /> <br /> ''Some speakers might be moved to the morning sessions'' <br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> * Talks on IT, omics support, and related topics<br /> * Round table discussion<br /> * Careers in Bioinformatics<br /> * Funding opportunities<br /> * Lightning Poster talks & awards<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> <br /> Adam Ciarleglio (Biostatistics and Epidemiology Consulting Service - BECS)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Keith Crandall<br /> |}<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Jewel Das. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview (panel discussion)<br /> |-<br /> | Ayah Zirikly || Computer Science || SEAS/Johns Hopkins University || Developments in NLP and AI for Mental Health: Insights from the Last Decade and Future Directions – A Focus on the CLPsych Workshop<br /> |-<br /> | Weiqun Peng || Physics || CCAS || Finding structures and their associated functions in genome wide of profiles of chromatin architecture<br /> |-<br /> | Nan Wu || Electrical and Computer Engineering || SEAS || Directed Graph Representation Learning for Circuits, Boolean Networks, and Beyond<br /> |-<br /> | Seth Berger || Biochemistry and Molecular Medicine / Pediatrics || SMHS || Blindspots in Clinical Genetic Testing: Integration of Multiomics to Improve Diagnostic Yields<br /> |-<br /> | Ali Reza Taheriyoun || Biostatistics and Bioinformatics || SPH || Dynamics of Gut Microbiome and Metabolome of Moderate and Severe Obesity Patients Under Sleeve Gastrectomy<br /> |-<br /> | Yi-Wen Chen || Biochemistry and Molecular Medicine / Pediatrics || SMHS || From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> | Mohammad Saeed || Computer Science || SEAS || Biases in AI-Driven Healthcare: Challenges and Implications for Clinical Decision-Making<br /> |-<br /> | Shekhar Nagar || Biological Sciences || CCAS || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Poster Presentations ==<br /> {| class="wikitable"<br /> !Poster Number<br /> ! Name !! Presentation Title<br /> |-<br /> |1<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> |2<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> |3<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> |4<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> |5<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> |6<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> |7<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |8<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> |9<br /> | Huai Chin Chiang || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> |10<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> |11<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> |12<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> |13<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> |14<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> |15<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> |16<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> |17<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> |18<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> |19<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> |20<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> |21<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> |22<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> |23<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> |24<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> |25<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> |26<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> |27<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> |28<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> |29<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> |30<br /> | Zohn Lab Zohn Lab || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> |31<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> |32<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=694 Volunteership 2025 2025-04-09T22:02:06Z

<p>Vishal.bakshi: /* Volunteers */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |-<br /> |Diya Kamalabharathy<br /> |Computational Biology, Python Programming,Molecular Biology Techniques<br /> Scientific Writing, Data Analysis<br /> |<br /> # BiomarkerKB Biocuration<br /> # PredictMod Machine Learning<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=689 2025 Bioinformatics Symposium 2025-04-09T17:07:34Z

<p>Vishal.bakshi: </p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> '''Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty!'''<br /> <br /> '''Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 42 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |Registration & coffee<br /> Lead: Raechelle McCants, Sunisha Harish<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |Talks<br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |Anelia Horvath (Biochemistry)<br /> Ljubica Caldovic (Children’s)<br /> <br /> Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> Seth Berger (Children’s)<br /> <br /> Raja Mazumder (Biochemistry)<br /> <br /> Yi-Wen Chen (Children’s)<br /> <br /> ''Marc Garbey (Neurology)''<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |''Jo Lynne Rokita (Children’s)''<br /> Max Alekseyev (Milken)<br /> <br /> Erika Hubbard (Crandall lab; Milken)<br /> <br /> Ali R. Taheriyoun (Rahnavard Lab; Milken)<br /> <br /> Hiroki Morizono (Children’s)<br /> <br /> Leon Grayfer (Biology)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Raechelle McCants, Sunisha Harish<br /> |'''Poster Judging Committee:'''<br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> |Howie Huang (Engineering)<br /> Nan Wu (ECE, Engineering)<br /> <br /> Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> Chen Zeng (Physics)<br /> <br /> Weiqun Peng (Physics)<br /> <br /> Xiangyun Qiu (Physics)<br /> <br /> Jimmy Saw (Biology)<br /> <br /> Guillermo Orti (Biology)<br /> <br /> ''Some speakers might be moved to the morning sessions'' <br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> * Talks on IT, omics support, and related topics<br /> * Round table discussion<br /> * Careers in Bioinformatics<br /> * Funding opportunities<br /> * Lightning Poster talks & awards<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> <br /> Adam Ciarleglio (Biostatistics and Epidemiology Consulting Service - BECS)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Keith Crandall<br /> |}<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Sunisha Harish. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview (panel discussion)<br /> |-<br /> | Ayah Zirikly || Computer Science || SEAS/Johns Hopkins University || Developments in NLP and AI for Mental Health: Insights from the Last Decade and Future Directions – A Focus on the CLPsych Workshop<br /> |-<br /> | Weiqun Peng || Physics || CCAS || Finding structures and their associated functions in genome wide of profiles of chromatin architecture<br /> |-<br /> | Nan Wu || Electrical and Computer Engineering || SEAS || Directed Graph Representation Learning for Circuits, Boolean Networks, and Beyond<br /> |-<br /> | Seth Berger || Biochemistry and Molecular Medicine / Pediatrics || SMHS || Blindspots in Clinical Genetic Testing: Integration of Multiomics to Improve Diagnostic Yields<br /> |-<br /> | Ali Reza Taheriyoun || Biostatistics and Bioinformatics || SPH || Dynamics of Gut Microbiome and Metabolome of Moderate and Severe Obesity Patients Under Sleeve Gastrectomy<br /> |-<br /> | Yi-Wen Chen || Biochemistry and Molecular Medicine / Pediatrics || SMHS || From gene to treatment: omics approaches for understanding facioscapulohumeral muscular dystrophy<br /> |-<br /> | Mohammad Saeed || Computer Science || SEAS || Biases in AI-Driven Healthcare: Challenges and Implications for Clinical Decision-Making<br /> |-<br /> | Shekhar Nagar || Biological Sciences || CCAS || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Poster Presentations ==<br /> {| class="wikitable"<br /> ! Name !! Presentation Title<br /> |-<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> | HUAI-CHIN CHIANG || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> | Zohn Lab Zohn Lab || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=2025_Bioinformatics_Symposium&diff=686 2025 Bioinformatics Symposium 2025-04-08T18:36:46Z

<p>Vishal.bakshi: </p> <hr /> <div>{{DISPLAYTITLE: 2025 Bioinformatics Symposium}}<br /> <br /> '''Title''': 2025 GW Bioinformatics Symposium <br /> <br /> '''When''': April 29th 2025, 9am to 6pm<br /> <br /> '''Venue''': Talks: SEH B1220. Refreshments, lunch, and posters: Green wall area (SEH B1167)<br /> <br /> '''Join us for a full-day, all-hands GW Bioinformatics Symposium featuring posters, talks, and roundtable discussions. Open to GW students, staff, and faculty!'''<br /> <br /> '''Event Registration:''' Space is limited. Please register by '''<u>April 12th. 2025</u>''' for the event through this '''<big>[https://docs.google.com/forms/d/e/1FAIpQLSd_VfXIL_S59cVgOBxx_0b0E-wMBphWbBVuK6-JOSm9-cqiJA/viewform?usp=sharing form]</big>'''. If you encounter any issues, please email Raja Mazumder (mazumder@gwu.edu) your name and the lab you are representing and he will register you.<br /> <br /> '''Abstract submission:''' Please submit your abstract by '''<u>April 12th. 2025</u>''' through [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC '''<big>REDCap</big>''']<br /> <br /> == Abstract/Overview ==<br /> <br /> The GW Bioinformatics Symposium on April 29, 2025, a full-day event is designed to bring together faculty, staff, and student bioinformatics researchers and also researchers who use bioinformatics in their labs, from across GW to foster networking, collaboration, and knowledge exchange. The symposium will feature talks from GW labs that focus on bioinformatics and related research, poster presentations, roundtable discussions, and sessions on resources, funding and career opportunities in bioinformatics. Topics will span bioinformatics, computational methods, IT/security, and training, highlighting the breadth of bioinformatics in various GW schools and centers. This event offers a unique opportunity for attendees to engage in meaningful discussions, explore potential collaborations, and stay informed about the latest advancements in the field. The symposium is a great way to connect with the GW bioinformatics community.<br /> <br /> == Poster ==<br /> Participants are invited to submit a brief poster abstract by March 31st at 11:59 PM (ET). We encourage submissions from bioinformatics labs and also other labs that do not primarily focus on bioinformatics but have research relevant to bioinformatics topics. A select few will be chosen for lightning talks. Due to the limited number of poster boards, priority will be given to ensure each lab/group has at least one designated board. If the number of submitted poster abstracts exceeds the available poster boards, additional posters may be printed as flyers with QR codes, enabling attendees to scan, view, or download them electronically. <br /> <br /> '''Size:''' Poster sizes can be up to 42 (width) x 36 (height) inches.<br /> <br /> '''Poster Abstract Submission Portal:''' [https://cri-datacap.org/surveys/?s=8LFM3TKDWY338KDC Click here].<br /> <br /> '''Poster Printing Instructions'''<br /> <br /> Download the poster template from GW Research Day Resources: [https://guides.himmelfarb.gwu.edu/ResearchDay/poster-design-layout Poster Design & Layout].<br /> <br /> After you create the PPT for your poster, request free poster printing from Gelman Library [https://library.gwu.edu/3-d-and-large-format-printing using this form].<br /> <br /> Submit your printing request by April 15th.<br /> <br /> == Schedule ([https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium#Talk_Titles Talk Titles]) ==<br /> <br /> {| class="wikitable"<br /> |+<br /> !Time<br /> !Duration <br /> !Topic<br /> !Presenter(s)<br /> |-<br /> | colspan="4" |'''Morning Session'''<br /> '''Topics: Registration, introduction, and health-related topics'''<br /> |-<br /> |8:30 - 9:00 AM<br /> |30 min<br /> |Registration & coffee<br /> Lead: Raechelle McCants, Sunisha Harish<br /> <br /> * Registration<br /> * Coffee<br /> * Poster Setup<br /> * Slide/AV setup & check<br /> |<br /> |-<br /> |9 - 11:00 AM<br /> |120 min<br /> |Talks<br /> Session chairs: Anelia Horvath, Raja Mazumder<br /> |Anelia Horvath (Biochemistry)<br /> Ljubica Caldovic (Children’s)<br /> <br /> Dae Young Kim (Children’s; Muhammad Rahman lab)<br /> <br /> Seth Berger (Children’s)<br /> <br /> Raja Mazumder (Biochemistry)<br /> <br /> Yi-Wen Chen (Children’s)<br /> <br /> ''Marc Garbey (Neurology)''<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |11 - 11:15 AM<br /> |15 min<br /> |Refreshment Break. <br /> |<br /> |-<br /> |11:15 - 12:30 PM<br /> |75 min<br /> |Talks<br /> Session chairs: Ali Rahnavard, Ljubica Caldovic<br /> |''Jo Lynne Rokita (Children’s)''<br /> Max Alekseyev (Milken)<br /> <br /> Erika Hubbard (Crandall lab; Milken)<br /> <br /> Ali R. Taheriyoun (Rahnavard Lab; Milken)<br /> <br /> Hiroki Morizono (Children’s)<br /> <br /> Leon Grayfer (Biology)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |12:30 - 2 PM<br /> |90 min<br /> |Lunch and poster session<br /> Lead: Raechelle McCants, Sunisha Harish<br /> |'''Poster Judging Committee:'''<br /> Ali Rahnavard <br /> <br /> Hiroki Morizono <br /> <br /> Yi-Wen Chen <br /> <br /> Jimmy Saw<br /> |-<br /> | colspan="4" |'''Afternoon Session'''<br /> '''Topics: Breadth of bioinformatics in biological research; IT/security; Training'''<br /> |-<br /> |2:00 - 3:30 PM<br /> |90 min<br /> |Talks<br /> Session Chairs: Howie Huang, Jimmy Saw, Chen Zeng<br /> |Howie Huang (Engineering)<br /> Nan Wu (ECE, Engineering)<br /> <br /> Aya Zirikly (Computer Science, GW/JHU)<br /> <br /> Chen Zeng (Physics)<br /> <br /> Weiqun Peng (Physics)<br /> <br /> Xiangyun Qiu (Physics)<br /> <br /> Jimmy Saw (Biology)<br /> <br /> Guillermo Orti (Biology)<br /> <br /> ''Some speakers might be moved to the morning sessions'' <br /> <br /> |-<br /> |3: 30 - 4:30 PM<br /> |60 min<br /> |Session Chair: Jonathon Keeney.<br /> Co-chairs: Hiroki Morizono, Anelia Horvath<br /> * Talks on IT, omics support, and related topics<br /> * Round table discussion<br /> * Careers in Bioinformatics<br /> * Funding opportunities<br /> * Lightning Poster talks & awards<br /> |Clark Gaylord (Director, Research Technology Services)<br /> Brian Choi (MFA)<br /> <br /> Anelia Horvath (MGPC core/Bioinformatics support)<br /> <br /> Jack Villani (GW Genomics Core)<br /> <br /> Ali Rahnavard (CBI Analytics)<br /> <br /> Adam Ciarleglio (Biostatistics and Epidemiology Consulting Service - BECS)<br /> <br /> ''Additional presenters TBD''<br /> |-<br /> |4:30 - 6:00 PM<br /> |<br /> |Networking event, poster prizes, and refreshments<br /> |Keith Crandall<br /> |}<br /> <br /> == Presentation/Discussion Sessions ==<br /> There will be a Q&A session and a networking event at the end of the workshop.<br /> <br /> == Scientific Organizing Committee ==<br /> <br /> Raja Mazumder (Symposium Chair), Anelia Horvath, Hiroki Morizono, Ljubica Caldovic, Keith Crandall, Jorge Sepulveda, Howie Huang, Chen Zeng, Jimmy Saw, Clark Gaylord.<br /> <br /> == Logistics Organizing Committee ==<br /> <br /> Raja Mazumder, Anelia Horvath, Raechelle McCants, Sunisha Harish. Student volunteers: Jane, Sofia, Allison, Chloe, Trupri and Lincoln.<br /> <br /> == Talk Titles ==<br /> {| class="wikitable"<br /> |+<br /> !Name<br /> !Department<br /> !School<br /> !Title<br /> |-<br /> |Jo Lynne Rokita<br /> |Pediatrics<br /> |CNH<br /> |Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> |Erika Hubbard<br /> |Bioinformatics and Biostatistics<br /> |SPH<br /> |Machine Learning to Determine Endotypes of Lupus<br /> |-<br /> |Raja Mazumder<br /> |Biochemistry and Molecular Medicine<br /> |SMHS<br /> |Integrating Biomedical Knowledgebases and Clinical Data for ML/AI-Powered Insights<br /> |-<br /> |Jack Villani<br /> |GW Genomics Core<br /> |SPH<br /> |GW Genomics Core: An Introduction & Overview<br /> |} <br /> <br /> == Acknowledgments ==<br /> <br /> Sponsors: Dept. of Biochemistry and Molecular Medicine (coffee, refreshments, lunch, poster prizes), IBS (poster boards), Milken Institute School of Public Health (happy hour, poster prizes). <br /> <br /> == Contact ==<br /> '''For questions about registration, abstract submission or general inquiries, please contact:''' <br /> <br /> Raja Mazumder: mazumder@gwu.edu<br /> <br /> == Presenters and Presentation Titles ==<br /> {| class="wikitable"<br /> ! Name !! Presentation Title<br /> |-<br /> | Sunisha Harish || AI-Driven Drug Response Prediction in Cancer Using Long-Read Single-Cell RNA-Seq<br /> |-<br /> | Dae Young Kim || mhGPT: A Lightweight Domain-Specific Language Model for Mental Health Analysis<br /> |-<br /> | Vania Ballesteros Prieto || Uncovering the Contributions of Expressed Genetic Variants, Isoforms, and RNA Editing to Tumor Heterogeneity via Long-Read Single-Cell RNA-Seq Analysis<br /> |-<br /> | Sarah Tiufekchiev-Grieco || Promoting Resolution of Inflammation as a Potential Therapy for DMD<br /> |-<br /> | Karli Gilbert || Machine Learning Models Predict Treatment Outcome from Serum Proteins in Patients with Myasthenia Gravis that received Thymectomy<br /> |-<br /> | Reny Mathew || Identification of anti-helminthic drug resistance associated Quantitative trait loci (QTLs) in the canine hookworm, Ancylostoma caninum: A pooled-sequencing approach<br /> |-<br /> | Jo Lynne Rokita || Accelerating discovery and target identification for pediatric brain tumors through open-source platforms and tools<br /> |-<br /> | Henry Kaminski || Moving towards a digital twin for myasthenia gravis <br /> |-<br /> | HUAI-CHIN CHIANG || Single-Cell Transcriptomic and Phenotypic Profiling Reveals T Cell Dysfunction in BRCA1 Mutation Carriers<br /> |-<br /> | Lori Krammer || GW-FEAST: a federated ecosystem for data analysis and machine learning<br /> |-<br /> | Medha Kurukunda || Analyzing the Use of Artificial Intelligence to Enhance the Identification of Food Insecure Areas in Washington, D.C. <br /> |-<br /> | Christie Rose Woodside || Bridging Genomics and Preparedness: Regulatory-Grade Genomics and Quality Control Metrics and Analysis for Emerging and Circulating Avian Influenza in 2024-2025<br /> |-<br /> | Aiste Gulla, MD, PhD || Clinical Outcomes and Long-Term Survival of Pancreatic Cancers by Histological Sub-Type in the Epic Cosmos Database: Results from 2010-2025<br /> |-<br /> | Jane Ulianova || Comparison of alignment performance between the T2T-CHM13 and GRCh38/hg38 reference genome assemblies for RNAseq<br /> |-<br /> | Zhe Yu || Automated Tracking of Freezing Behavior in Paired House Mice Using DeepLabCut<br /> |-<br /> | Zhe Yu || Behavioral Bioinformatics for Temporal Analysis of Freezing Behavior in Dyad Mice<br /> |-<br /> | Karim Ismat || Generation of a single nuclei RNA sequencing atlas of dysferlin-deficient skeletal muscle <br /> |-<br /> | Kai Leung (Adam) Wong || An Experience of carrying out GPU-accelerated Genomic Analysis on Pegasus<br /> |-<br /> | Gabriel Batzli || Defining macrophage heterogeneity in murine skin wounds during inflammation <br /> |-<br /> | Hovhannes Arestakesyan || Recurrent Somatic scSNVs in Single-Cell RNA-Seq: Insights into Tumor Heterogeneity and RNA-Level Variants<br /> |-<br /> | Chloe Sachs || Secretome distinguishes spectrum of NF1 associated peripheral nerve sheath tumors<br /> |-<br /> | Nikhil Arethiya || A Time-Series Approach to Glucose-Based Participant Classification<br /> |-<br /> | Siera Martinez || Hetero-GNN Link Prediction of RNA Editing in Single Cells<br /> |-<br /> | Renxi Li || Thirty-day outcomes of infrainguinal bypass surgery with concurrent iliac artery stenting in patients with chronic limb-threatening ischemia <br /> |-<br /> | Matthew Mollerus || ResLens: Detecting Antibiotic Resistance Genes with Large Language Models<br /> |-<br /> | Parimala Nagaraj || Cybersecurity at the Intersection of Genomics and Data Science: Securing the Future of Bioinformatics<br /> |-<br /> | Shekhar Nagar || Metabolic flexibility and dissemination of antibiotic resistomes from Actinobacteria in Hawaii hydrothermal steam vents<br /> |-<br /> | Cristina Fenollar Ferrer || Functional impact of PIP2 on the Serotonin Transporter (SERT)<br /> |-<br /> | Ali Taheriyoun || Dynamics of gut microbiome and metabolome of obesity patients under sleeve gastrectomy<br /> |-<br /> | Zohn Lab Zohn Lab || Next Generation sequencing approaches to understand developmental defects<br /> |-<br /> | Max Alekseyev || Bioinformatics meets Quantum Informatics: from genome rearrangements to Weingarten calculus<br /> |-<br /> | Lausanne Lee Oliver || Phylogenetic analysis of novel phages from Hawaiian fumaroles<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=685 Volunteership 2025 2025-04-07T16:54:03Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=684 Volunteership 2025 2025-04-07T16:53:11Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> !Resume<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |[[File:Grace S Chong Resume.pdf|center|Grace S Chong Resume]]<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |[[File:Resume - Alma Ogunsina.pdf|center|Alma Ogunsina Resume]]<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:Grace_S_Chong_Resume.pdf&diff=683 File:Grace S Chong Resume.pdf 2025-04-07T16:52:59Z

<p>Vishal.bakshi: </p> <hr /> <div>Grace S Chong Resume</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=682 Volunteership 2025 2025-04-07T16:52:04Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> !Resume<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |<br /> |-<br /> |Alma Ogunsina<br /> |Molecular Biology, Python, ML, and Data Analysis<br /> |<br /> # BiomarkerKB<br /> # ARGOS<br /> # PredictMod<br /> |[[File:Resume - Alma Ogunsina.pdf|center|Alma Ogunsina Resume]]<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:Resume_-_Alma_Ogunsina.pdf&diff=681 File:Resume - Alma Ogunsina.pdf 2025-04-07T16:51:00Z

<p>Vishal.bakshi: </p> <hr /> <div> Alma Ogunsina Resume</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=680 Volunteership 2025 2025-04-07T15:54:38Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>''Note: Individuals involved in the above projects with a background in programming and/or machine learning may also undertake additional tasks to support the development of ML models, which can be integrated into PredictMod or used to enhance AI/ML-ready datasets within GlyGen.''<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project Ideas ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<br /> <br /> '''FDA-ARGOS Computation and Pathogen Curation Project'''<br /> <br /> POC: Christie Woodside<br /> <br /> # Update data tables for more efficient computations<br /> ## Student would review and input additional data and IDs in the tables/sheets used to perform computations. This would be manual work (but super important), but would require high attention to detail. ~1 week's worth of work<br /> ## Requires Python/shell coding background. Student would run scripts that prepare and format data tables that are pushed to data.argosdb.org. Coding knowledge is needed in case of errors, bugs, or other mishaps in the code. Ongoing work as computations are performed.<br /> # Curate and report on current pathogens to upload to ARGOS<br /> ## Student would work on manual curation of circulating pathogens to be added to data.argosdb.org. Regular check-ins and reports of what was found. ~4-10 weeks worth of work<br /> ## Locate assembly IDs, reads, and metagenomic information for these pathogens to be used in computations and deposited into data.argosdb.org.<br /> ## Provide documentation on why they were curated, why they are important, how they were selected, and how data was collected.<br /> <br /> If the student has any other ideas or methods they want to focus on, please reach out to christie.woodside@email.gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=664 Volunteership 2025 2025-04-04T17:23:15Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. </li></ol>Individuals with a background in programming and/or machine learning may take on additional tasks that contribute to the development of ML models, which can be integrated into PredictMod.<hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> !Projects Interested<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |<br /> # PredictMod<br /> # BiomarkerKB Biocuration<br /> # GlyGen Biocuration<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=659 Volunteership 2025 2025-04-04T14:13:49Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. Individuals with a background in programming and/or machine learning may take on additional tasks that contribute to the development of ML models, which can be integrated into PredictMod. </li></ol><hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> <hr><br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=658 Volunteership 2025 2025-04-04T14:13:06Z

<p>Vishal.bakshi: /* Contact */</p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. Individuals with a background in programming and/or machine learning may take on additional tasks that contribute to the development of ML models, which can be integrated into PredictMod. </li></ol><hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab@gwu.edu.<br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=657 Volunteership 2025 2025-04-04T14:12:45Z

<p>Vishal.bakshi: </p> <hr /> <div><h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily progress updates via Slack (scrum).</li><br /> <li>Regular Zoom meetings with the assigned project point of contact.</li><li>Expected to dedicate 5–6 hours per day to project work, with the remaining time focused on skill development or reading. </li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Potential Projects</h3><br /> <ol><br /> <li>BiomarkerKB ([https://biomarkerkb.org biomarkerkb.org]) project: Biomarker curation project. Involves reading papers and collecting biomarkers.</li><br /> <li>GlyGen ([https://glygen.org glygen.org]) project: Review glycomics and glycoproteomics data and curate tissue, disease, and other related information. </li><li>ARGOS ([https://argosdb.org argosdb.org]) project: Analyze genomics data using HIVE to identify reference genome assemblies. </li><li>PredictMod ([https://hivelab.biochemistry.gwu.edu/predictmod hivelab.biochemistry.gwu.edu/predictmod]) project. Identifying datasets and harmonizing them so that they can be used to generate ML models. Individuals with a background in programming and/or machine learning may take on additional tasks that contribute to the development of ML models, which can be integrated into PredictMod. </li></ol><hr><br /> <br /> <h4>BiomarkerKB Biocuration Project Ideas</h4>POC: Daniall Masood<br /> # Curate biomarkers for a specific disease (Alzheimers)<br /> ## The student would be doing manual curation for about 4 weeks, with regular check-ins with me to ensure it is being done correctly.<br /> ## The next 4 weeks can be dedicated to developing an LLM or an automated process to extract biomarker details with data collected in the first 4 weeks as training data/example data.<br /> # Top 50 biomarkers<br /> ## Curate the top 50 biomarkers for biomarkerkb.org.<br /> ## Define what constitutes a top 50 biomarker.<br /> ## Begin curating biomarkers from different sources and papers by collecting fields mentioned in the data model, as well as collecting cross-references.<br /> # Biocuration of biomarkers from NLP/LLM work<br /> ## Use the biomarkers collected from NLP work.<br /> ## Curate biomarkers. Data provided was not provided in the biomarker data model.<br /> ## While curating the biomarkers, check if data collected from NLP is correct.<br /> ## After completion, the student can start using curated data to work on the NLP/LLM method.<br /> # Curate biomarkers for a treatment<br /> ## See #1 above.<br /> <br /> If the student has any other ideas, diseases, treatments, or methods they want to focus on, please reach out to daniallmasood@gwu.edu to discuss your idea and check if it will be feasible as a project for the summer.<br /> <br /> ==== GlyGen Biocuration Project ====<br /> POC: Rene Ranzinger and Urnisha Bhuiyan<br /> <br /> Using TableMaker in GlyGen, individuals will curate glycomics and glycoproteomics data from previous database resources that are now defunct. There might also be biocuration projects that inolve curating papers. <br /> <br /> ==== PredictMod Machine Learning Project Ideas ====<br /> POC: Lori Krammer<br /> <br /> Data Identification & Harmonization: <br /> <br /> # Identify publicly-available datasets from scientific literature that can be used for intervention outcome prediction models.<br /> # Conduct data harmonization and pre-processing following established project pipelines to make ML-ready dataset and data dictionary.<br /> <br /> Modeling & Integration (for those with experience in programming/ML)<br /> <br /> # Perform model training and document ML pipeline in a BioCompute Object (BCO). <br /> # Integrate model into PredictMod platform.<br /> <br /> Individuals with a background or interest in machine learning should reach out to lorikrammer@gwu.edu with a potential dataset to determine if it is a feasible project for the summer.<hr><br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work last week of the 8-week period.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> Contact the Admin Team to access previously submitted slides.<br /> <hr><br /> <br /> === Completion Certificate ===<br /> A certificate of completion and a letter of recommendation will be provided to all participants who successfully complete the program.<br /> <hr><br /> === Contact ===<br /> mazumder_lab AT gwu.edu.<br /> === Volunteers ===<br /> {| class="wikitable"<br /> |+<br /> |-<br /> ! Name<br /> ! Skills<br /> |-<br /> | Grace Chong<br /> | Python, Machine Learning, NLP, Analysis & Mathematics<br /> |}</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=632 Volunteership 2025 2025-04-02T18:07:47Z

<p>Vishal.bakshi: </p> <hr /> <div><html><br /> <h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily scrumming (progress updates)</li><br /> <li>Regular Zoom meetings with the assigned project point of contact</li><br /> </ol><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <hr><br /> <br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work on 21st July, 2025.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> <p>Contact the Admin Team to access previously submitted slides.</p></div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=631 Volunteership 2025 2025-04-02T18:05:40Z

<p>Vishal.bakshi: </p> <hr /> <div><html><br /> <h2>2025 Volunteer Program Details</h2><br /> <br /> <h3>Dates</h3><br /> <p><strong>June 2nd, 2025 – July 25th, 2025</strong> (8 weeks)<br><br /> Monday to Friday | Remote | No breaks</p><br /> <p style="color: red;"><strong>Important:</strong> If the scrum is not updated for 2 consecutive days, the candidate will be <u>automatically dropped</u> from the program.</p><br /> <br /> <hr><br /> <br /> <h3>Volunteer Expectations</h3><br /> <ol><br /> <li>Daily scrumming (progress updates)</li><br /> <li>Regular Zoom meetings with the assigned project point of contact</li><br /> </ol><br /> <br /> <hr><br /> <br /> <h3>Requirements for Completion</h3><br /> <p><strong>Note:</strong> The following are <u>mandatory</u>. Failure to complete any will result in an incomplete volunteer record.</p><br /> <br /> <h4>Documentation</h4><br /> <p>All volunteers must maintain adequate documentation of their work, including written protocols and scripts submitted to GitHub.</p><br /> <br /> <h4>Written Report</h4><br /> <p>Submit a 1–2 page summary of your tasks and accomplishments to the Admin Team during the final week of your program.</p><br /> <br /> <h4>Presentation & Slide Submission</h4><br /> <p>Present your work on 21st July, 2025.</p><br /> <p>Slides must be submitted to the Admin Team and should include:</p><br /> <ul><br /> <li>A title slide with your name, date, and mentor</li><br /> <li>At least 3 content slides</li><br /> <li>A final slide with acknowledgements or references</li><br /> </ul><br /> <p>Contact the Admin Team to access previously submitted slides.</p></div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=Volunteership_2025&diff=630 Volunteership 2025 2025-04-02T17:55:16Z

<p>Vishal.bakshi: Summer Volunteer-ship 2025</p> <hr /> <div><nowiki><h2>2025 Volunteer Program Details</h2></nowiki><br /> <br /> <nowiki><h3>📅 Dates</h3></nowiki><br /> <br /> <nowiki><p><strong>June 2nd, 2025 – July 25th, 2025</strong></nowiki> (8 weeks)<nowiki><br></nowiki><br /> <br /> Monday to Friday | Remote | No breaks<nowiki></p></nowiki><br /> <br /> <nowiki><p style="color: red;"><strong>Important:</strong></nowiki> If the scrum is not updated for 2 days in a row, the candidate will be <nowiki><u>automatically dropped</u></nowiki> from the program.<nowiki></p></nowiki><br /> <br /> <nowiki><hr></nowiki><br /> <br /> <nowiki><h3>🎯 Volunteer Expectations</h3></nowiki><br /> <br /> <nowiki><ol></nowiki><br /> <br /> <nowiki><li>Scrumming every day (daily progress updates)</li></nowiki><br /> <br /> <nowiki><li>Zoom meetings with your assigned Project Point of Contact</li></nowiki><br /> <br /> <nowiki></ol></nowiki><br /> <br /> <nowiki><hr></nowiki><br /> <br /> <nowiki><h3>✅ Requirements for Completion</h3></nowiki><br /> <br /> <nowiki><p><strong>Note:</strong></nowiki> All of the following are mandatory. Failure to complete any will result in an incomplete volunteer record.<nowiki></p></nowiki><br /> <br /> <nowiki><h4>📝 Documentation</h4></nowiki><br /> <br /> <nowiki><p>Volunteers must maintain clear documentation of their work. This includes, but is not limited to, written protocols and scripts submitted to GitHub.</p></nowiki><br /> <br /> <nowiki><h4>🧾 Written Report</h4></nowiki><br /> <br /> <nowiki><p>Volunteers must submit a 1–2 page summary of tasks and accomplishments during the program. This should be emailed to the Admin Team during the final week.</p></nowiki><br /> <br /> <nowiki><h4>📊 Presentation & Slide Submission</h4></nowiki><br /> <br /> <nowiki><p>Volunteers must present their work during an All-Hands Meeting. Presentations must be 5–7 minutes long, followed by 3–5 minutes of Q&A.</p></nowiki><br /> <br /> <nowiki><p>Slides must be submitted to the Admin Team. Presentations must include:</p></nowiki><br /> <br /> <nowiki><ul></nowiki><br /> <br /> <nowiki><li>Title Slide (with Name, Date, and Mentor)</li></nowiki><br /> <br /> <nowiki><li>At least three slides total</li></nowiki><br /> <br /> <nowiki><li>End slide with acknowledgements/references</li></nowiki><br /> <br /> <nowiki></ul></nowiki><br /> <br /> <nowiki><p>For access to past slides, contact the Admin Team.</p></nowiki></div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=MediaWiki:Sidebar&diff=457 MediaWiki:Sidebar 2025-03-10T17:38:27Z

<p>Vishal.bakshi: </p> <hr /> <div>•⁠ ⁠navigation<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Projects |Projects<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Publications |Publications<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/People |People<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Events |Events<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Collaborators |Collaborators<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> •⁠ ⁠SEARCH<br /> •⁠ ⁠TOOLBOX<br /> •⁠ ⁠LANGUAGES</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=MediaWiki:Sidebar&diff=456 MediaWiki:Sidebar 2025-03-10T17:37:39Z

<p>Vishal.bakshi: </p> <hr /> <div>* navigation<br /> {{#ifeq: {{PAGENAME}} | 2025_Bioinformatics_Symposium |<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> |<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Projects |Projects<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Publications |Publications<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/People |People<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Events |Events<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Collaborators |Collaborators<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> }}<br /> * SEARCH<br /> * TOOLBOX<br /> * LANGUAGES</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=MediaWiki:Sidebar&diff=455 MediaWiki:Sidebar 2025-03-10T17:36:37Z

<p>Vishal.bakshi: </p> <hr /> <div>•⁠ ⁠navigation<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Projects |Projects<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Publications |Publications<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/People |People<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Events |Events<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Collaborators |Collaborators<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> •⁠ ⁠SEARCH<br /> •⁠ ⁠TOOLBOX<br /> •⁠ ⁠LANGUAGES</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=MediaWiki:Sidebar&diff=454 MediaWiki:Sidebar 2025-03-10T17:36:00Z

<p>Vishal.bakshi: </p> <hr /> <div>* navigation<br /> {{#ifeq: {{PAGENAME}} | 2025_Bioinformatics_Symposium |<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> |<br /> ** mainpage|mainpage-description<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Projects |Projects<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Publications |Publications<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/People |People<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Events |Events<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Collaborators |Collaborators<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/Contact |Contact<br /> ** https://hivelab.biochemistry.gwu.edu/wiki/MediaWiki:Sidebar |Edit Sidebar<br /> ** recentchanges-url|recentchanges<br /> ** randompage-url|randompage<br /> ** helppage|help-mediawiki<br /> }}<br /> * SEARCH<br /> * TOOLBOX<br /> * LANGUAGES</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=GW_Bioinformatics_Network&diff=451 GW Bioinformatics Network 2025-03-10T16:20:34Z

<p>Vishal.bakshi: </p> <hr /> <div>== Contents ==<br /> __TOC__<br /> <br /> == Upcoming Meetings ==<br /> === 2025 Spring Meeting (GW-Bioinformatics Symposium) ===<br /> '''Date:''' April 29th. <br /> '''Overall idea:''' All-hands, all-day meeting/conference. We will have posters and talks. Students, staff, and faculty will be encouraged to join. <br /> '''Link:''' [https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium 2025 Bioinformatics Symposium]<br /> <br /> === 2025 Fall Meeting ===<br /> '''Date:''' To be decided. <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> <br /> == Past Meetings ==<br /> [[File:sym.jpg|thumb|right|250px|Attendee picture | 1st Bioinformatics Retreat (2024 Fall)]]<br /> [[File:Notes.png|thumb|right|250px|Attendee picture | Attendee notes | 1st Bioinformatics Retreat(2024 Fall)]]<br /> === 2024 Fall Meeting ===<br /> '''Date:''' 10/31/2024 <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> '''Notes:''' Notes from Fall 2024 meeting.<br /> <br /> <br /> <br /> == LISTSERV ==<br /> Anyone interested in bioinformatics at GW can subscribe. To subscribe to the listserv, which we can use to share funding, collaboration, and training opportunities, please follow these steps: <br /> <br /> # Go to your email program and choose '''Compose'''. Keep '''SUBJECT''' empty. <br /> # Type the bolded text below: <br /> #'''TO:''' listserv@hermes.gwu.edu <br /> #'''MESSAGE TEXT:''' '''subscribe BIOINFO your-first-name your-last-name''' <br /> <br /> <br /> == GW Bioinformatics Labs/Groups ==<br /> * [https://cblab.org/ Alekseyev, Max Lab] <br /> * Bradley, Brenda Lab (Anthropology) <br /> * Berger, Seth Lab <br /> * Broniatowski, David Lab <br /> * Caldovic, Ljubica Lab <br /> * Callier, Shawneequa Lab <br /> * Crandall, Keith Lab <br /> * Garbey, Marc Lab <br /> * Gaylord, Clark Group <br /> * Horvath, Anelia Lab <br /> * Huang Lab <br /> * Kaminski, Henry Lab <br /> * [https://hivelab.biochemistry.gwu.edu/ Mazumder, Raja Lab (Biochemistry, SMHS)] <br /> * Morizono, Hiroki Lab <br /> * Orti, Guillermo Lab <br /> * Peng, Weiqun Lab <br /> * Perez-Losada, Marcos Lab <br /> * Pyron, Alex Lab <br /> * Qiu, Xiangyun Lab <br /> * Rahman, Muhammad Lab <br /> * Rahnavard, Ali Lab <br /> * Jo Lynne Rokita Lab <br /> * Saw, Jimmy Lab <br /> * Sepulveda, Jorge Lab <br /> * Simha, Rahul Lab <br /> * Zeng, Chen Lab <br /> * Zeng, Qing Lab<br /> <br /> == Contact ==<br /> To add your name to this list please email mazumder_lab@gwu.edu. If you have a URL for your lab, please let us know so that we can link it. This will allow others to learn about your research and also find other faculty and staff in your lab/group.</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:Notes.png&diff=450 File:Notes.png 2025-03-10T16:18:34Z

<p>Vishal.bakshi: </p> <hr /> <div></div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=GW_Bioinformatics_Network&diff=448 GW Bioinformatics Network 2025-03-10T16:04:20Z

<p>Vishal.bakshi: </p> <hr /> <div>== Contents ==<br /> __TOC__<br /> <br /> == Upcoming Meetings ==<br /> === 2025 Spring Meeting (GW-Bioinformatics Symposium) ===<br /> '''Date:''' April 29th. Please hold this week for now. <br /> '''Overall idea:''' All-hands, all-day meeting/conference. We will have posters and talks. Students, staff, and faculty will be encouraged to join. <br /> '''Link:''' [https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium 2025 Bioinformatics Symposium]<br /> <br /> === 2025 Fall Meeting ===<br /> '''Date:''' To be decided. <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> <br /> == Past Meetings ==<br /> [[File:sym.jpg|thumb|right|250px|Attendee picture | 1st Bioinformatics Retreat (2024 Fall)]]<br /> === 2024 Fall Meeting ===<br /> '''Date:''' 10/31/2024 <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> '''Notes:''' Notes from Fall 2024 meeting.<br /> <br /> <br /> <br /> == LISTSERV ==<br /> Anyone interested in bioinformatics at GW can subscribe. To subscribe to the listserv, which we can use to share funding, collaboration, and training opportunities, please follow these steps: <br /> <br /> # Go to your email program and choose '''Compose'''. Keep '''SUBJECT''' empty. <br /> # Type the bolded text below: <br /> #'''TO:''' listserv@hermes.gwu.edu <br /> #'''MESSAGE TEXT:''' '''subscribe BIOINFO your-first-name your-last-name''' <br /> <br /> <br /> == GW Bioinformatics Labs/Groups ==<br /> * [https://cblab.org/ Alekseyev, Max Lab] <br /> * Bradley, Brenda Lab (Anthropology) <br /> * Berger, Seth Lab <br /> * Broniatowski, David Lab <br /> * Caldovic, Ljubica Lab <br /> * Callier, Shawneequa Lab <br /> * Crandall, Keith Lab <br /> * Garbey, Marc Lab <br /> * Gaylord, Clark Group <br /> * Horvath, Anelia Lab <br /> * Huang Lab <br /> * Kaminski, Henry Lab <br /> * [https://hivelab.biochemistry.gwu.edu/ Mazumder, Raja Lab (Biochemistry, SMHS)] <br /> * Morizono, Hiroki Lab <br /> * Orti, Guillermo Lab <br /> * Peng, Weiqun Lab <br /> * Perez-Losada, Marcos Lab <br /> * Pyron, Alex Lab <br /> * Qiu, Xiangyun Lab <br /> * Rahman, Muhammad Lab <br /> * Rahnavard, Ali Lab <br /> * Jo Lynne Rokita Lab <br /> * Saw, Jimmy Lab <br /> * Sepulveda, Jorge Lab <br /> * Simha, Rahul Lab <br /> * Zeng, Chen Lab <br /> * Zeng, Qing Lab<br /> <br /> == Contact ==<br /> To add your name to this list please email mazumder_lab@gwu.edu. If you have a URL for your lab, please let us know so people can find other faculty and staff in your lab/group.</div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=File:Sym.jpg&diff=447 File:Sym.jpg 2025-03-10T16:01:30Z

<p>Vishal.bakshi: </p> <hr /> <div></div>

Vishal.bakshi https://hivelab.biochemistry.gwu.edu/wiki/index.php?title=GW_Bioinformatics_Network&diff=446 GW Bioinformatics Network 2025-03-10T15:58:10Z

<p>Vishal.bakshi: </p> <hr /> <div>== Contents ==<br /> __TOC__<br /> <br /> == Upcoming Meetings ==<br /> === 2025 Spring Meeting (GW-Bioinformatics Symposium) ===<br /> '''Date:''' April 29th. Please hold this week for now. <br /> '''Overall idea:''' All-hands, all-day meeting/conference. We will have posters and talks. Students, staff, and faculty will be encouraged to join. <br /> '''Link:''' [https://hivelab.biochemistry.gwu.edu/wiki/2025_Bioinformatics_Symposium 2025 Bioinformatics Symposium]<br /> <br /> === 2025 Fall Meeting ===<br /> '''Date:''' To be decided. <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> <br /> == Past Meetings ==<br /> === 2024 Fall Meeting ===<br /> '''Date:''' 10/31/2024 <br /> '''Fall Retreat (PD/PI-Only):''' A smaller meeting dedicated to discussions about proposals, funding, and our strategic mission. <br /> '''Notes:''' Notes from Fall 2024 meeting.<br /> <br /> == LISTSERV ==<br /> Anyone interested in bioinformatics at GW can subscribe. To subscribe to the listserv, which we can use to share funding, collaboration, and training opportunities, please follow these steps: <br /> <br /> # Go to your email program and choose '''Compose'''. Keep '''SUBJECT''' empty. <br /> # Type the bolded text below: <br /> #'''TO:''' listserv@hermes.gwu.edu <br /> #'''MESSAGE TEXT:''' '''subscribe BIOINFO your-first-name your-last-name''' <br /> <br /> <br /> == GW Bioinformatics Labs/Groups ==<br /> * [https://cblab.org/ Alekseyev, Max Lab] <br /> * Bradley, Brenda Lab (Anthropology) <br /> * Berger, Seth Lab <br /> * Broniatowski, David Lab <br /> * Caldovic, Ljubica Lab <br /> * Callier, Shawneequa Lab <br /> * Crandall, Keith Lab <br /> * Garbey, Marc Lab <br /> * Gaylord, Clark Group <br /> * Horvath, Anelia Lab <br /> * Huang Lab <br /> * Kaminski, Henry Lab <br /> * [https://hivelab.biochemistry.gwu.edu/ Mazumder, Raja Lab (Biochemistry, SMHS)] <br /> * Morizono, Hiroki Lab <br /> * Orti, Guillermo Lab <br /> * Peng, Weiqun Lab <br /> * Perez-Losada, Marcos Lab <br /> * Pyron, Alex Lab <br /> * Qiu, Xiangyun Lab <br /> * Rahman, Muhammad Lab <br /> * Rahnavard, Ali Lab <br /> * Jo Lynne Rokita Lab <br /> * Saw, Jimmy Lab <br /> * Sepulveda, Jorge Lab <br /> * Simha, Rahul Lab <br /> * Zeng, Chen Lab <br /> * Zeng, Qing Lab<br /> <br /> == Contact ==<br /> To add your name to this list please email mazumder_lab@gwu.edu. If you have a URL for your lab, please let us know so people can find other faculty and staff in your lab/group.</div>

Vishal.bakshi