National Breast Cancer Coalition (NBCC) Data

From HIVE Lab
Revision as of 19:27, 13 March 2025 by Lorikrammer (talk | contribs)
Jump to navigation Jump to search

Introduction

The National Breast Cancer Coalition (NBCC) data contains twelve terabytes of -omics data for 26,546 participants. This data contains a cohort of 575 breast cancer patients and 3,479 patient family members. The original data was collected through the DNA.Land project (PMID 29374253) and is currently housed on the GW-FEAST data browser. Access to this data is restricted and must be approved by NBCC on a case-by-case basis through the NBCC Data Access Request (DAR) Form.

De-identification

The NBCC data has been de-identified using a tool (NBCC De-IDN Tool v1.0) developed by Dr. Robel Kahsay from our group1, which uses the Safe Harbor approach to de-identification. This tool is specific to the NBCC data and any changes to the parameters will be captured in subsequent versions.

Data Schema

Data Sample

To download a de-identified single-patient NBCC dataset, please visit GW-FEAST De-identified Data Templates.

References

1Dr. Robel Kahsay (Mazumder Lab, Dept. of Biochemistry & Molecular Medicine). GW Review Board (IRB), FWA00005945. Subject: NCR224302, "Analysis of prostate MRI image data and its integration with biomedical data" Haji-Momenian, Shahriar (Shawn), MD, Sepulveda, Jorge, MD, PhD; Whalen, Michael, MD; Kahsay, Robel, PhD