National Breast Cancer Coalition (NBCC) Data: Difference between revisions
Lorikrammer (talk | contribs) mNo edit summary |
Lorikrammer (talk | contribs) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
The National Breast Cancer Coalition (NBCC) data contains twelve terabytes of -omics data for 26,546 participants. This data contains a cohort of 575 breast cancer patients and 3,479 patient family members. The original data was collected through the DNA.Land project (<nowiki>PMID 29374253</nowiki>) and is currently housed on the GW-FEAST data browser. Access to this data is restricted and must be approved by NBCC on a case-by-case basis through the NBCC Data Access Request (DAR) Form. | The National Breast Cancer Coalition (NBCC) data contains twelve terabytes of -omics data for 26,546 participants. This data contains a cohort of 575 breast cancer patients and 3,479 patient family members. The original data was collected through the DNA.Land project (<nowiki>PMID 29374253</nowiki>) and is currently housed on the GW-FEAST data browser. Access to this data is restricted and must be approved by NBCC on a case-by-case basis through the NBCC Data Access Request (DAR) Form. | ||
=== De- | == De-identification == | ||
=== NBCC De-IDN Tool v1.0 === | |||
The NBCC data has been de-identified using a tool (NBCC De-IDN Tool v1.0) developed by Dr. Robel Kahsay from our group<sup>1</sup>, which uses the Safe Harbor approach to de-identification. This tool is specific to the NBCC data and any changes to the parameters will be captured in subsequent versions. | The NBCC data has been de-identified using a tool (NBCC De-IDN Tool v1.0) developed by Dr. Robel Kahsay from our group<sup>1</sup>, which uses the Safe Harbor approach to de-identification. This tool is specific to the NBCC data and any changes to the parameters will be captured in subsequent versions. | ||
=== NBCC De-identification Workflow === | |||
[[File:Nbcc deidn tool v1.0.png|frameless|743x743px]] | [[File:Nbcc deidn tool v1.0.png|frameless|743x743px]] | ||
== Dataset Information == | |||
=== Data Schema === | === Data Schema === | ||
[[File:Nbcc_schema.png|frameless|746x746px]] | [[File:Nbcc_schema.png|frameless|746x746px]] | ||
Latest revision as of 19:44, 13 March 2025
Introduction
The National Breast Cancer Coalition (NBCC) data contains twelve terabytes of -omics data for 26,546 participants. This data contains a cohort of 575 breast cancer patients and 3,479 patient family members. The original data was collected through the DNA.Land project (PMID 29374253) and is currently housed on the GW-FEAST data browser. Access to this data is restricted and must be approved by NBCC on a case-by-case basis through the NBCC Data Access Request (DAR) Form.
De-identification
NBCC De-IDN Tool v1.0
The NBCC data has been de-identified using a tool (NBCC De-IDN Tool v1.0) developed by Dr. Robel Kahsay from our group1, which uses the Safe Harbor approach to de-identification. This tool is specific to the NBCC data and any changes to the parameters will be captured in subsequent versions.
NBCC De-identification Workflow
Dataset Information
Data Schema
Data Sample
To download a de-identified single-patient NBCC dataset, please visit GW-FEAST De-identified Data Templates.
References
1Dr. Robel Kahsay (Mazumder Lab, Dept. of Biochemistry & Molecular Medicine). GW Review Board (IRB), FWA00005945. Subject: NCR224302, "Analysis of prostate MRI image data and its integration with biomedical data" Haji-Momenian, Shahriar (Shawn), MD, Sepulveda, Jorge, MD, PhD; Whalen, Michael, MD; Kahsay, Robel, PhD