Additional Notes: Difference between revisions
Jump to navigation
Jump to search
Created page with "=== Additional Notes === All the mapping files are available in the scripts repository in the folder: `pipeline/convert_step2/mapping` The mapping files used for converting the COSMIC tsv are: '''DOID:''' * `cosmic_doid_mapping.csv` COSMIC tissue site terms were mapped to DOID parent terms using the following table (generated from previous Biomuta mapping): {| class="wikitable" ! Primary Site ! Top_Level_Organ_system |- | NS | NA |- | adrenal_gland | DOID:3953 / adre..." |
|||
Line 3: | Line 3: | ||
`pipeline/convert_step2/mapping` | `pipeline/convert_step2/mapping` | ||
ICGC uses TCGA study terms, so the same TCGA to DOID parent terms are used for mapping (generated from previous Biomuta mapping): | |||
{| class="wikitable" | {| class="wikitable" | ||
! | ! DO_slim_id | ||
! | ! DO_slim_name | ||
! TCGA_project | |||
|- | |- | ||
| | | DOID:5041 | ||
| | | esophageal cancer | ||
| TCGA-ESCA | |||
|- | |- | ||
| | | DOID:2531 | ||
| | | hematologic cancer | ||
| TCGA-DLBC | |||
|- | |- | ||
| | | DOID:9256 | ||
| | | colorectal cancer | ||
| TCGA-READ | |||
|- | |- | ||
| | | DOID:1319 | ||
| | | brain cancer | ||
| TCGA-GBM | |||
|- | |- | ||
| | | DOID:1319 | ||
| | | brain cancer | ||
| TCGA-LGG | |||
|- | |- | ||
| | | DOID:1781 | ||
| | | thyroid cancer | ||
| TCGA-THCA | |||
|- | |- | ||
| | | DOID:11054 | ||
| | | urinary bladder cancer | ||
| TCGA-BLCA | |||
|- | |- | ||
| | | DOID:363 | ||
| | | uterine cancer | ||
| TCGA-UCEC | |||
|- | |- | ||
| | | DOID:169 | ||
| | | neuroendocrine tumor | ||
| TCGA-PCPG | |||
|- | |- | ||
| | | DOID:4362 | ||
| | | cervical cancer | ||
| TCGA-CESC | |||
|- | |- | ||
| | | DOID:363 | ||
| | | uterine cancer | ||
| TCGA-UCS | |||
|- | |- | ||
| | | DOID:3277 | ||
| | | thymus cancer | ||
| TCGA-THYM | |||
|- | |- | ||
| | | DOID:3571 | ||
| | | liver cancer | ||
| TCGA-LIHC | |||
|- | |- | ||
| | | DOID:11934 | ||
| | | head and neck cancer | ||
| TCGA-HNSC | |||
|- | |- | ||
| | | DOID:2174 | ||
| | | ocular cancer | ||
| TCGA-UVM | |||
|- | |- | ||
| | | DOID:4159 | ||
| | | skin cancer | ||
| TCGA-SKCM | |||
|- | |- | ||
| | | DOID:9256 | ||
| | | colorectal cancer | ||
| TCGA-COAD | |||
|- | |- | ||
| | | DOID:3953 | ||
| | | adrenal gland cancer | ||
| TCGA-ACC | |||
|- | |- | ||
| | | DOID:1793 | ||
| | | pancreatic cancer | ||
| TCGA-PAAD | |||
|- | |- | ||
| | | DOID:2994 | ||
| | | germ cell cancer | ||
| TCGA-TGCT | |||
|- | |- | ||
| | | DOID:1324 | ||
| | | lung cancer | ||
| TCGA-LUSC | |||
|- | |- | ||
| | | DOID:1790 | ||
| | | malignant mesothelioma | ||
| TCGA-MESO | |||
|- | |- | ||
| | | DOID:2394 | ||
| | | ovarian cancer | ||
| TCGA-OV | |||
|- | |- | ||
| | | DOID:1115 | ||
| | | sarcoma | ||
| TCGA-SARC | |||
|- | |- | ||
| | | DOID:263 | ||
| | | kidney cancer | ||
| TCGA-KIRP | |||
|- | |- | ||
| | | DOID:10534 | ||
| | | stomach cancer | ||
| TCGA-STAD | |||
|- | |- | ||
| | | DOID:2531 | ||
| | | hematologic cancer | ||
| TCGA-LAML | |||
|- | |- | ||
| | | DOID:10283 | ||
| | | prostate cancer | ||
| TCGA-PRAD | |||
|- | |- | ||
| | | DOID:1324 | ||
| | | lung cancer | ||
| TCGA-LUAD | |||
|- | |- | ||
| | | DOID:1612 | ||
| | | breast cancer | ||
| TCGA-BRCA | |||
|- | |- | ||
| | | DOID:263 | ||
| | | kidney cancer | ||
| TCGA-KIRC | |||
|- | |- | ||
| | | DOID:263 | ||
| | | kidney cancer | ||
| TCGA-KICH | |||
|} | |} | ||
Latest revision as of 21:50, 9 October 2024
Additional Notes
All the mapping files are available in the scripts repository in the folder: `pipeline/convert_step2/mapping`
ICGC uses TCGA study terms, so the same TCGA to DOID parent terms are used for mapping (generated from previous Biomuta mapping):
DO_slim_id | DO_slim_name | TCGA_project |
---|---|---|
DOID:5041 | esophageal cancer | TCGA-ESCA |
DOID:2531 | hematologic cancer | TCGA-DLBC |
DOID:9256 | colorectal cancer | TCGA-READ |
DOID:1319 | brain cancer | TCGA-GBM |
DOID:1319 | brain cancer | TCGA-LGG |
DOID:1781 | thyroid cancer | TCGA-THCA |
DOID:11054 | urinary bladder cancer | TCGA-BLCA |
DOID:363 | uterine cancer | TCGA-UCEC |
DOID:169 | neuroendocrine tumor | TCGA-PCPG |
DOID:4362 | cervical cancer | TCGA-CESC |
DOID:363 | uterine cancer | TCGA-UCS |
DOID:3277 | thymus cancer | TCGA-THYM |
DOID:3571 | liver cancer | TCGA-LIHC |
DOID:11934 | head and neck cancer | TCGA-HNSC |
DOID:2174 | ocular cancer | TCGA-UVM |
DOID:4159 | skin cancer | TCGA-SKCM |
DOID:9256 | colorectal cancer | TCGA-COAD |
DOID:3953 | adrenal gland cancer | TCGA-ACC |
DOID:1793 | pancreatic cancer | TCGA-PAAD |
DOID:2994 | germ cell cancer | TCGA-TGCT |
DOID:1324 | lung cancer | TCGA-LUSC |
DOID:1790 | malignant mesothelioma | TCGA-MESO |
DOID:2394 | ovarian cancer | TCGA-OV |
DOID:1115 | sarcoma | TCGA-SARC |
DOID:263 | kidney cancer | TCGA-KIRP |
DOID:10534 | stomach cancer | TCGA-STAD |
DOID:2531 | hematologic cancer | TCGA-LAML |
DOID:10283 | prostate cancer | TCGA-PRAD |
DOID:1324 | lung cancer | TCGA-LUAD |
DOID:1612 | breast cancer | TCGA-BRCA |
DOID:263 | kidney cancer | TCGA-KIRC |
DOID:263 | kidney cancer | TCGA-KICH |
Uniprot Accession:
- `human_protein_transcriptlocus.csv`
Transcript ID (starts with ENSP) was mapped to uniprot isoform accession.
Mapping was NOT performed to uniprot canonical accession as this resulted in an issue with the final dataset in which a mutation for the same canonical accession would be listed with different amino acid changes.